Computational Methods for Nanoscale Applications: Particles, Plasmons and Waves [2nd ed.] 9783030438920, 9783030438937

This second edition of a well-received book presents new perspectives on modern nanoscale problems, where fundamental sc

218 57 20MB

English Pages XXVII, 707 [725] Year 2020

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Front Matter ....Pages i-xxvii
Introduction (Igor Tsukerman)....Pages 1-9
Finite-Difference Schemes (Igor Tsukerman)....Pages 11-67
The Finite Element Method (Igor Tsukerman)....Pages 69-181
Flexible Local Approximation MEthods (FLAME) (Igor Tsukerman)....Pages 183-242
Long-Range Interactions in Free Space (Igor Tsukerman)....Pages 243-284
Long-Range Interactions in Heterogeneous Systems (Igor Tsukerman)....Pages 285-355
Finite-Difference Time-Domain Methods for Electrodynamics (Igor Tsukerman)....Pages 357-423
Applications in Nano-Photonics (Igor Tsukerman)....Pages 425-560
Metamaterials and Their Parameters (Igor Tsukerman)....Pages 561-635
Miscellany (Igor Tsukerman)....Pages 637-643
Conclusion: “Plenty of Room at the Bottom” for Computational Methods (Igor Tsukerman)....Pages 645-646
Back Matter ....Pages 647-707
Recommend Papers

Computational Methods for Nanoscale Applications: Particles, Plasmons and Waves [2nd ed.]
 9783030438920, 9783030438937

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Nanostructure Science and Technology Series Editor: David J. Lockwood

Igor Tsukerman

Computational Methods for Nanoscale Applications Particles, Plasmons and Waves Second Edition

Nanostructure Science and Technology Series Editor David J. Lockwood, FRSC National Research Council of Canada Ottawa, ON, Canada

More information about this series at http://www.springer.com/series/6331

Igor Tsukerman

Computational Methods for Nanoscale Applications Particles, Plasmons and Waves Second Edition

123

Igor Tsukerman Akron, OH, USA

ISSN 1571-5744 ISSN 2197-7976 (electronic) Nanostructure Science and Technology ISBN 978-3-030-43892-0 ISBN 978-3-030-43893-7 (eBook) https://doi.org/10.1007/978-3-030-43893-7 1st edition: © Springer Science+Business Media, LLC 2008 2nd edition: © Springer Nature Switzerland AG 2020 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

To the memory of my mother, to my father, and to the miracle of M.

Foreword

Nanotechnologies are not anymore the revolution that was conjectured by Feynman on the eve of the 1960s; they are here, in our daily life, in your hand, in your pocket and even sometimes on your skin. Among all the facets of nanosciences, the interaction of light and electromagnetic radiations with nanoscale objects is probably the most compelling way to perceive how a structure can behave very differently when its dimensions shrink. Think of the appearance of a silver coin: shiny and gray, while silver colloidal nanoparticles are blue, red or yellow, depending on their size or shape. The nanoworld is full of surprises, and its interaction with light and microwaves is one of their most prominent indicators. Comprehending these wonders is however quite challenging, since it is impossible to see the nanoworld with naked eyes. Most of the time, one must resort to models to understand how light–matter interactions take place at the nanoscale. This is the very topic of this book, for which it provides an invaluable reference. The manuscript is organized like a solid, three-dimensional, edifice. Chapters 2 and 3 provide the foundations and describe in detail the two main techniques to discretize differential equations: finite differences and finite elements. Each topic is first introduced in very general terms, such that the novice reader can get easily acquainted. This first exposition is followed by more specialized topics, like the flux-balance scheme and the Collatz Mehrstellen scheme for finite differences; or edge elements and error estimates for finite elements. The discussion in these two chapters is very general and not limited to the solution of Maxwell’s equations. A later chapter, Chap. 7, focuses on finite differences in time domain, a very popular method for studying electromagnetic problems, especially in nanophotonics. There exist actually quite a few commercial or noncommercial implementations of this algorithm and Chap. 7 concludes with a useful table that can help the nonspecialist choose the best package. Professor Igor Tsukerman has made important contributions to a numerical approach that aims at using the simplicity of the regular grid associated with finite differences, while overcoming some of its limitations, especially in terms of accuracy. This flexible local approximation method (FLAME) is the topic of Chap. 4, where the author guides us through the different postulates of the method: vii

viii

Foreword

a broad pool of basis functions, finding an approximation for the solution instead of the equation, allowing different solutions to coexist within one computational domain and exploiting as much as possible the flexibility of the numerical scheme. FLAME represents a novel approach to solve discretized equations, and this chapter provides a very comprehensive description for those who would like to unleash its potential. The author puts his own contributions well in perspective and discusses in detail the numerical landscape around FLAME, which includes popular methods such as the discontinuous Galerkin method, the meshless methods or pseudospectral methods, to name a few. Moving upward in the book edifice, in Chaps. 5 and 6, the reader is reminded that—at the nanoscale—individual charges and dipoles are becoming the source of interaction, and this interaction can act over very large distances. This is a rather counterintuitive idea, since one may be tempted to think that at the nanoscale everything is local. It is of course not true, think for example of Wood’s anomaly, which can produce a strongly enhanced electromagnetic field at the vicinity of a surface, but is the emanation of an infinite periodic lattice. Several problems described in those chapters are still open today: How to handle very small separation distances between nanoscopic objects, where is the limit of a classical description based on Maxwell’s equations and where more sophisticated approaches are required? The last three chapters in the book are made to tickle, and often challenge, the curiosity of the reader. They address many current problems in nanophotonics and metamaterials. Efficient links are made in those chapters with the foundation material presented in the first part of the book, such that the reader who would miss some background can easily find the appropriate references. In those last chapters, the author provides his original and perceptive view on problems such as photonic bandgap calculations, light confinement beyond the diffraction limit, plasmonic enhancement, homogenization of metamaterials or the definition of losses in conductors. I have been developing numerical techniques to solve Maxwell’s equations for over three decades, and I am still fascinated (and sometimes put off…) by the complexity associated with solving them at the nanoscale. I find the selection of topics in this book truly to the point, and its content will appeal both to those well versed in computational electromagnetics and those entering into that topic. The presentation is very comprehensive and pedagogical, such that the book can serve for self-study, as well as a reference to keep at hand when developing codes. Over the last decade, the art of computational electromagnetics has also become much more accessible, with several all-round numerical platforms readily available. This book is also for the user of these platforms: While it is very easy to produce a colorful image of your numerical solution, this naïve representation can easily be flawed by an inappropriate discretization mesh or the wrong boundary conditions. The very practical and concrete sides of this book will prevent you from falling into such a pitfall and help you interpret your numerical results correctly, when you explore the unexplored.

Foreword

ix

A particularly abundant bibliography supports the reader who would like to dwell further in a specific topic. In summary, this book provides in one single place the key elements of the most popular numerical methods for nanophotonics, molecular dynamics and charge interactions in solvents; it will prove an invaluable reference for the practitioner. It will also appeal to the teacher or the student—physicist, chemist or computer scientist—who would like to develop an education in computational electromagnetics at the nanoscale. As written by the author in the introduction chapter, “computational simulation is not an exact science,” this very book, however, will help you chose the right approach to tackle the problem you are interested in, teach you how to set up a sound and solid numerical model and help you transform the art of computation into a reliable ally for your explorations of the nanoworld. May 2020

Olivier J. F. Martin Swiss Federal Institute of Technology Lausanne (EPFL) Lausanne, Switzerland

Preface

The purpose of this note … is to sort out my own thoughts … and to solicit ideas from others. Lloyd N. Trefethen Three mysteries of Gaussian elimination

Since 2008, when the first edition of this book was published, nanoscience, nanotechnology and computer simulation of nanoscale and molecular-scale systems have experienced steady growth. This is exemplified by the ISI database data in Fig. 1. It is clear that these research areas have reached a stage of maturity, yet are still being explored very actively. Another obvious conclusion: It is not humanly possible to keep track of all the research in nanoscale or molecular-scale phenomena and devices. This book deals primarily with electromagnetic analysis and simulation, but even that is way too broad. Naturally, the selection of topics is influenced strongly by my own research

Fig. 1 Academic publications 2008–2019, the ISI database. Keywords “nanoscale”; “nanoscale & simulation”. (The 2019 data may be incomplete.)

xi

xii

Preface

experience and expertise, but also by the adoption and acceptance of particular methods in the engineering and physics communities. As a result, three new chapters have been included in the second edition. The first one deals with “finite difference time domain” (FDTD) methods, quite popular in electromagnetics and photonics simulations. FDTD is almost 60 years old and has become a powerful simulation technology in its own right and on all scales, covering virtually all types of numerical modeling of electromagnetic and acoustic phenomena, complex materials and devices. A large variety of software packages, both public domain and commercial, are available (Sect. 7.18). Related to FDTD is “Finite Integration Techniques” (FIT), Sect. 7.8. The second new chapter is devoted to metamaterials—artificial structures judiciously designed to control refraction and propagation of waves and to produce physical effects not attainable in natural materials. Most notable of these effects are appreciable magnetic response at high frequencies, negative refraction and cloaking. I review these and other applications but concentrate on homogenization, one of the subjects of my own research in recent years. My particular emphasis is on non-asymptotic theories (not limited to vanishingly small lattice cell sizes) and on the role of boundary effects and conditions. Only with that in mind can accurate homogenized models be constructed. As Fig. 2 shows, the body of research in FDTD and in metamaterials continues to expand rapidly, albeit for somewhat different reasons. FDTD is quite a mature area already, but continues to enjoy efficiency improvements and to find new applications. Metamaterials is a younger field of research; even though the ideas behind it can be traced back to the 1970s or even (with some stretch) to the 1950– 1960s, its growth was fueled by the advent of micro- and nanotechnology, as well as—importantly—by the demonstration of negative refraction, cloaking and other related effects in the 2000s. The third new chapter (Chap. 10) is an assortment of curious and paradoxical facts related to the main content of the book. It is hoped that the readers will find this collection instructive.

Fig. 2 Academic publications 2008–2019, the ISI database. Keywords “finite difference time domain”; “metamaterials”. (The 2019 data may be incomplete.)

Preface

xiii

Many sections of the remaining seven chapters from the first edition of the book have been substantially revised and updated. Known typos have been corrected. A number of figures that were previously rendered in grayscale now appear in color. *** I owe an enormous debt of gratitude to my parents for their incredible kindness and selflessness and to my wife for her tolerance of my character quirks and for her unwavering support under all circumstances. My son proofread parts of the book and made some helpful comments. My work on the first edition of the book was interrupted by the sudden and heartbreaking death of my mother in the summer of 2006. I dedicate both editions to her memory. *** March 2018 opened a void that can never be filled. Victor Lomonosov was a brilliant mathematician, well known by experts for his discoveries in functional analysis.1 Victor would have certainly been able to set me straight on the mathematical part of Sect. 8.5. His accomplishments and mathematical fame notwithstanding, Victor was a very cordial and unassuming man, with many friends from every walk of life, who now miss him dearly. *** ACKNOWLEDGMENT AND THANKS

Collaboration with Gary Friedman and his group, especially during my sabbatical in 2002–2003 at Drexel University, has influenced my research and the material of the first edition greatly. Gary’s energy, enthusiasm and innovative ideas have always been stimulating. During the same sabbatical year, I was fortunate to visit several research groups working on the simulation of colloids, polyelectrolytes, macro- and biomolecules. I am very grateful to all of them for their hospitality. I would particularly like to mention Christian Holm, Markus Deserno, Vladimir Lobaskin at the Max-Planck-Institut für Polymerforschung in Mainz, Germany; Rebecca Wade at the European Molecular Biology Laboratory in Heidelberg and Thomas Simonson at the Laboratoire de Biologie Structurale in Strasbourg, France.

“In 1973, operator theorists were stunned by the generalization achieved by Lomonosov... The theorem Lomonosov obtained was a more general result than anyone had ever hoped to be able to prove.” A. J. Michaels, Hilden’s simple proof of Lomonosov’s invariant subspace theorem. Advances in Mathematics 25, 56–58 (1977).

1

xiv

Preface

Advanced techniques developed by Alexei Sokolov (the University of Akron) and experiments in optical sensors and microscopy with molecular-scale resolution had a strong impact on my students’ and my work. I thank Alexei for providing a great opportunity for joint research. Similarly, collaboration with Fritz Keilmann, an expert in near-field infrared microscopy, and his group (Max-Planck-Institut für Biochemie in Martinsried, Germany) was a great learning experience for us in 2006–2008. In the course of the last three decades, I have benefited enormously from my communication with Alain Bossavit (Électricité de France and Laboratoire de Genie Electrique de Paris), from his very deep knowledge of all aspects of computational electromagnetism, and from his very detailed and thoughtful analysis of any difficult subject that would come up. Isaak Mayergoyz of the University of Maryland at College Park has on many occasions shared his valuable insights with me. His knowledge of many areas of electromagnetism, physics and mathematics is profound and often unmatched. My communication with Jon Webb (McGill University, Montréal) has always been thought-provoking and informative. His astute observations and comments make complicated matters look clear and simple. I was very pleased that Professor Webb devoted part of his time to our joint research on Flexible Local Approximation MEthods (FLAME, Chap. 4). Yuri Kizimovich and I have worked jointly on a variety of projects for a long time. His original thinking and elegant solutions of practical problems have always been a great asset. Even though over 30 years have already passed since the untimely death of my thesis advisor, Yu. V. Rakitskii, his students still remember very warmly his relentless strive for excellence and quixotic attitude to scientific research. Rakitskii’s main contribution was to numerical methods for stiff systems of differential equations. He was guided by the idea of incorporating, to the extent possible, analytical approximations into numerical methods. This approach is manifest in FLAME that I believe Rakitskii would have liked. My sincere thanks go to Elena Ivanova and Sergey Voskoboynikov (Technical University of St. Petersburg, Russia), for their very, very diligent work on FLAME (Chap. 4), and to Benjamin Yellen (Duke University), for many discussions, innovative ideas and for his great contribution to the NSF-NIRT project on magnetic assembly of particles. I appreciate the help, support and opportunities provided by the International Compumag Society through a series of the International Compumag Conferences and through personal communication with its Board and members: Jan K Sykulski, Arnulf Kost, Kay Hameyer, François Henrotte, Oszkár Bíró, J.-P. Bastos, R. C. Mesquita, David Lowther, József Pávó, and others. My Ph.D. students have contributed immensely to the research, and their work is frequently referred to throughout the book. Alexander Plaks worked on adaptive multigrid methods and generalized finite element methods for electromagnetic applications. Leonid Proekt was instrumental in the development of generalized FEM, especially for the vectorial case, and of absorbing boundary conditions.

Preface

xv

Jianhua Dai has worked on generalized finite-difference methods. Frantisek Čajko developed schemes with flexible local approximation and carried out, with a great deal of intelligence and ingenuity, a variety of simulations in nanophotonics and nano-optics. I gratefully acknowledge over two decades of financial support by the National Science Foundation and the NSF-NIRT program,2 as well as private companies (Rockwell Automation, 3ga Co., Baker Hughes, ABB). A number of workshops and tutorials at the University of Minnesota in Minneapolis have been exceptionally interesting and educational for me. I sincerely thank the organizers: Douglas Arnold, Boris Shklovskii, Alexander Grosberg, and others. Special thanks to Serge Prudhomme, the reviewer of the first edition of the book, for many insightful comments, numerous corrections and suggestions, and especially for his careful and meticulous analysis of the chapters on finite-difference and finite element methods. The reviewer did not wish to remain anonymous, which greatly facilitated our communication and helped to improve the text. A substantial portion of the book forms a basis of the graduate course “Simulation of Nanoscale Systems,” which I developed and taught at the University of Akron, Ohio. My work on both editions of the book became possible due to my sabbatical leaves in 2002–2003, 2010–2011 and 2017–2018. I am grateful to the University administration for granting these leaves, as well as to my colleagues at the Department of Electrical & Computer Engineering and four Department Chairs —Robert Veillette, Joan Carletta, Alexis De Abreu Garcia and Nathan Ida—for their support and encouragement. I am also extremely grateful to all my national and international hosts over the years for sharing their knowledge, for their hospitality, patience and financial support through various funding agencies in their respective countries and institutions: • Weng Cho Chew and Li Jun Jiang—Hong Kong University; Electrical Engineering • Graeme Milton—The University of Utah; Mathematics • Stéphane Clénet—ENSAM Lille (France); Electrical Engineering • Sergey Bozhevolnyi—The University of Southern Denmark; Physics • Oszkár Bíró—TU Graz, Austria; Electrical Engineering • Stéphane Clénet—ENSAM Lille, France; Electrical Engineering • George Schatz—Northwestern University; Chemistry • Yidong Chong—Nanyang Technological University, Singapore; Physics • Vadim Markel, Sebastien Guenneau, Boris Gralak—Institut Fresnel, France • Olav Breinbjerg, Radu Malureanu and especially Andrei Lavrinenko—DTU Fotonik and DTU Electrical Engineering (Otto Mønsteds Fond, Denmark) • Olivier Martin—École polytechnique fédérale de Lausanne, Switzerland; Nanophotonics 2

Awards 9702364, 9812895, 0304453, 1216927, 1620112.

xvi

Preface

• Karl Hollaus and Joachim Schöberl—TU Vienna, Austria; Scientific Computing • Guglielmo Rubinacci and Carlo Forestiere—Università degli Studi di Napoli Federico II, Italy; Electrical Engineering • Leszek Demkowicz and Gregory Rodin—The ICES Institute, UT Austin (JTO Faculty Fellowship) • Che-Ting Chan—Hong Kong University of Science & Technology; Physics • Ralf Hiptmair—ETH Zürich, Switzerland; Applied Mathematics I have benefited greatly from communication with my colleagues on a number of research projects managed by Alex Fedoseyev and Edward Kansa: Semyon Tsynkov, Sergey V. Petropavlovsky, Uri Shumlak, who are all top-tier researchers. Many thanks to teams of conference organizers for the opportunities to give plenary/keynote talks at a number of conferences. This has been a thrill and a humbling experience at the same time. The organizers include Roberto Graglia, Piergiorgio Uslenghi, Guido Lombardi, Luca Dal Negro, Kurt Busch, Roland Schnaubelt, Sebastien Guenneau, Boris Gralak, Herbert De Gersem, Sebastian Schöps and many others. Over the last ten years, I have been extremely fortunate to have Vadim Markel as a key collaborator. His breadth of knowledge and depth of physical insight are truly impressive. Finally, I thank Springer’s editors, especially David Packer, David Lockwood, Jeffrey Taub, Barbara Amorese, Jayanthi Narayanaswamy for their help, cooperation and patience. Akron, OH, USA August 9, 2020

Igor Tsukerman

On Units and Conventions

If something can be misunderstood, it will be. If something cannot be misunderstood, it will be anyway. Corollary to Murphy’s laws.

There are two main systems of units: Gaussian, used primarily by physicists and applied mathematicians, and SI, used primarily by electrical engineers. There are two sign conventions for the time phasor, expðixtÞ, and two symbols for the imaginary unit (i and j). Altogether, this produces eight notational possibilities. Fortunately, only 2–3 of those are in active use, but that is still quite messy. This creates a dilemma for a book like this one, whose subject is at the intersection of different disciplines and whose prospective readers may prefer different conventions. Moreover, various methods and theories originated in different scientific and research communities. For example, developments in finite-difference time domain (FDTD) methods have historically been published in the electrical engineering literature primarily, while the majority of publications in optics and photonics are on the physics side. The beauty of each set of conventions is in the eye of the beholder. My own background is in electrical engineering, and this biased me toward the use of SI units and the expð þ ixtÞ phasor convention in the first edition of this book. One shortcoming of the SI system is the redundancy in electromagnetic parameters: two primary constants e0 and l0 with silly numerical values, as opposed to just the vacuum speed of light c in the Gaussian system.3 Having to carry e0 and l0 through all expressions is a particular nuisance in optics, photonics and homogenization theory of periodic structures.4

pffiffiffiffiffiffiffiffiffiffi Yet one prominent researcher expressed an opinion that the free-space impedance l0 e0  376:7X has a significant physical meaning. Significance, like beauty, is truly “in the eye of the beholder”. 4 There is one counterpoint, though. The relation B ¼ l0 H, rather than just B ¼ H in a vacuum, gives us a hint that B and H can be considered as fundamentally different objects. This lies in the foundation of the differential-geometric treatment of electromagnetic theory (see remarks on Sects. 7.2 and 9.3.74). Still, this differential-geometric treatment involves the so-called Hodge operators rather than just a factor like l0 . 3

xvii

xviii

On Units and Conventions

To the extent possible, I tried to make the exposition independent of the system of units and occasionally introduced generic factors that have different values in different systems or, when practical, provided both Gaussian and SI versions of equations. In other cases, most notably in the chapter on metamaterials, the presence of e0 and l0 would have made expressions more cumbersome than they need to be, and so the Gaussian system is used. This also makes the notation consistent with most of the metamaterials literature. Similarly, the phasor convention expðixtÞ is more prevalent in the optics and photonics (and, generally, physics) literature. Under that convention, a plane wave propagating, say, in the þ x direction has the spatial exponential factor expð þ ikxÞ rather than the slightly confusing expðikxÞ under the opposite phasor convention. Also, the imaginary part of the dielectric permittivity of passive media is positive under expðixtÞ. The readers may of course exercise their discretion and pay more attention to the substance of the analysis rather than the system-dependent factors in particular expressions.

Contents

1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Why Deal with the Nanoscale? . . . . . . . 1.2 Why Special Models for the Nanoscale? . 1.3 How to Hone the Computational Tools . 1.4 So What? . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

1 1 2 5 7

2

Finite-Difference Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 A Primer on Time-Stepping Schemes . . . . . . . . . . . . . . . . 2.3 Exact Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Some Classic Schemes for Initial Value Problems . . . . . . . 2.4.1 The Runge–Kutta Methods . . . . . . . . . . . . . . . . . 2.4.2 The Adams Methods . . . . . . . . . . . . . . . . . . . . . . 2.4.3 Stability of Linear Multistep Schemes . . . . . . . . . 2.4.4 Methods for Stiff Systems . . . . . . . . . . . . . . . . . . 2.5 Schemes for Hamiltonian Systems . . . . . . . . . . . . . . . . . . 2.5.1 Introduction to Hamiltonian Dynamics . . . . . . . . . 2.5.2 Symplectic Schemes for Hamiltonian Systems . . . 2.6 Schemes for One-Dimensional Boundary Value Problems . 2.6.1 The Taylor Derivation . . . . . . . . . . . . . . . . . . . . . 2.6.2 Using Constraints to Derive Difference Schemes . 2.6.3 Flux-Balance Schemes . . . . . . . . . . . . . . . . . . . . 2.6.4 Implementation of 1D Schemes for Boundary Value Problems . . . . . . . . . . . . . . . . . . . . . . . . . 2.7 Schemes for Two-Dimensional Boundary Value Problems . 2.7.1 Schemes Based on the Taylor Expansion . . . . . . . 2.7.2 Flux-Balance Schemes . . . . . . . . . . . . . . . . . . . . 2.7.3 Implementation of 2D Schemes . . . . . . . . . . . . . . 2.7.4 The Collatz “Mehrstellen” Schemes in 2D . . . . . . 2.8 Schemes for Three-Dimensional Problems . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

11 11 12 17 19 20 24 24 27 34 34 37 38 38 40 42

. . . . . . .

. . . . . . .

. . . . . . .

46 47 47 48 50 51 54

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

xix

xx

Contents

2.8.1 An Overview . . . . . . . . . . . . . . . . . . . . . . . . . 2.8.2 Schemes Based on the Taylor Expansion in 3D 2.8.3 Flux-Balance Schemes in 3D . . . . . . . . . . . . . . 2.8.4 Implementation of 3D Schemes . . . . . . . . . . . . 2.8.5 The Collatz “Mehrstellen” Schemes in 3D . . . . 2.9 Consistency and Convergence of Difference Schemes . . 2.10 Summary and Further Reading . . . . . . . . . . . . . . . . . . . Appendix: Frequently Used Vector and Matrix Norms . . . . . . . Appendix: Matrix Exponential . . . . . . . . . . . . . . . . . . . . . . . . . 3

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

54 55 56 57 58 59 64 65 66

The Finite Element Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Everything Is Variational . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 The Weak Formulation and the Galerkin Method . . . . . . . . 3.3 Variational Methods and Minimization . . . . . . . . . . . . . . . . 3.3.1 The Galerkin Solution Minimizes the Error . . . . . . 3.3.2 The Galerkin Solution and the Energy Functional . . 3.4 Essential and Natural Boundary Conditions . . . . . . . . . . . . 3.5 Mathematical Notes: Convergence, Lax–Milgram and Céa’s Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Local Approximation in the Finite Element Method . . . . . . 3.7 The Finite Element Method in One Dimension . . . . . . . . . . 3.7.1 First-Order Elements . . . . . . . . . . . . . . . . . . . . . . . 3.7.2 Higher-Order Elements . . . . . . . . . . . . . . . . . . . . . 3.8 The Finite Element Method in Two Dimensions . . . . . . . . . 3.8.1 First-Order Elements . . . . . . . . . . . . . . . . . . . . . . . 3.8.2 Higher-Order Triangular Elements . . . . . . . . . . . . . 3.9 The Finite Element Method in Three Dimensions . . . . . . . . 3.10 Approximation Accuracy in FEM . . . . . . . . . . . . . . . . . . . 3.10.1 Appendix: The Ladyzhenskaya–Babuška–Brezzi Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.10.2 Appendix: A Peculiar Case of Finite Element Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.11 An Overview of System Solvers . . . . . . . . . . . . . . . . . . . . 3.12 Electromagnetic Problems and Edge Elements . . . . . . . . . . 3.12.1 Why Edge Elements? . . . . . . . . . . . . . . . . . . . . . . 3.12.2 The Definition and Properties of Whitney-Nédélec Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.12.3 Implementation Issues . . . . . . . . . . . . . . . . . . . . . . 3.12.4 Historical Notes on Edge Elements . . . . . . . . . . . . 3.12.5 Appendix: Several Common Families of Tetrahedral Edge Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.13 Adaptive Mesh Refinement and Multigrid Methods . . . . . . . 3.13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.13.2 Hierarchical Bases and Local Refinement . . . . . . . .

. . . . . . .

. . . . . . .

69 69 75 81 81 82 83

. . . . . . . . . .

. . . . . . . . . .

86 88 91 91 100 103 103 117 117 119

. . . . . . . . .

. . . . . . . . .

. . 121 . . . .

. . . .

123 124 133 133

. . 136 . . 139 . . 140 . . . .

. . . .

141 142 142 143

Contents

3.14

3.15

3.16 3.17 4

xxi

3.13.3 A Posteriori Error Estimates . . . . . . . . . . . . . . 3.13.4 Multigrid Algorithms . . . . . . . . . . . . . . . . . . . Special Topic: Element Shape and Approximation Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 3.14.2 Algebraic Sources of Shape-Dependent Errors: Eigenvalue and Singular Value Conditions . . . . 3.14.3 Geometric Implications of the Singular Value Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.14.4 Condition Number and Approximation . . . . . . 3.14.5 Discussion of Algebraic and Geometric a Priori Estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . Special Topic: Generalized FEM . . . . . . . . . . . . . . . . . 3.15.1 Description of the Method . . . . . . . . . . . . . . . 3.15.2 Trade-Offs . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary and Further Reading . . . . . . . . . . . . . . . . . . . Appendix: Generalized Curl and Divergence . . . . . . . . .

. . . . . 145 . . . . . 149 . . . . . 152 . . . . . 152 . . . . . 154 . . . . . 165 . . . . . 173 . . . . . .

. . . . . .

Flexible Local Approximation MEthods (FLAME) . . . . . . . . . . 4.1 A Preview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Perspectives on Generalized FD Schemes . . . . . . . . . . . . . 4.2.1 Perspective #1: Basis Functions Not Limited to Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.2 Perspective #2: Approximating the Solution, Not the Equation . . . . . . . . . . . . . . . . . . . . . . . . 4.2.3 Perspective #3: Multivalued Approximation . . . . . 4.2.4 Perspective #4: Conformity Versus Flexibility . . . 4.2.5 Why Flexible Approximation? . . . . . . . . . . . . . . . 4.2.6 A Preliminary Example: The 1D Laplace Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Trefftz Schemes with flexible local approximation . . . . . . . 4.3.1 Overlapping Patches . . . . . . . . . . . . . . . . . . . . . . 4.3.2 Construction of the Schemes . . . . . . . . . . . . . . . . 4.3.3 The Treatment of Boundary Conditions . . . . . . . . 4.3.4 Trefftz–FLAME Schemes for Inhomogeneous and Nonlinear Equations . . . . . . . . . . . . . . . . . . . 4.3.5 Consistency and Convergence of the Schemes . . . 4.4 Trefftz–FLAME Schemes: Case Studies . . . . . . . . . . . . . . 4.4.1 1D Laplace, Helmholtz and Convection–Diffusion Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.2 The 1D Heat Equation with Variable Material Parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.3 The 2D and 3D Laplace Equation . . . . . . . . . . . . 4.4.4 The Fourth-Order Nine-Point Mehrstellen Scheme for the Laplace Equation in 2D . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

174 175 175 177 177 179

. . . 183 . . . 183 . . . 186 . . . 186 . . . .

. . . .

. . . .

186 187 188 189

. . . . .

. . . . .

. . . . .

191 193 193 194 197

. . . 197 . . . 199 . . . 201 . . . 201 . . . 202 . . . 203 . . . 203

xxii

Contents

4.4.5

4.5 4.6

4.7 4.8

4.9 4.10 5

The Fourth-Order 19-Point Mehrstellen Scheme for the Laplace Equation in 3D . . . . . . . . . . . . . . 4.4.6 The 1D Schrödinger Equation. FLAME Schemes by Variation of Parameters . . . . . . . . . . . . . . . . . 4.4.7 Super-High-Order FLAME Schemes for the 1D Schrödinger Equation . . . . . . . . . . . . . . . . . . . . . 4.4.8 A Singular Equation . . . . . . . . . . . . . . . . . . . . . . 4.4.9 A Polarized Elliptic Particle . . . . . . . . . . . . . . . . 4.4.10 A Line Charge Near a Slanted Boundary . . . . . . . 4.4.11 Scattering from a Dielectric Cylinder . . . . . . . . . . Approximation in Trefftz Subspaces . . . . . . . . . . . . . . . . . Existing Methods Featuring Flexible or Nonstandard Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.1 The Treatment of Singularities in Standard FEM . 4.6.2 Generalized FEM by Partition of Unity . . . . . . . . 4.6.3 Homogenization Schemes Based on Variational Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.4 Discontinuous Galerkin Methods . . . . . . . . . . . . . 4.6.5 Homogenization Schemes in FDTD . . . . . . . . . . . 4.6.6 Meshless Methods . . . . . . . . . . . . . . . . . . . . . . . 4.6.7 Special Finite Element Methods . . . . . . . . . . . . . 4.6.8 Domain Decomposition . . . . . . . . . . . . . . . . . . . . 4.6.9 Pseudospectral Methods . . . . . . . . . . . . . . . . . . . 4.6.10 Special FD Schemes . . . . . . . . . . . . . . . . . . . . . . Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix: Variational FLAME . . . . . . . . . . . . . . . . . . . . 4.8.1 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.8.2 The Model Problem . . . . . . . . . . . . . . . . . . . . . . 4.8.3 Construction of Variational FLAME . . . . . . . . . . 4.8.4 Summary of the Variational-Difference Setup . . . . Appendix: Coefficients of the Nine-Point Trefftz–FLAME Scheme for the Wave Equation in Free Space . . . . . . . . . . Appendix: The Fréchet Derivative . . . . . . . . . . . . . . . . . .

Long-Range Interactions in Free Space . . . . . . . . . . . . . . . 5.1 Long-Range Particle Interactions in a Homogeneous Medium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Real and Reciprocal Lattices . . . . . . . . . . . . . . . . . . . 5.3 Introduction to Ewald Summation . . . . . . . . . . . . . . . 5.3.1 A Boundary Value Problem for Charge Interactions . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.2 A Re-formulation with “Clouds” of Charge . . 5.3.3 The Potential of a Gaussian Cloud of Charge . 5.3.4 The Field of a Periodic System of Clouds . . .

. . . 204 . . . 205 . . . . . .

. . . . . .

. . . . . .

206 207 209 211 212 214

. . . 222 . . . 225 . . . 225 . . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

226 226 227 228 229 230 230 231 232 235 235 235 236 239

. . . 240 . . . 241

. . . . . . 243 . . . . . . 243 . . . . . . 246 . . . . . . 247 . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

250 252 253 255

Contents

5.4

5.5 5.6 5.7 5.8 6

xxiii

5.3.5 The Ewald Formulas . . . . . . . . . . . . . . . . . . . . 5.3.6 The Role of Parameters . . . . . . . . . . . . . . . . . . Grid-Based Ewald Methods with FFT . . . . . . . . . . . . . 5.4.1 The Computational Work . . . . . . . . . . . . . . . . 5.4.2 On Numerical Differentiation . . . . . . . . . . . . . . 5.4.3 Particle–Mesh Ewald . . . . . . . . . . . . . . . . . . . 5.4.4 Smooth Particle–Mesh Ewald Methods . . . . . . 5.4.5 Particle–Particle Particle–Mesh Ewald Methods 5.4.6 The York–Yang Method . . . . . . . . . . . . . . . . . 5.4.7 Methods Without Fourier Transforms . . . . . . . Notes on Ewald Formulas in the Literature . . . . . . . . . . Summary and Further Reading . . . . . . . . . . . . . . . . . . . Appendix: The Fourier Transform of “Periodized” Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix: An Infinite Sum of Complex Exponentials . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

256 259 260 260 266 268 271 274 275 276 278 279

. . . . . 281 . . . . . 283

Long-Range Interactions in Heterogeneous Systems . . . . . . . . . . 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 FLAME Schemes for Static Fields of Polarized Particles in 2D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Computation of Fields and Forces for Cylindrical Particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2 A Numerical Example: Well-Separated Particles . . . 6.2.3 A Numerical Example: Small Separations . . . . . . . 6.3 Static Fields of Spherical Particles in a Homogeneous Dielectric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 FLAME Basis and the Scheme . . . . . . . . . . . . . . . 6.3.2 A Basic Example: Spherical Particle in Uniform Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Introduction to the Poisson–Boltzmann Model . . . . . . . . . . 6.5 Limitations of the PBE Model . . . . . . . . . . . . . . . . . . . . . . 6.6 Numerical Methods for 3D Electrostatic Fields of Colloidal Particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.7 3D FLAME Schemes for Particles in Solvent . . . . . . . . . . . 6.8 The Numerical Treatment of Nonlinearity . . . . . . . . . . . . . . 6.9 The DLVO Expression for Electrostatic Energy and Forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.10 London, Van der Waals and Casimir Forces: Theory and Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.10.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.10.2 Casimir Effect: Interpretations and Controversies . . 6.10.3 Computation of Casimir and van der Waals Forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.11 Thermodynamic Potential, Free Energy and Forces . . . . . . .

. . 285 . . 285 . . 289 . . 293 . . 295 . . 297 . . 305 . . 305 . . 308 . . 311 . . 315 . . 316 . . 317 . . 321 . . 324 . . 327 . . 327 . . 329 . . 331 . . 335

xxiv

Contents

6.12 6.13 6.14 6.15 7

Comparison of FLAME and DLVO Results . . . . . . . . Summary and Further Reading . . . . . . . . . . . . . . . . . . Appendix: Thermodynamic Potential for Electrostatics in Solvents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix: Generalized Functions (Distributions) . . . . .

. . . . . . 339 . . . . . . 344 . . . . . . 344 . . . . . . 349

Finite-Difference Time-Domain Methods for Electrodynamics 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Basic Ideas and Schemes . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Consistency of the Yee Scheme in 1D . . . . . . . . . . . . . . 7.4 The Yee Scheme in Fourier Space (1D) . . . . . . . . . . . . . 7.5 Stability of the Yee Scheme in 1D . . . . . . . . . . . . . . . . . 7.6 The Yee Scheme in 2D . . . . . . . . . . . . . . . . . . . . . . . . . 7.6.1 Formulation (2D) . . . . . . . . . . . . . . . . . . . . . . . 7.6.2 Numerical Dispersion and Stability (2D) . . . . . . 7.7 The Yee Scheme in 3D . . . . . . . . . . . . . . . . . . . . . . . . . 7.7.1 Formulation (3D) . . . . . . . . . . . . . . . . . . . . . . . 7.7.2 Numerical Dispersion and Stability (3D) . . . . . . 7.8 The Finite Integration Technique (FIT) . . . . . . . . . . . . . . 7.9 Advanced Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . 7.9.1 Implicit Schemes . . . . . . . . . . . . . . . . . . . . . . . 7.9.2 Pseudospectral Time-Domain Methods . . . . . . . . 7.10 Exterior Boundary Conditions . . . . . . . . . . . . . . . . . . . . 7.10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.10.2 Perfectly Matched Layers . . . . . . . . . . . . . . . . . 7.10.3 PML: Complex Coordinate Transforms . . . . . . . 7.10.4 Discrete Perfectly Matched Layers . . . . . . . . . . . 7.10.5 Absorbing Conditions . . . . . . . . . . . . . . . . . . . . 7.11 Long-Term Stability and the Lacunae Method . . . . . . . . 7.12 The Treatment of Material Interfaces . . . . . . . . . . . . . . . 7.12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.12.2 An Overview of Existing Techniques . . . . . . . . . 7.12.3 Order of a Difference Scheme Revisited: Trefftz Test Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.13 The Total-Field/Scattered-Field Formulations . . . . . . . . . 7.14 The Treatment of Frequency-Dependent Parameters . . . . 7.15 Simulation of Periodic Structures . . . . . . . . . . . . . . . . . . 7.16 Near-to-Far-Field Transformations . . . . . . . . . . . . . . . . . 7.16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.16.2 Example: “Far Field” in 1D . . . . . . . . . . . . . . . 7.16.3 Far Field in Maxwell’s Electrodynamics . . . . . . 7.17 Historical Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.18 Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.19 Appendix: The Yee Scheme Is Exact in 1D for the “Magic” Time Step . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

357 357 358 361 361 364 365 365 367 368 368 370 371 374 374 376 377 377 378 379 382 384 391 392 392 393

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

395 403 405 406 409 409 412 414 414 416

. . . . 417

Contents

7.20

8

xxv

Appendix: Green’s Functions for Maxwell’s Equations 7.20.1 Green’s Functions for the Helmholtz Equation 7.20.2 Maxwell’s Equations . . . . . . . . . . . . . . . . . . 7.20.3 Subcase Jm ¼ 0 . . . . . . . . . . . . . . . . . . . . . . 7.20.4 Subcase Je ¼ 0 . . . . . . . . . . . . . . . . . . . . . . . 7.20.5 Excitation by both Electric and “Magnetic” Currents . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.20.6 Summary: Near-to-Far-Field Transformation .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

418 418 419 419 421

. . . . . . 422 . . . . . . 423

Applications in Nano-Photonics . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Maxwell’s Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 One-Dimensional Problems of Wave Propagation . . . . . . . . 8.3.1 The Wave Equation and Plane Waves . . . . . . . . . . 8.3.2 Signal Velocity and Group Velocity . . . . . . . . . . . 8.3.3 Group Velocity and Energy Velocity . . . . . . . . . . . 8.4 Analysis of Periodic Structures in 1D . . . . . . . . . . . . . . . . . 8.5 Remarks and Further Reading on Bloch Waves . . . . . . . . . 8.6 Band Structure by Fourier Analysis (Plane Wave Expansion) in 1D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.7 Characteristics of Bloch Waves . . . . . . . . . . . . . . . . . . . . . 8.7.1 Fourier Harmonics of Bloch Waves . . . . . . . . . . . . 8.7.2 Fourier Harmonics and the Poynting Vector . . . . . . 8.7.3 Bloch Waves and Group Velocity . . . . . . . . . . . . . 8.8 Two-Dimensional Problems of Wave Propagation . . . . . . . . 8.9 Photonic Bandgap in Two Dimensions . . . . . . . . . . . . . . . . 8.10 Band Structure Computation: PWE, FEM and FLAME . . . . 8.10.1 Solution by Plane Wave Expansion . . . . . . . . . . . . 8.10.2 The Role of Polarization . . . . . . . . . . . . . . . . . . . . 8.10.3 Accuracy of the Fourier Expansion . . . . . . . . . . . . 8.10.4 FEM for Photonic Bandgap Problems in 2D . . . . . 8.10.5 A Numerical Example: Band Structure Using FEM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.10.6 Flexible Local Approximation Schemes for Waves in Photonic Crystals . . . . . . . . . . . . . . . . . . . . . . . 8.10.7 Band Structure Computation Using FLAME . . . . . 8.11 Photonic Bandgap Calculation in Three Dimensions: Comparison with the 2D Case . . . . . . . . . . . . . . . . . . . . . . 8.11.1 Formulation of the Vector Problem . . . . . . . . . . . . 8.11.2 FEM for Photonic Bandgap Problems in 3D . . . . . 8.11.3 Historical Notes on the Photonic Bandgap Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

425 425 425 429 429 431 434 436 449

. . . . . . . . . . . .

. . . . . . . . . . . .

452 456 456 457 458 459 462 465 465 467 467 469

. . 474 . . 476 . . 481 . . 485 . . 485 . . 489 . . 490

xxvi

Contents

8.12

8.13

8.14

8.15

8.16 8.17 9

Negative Permittivity and Plasmonic Effects . . . . . . . . . . . 8.12.1 Electrostatic Resonances for Spherical Particles . . 8.12.2 Plasmon Resonances: Electrostatic Approximation 8.12.3 Wave Analysis of Plasmonic Systems . . . . . . . . . 8.12.4 Some Common Methods for Plasmon Simulation . 8.12.5 Trefftz–FLAME Simulation of Plasmonic Particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.12.6 Plasmonic Nano-Focusing: Finite Element Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Diffraction Limit: Can It Be Broken? . . . . . . . . . . . . 8.13.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.13.2 Superoscillations . . . . . . . . . . . . . . . . . . . . . . . . . 8.13.3 Subdiffraction Focusing and Imaging Techniques: A Brief Summary . . . . . . . . . . . . . . . . . . . . . . . . Plasmonic Enhancement in Scanning Near-Field Optical Microscopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.14.1 Apertureless and Dark-Field Microscopy . . . . . . . 8.14.2 Simulation Examples for Apertureless SNOM . . . Backward Waves, Negative Refraction and Superlensing . . 8.15.1 Introduction and Historical Notes . . . . . . . . . . . . 8.15.2 Negative Permittivity and the “Perfect Lens” Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.15.3 Forward and Backward Plane Waves in a Homogeneous Isotropic Medium . . . . . . . . . . 8.15.4 Backward Waves in Mandelshtam’s Chain of Oscillators . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.15.5 Backward Waves and Negative Refraction in Photonic Crystals . . . . . . . . . . . . . . . . . . . . . . 8.15.6 Are There Two Species of Negative Refraction? . Appendix: The Bloch Transform . . . . . . . . . . . . . . . . . . . Appendix: Eigenvalue Solvers . . . . . . . . . . . . . . . . . . . . .

Metamaterials and Their Parameters . . . . . . . . . . . . . . . 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Applications of Metamaterials . . . . . . . . . . . . . . . . 9.2.1 An Overview . . . . . . . . . . . . . . . . . . . . . . 9.2.2 Imaging: Perfect and Imperfect Lenses . . . . 9.2.3 Transformation Optics . . . . . . . . . . . . . . . . 9.2.4 Tunable, Reconfigurable, Superconducting and Other Metamaterials and Metadevices . 9.2.5 Metamaterial Absorbers . . . . . . . . . . . . . . 9.3 Homogenization . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 9.3.2 Parameter Retrieval: Procedures . . . . . . . . .

. . . . .

. . . . .

. . . . .

490 493 495 497 498

. . . 500 . . . .

. . . .

. . . .

503 508 508 510

. . . 514 . . . . .

. . . . .

. . . . .

517 520 522 523 523

. . . 528 . . . 533 . . . 535 . . . .

. . . .

. . . .

541 546 551 552

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

561 561 563 563 565 566

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

571 572 577 577 580

Contents

xxvii

9.3.3 9.3.4 9.3.5 9.3.6 9.3.7 9.3.8 9.3.9

9.4 9.5 9.6

9.7

Parameter Retrieval: Anomalies and Controversies Two-Parameter Homogenization . . . . . . . . . . . . . “High-Frequency Homogenization” . . . . . . . . . . . Wave Vector-Dependent Tensor . . . . . . . . . . . . . Non-asymptotic Homogenization . . . . . . . . . . . . . The Uncertainty Principle . . . . . . . . . . . . . . . . . . Polarization, Magnetization, and Classical Effective Medium Theories . . . . . . . . . . . . . . . . . 9.3.10 Summary on Homogenization of Periodic Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix: Parameters of Split-Ring Resonators . . . . . . . . Appendix: Coordinate Mappings and Tensor Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix: Dielectric Cylinders and Spheres in a Uniform Electrostatic Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.6.1 Dielectric Cylinders in a Uniform Electrostatic Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.6.2 Dielectric Spheres in a Uniform Electrostatic Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix: Wave Propagation Through a Homogeneous Slab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.7.1 Maxwell’s Equations . . . . . . . . . . . . . . . . . . . . . 9.7.2 E-mode (s-mode), with Possible Nonlocality . . . . 9.7.3 H-mode (p-mode), with Possible Nonlocality . . . .

10 Miscellany . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1 Good or Poor Conductors for Low Loss? . . . . . . . . . . . 10.2 The “Source Current” . . . . . . . . . . . . . . . . . . . . . . . . . 10.3 Boundary Conditions in Effective Medium Theory . . . . 10.4 “Spurious Modes” . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.5 The Moment Method and FEM . . . . . . . . . . . . . . . . . . 10.6 The Magnetostatic “Source Field” and the Biot–Savart Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.7 TE and TM Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.8 FDTD Versus Discontinuous Galerkin and FETD . . . . . 10.9 1D Poisson Equation: FEM Solution with Exact Nodal Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.10 Good or Poor Conductors for Low Loss? (A Hint) . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

584 586 587 587 588 610

. . . 614 . . . 623 . . . 625 . . . 627 . . . 629 . . . 629 . . . 630 . . . .

. . . .

. . . .

631 631 631 634

. . . . . .

. . . . . .

. . . . . .

637 637 638 638 639 639

. . . . . 640 . . . . . 641 . . . . . 641 . . . . . 642 . . . . . 643

11 Conclusion: “Plenty of Room at the Bottom” for Computational Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 701

Chapter 1

Introduction

May you live in interesting times. Eric Frank Russell, “U-Turn” (1950).

It’s been said that a good presentation should address three key questions: (1) Why? (2) How? and (3) So What? The following sections answer these questions, and a few more.

1.1 Why Deal with the Nanoscale? The complexity and variety of applications on the nanoscale are as great, or arguably greater, than on the macroscale. While a detailed account of nanoscale problems in a single book is impossible, one can make a general observation on the importance of the nanoscale: the properties of materials are strongly affected by their nanoscale structure. Many remarkable effects, physical phenomena, materials and devices have already been discovered or developed: nanocomposites, carbon nanotubes, nanowires and nanodots, nanoparticles of different types, photonic crystals, metamaterials, and so on. On a more fundamental level, research in nanoscale physics may provide clues to the most profound mysteries of nature. Where is the frontier of physics? asks L. S. Schulman in the Preface to his book [Sch97]. Some would say 10−33 cm, some 10−15 cm and some 10+28 cm. My vote is for 10−6 cm. Two of the greatest puzzles of our age have their origins at the interface between the macroscopic and microscopic worlds. The older mystery is the thermodynamic arrow of time, the way that (mostly) time-symmetric microscopic laws acquire a manifest asymmetry at larger scales. And then there’s the superposition principle of quantum mechanics, a profound

© Springer Nature Switzerland AG 2020 I. Tsukerman, Computational Methods for Nanoscale Applications, Nanostructure Science and Technology, https://doi.org/10.1007/978-3-030-43893-7_1

1

2

1 Introduction revolution of the twentieth century. When this principle is extrapolated to macroscopic scales, its predictions seem widely at odds with ordinary experience.

The second “puzzle” that Professor Schulman refers to is the apparent contradiction between the quantum-mechanical representation of micro-objects in a superposition of quantum states and a single unambiguous state that all of us really observe for macro-objects. Where and how exactly is this transition from the quantum world to the macro-world effected? The boundary between particle- or atomic-size quantum objects and macro-objects is on the nanoscale; that is where the “collapse of the quantum-mechanical wave function” from a superposition of states to one welldefined state would have to occur. Remarkable double-slit experiments published by M. Arndt et al. in 1999 show no evidence of “collapse” of the wave function and prove the wave nature of large molecules with the mass of up to 1,632 units and size up to 2 nm [ANVA+99]. 14 year later, S. Eibenberger et al. demonstrated quantum interference of molecules exceeding 10,000 atomic mass units and having 810 atoms in a single particle [EGA+13]. Further experiments along similar lines will undoubtedly be captivating; see excellent reviews by K. Hornberger et al. [HGH+12], F. Fröwis et al. [FSD+18]. Still, this book covers only classical (non-quantum-mechainical) models, sufficient for many nanoscale and some molecular-scale problems. In addition to theoretical and computational issues, the book covers some practical aspects of molecular and nano-science: molecular dynamics, near-field optics, plasmonic field enhancement, high-resolution imaging, cloaking, and more. Countless other equally fascinating applications in numerous other areas could be given. Like it or not, we live in interesting times.

1.2 Why Special Models for the Nanoscale? A good model can advance fashion by ten years. Yves Saint Laurent

First, a general observation. A simulation model consists of a physical and mathematical formulation of the problem at hand and a computational method. The formulation tells us what to solve and the computational method tells us how to solve it. Frequently more than one formulation is possible, and almost always several computational techniques are available; hence there potentially are numerous combinations of formulations and methods. Ideally, one strives to find the best such combination(s) in terms of efficiency, accuracy, robustness, algorithmic simplicity, and so on. It is not surprising that the formulations of nanoscale problems are indeed special. The scale is often too small for continuous-level macroscopic laws to be fully applicable; yet it is too large for a first-principles atomic simulation to be feasible. Computational compromises are reached in several different ways. In some cases, continuous parameters can be used with some caution and with suitable adjustments.

1.2 Why Special Models for the Nanoscale?

3

One example is light scattering by small particles and the related “plasmonic” effects (Chap. 8), where the dielectric constant of metals or dielectrics can be adjusted to account for the size of the scatterers. In other situations, multiscale modeling is used, where a hierarchy of problems are solved and the information obtained on a finer level is passed on to the coarser ones and back. Multiscale often goes hand-in-hand with multiphysics: for example, molecular dynamics on the finest scale is combined with continuum mechanics on the macroscale. The Society for Industrial and Applied Mathematics (SIAM) publishes a journal devoted entirely to this subject: Multiscale Modeling and Simulation, inaugurated in 2003. While multiscale and multiphysics models are not the main theme of this book, the applications and problems considered in it do have some multiscale features; examples are the Flexible Local Approximation MEthod (FLAME) of Chap. 4, scanning near-field optical microscopy (SNOM, Sect. 8.14), and homogenization of periodic structures (Chap. 9). Sometimes fine-scale degrees of freedom can be “integrated out”. In colloidal simulation (Chap. 6), this leads to the Poisson–Boltzmann equation applicable on the scale of colloidal particles (approximately from 10 to 1000 nm). Let us now discuss the computational side of nanoscale models. Computational analysis is a mature discipline combining science, engineering and elements of art. It includes general and powerful techniques such as finite difference, finite element, spectral or pseudospectral, integral equation and other methods; it has been applied to every physical problem and device imaginable. Are these existing methods good enough for nanoscale problems? The answer is a resounding “maybe”. • When continuum models are still applicable, traditional methods work well. A relevant example is the simulation of light scattering by plasmon nanoparticles and of plasmon-enhanced components for ultra-sensitive optical sensors and near-field microscopes (Chap. 8). Despite the nanoscale features of the problem, equivalent material parameters (dielectric permittivity and magnetic permeability) can still be used, possibly with some adjustments. Consequently, commercial finite-element software is suitable for this type of modeling. • When the system size is even smaller, as in macromolecular simulation, the use of equivalent material parameters is more questionable. In electrostatic models of protein molecules in solvents—an area of extensive and intensive research due to its enormous implications for biology and medicine—two main approaches coexist. In implicit models, the solvent is characterized by equivalent continuum parameters (dielectric permittivity and the Debye length). In the layer of the solvent immediately adjacent to the surface of the molecule, these equivalent parameters are dramatically different from their values in the bulk (A. Rubinstein & S. Sherman [RS04]). In contrast, explicit models directly include molecular dynamics of the solvent. This approach is in principle more accurate, as no approximation of the solvent by an equivalent medium is made, but the computational cost is extremely high due to a very large number of degrees of freedom corresponding to the molecules of the solvent. For more information on protein simulation, see

4

1 Introduction

T. Schlick’s book [Sch02] and T. Simonson’s review paper [Sim03] as a starting point. • When the problem reduces to a system of ordinary differential equations, the computational analysis is on very solid ground—this is one of the most mature areas of numerical mathematics (Chap. 2). It is highly desirable to use numerical schemes that preserve the essential physical properties of the system. In Molecular Dynamics, such fundamental properties are the conservation of energy and momentum, and—more generally—symplecticness of the underlying Hamiltonian system (Sect. 2.5). Time-stepping schemes with analogous conservation properties are available and their advantages are now widely recognized (J. M. Sanz-Serna & M. P. Calvo [SSC94], Yu. B. Suris [Sur87, Sur96], R. D. Skeel et al. [RDS97]). • Quantum mechanical effects require special computational treatment. The models are substantially different from those of continuum media for which the traditional methods (such as finite elements or finite differences) were originally designed and used. Nevertheless these traditional methods can be very effective at certain stages of quantum mechanical analysis. For example, classical finite-difference schemes (in particular, the Collatz “Mehrstellen” schemes, Chap. 2), have been successfully applied to the Kohn–Sham equation—the central procedure in Density Functional Theory. (This is the Schrödinger equation, with the potential expressed as a function of electron density.) For a detailed description, see E. L. Briggs et al. [BSB96] and T. L. Beck [Bec00]. Moreover, difference schemes can also be used to find the electrostatic potential from the Poisson equation with the electron density in the right hand side. • Colloidal simulation considered in Chap. 6 is an interesting and special computational case. As explained in that chapter, classical methods of computation are not particularly well suited for this problem. Finite element meshes become too complex and impractical to generate even for a moderate number of particles in the model; standard finite-difference schemes require unreasonably fine grids to represent the boundaries of the particles accurately; the Fast Multipole Method does not work too well for inhomogeneous and/or nonlinear problems. A new finite-difference calculus of Flexible Local Approximation MEthods (FLAME) is a promising alternative (Chap. 4). This list could easily be extended to include other examples, but the main point is clear: a variety of computational methods, both traditional and new, are very helpful for the efficient simulation of nanoscale systems.

1.3 How to Hone the Computational Tools

5

1.3 How to Hone the Computational Tools A computer makes as many mistakes in two seconds as 20 men working 20 years make. Murphy’s Laws of Computing

Computer simulation is not an exact science. If it were, one would simply set a desired level of accuracy  of the numerical solution and prove that a certain method achieves that level with the minimal number of operations  = (). The reality is of course much more intricate. First, there are many possible measures of accuracy and many possible measures of the cost (keeping in mind that human time needed for the development of algorithms and software may be more valuable than the CPU time). Accuracy and cost both depend on the class and subclass of problems being solved. For example, numerical solution becomes substantially more complicated if discontinuities and edge or corner singularities of the field need to be represented accurately. Second, it is usually close to impossible to guarantee, at the mathematical level of rigor, that the numerical solution obtained has a certain prescribed accuracy.1 Third, in practice it is never possible to prove that any given method minimizes the number of arithmetic operations. Fourth, there are modeling errors— approximations made in the formulation of the physical problem; these errors are a particular concern on the nanoscale, where direct and accurate experimental verification of the assumptions made is very difficult. Fifth, a host of other issues—from the algorithmic implementation of the chosen method to roundoff errors—are quite difficult to take into account. Parallelization of the algorithm and the computer code is another complicated matter. With all this in mind, computer simulation turns out to be partially an art. There is always more than one way to solve a given problem numerically and, with enough time and resources, any reasonable approach is likely to produce a result eventually. Still, it is obvious that not all approaches are equal. Although the accuracy and computational cost cannot be determined exactly, some qualitative measures are certainly available and are commonly used. The main characteristic is the asymptotic behavior of the number of operations and memory required for a given method as a function of some accuracy-related parameter. In mesh-based methods (finite elements, finite differences, Ewald summation, etc.) the mesh size h or the number of nodes n usually act as such a parameter. The “big-oh” notation is standard; for example, the number of arithmetic operations θ being O(n γ ) as n → ∞ means that c1 n γ ≤ θ ≤ c2 n γ , where c1,2 and γ are some positive constants independent of n. Computational methods with the operation count and memory O(n) are considered 1 There is a notable exception in variational methods: rigorous pointwise error bounds can, for some

classes of problems, be established using dual formulations (see Sect. 3.13.3.3 for more information). However, this requires numerical solution of a separate auxiliary problem for Green’s function at each point where the error bound is sought.

6

1 Introduction

as asymptotically optimal; the doubling of the number of nodes (or some other such parameter) leads, roughly, to the doubling of the number of operations and memory size. For several classes of problems, there exist divide-and-conquer or hierarchical strategies with either optimal O(n) or slightly suboptimal O(n log n) complexity. The most notable examples are Fast Fourier Transforms (FFT), Fast Multipole Methods, multigrid methods, and FFT-based Ewald summation. Clearly, the numerical factors c1,2 also affect the performance of the method. For real-life problems, they can be determined experimentally and their magnitude is not usually a serious concern. A salient exception is the Fast Multipole Method for multiparticle interactions; its operation count is close to optimal, O(n p log n p ), where n p is the number of particles, but the numerical prefactors are very large, so the method outperforms the brute-force approach (O(n 2p ) pairwise particle interactions) only for a large number of particles, tens of thousands and beyond. Given that the choice of a suitable method is partially an art, what is one to do? As a practical matter, the availability of good public domain and commercial software in many cases simplifies the decision. Examples of such software are2 • Molecular Dynamics packages AMBER (Assisted Model Building with Energy Refinement); NAMD, GROMACS, CHARMM/CHARMm (Chemistry at HARvard Macromolecular Mechanics); TINKER, DL POLY. • A finite difference Poisson-Boltzman solver DelPhi. • Finite Element software developed by ANSYS, and Comsol Multiphysics. • A software suite from Rsoft Group for design of photonics components and optical networks. • Finite difference time-domain simulation software (Sect. 7.18). This list is certainly not exhaustive and, among other things, does not include software for ab initio electronic structure calculation, as this subject matter lies outside the scope of the book. The obvious drawback of using somebody else’s software is that the user cannot extend its capabilities and apply it to problems fors which it was not designed. Some tricks are occasionally possible (for example, equations in cylindrical coordinates can be converted to the Cartesian system by a mathematically equivalent transformation of material parameters), but by and large the user is out of luck if the code is proprietary and does not handle a given problem. For open-source software, users may in principle add their own modules to accomplish a required task, but, unless the revisions are superficial, this requires detailed knowledge of the code. Whether the reader of this book is an intelligent user of existing software or a developer of his own algorithms and codes, the book will hopefully help him/her to understand how the underlying numerical methods work.

2 In

the first edition of the book, I provided web links to these packages. But some of the resources, especially ones in the public domain, tend to migrate to different sites over the years. The interested reader can easily find the relevant information via Google (or whatever the equivalent of Google is a hundred years from now, if the book is perused then).

1.4 So What?

7

1.4 So What? Avoid clichés like the plague! William Safire’s Rules for Writers

There is one cliché that I feel compelled to use: nanoscale science and technology are interdisciplinary. The book is intended to be a bridge between two broad fields: computational methods, both traditional and new, on the one hand, and several nanoscale or molecular-scale applications on the other. It is my hope that the reader who has a background in physics, physical chemistry, electrical engineering or related subjects, and who is curious about the inner workings of computational methods, will find this book helpful for crossing the bridge between the disciplines. Likewise, experts in computational methods may be interested in browsing the application-related chapters. At the same time, readers who wish to stay on their side of the “bridge” may also find some topics in the book to be of interest. An example of such a topic for numerical analysts is the FLAME schemes of Chap. 4; a novel feature of this approach is the systematic use of local approximation spaces in the FD context, with basis functions not limited to Taylor polynomials. Similarly, in the chapter on Finite Element analysis (Chap. 3), the theory of shape-related approximation errors is nonstandard and yields some interesting error estimates. Since the prospective reader will not necessarily be an expert in any given subject of the book, I have tried, to the extent possible, to make the text accessible to researchers, graduate and even senior-level undergraduate students with a good general background in physics and mathematics. While part of the material is related to mathematical physics, the style of the book can be characterized as physical mathematics3 —“physical” explanation of the underlying mathematical concepts. I hope that this style will be tolerable to the mathematicians and beneficial to the reader with a background in physical sciences and engineering. Sometimes, however, a more technical presentation is necessary. This is the case in the analysis of consistency errors and convergence of difference schemes in Chap. 2, Ewald summation in Chap. 5, and the derivation of FLAME basis functions for particle problems in Chap. 6. In many other instances, references to a rigorous mathematical treatment of the subject are provided. I cannot stress enough that this book is very far from being a comprehensive treatise on nanoscale problems and applications. The selection of subjects is strongly influenced by my research interests and experience. Topics where I felt I could contribute some new ideas, methods and results were favored. Many subjects that are covered thoroughly in the existing literature were not included. For example, material on Molecular Dynamics was, for the most part, left out because of the abundance of good literature on this subject.4 However, one of the most challenging parts 3 Not

exactly the same as “engineering mathematics,” a more utilitarian, user-oriented approach. Haile, Molecular Dynamics Simulation: Elementary Methods, Wiley-Interscience, 1997; D. Frenkel & B. Smit, Understanding Molecular Simulation, Academic Press, 2001; D.C. Rapaport,

4 J.M.

8

1 Introduction

of Molecular Dynamics—the computation of long-range forces in a homogeneous medium—appears as a separate chapter in the book (Chap. 5). The novel features of this analysis are a rigorous treatment of “charge allocation” to grid and the application of finite-difference schemes, with the potential splitting, in real space. Chapter 2 gives the necessary background on Finite Difference (FD) schemes; familiarity with numerical methods is helpful but not required for reading and understanding this chapter. In addition to the standard material on classical methods, their consistency and convergence, this chapter includes introduction to flexible approximation schemes, Collatz “Mehrstellen” schemes, and schemes for Hamiltonian systems. Chapter 3 is a concise self-contained description of the Finite Element Method (FEM). No special prior knowledge of computational methods is required to read most of this chapter. Variational principles and their role are explained first, followed by a tutorial-style exposition of FEM in the simplest 1D case. Two- and threedimensional scalar problems are considered in the subsequent sections of the chapter. A more advanced subject is edge elements, crucial for vector field problems in electromagnetic analysis. Readers already familiar with FEM may be interested in the new treatment of approximation accuracy as a function of element shape; this is a special topic in Chap. 3. Chapter 4 introduces the Finite Difference (FD) calculus of Flexible Local Approximation MEthods (FLAME). Local analytical solutions are incorporated into the schemes, which often leads to much higher accuracy than would be possible in classical FD. A large assortment of examples illustrating the usage of the method are presented. Chapter 6 can be viewed as an extension of Chap. 5 to multiparticle problems in heterogeneous media. The simulation of such systems, due to its complexity, has received relatively little attention, and good methods are still lacking. Yet the applications are very broad— from colloidal suspensions to polymers and polyelectrolytes; in all of these cases, the media are inhomogeneous because the dielectric permittivities of the solute and solvent are usually quite different. Ewald methods can only be used if the solvent is modeled explicitly, by including polarization on the molecular level; this requires a very large number of degrees of freedom in the simulation. An alternative is to model the solvent implicitly by continuum parameters and use the FLAME schemes of Chap. 4. Application of these schemes to the computation of the electrostatic potential, field and forces in colloidal systems is described in Chap. 6. Chapter 7, new in the second edition of the book, is an overview of finite difference time domain methods (FDTD). In no way is it a substitute for a comprehensive treatise on the subject by A. Taflove & S. C. Hagness [TH05] or similar monographs by other authors; rather, my bias is toward the aspects of FDTD that I consider critical and nontrivial. Many references are included. Chapter 8 deals with applications in nano-photonics and nano-optics. It reviews the mathematical theory of Bloch modes, in connection with the propagation of elecThe Art of Molecular Dynamics Simulation, Cambridge University Press, 2004; T. Schlik [Sch02], and others.

1.4 So What?

9

tromagnetic waves in periodic structures; describes plane wave expansion, FEM and FLAME for photonic bandgap computation; provides a theoretical background for plasmon resonances and considers various numerical methods for plasmon-enhanced systems. Such systems include optical sensors with very high sensitivity, as well as scanning near-field optical microscopes with molecular-scale resolution, unprecedented in optics. Chapter 8 also touches upon negative refraction and nanolensing— areas of intensive research and debate—and includes new material on the inhomogeneity of backward wave media. Chapter 9 is new in the second edition of the book and is devoted to metamaterials— artificial judiciously designed structures which can manipulate waves in unusual ways. This is still a growing area of research, with a wide array of potential applications. The most detailed section of that chapter is related to one of the critical aspects of metamaterial design—effective medium theory (homogenization). I stress that a homogenization theory accurately representing nontrivial characteristics of metamaterials (most notably, appreciable magnetic response) must, by necessity, be nonasymptotic and must account for boundary effects. Chapter 10 is a random collection of curious or controversial subjects related to electromagnetic analysis and simulation. I sincerely thank you for reading, or at least browsing, this book. Suggestions and critique are welcome.

Chapter 2

Finite-Difference Schemes

We must look for consistency. Where there is a want of it we must suspect deception. Sir Arthur Conan Doyle, The Casebook of Sherlock Holmes

2.1 Introduction Due to its relative simplicity, finite-difference (FD) analysis was historically the first numerical technique for boundary value problems in mathematical physics. The excellent review paper by V. Thomée [Tho01] traces the origin of FD to a 1928 paper by R. Courant, K. Friedrichs and H. Lewy, and to a 1930 paper by S. Gerschgorin. There are interesting, and practically important, trade-offs between FD and the finite element method (FEM) that emerged in the 1960s (Chap. 3). On the one hand, the great geometric flexibility of FEM, combined with modern techniques of hp-adaption, parallel multilevel preconditioning, domain decomposition give FEM unmatched power. This, however, comes at the expense of higher algorithmic and computational complexity, and often more demanding memory requirements than in the FD case. All in all, both FD and FEM are extremely valuable tools. FD has become particularly popular for time-domain problems in photonics as “finitedifference time-domain” (FDTD) methods (Chap. 7). This chapter starts with a gentle introduction to FD schemes and proceeds to a more detailed review. Sections 2.2–2.4 are addressed to readers with little or no background in finite-difference methods. Section 2.3, however, introduces a nontraditional perspective and may be of interest to more advanced readers as well. By approximating the solution of the problem rather than a generic smooth function, one can achieve much higher accuracy. This idea is further developed in Chap. 4. Section 2.4 gives an overview of classical FD schemes for ordinary differential equations (ODE) and systems of ODE; Sect. 2.5—an overview of Hamiltonian systems that are particularly important in molecular dynamics. © Springer Nature Switzerland AG 2020 I. Tsukerman, Computational Methods for Nanoscale Applications, Nanostructure Science and Technology, https://doi.org/10.1007/978-3-030-43893-7_2

11

12

2 Finite-Difference Schemes

Sections 2.6–2.8 describe FD schemes for boundary value problems in one, two and three dimensions. Some ideas of this analysis, such as minimization of the consistency error for a constrained set of functions, are nonstandard. Finally, Sect. 2.9 summarizes important results on consistency and convergence of FD schemes. In addition to providing a general background on FD methods, this chapter sets the stage for the generalized FD analysis with “flexible local approximation” described in Chap. 4. For a more comprehensive treatment and analysis of FD methods— in particular, elaborate time-stepping schemes for ordinary differential equations, schemes for gas and fluid dynamics, etc.—I defer to many excellent more specialized monographs. Highly recommended are books by C. W. Gear [Gea71] (ODE, including stiff systems), U. M. Ascher & L. R. Petzold [AP98], K. E. Brenan et al. [KEB96] (ODE, especially the treatment of differential-algebraic equations), S. K. Godunov & V. S. Ryabenkii [GR87a] (general theory of difference schemes and hyperbolic equations), J. Butcher [But87, But03] (time-stepping schemes and especially Runge–Kutta methods), T. J. Chung [Chu02] and S. V. Patankar [Pat80] (schemes for computational fluid dynamics).

2.2 A Primer on Time-Stepping Schemes The following example is the simplest possible illustration of key principles of finitedifference analysis.1 Suppose we wish to solve the ordinary differential equation du = λu on [0, tmax ], u(0) = u 0 , Re λ < 0 dt

(2.1)

numerically. The exact solution of this equation u exact = u 0 exp(λt)

(2.2)

obviously has infinitely many values at infinitely many points within the interval. In contrast, numerical algorithms have to operate with finite (discrete) sets of data. We therefore introduce a set of points (grid) t0 = 0, t1 , . . . , tn−1 , tn = tmax over the given interval. For simplicity, let us assume that the grid size t is the same for all pairs of neighboring points: tk+1 − tk = t, so that tk = kt. We now consider Eq. (2.1) at a moment of time t = tk : du (tk ) = λu(tk ) dt

1I

(2.3)

am grateful to Serge Prudhomme for very helpful suggestions and comments on the material of this section.

2.2 A Primer on Time-Stepping Schemes

13

The first derivative du/d x can be approximated on the grid in several different ways: du u(tk+1 ) − u(tk ) (tk ) = + O(t) dt t du u(tk ) − u(tk−1 ) (tk ) = + O(t) dt t du u(tk+1 ) − u(tk−1 ) (tk ) = + O((t)2 ) dt 2t These equalities—each of which can be easily justified by Taylor expansion—lead to the algorithms known as forward Euler, backward Euler and central-difference schemes, respectively: u k+1 − u k − uk = 0 (2.4) λt or, equivalently, u k+1 − (1 + λt)u k = 0

(forward Euler)

u k − u k−1 = uk λt

(2.5) (2.6)

or (1 − λt)u k − u k−1 = 0

(backward Euler)

u k+1 − u k−1 = uk 2λt

(2.7) (2.8)

or u k+1 − 2λtu k − u k−1 = 0

(central difference)

(2.9)

where u k−1 , u k and u k+1 are approximations to u(t) at discrete times tk−1 , tk and tk+1 , respectively. For convenience of analysis, the schemes above are written in the form that makes the dimensionless product λt explicit. The (discrete) solution for the forward Euler scheme (2.4) can be easily found by time-stepping: start with the given initial value u(0) = u 0 and use the scheme to find the value of the solution at each subsequent step: u k+1 = (1 + λt) u k

(2.10)

This difference scheme was obtained by approximating the original differential equation, and it is therefore natural to expect that the solution of the original equation will approximately satisfy the difference equation. This can be easily verified because in this simple example the exact solution is known. Let us substitute the exact solution (2.2) into the left-hand side of the difference Eq. (2.4):

14

2 Finite-Difference Schemes

 c = u 0

 exp (λ(k + 1)t) − exp (kλt) − exp (kλt) λt



 exp(λt) − 1 λt = u 0 exp(kλt) − 1 = u 0 exp(kλt) + h.o.t. (2.11) λt 2 where the very last equality was obtained via the Taylor expansion for t → 0, and “h.o.t.” are higher-order terms with respect to the time step t. Note that the exponential factor exp(kλt) goes to unity if t → 0 and the other parameters are fixed; however, if the moment of time t = tk is fixed, then this exponential is proportional to the value of the exact solution Symbol c stands for consistency error that is, by definition, obtained by substituting the exact solution into the difference scheme. The consistency error (2.11) is indeed “small”—it tends to zero as t tends to zero. More precisely, the error is of order one with respect to t. In general, the consistency error c is said to be of order p with respect to t if c1 t p ≤ |c | ≤ c2 t p

(2.12)

where c1,2 are some positive constants independent of t. (In the case under consideration, p = 1.) A very common equivalent form of this statement is the “big-oh” notation: |c | = O((t) p ) (see also introduction, Sect. 1.3). While consistency error is a convenient and very important intermediate quantity, the ultimate measure of accuracy is the solution error, i.e. the deviation of the numerical solution from the exact one: k = u k − u exact (tk )

(2.13)

The connection between consistency and solution errors will be discussed in Sect. 2.9. In our current example, we can evaluate the numerical error directly. The repeated “time-stepping” by the forward Euler scheme (2.10) yields the following numerical solution: (2.14) u k = (1 + λt)k u 0 ≡ (1 − ξ)k u 0 where ξ = −λt. (Note that Re ξ > 0, as Re λ is assumed negative.) The kth time step corresponds to the time instant tk = kt, and so in terms of time the numerical solution can then be rewritten as u k = [(1 − ξ)1/ξ ]−λtk u 0

(2.15)

2.2 A Primer on Time-Stepping Schemes

15

From basic calculus, the expression in the square brackets tends to e−1 as ξ → 0, and hence u k tends to the exact solution (2.2) u 0 exp(λtk ) as t → 0. Thus, in the limit of small time steps, the forward Euler scheme works as expected. However, in practice, when equations and systems much more complex than our example are solved, very small step sizes may lead to prohibitively high computational costs due to a large number of time steps involved. It is therefore important to examine the behavior of the numerical solution for any given positive value of the time step rather than only in the limit t → 0. Three qualitatively different cases emerge from (2.14): ⎧ ⎨ |1 + λt| < 1 ⇔ t < tmin , numerical solution decays (as it should); |1 + λt| > 1 ⇔ t > tmin , numerical solution diverges; ⎩ |1 + λt| = 1 ⇔ t = tmin , numerical solution oscillates. where

2Re λ , Re λ< 0 |λ|2

(2.16)

2 , λ < 0 (λ real) |λ|

(2.17)

tmin = − tmin =

For the purposes of this introduction, we shall call a difference scheme stable if, for a given initial condition, the numerical solution remains bounded for all time steps; otherwise, the scheme is unstable.2 It is clear that in the second and third case above, the numerical solution is qualitatively incorrect. The forward Euler scheme is stable only for sufficiently small time steps— namely for t < tmin

(stability condition for the forward Euler scheme)

(2.18)

Schemes that are stable only for a certain range of values of the time step are called conditionally stable. Schemes that are stable for any positive time step are called unconditionally stable. It is a misconception to attribute this numerical instability to roundoff errors.3 While roundoff errors can exacerbate the situation, it is clear from (2.14) the instability will manifest itself even in exact arithmetic if the time step is not sufficiently small. The backward Euler difference scheme (2.6) is substantially different in this regard. The numerical solution for that scheme is easily found to be u k = (1 − λt)−k u 0 2 More

(2.19)

specialized definitions of stability can be given for various classes of schemes; see e.g. C. W. Gear [Gea71], J. C. Butcher [But03], E. Hairer et al. [HrW93] as well as the following sections of this chapter. 3 This misconception may be due to confusion with other cases where numerical instability is entirely due to roundoff. Example: direct solvers (Sect. 3.11).

16

2 Finite-Difference Schemes

In contrast with the forward Euler method, for negative Re λ this solution is bounded (and decaying in time) regardless of the step size t. That is, the backward Euler scheme is unconditionally stable. However, there is a price to pay for this advantage: the scheme is an equation with respect to u k+1 . In the current example, solution of this equation is trivial (just divide by 1 − λt), but for nonlinear differential equations, and especially for (linear and nonlinear) systems of differential equations the computational cost of computing the solution at each time step may be high. Difference schemes that require solution of a system of equations to find u k+1 are called implicit; otherwise, the scheme is explicit. The forward Euler scheme is explicit, and the backward Euler scheme is implicit. The derivation of the consistency error for the backward Euler scheme is completely analogous to that of the forward Euler scheme, and the result is essentially the same, except for a sign difference: c = − u 0 exp(kλt)

λt + h.o.t. 2

(2.20)

As in the forward Euler case, the exponential factor tends to unity as the time step goes to zero, if k and λ are fixed. The very popular Crank–Nicolson scheme4 can be viewed as an approximation of the original differential equation at time tk+1/2 ≡ tk + t/2: u k+1 − u k u k + u k+1 − = 0, k = 0, 1, . . . λt 2

(2.21)

Indeed, the first term in this equation is the central-difference approximation (completely analogous to (2.8), but with a twice smaller time step), while the second term approximates the value of u(tk+1/2 ). The time-stepping procedure for the Crank–Nicolson scheme is     λt λt 1− u k+1 = 1 + u k , k = 0, 1, . . . 2 2

(2.22)

and the numerical solution of the model problem is  uk =

1 + λt/2 1 − λt/2

k u0

(2.23)

Since the absolute value of the fraction here is less than one for all positive (even very large) time steps and Re λ < 0, the Crank–Nicolson scheme is unconditionally stable. Its consistency error is again found by substituting the exact solution (2.2) 4 Often misspelled as Crank-Nicholson. After John Crank (1916–2006), British mathematical physi-

cist, and Phyllis Nicolson (1917–1968), British physicist. The original paper is: J. Crank and P. Nicolson, A practical method for numerical evaluation of solutions of partial differential equations of the heat-conduction type, Proc. Cambridge Philos. Soc., vol. 43, pp. 50–67, 1947. [Re-published in: John Crank 80th birthday special issue of Adv. Comput. Math., vol. 6, pp. 207–226, 1997.].

2.2 A Primer on Time-Stepping Schemes

17

into the scheme (2.21). The result is c = − u 0 exp(kλt)

(λt)2 + h.o.t. 12

(2.24)

The consistency error is seen to be of second order—as such, it is (for sufficiently small time steps) much smaller than the error of both Euler schemes.

2.3 Exact Schemes As we have seen, consistency error can be made smaller if one switches from Euler methods to the Crank–Nicolson scheme. Can the consistency error be reduced even further? One may try to “mix” the forward and backward Euler schemes in a way similar to the Crank–Nicolson scheme, by assigning some other weights θ and (1 − θ), instead of 21 , to u k and u k+1 in (2.21). However, it would soon transpire that the Crank–Nicolson scheme in fact has the smallest consistency error in this family of schemes, so nothing substantially new is gained by introducing the alternative weighting factors. Nevertheless one can easily construct schemes whose consistency error cannot be beaten. Indeed, here is an example of such a scheme: u k+1 uk − = 0 u exact (tk ) u exact (tk+1 )

(2.25)

More specifically for the equation under consideration u k+1 uk − = 0 exp(−λtk ) exp(−λtk+1 )

(2.26)

u k − u k+1 exp(λt) = 0

(2.27)

Equivalently,

Obviously, by construction of the scheme, the analytical solution satisfies the difference equation exactly—that is, the consistency error of the scheme is zero. One cannot do any better than that! The first reaction may be to dismiss this construction as cheating: the scheme makes use of the exact solution that in fact needs to be found. If the exact solution is known, the problem has been solved and no difference scheme is needed. If the solution is not known, the coefficients of this exact scheme are not available. Yet the idea of “exact” schemes like (2.25) proves very useful. Even though the exact solution is usually not known, excellent approximations for it can frequently be found and used to construct a difference scheme. One key observation is that such approximations need not be global (i.e. valid throughout the computational domain).

18

2 Finite-Difference Schemes

Fig. 2.1 Numerical solution for different one-step schemes Time step t = 0.05. λ = −10

Since difference schemes are local, all that is needed is a good local approximation of the solution. Local approximations are much more easily obtainable than global ones. In fact, the Taylor series expansion, which was implicitly used to construct the Euler and Crank–Nicolson schemes, and which will be more explicitly used in the following subsection, is just an example of a local approximation. The construction of “exact” schemes represents a shift in perspective. The objective of Taylor-based schemes is to approximate the differential operator—for example, d/dt—with a suitable finite difference, and consequently, the differential equation with the respective FD scheme. The objective of the “exact” schemes is to approximate the solution. Approximation of the differential operator is a very powerful tool, but it carries substantial redundancy: it is applicable to all sufficiently smooth functions to which the differential operator could be applied. By focusing on the solution only, rather than on a wide class of smooth functions, one can reduce or even eliminate this redundancy. As a result, the accuracy of the numerical solution can be improved dramatically. This set of ideas will be explored in Chap. 4. The accuracy of different one-step schemes for our simple model problem with parameter λ = −10 is illustrated in several figures. Figure 2.1 shows the analytical and numerical solutions for time step t = 0.05. It is evident that the Crank–Nicolson scheme is substantially more accurate than the Euler schemes. The numerical errors are quantified in Fig. 2.2. As expected, the exact scheme gives the true solution up to the roundoff error. For a larger time step t = 0.25, the forward Euler scheme exhibits instability (Fig. 2.3). The exact scheme still yields the analytical solution to machine precision. The backward Euler and Crank–Nicolson schemes are stable (Fig. 2.4), but the numerical errors are higher than for the smaller time step. R. E. Mickens [Mic94] derives “exact” schemes from a different perspective and extends them to a family of “nonstandard” schemes defined by a set of heuristic rules. We shall see in Chap. 4 that the “exact” schemes are a very natural particular case

2.3 Exact Schemes

19

Fig. 2.2 Numerical errors for different one-step schemes. Time step t = 0.05. λ = −10

Fig. 2.3 Numerical solution for the forward Euler scheme.Time step t = 0.25. λ = −10

of a special finite-difference calculus—“Flexible Local Approximation MEthods” (FLAME).

2.4 Some Classic Schemes for Initial Value Problems For completeness, this section presents a brief overview of a few popular timestepping schemes for Ordinary Differential Equations (ODE).

20

2 Finite-Difference Schemes

Fig. 2.4 Numerical solution for different one-step schemes.Time step t = 0.25. λ = −10

2.4.1 The Runge–Kutta Methods This introduction to Runge–Kutta (R-K) methods follows the elegant exposition by E. Hairer et al. [HrW93]. The main idea dates back to C. Runge’s original paper of 1895. The goal is to construct high-order difference schemes for the ODE y  (t) = f (t, y), y(t0 ) = y0

(2.28)

Our starting point is a simpler problem, with the right-hand side independent of y: y  (t) = f (t), y(t0 ) = y0

(2.29)

This problem not only has an analytical solution t f (τ )dτ y(t) = y0 +

(2.30)

t0

but also admits accurate approximations via numerical quadratures. For example, the midpoint rule gives  y1 ≡ y(t1 ) ≈ y0 + t0 f

t0 t0 + 2



  t1 y2 ≡ y(t2 ) ≈ y1 + t1 f t1 + 2

2.4 Some Classic Schemes for Initial Value Problems

21

and so on. Here t0 , t1 , etc., are a discrete set of points in time, and the time steps t0 = t1 − t0 , t1 = t2 − t1 , etc., do not have to be equal. It is straightforward to verify that this numerical quadrature (that doubles as a time-stepping scheme) has second-order accuracy with respect to the maximum time step. An analogous formula for taking the numerical solution of the original Eq. (2.28) from a generic point t in time to t + t would be    t t ,y t+ y(t + t) ≈ y(t) + t f t + 2 2

(2.31)

The obstacle is that the value of y at the midpoint t + t is not directly available. 2 However, this value may be found approximately via the forward Euler scheme with the time step t/2:   t t ≈ y(t) + f (t, y(t)) (2.32) y t+ 2 2 A valid difference scheme can now be produced by inserting this midpoint value into the numerical quadrature (2.31). The customary way of writing the overall procedure is as the following sequence: k1 = f (t, y)   t t , y(t) + k1 k2 = f t + 2 2 y(t + t) = y(t) + t k2

(2.33) (2.34) (2.35)

This is the simplest R-K method with two stages (k1 is computed at the first stage and k2 at the second). The generic form of an s-stage explicit R-K method is as follows [HrW93]: k1 = f (t0 , y0 ) k2 = f (t0 + c2 t, y0 + t a21 k1 ) k3 = f (t0 + c3 t, y0 + t (a31 k1 + a32 k2 )) ...



ks = f t0 + cs t, y0 + t as1 k1 + · · · + as,s−1 ks−1 y(t + h) = y0 + t (b1 k1 + b2 k2 + · · · + bs ks ) The procedure is indeed explicit, as the computation at each subsequent stage depends only on the values computed at the previous stages. The “input data” for the R-K method at any given time step consists only of one value y0 at the beginning of this step and does not include any other previously computed values. Thus the R-K time step sizes can be chosen independently, which is very useful for adaptive

22

2 Finite-Difference Schemes

algorithms. The multistage method should not be confused with multistep schemes (e.g. the Adams methods, Sect. 2.4.2 below) where the input data at each discretetime point contains the values of y at several previous steps. Changing the time step in multistep methods may be cumbersome and may require “re-initialization” of the algorithm. To write R-K schemes in a compact form, it is standard to collect all the coefficients a, b and c in J. Butcher’s tableau: 0 c2 c3 ... ... cs

a21 a31 ... ... as1 b1

a32 ... ... as2 b2

... ... ... ...

... ... ... . . . . . . as,s−1 . . . . . . bs

One further intuitive observation is that the k parameters in the R-K method are values of function f at some intermediate points. As a rule, one wants these intermediate points to be close to the actual solution y(t) of (2.28). Then, according to (2.28), the ks also approximate the time derivative of y over the current time step. Thus, at the ith stage of the procedure function f is evaluated, roughly speaking, at point (t0 + ci t, y0 + (ai1 + · · · + ai,s−1 )y  (t0 )t). From these considerations, condition ci = ai1 + · · · + ai,s−1 , i = 2, 3 . . . s emerges as natural (although not, strictly speaking, necessary). The number of stages is in general different from the order of the method (i.e. from the asymptotic order of the consistency error with respect to the time step), and one wishes to find the free parameters a, b and c that would maximize the order. For s ≥ 5, no explicit s-stage R-K method of order s exists (E. Hairer et al. [HrW93], J. C. Butcher [But03]). However, a family of four-stage explicit R-K methods of fourth order are available [HrW93, But03]. The most popular of these methods are 0 1/2 1/2 1/2 0 1/2 0 0 1 1 1/6 2/6 2/6 1/6 and

2.4 Some Classic Schemes for Initial Value Problems

23

Fig. 2.5 Stability regions in the λt-plane for explicit Runge–Kutta methods of orders one through four

0 1/3 1/3 2/3 -1/3 1 1 -1 1 1 1/8 3/8 3/8 1/8 Stability conditions for explicit Runge–Kutta schemes can be obtained along the following lines. For the model scalar equation (2.1) dy = λy on [0, tmax ], u(0) = u 0 dt

(2.36)

the exact solution changes by the factor of exp(λh) over one-time step. If the R-K method is of order p, the respective factor in the growth of the numerical solution is the Taylor approximation T (ξ) =

p ξk k=0

k!

,

ξ ≡ λt

to this exponential factor. Stability regions then correspond to |T (ξ)| < 1 in the complex plane ξ ≡ λt (Fig. 2.5). Further analysis of R-K methods can be found in monographs by J. Butcher [But03], E. Hairer et al. [HrW93], and C. W. Gear [Gea71].

24

2 Finite-Difference Schemes

2.4.2 The Adams Methods Adams methods are a popular class of multistep schemes, where the solution values from several previous time steps are utilized to find the numerical solution at the subsequent step. This is accomplished by polynomial interpolation. The following brief summary is due primarily to E. Hairer et al. [HrW93]. Consider again the general ODE (2.28) (reproduced here for easy reference): y  (t) = f (t, y), y(t0 ) = y0

(2.37)

Let the grid be uniform, ti = t0 + it, and integrate the differential equation over one-time step: tn+1 y(tn+1 ) = y(tn ) + f (t, y(t)) dt (2.38) tn

The integrand is a function of the unknown solution and obviously is not directly available; however, it can be approximated by a polynomial p(t) passing through k previous numerical solution values (ti , f (yi )). The numerical solution at time step n + 1 is then found as tn+1

yn+1 = yn +

p(t) dt

(2.39)

tn

Coefficients of p(t) can be found explicitly (e.g. via backward differences), and the scheme is then obtained after inserting the expression for p into (2.39). This explicit calculation appears in all texts on numerical methods for ODE and is not included here. Adams methods can also be used in the Nordsieck form, where instead of the values of function f at the previous time steps approximate Taylor coefficients for the solution are stored. These approximate coefficients form the Nordsieck vector 2 k (yn , t yn , t2 yn , …, tk! yn(k) ). This form makes it easier to change the time step size as needed.

2.4.3 Stability of Linear Multistep Schemes It is clear from the introduction in Sect. 2.2 that stability characteristics of the difference scheme are of critical importance for the numerical solution. Stability depends on the intrinsic properties of the underlying differential equation (or a system of ODE), as well as on the difference scheme itself and the mesh size. This section highlights the key points in the stability analysis of linear multistep schemes; the results and conclusions will be used, in particular, in the next section (stiff systems).

2.4 Some Classic Schemes for Initial Value Problems

25

Stability of linear multistep schemes is covered in all texts on FD schemes for ODE (e.g. C. W. Gear [Gea71], J. Butcher [But03], E. Hairer et al. [HrW93], U. M. Ascher & L. R. Petzold [AP98]). A comprehensive classification of types of stability is given in the book by J. D. Lambert [Lam91]. This section, for the most part, follows Lambert’s presentation. Consider the test system of equations y  = Ay,

y ∈ Rn

(2.40)

where all eigenvalues of matrix A are for simplicity assumed to be distinct and to have strictly negative real parts, so that the system is stable. Further, let a linear k-step method be k k α j y+ j = t β j f+ j (2.41) j=0

j=0

where f is the right-hand side of the system, h is (as usual) the mesh size, and index + j indicates values at the jth time step (the “current” step corresponding to j = 0). In our case, the right-hand side f = Ay, and the multistep scheme becomes k (α j I − tβ j A) y+ j = 0

(2.42)

j=0

Since A is assumed to have distinct eigenvalues, it is diagonalizable, i.e. Q −1 AQ =  ≡ diag(λ1 , . . . , λn )

(2.43)

where Q is a nonsingular matrix. The same transformation can then be applied to the whole scheme (2.42) by multiplying it with Q −1 on the left and introducing a variable change y = Qz. It is easy to see that, since the system matrix becomes diagonal upon this transformation, the system splits up into completely decoupled equations for each z i , i = 1, 2, . . . , n. With some abuse of notation now, dropping the index i for z i and the respective eigenvalue λi , we get the scalar version of the scheme k (α j − tβ j λ)z + j = 0 (2.44) j=0

From the theory of difference equations, it is well known that stability is governed by the roots5 rs (s = 1,2, . . . , k) of the characteristic equation k (α j − tλβ j ) r j = 0 j=0

5 Lambert’s

notation is used here.

(2.45)

26

2 Finite-Difference Schemes

Clearly, stability depends on the (dimensionless) parameter hλ. The multistep method is said to be absolutely stable for given λt if all the roots rs of the characteristic polynomial for this value of λt lie strictly inside the unit circle in the complex plane. The set of points λt in the λt-plane for which the scheme is absolutely stable is called the region of absolute stability. For illustration, let us recall the simplest case—one-step schemes for the scalar equation y  = λy: y+1 − y0 = λ (θy0 + (1 − θ)y+1 ) t

(2.46)

For θ = 0 and 1, this is the implicit/explicit Euler method, respectively; for θ = 0.5 it is the Crank–Nicolson (trapezoidal) scheme. The characteristic equation is obtained in a standard way, by formally substituting r 1 for y+1 and r 0 = 1 for y0 : r −1 = λ (θ + (1 − θ)r ) t The root is r =

1 + λθt 1 − λ(1 − θ)t

(2.47)

(2.48)

For the explicit Euler scheme (θ = 1) rexpl.Euler = 1 + λt

(2.49)

and so the region of absolute stability in the λt-plane is the unit circle centered at −1 (Fig. 2.6). For the implicit Euler scheme (θ = 0) rimpl.Euler =

1 1 − λt

(2.50)

the region of absolute stability is outside the unit circle centered at 1 (Fig. 2.7).

Fig. 2.6 Stability region of the explicit Euler method is the unit circle (shaded)

2.4 Some Classic Schemes for Initial Value Problems

27

Fig. 2.7 Stability region of the implicit Euler method is the shaded area outside the unit circle

Fig. 2.8 tability region of the Crank–Nicolson scheme is the left half-plane

This stability region includes all negative values of λt—that is, for a negative λ, the scheme is stable for any (positive) time step. In addition, curiously enough, the scheme is stable in a vast area with positive λt—i.e. the numerical solution may decay exponentially when the exact one grows exponentially. This latter feature is somewhat undesirable but is typically of little significance, as in most cases the underlying differential equations describe stable systems with decaying solutions. What about the Crank–Nicolson scheme? For θ = 0.5 we have rCrank−Nicolson =

1 + λt/2 1 − λt/2

(2.51)

and it is then straightforward to verify that the stability region is the half-plane λt < 0 (Fig. 2.8). The region of stability is clearly a key consideration for choosing a suitable class of schemes and the mesh size such that hλ lies inside the region of stability.

2.4.4 Methods for Stiff Systems One can identify two principal constraints on the choice of the time step in a numerical scheme for ODE. The first constraint has to do with the desired approximation accuracy (i.e. consistency error): if the solution varies smoothly and slowly in time, it can be approximated with sufficient accuracy even if the time step is large.

28

2 Finite-Difference Schemes

The second constraint is imposed by stability of the scheme. Let us recall, for example, that the stability condition for the simplest one-step scheme—the forward Euler method—is t < 2/|λ| (2.18), (2.17) for real negative λ, in reference to the test Eq. (2.1) dy = λy on [0, tmax ], u(0) = u 0 (2.52) dt More advanced explicit methods may have broader stability regions: see e.g. Fig. 2.5 for Runge–Kutta methods in Sect. 2.4.1. However, the improvement is not dramatic; for example, for the four-stage fourth-order Runge–Kutta method, the step size cannot exceed ∼ 2.785/|λ|. For a single scalar equation (2.52) with λ < 0 and a decaying exponential solution, the accuracy and stability restrictions on the time step size are commensurate. Indeed, accuracy calls for the step size on the order of the relaxation time 1/λ or less, which is well within the stability limit even for the simplest forward Euler scheme. However, for systems of equations the stability constraint on the step size can be much more severe than the accuracy limit. Consider the following example: dy1 = λ1 y1 ; λ1 = −1 dt dy2 = λ2 y2 ; λ2 = −1000 dt

(2.53) (2.54)

The second component (y2 ) dies out when t 1/|λ2 | = 10−3 and can then be neglected; beyond that point, the approximation accuracy would suggest the time step commensurate with the relaxation time of the first component, 1/|λ1 | = 1. However, the stability condition t ≤ c/|λ| (where c depends on the method but is not much greater than 2–3 for most practical explicit schemes) has to hold for both λ and limits the time step to approximately 1/|λ2 | = 10−3 . In other words, the time step that would provide good approximation accuracy exceeds the stability limit by a factor of about 1000. A brute force approach is to use a very small time step and accept the high-computational cost as well as the tremendous redundancy in the numerical solution that will remain virtually unchanged over one time step. An obvious possibility for a system with decoupled components is to solve the problem separately for each component. In the example above, one could time-step y1 with t1 ∼ 0.1 for about 50 steps (after which y1 will die out) and y2 with t2 ∼ 10−4 also for about 50 steps. However, decoupled systems are a luxury that one seldom has in practical problems. For example, the system of ODEs 



z (t) = −Az; z(t) ∈ R ; 2

A =

500.5 −499.5 −499.5 500.5

 (2.55)

poses the same stability problem for explicit schemes as the previous example— simply because matrix A is obtained from the diagonal matrix D = diag(1, 1000) of

2.4 Some Classic Schemes for Initial Value Problems

29

the previous example by an orthogonal transformation A = Q  D Q, with 1 Q = √ 2



 1 1 −1 1

The “fast” and “slow” components, with their respective time scales, are now mixed up, but this is no longer immediately obvious. Recovering the two components is equivalent to solving a full eigenvalue–eigenvector problem for the system matrix, which can be done for small systems but is inefficient or even impossible in practice for large ones. The situation is even more complicated for nonlinear problems and systems with time-varying coefficients. A practical alternative lies in switching to implicit difference schemes. In return for excellent stability properties, one pays the price of having to solve for the unknown value of the numerical solution yn+1 at the next time step. This is in general a nonlinear equation (for a scalar ODE) or a nonlinear system of algebraic equations (for a system of ODEs, y being in that case a Euclidean vector). Recall that for the ODE (2.56) y  (t) = f (t, y) the simplest implicit scheme—the backward Euler method—is yn+1 − yn = t f (tn+1 , yn+1 )

(2.57)

A set of schemes that generalize the backward Euler algorithm to higher orders is due to C. W. Gear [Gea67, Gea71, HrW93] and is called “Backward Differentiation Formulae” (BDF). For illustration, let us derive the second-order BDF scheme, the derivation of higher-order schemes being analogous. The second-order scheme involves a “grid molecule”6 of three points: t−1 = t0 − t, t0 and t+1 = t0 + t; quantities related to the “current” time step t0 will be labeled with index 0, quantities related to the previous and the next step will be labeled with −1 and +1, respectively (Fig. 2.9). The starting point is almost the same as for explicit Adams methods: an interpolation polynomial p(t) (quadratic for the second-order scheme) that passes through three points (t−1 , y−1 ), (t0 , y0 ) and (t+1 , y+1 ). The values y0 and y−1 of the solution at the current and previous steps are known. The value y+1 at the next step is an unknown parameter, and a suitable condition is needed to evaluate it. In BDF, the following condition is imposed: the interpolating polynomial p(t) must satisfy the underlying differential equation at time t+1 , i.e. p  (t+1 ) = f (t+1 , y+1 )

(2.58)

6 As far as I know, this locution was coined by J. P. Webb [PW09]. Surprisingly, there does not seem

to be a standard term for a set of nodes over which an FD scheme is defined. In the past, I have used the word “stencil” for that purpose; however, by “stencil” most researchers mean the set of coefficients of a scheme rather than the set of nodes.

30

2 Finite-Difference Schemes

Fig. 2.9 Second-order BDF involves quadratic polynomial interpolation over three points: (t−1 , y−1 ), (t0 , y0 ) and (t+1 , y+1 )

To find this interpolation polynomial and then the BDF scheme itself, let us for convenience move the origin of the coordinate system to the midpoint of the grid molecule and set t0 = 0. Lagrange interpolation through the three points then gives p(t) = y−1 = y−1

t (t − t) (t + t)(t − t) (t + t)t + y0 + y+1 (−t) · (−2t) t · (−t) 2t · t

t (t − t) (t + t)(t − t) (t + t)t − y0 + y+1 2 2 2t t 2t 2

(2.59)

The derivative of p (needed to impose condition (2.58) at the next step) is p  (t) =

y−1 y0 y+1 (2t − t) − 2t + (2t + t) 2t 2 t 2 2t 2

(2.60)

Condition (2.58) is obtained by substituting t = t+1 : p  (t+1 ) =

2y0 3y+1 y−1 − + = f (t+1 , y+1 ) 2t t 2t

(2.61)

or equivalently 3 1 y+1 − 2y0 + y−1 = t f (t+1 , y+1 ) 2 2

(2.62)

This is Gear’s second-order method. The scheme is implicit—it constitutes a (generally nonlinear) equation with respect to y+1 or, in the case of a vector problem (y ∈ Rn ), a system of equations. In practice, iterative linearization by the Newton–Raphson method is used and suitable linear system solvers are applied in the Newton–Raphson loop.

2.4 Some Classic Schemes for Initial Value Problems

31

For reference, here is a list of BDF of orders k from one through six [HrW93]. The first-order BDF scheme coincides with the implicit Euler method. BDF schemes of orders higher than six are unstable.

147 y+1 60

3 y+1 − 2 11 3 y+1 − 3y0 + 6 2 25 4 y+1 − 4y0 + 3 y−1 − 12 3 137 10 5 y+1 − 5y0 + 5 y−1 − y−2 + 60 3 4 15 20 15 6 y−1 − y−2 + y−3 − − 6y0 + 2 3 4 5

y+1 − y0 1 2y0 + y−1 2 1 y−1 − y−2 3 1 y−2 + y−3 4 1 y−3 − y−4 5 1 y−4 + y−5 6

= t f +1 = t f +1 = t f +1 = t f +1 = t f +1 = t f +1

Since stability considerations are of paramount importance in the choice of difference schemes for stiff problems, an elaborate classification of schemes based on their stability properties—or more precisely, on their regions of absolute stability (see Sect. 2.4.3)—has been developed. The relevant material can be found in C. W. Gear’s monograph [Gea71] and, in a more complete form, in J. D. Lambert’s book [Lam91]. What follows is a brief summary of this stability classification. A hierarchy of definitions of stability classes with progressively wider regions of stability is (Lambert’s definitions are adopted): A0 -stability ⇐= A(0)-stability ⇐= A(α)-stability ⇐= stiff-stability ⇐= A-stability ⇐= L-stability Definition 1 A method is said to be A0 -stable if its region of absolute stability includes the (strictly) negative real semiaxis. Definition 2 ([Gea71, Lam91]) A method is said to be A(α)-stable, 0 < α < π/2, if its region of absolute stability includes the “angular” domain | arg(λt) − π| ≤ α in the λt-plane (Fig. 2.10). A method is said to be A(0)-stable if it is A(α)-stable for some 0 < α < π/2. Definition 3 ([Gea71, Lam91]) A method is said to be A-stable if its region of absolute stability includes the half-plane Re (λt) < 0. Definition 4 A method is said to be stiffly-stable if its region of absolute stability includes the union of two domains (Fig. 2.11): (i) Re (λt) < −a, and (ii) −a ≤ Re (λt) < 0, |Im(λt)| < c, where a, c are positive real numbers. Thus stiff stability differs from A-stability in that slowly decaying but highly oscillatory solutions are irrelevant for stiff stability. The rationale is that for such solutions

32

2 Finite-Difference Schemes

Fig. 2.10 A(α)-stability region

Fig. 2.11 Stiff-stability region

the time step is governed by accuracy requirements for the oscillatory components as much, or perhaps even more, than it is governed by stability requirements—hence, this is not truly a stiff case. Definition 5 ([Gea71, Lam91]) A method is said to be L-stable if it is A-stable and, in addition, when applied to the scalar test equation y  = λy, Re λ < 0, it yields yn+1 = R(λt) yn , with |R(λt)| → 0 as Re λt → −∞. The notion of L-stability is motivated by the following test case. Consider one more time the Crank–Nicolson scheme applied to the model scalar equation y  = λy: yn+1 − yn yn+1 + yn = λ h 2 The numerical solution is easily found to be

(2.63)

2.4 Some Classic Schemes for Initial Value Problems

 yn = y0

1 + λt/2 1 − λt/2

33

n (2.64)

As already noted, the Crank–Nicolson scheme is absolutely stable for any λt with a negative real part. The solution above reflects this fact, as the expression in parentheses has the absolute value less than one for Re λt < 0. Still, the numerical solution exhibits some undesirable behavior for “highly negative” values of λ, i.e. for λ < 0, |λ|t 1. Indeed, in this case the actual solution decays very rapidly in time as exp(λt), whereas the numerical solution decays very slowly but is highly oscillatory because the expression in parentheses in (2.64) is close to −1. This is a case where the numerical solution disagrees with the exact one not just quantitatively but qualitatively. The problem is in fact much broader. If the difference scheme is not chosen judiciously, the character of the solution may be qualitatively incorrect (such as an oscillatory numerical solution vs. a rapidly decaying exact one). Further, important physical invariants (most notably energy or momentum) may not be conserved in the numerical solution, which may render the computated results nonphysical. This is important, in particular, in molecular dynamics, where energy conservation and, more generally, “symplecticness” of the underlying Hamiltonian system (Sect. 2.5) should be preserved. With regard to stiff systems, an alternative solution strategy that does not involve difference schemes can sometimes be effective. The solution of a linear system of ODE can be analytically expressed via matrix exponential exp At (see Appendix 2.10). Computing this exponential is by no means easy (many caveats are discussed in the excellent papers by C. Moler & C. Van Loan [ML78, ML03]); nevertheless the recursion relation exp At = (exp(At/n))n is helpful. The idea is that for n sufficiently large matrix At/n is “small enough” for its exponential to be computed relatively easily with sufficient accuracy; n is usually chosen as an integer power of two, so that the nth power of the matrix can be computed by repeated squaring. Two interesting motifs of this and the following section can now be noted: • difference methods that ensure a qualitative/physical agreement between the numerical solutions and the exact ones; • methods blending numerical and analytical approximations. Many years ago, my advisor Iu. V. Rakitskii [Rak72, RUC79, RSY+85] was an active proponent of both themes. Nowadays, the qualitative similarity between discrete and continuous models is an important trend in mathematical studies and their applications. Undoubtedly, Rakitskii would have been happy to see the contribution of Yu. B. Suris, his former student, to the development of numerical methods preserving the physical invariants of Hamiltonian systems [Sur87]–[Sur96], as well as to discrete differential geometry (A. I. Bobenko & Yu. B. Suris [BS08]). Another “Rakitskii-style” development is the generalized finite-difference calculus of Flexible Local Approximation MEthods (FLAME, Chap. 4) that seamlessly incorporates local analytical approximations into difference schemes.

34

2 Finite-Difference Schemes

2.5 Schemes for Hamiltonian Systems 2.5.1 Introduction to Hamiltonian Dynamics Note: no prior knowledge of Hamiltonian systems is necessary for reading this section. As a starting example, consider a (classical) harmonic oscillator, such as a mass on a spring, described by the ODE m q¨ = − kq

(2.65)

(mass times acceleration equals force), where mass m and the spring constant k are known parameters and q is a coordinate. The general solution to this equation is q(t) = q0 cos(ω0 t + φ);

ω02 =

k m

(2.66)

for some parameters q0 and φ. Even though the above expression in principle contains all the information about the solution, recasting the differential equation in a different form brings a deeper perspective. The new insights are even more profound for multiparticle problems with multiple degrees of freedom. The Hamiltonian of the oscillator—the energy function H expressed in terms of q and q—comprises ˙ the kinetic and potential terms7 : H =

1 1 m q˙ 2 + kq 2 2 2

(2.67)

We shall view H as a function of two variables: coordinate q and momentum p = m q; ˙ in terms of these variables, H (q, p) =

kq 2 p2 + 2m 2

(2.68)

The original second-order differential equation splits up into two first-order equations ⎧ ⎨ q˙ = m −1 p ⎩

(2.69) p˙ = − kq

or in matrix–vector form 7 More

generally in mechanics, the Hamiltonian can be defined by its relationship with the Lagrangian of the system and is indeed equal to the energy of the system if expressions for the generalized coordinates do not depend on time.

2.5 Schemes for Hamiltonian Systems

w˙ = Aw, w =

35

  q ; p

 A =

0 m −1 −k 0

 (2.70)

The right-hand side of differential equations (2.69) is in fact directly related to the partial derivatives of H (q, p): ∂ H (q, p) p = ∂p m

(2.71)

∂ H (q, p) = kq ∂q

(2.72)

We thus arrive at the equations of Hamiltonian dynamics, with their elegant symmetry: ⎧ ∂ H (q, p) ⎪ = q˙ ⎨ ∂p (2.73) ⎪ ⎩ ∂ H (q, p) = − p˙ ∂q Energy conservation follows directly from these Hamiltonian equations by chain-rule differentiation: ∂H ∂H ∂H = p˙ + q˙ = q˙ p˙ − p˙ q˙ = 0 ∂t ∂p ∂q In the phase plane (q, p), constant energy levels correspond to ellipses H (q, p) =

p2 kq 2 + = const 2m 2

(2.74)

For the Hamiltonian system, any particular solution (q(t), p(t)), viewed as a (moving) point in the phase plane, moves along the ellipse corresponding to the energy of the oscillator. Further insight is gained by following the evolution of the w = (q, p) points corresponding to a collection of oscillators (or the same oscillator observed repeatedly under different conditions). The initial coordinates and momenta of a family of oscillators are represented by a set of points in the phase plane. One may imagine that these points fill a certain geometric domain (0) at t = 0 (shaded area in Fig. 2.12). With time, each of the points will follow its own elliptic trajectory, so that at any given moment of time t the initial domain (0) will be transformed into some other domain (t). By definition, it is the solutions of the Hamiltonian system that effect the mapping from (0) to (t). These solutions are given by matrix exponentials (see Appendix 2.10):     q(t) q(0) w(t) = = exp(At) (2.75) p(t) p(0)

36

2 Finite-Difference Schemes

Fig. 2.12 The motion of a harmonic oscillator is represented in the (q, p) phase plane by a point moving around an ellipse. Domain (0) contains a collection of such points (corresponding to an ensemble of oscillators or, equivalently, to a set of different initial conditions for one oscillator) at time t = 0. Domain (t) contains the points corresponding to the same oscillators at some arbitrary moment of time t. The area of (t) turns out not to depend on time

The Jacobian of this mapping is the determinant of exp At; as known from linear algebra; this determinant is equal to the product of eigenvalues λ1,2 (exp At): det (exp At) = λ1 (exp At) λ2 (exp At) = exp (λ1 (At)) exp (λ2 (At)) (2.76) = exp (λ1 (At) + λ2 (At)) = exp (Tr(At)) = 1 (The eigenvalues of exp At are equal to the exponents of the eigenvalues of At; if this sounds unfamiliar, see Appendix 2.10.) Since the determinant of the transformation is unity, the evolution operator preserves the oriented area of (t), in addition to energy conservation that was demonstrated earlier. This result generalizes to higher-dimensional phase spaces in multiparticle systems. Such phase spaces comprise the generalized coordinates qi and momenta pi of N particles. If particle motion is three-dimensional, there are three degrees of freedom per particle8 and hence i = 1, 2, . . . , 3N ; the dimension of the phase space is thus 6N . The most direct analogy with area conservation is that the 6N -dimensional phase volume is conserved under the evolution map [Arn89, HrW93, SSC94]. However, there is more. For any two-dimensional surface in the phase space, take its projections onto the individual phase planes ( pi , qi ) and sum up the oriented areas of these projections; this sum is conserved during the Hamiltonian evolution of the 8 Disregarding any degrees of freedom that may be associated with the internal structure of particles

and with their rotation.

2.5 Schemes for Hamiltonian Systems

37

surface. Transformations that have this conservation property for the sum of the areas are called symplectic. There is a very deep and elaborate mathematical theory of Hamiltonian phase flows on symplectic manifolds. A symplectic manifold is an even-dimensional differentiable manifold endowed with a closed nondegenerate differential 2-form; these notions, however, are not covered in this book. Further mathematical details are described in the monographs by V. I. Arnol’d [Arn89] and J. M. Sanz-Serna & M. P. Calvo [SSC94].

2.5.2 Symplectic Schemes for Hamiltonian Systems This subsection gives a brief summary of FD schemes that preserve the symplectic property of Hamiltonian systems. The material comes from the paper by R. D. Skeel et al. [RDS97], from the results on Runge–Kutta schemes due to Yu. B. Suris [Sur87]– [Sur90] and J. M. Sanz-Serna [SSC94], and from the compendium of symplectic symmetric Runge–Kutta methods by W. Oevel & M. Sofroniou [OS97]. The governing system of ODEs in Newtonian mechanics and, in particular, molecular dynamics is (2.77) r¨ = f (r ), r ∈ Rn where r is the position vector for a collection of n interacting particles and f is the normalized force vector (vector of forces divided by particle masses). It is assumed that the forces do not explicitly depend on time. The simplest, and yet effective, difference scheme for this problem is known as the Störmer–Verlet method9 : rn+1 − 2rn + rn−1 = f (rn ) t 2

(2.78)

The left-hand side of the Störmer scheme is a second-order (with respect to the time step t) approximation of r¨ ; this approximation is very common. The velocity vector can be computed from the position vector by central differencing: rn+1 − rn−1 (2.79) vn = 2t Time-stepping for both vectors r and v simultaneously can be arranged in a “leapfrog” manner: vn+1/2 = vn−1/2 + t f (rn ) (2.80) rn+1 = rn + t v(n + 1/2) 9 Skeel

(2.81)

et al. [RDS97] cite S. Toxvaerd’s statement [Tox94] that “the first known published appearance [of this method] is due to Joseph Delambre (1791)”.

38

2 Finite-Difference Schemes

The leapfrog scheme (2.80), (2.81) is theoretically equivalent to the Störmer scheme (2.78), (2.79). The advantage of these schemes is that they are symplectic and at the same time explicit: no systems of equations need to be solved in the process of time-stepping. Several other symplectic integrators are considered by R. D. Skeel et al. [RDS97], but they are all implicit. With regard to the Runge–Kutta methods, the Suris–Sanz-Serna condition of symplecticness is bi ai j + b j a ji − bi b j = 0,

i, j = 1, 2, . . . s

(2.82)

where bi , ai j are the coefficients of an s-stage Runge–Kutta method defined on Sect. 2.4.1, except that here the scheme is no longer explicit—i.e. ai j can be nonzero for any pair of indexes i, j. W. Oevel & M. Sofroniou [OS97] give the following summary of symplectic Runge–Kutta schemes. There is a unique one-stage symplectic method with the Butcher tableau 1 2

1 2

1 It represents the implicit scheme rn+1

  t 1 , (rn + rn+1 ) = rn + t f tn + 2 2

(2.83)

The following two-stage method is also symplectic: 1 2 1 2

± 2√1 3 ∓ 2√1 3

1 4

1 4

1 4



1 √ 2 3

± 1 4 1 2

1 √ 2 3 1 2

W. Oevel & M. Sofroniou [OS97] list a number of other methods, up to six-stage ones; these methods were derived using symbolic algebra.

2.6 Schemes for One-Dimensional Boundary Value Problems 2.6.1 The Taylor Derivation After a brief review of time-stepping schemes, we turn our attention to FD schemes for boundary value problems. Such schemes can be applied to various physical fields and potentials in one-dimension (this section), two and three dimensions (the

2.6 Schemes for One-Dimensional Boundary Value Problems

39

following sections). The most common and straightforward way of generating FD schemes is by Taylor expansion. As the simplest example, consider the Poisson equation in 1D: d 2u (2.84) − 2 = f (x) dx where f (x) is a given function that in physical problems represents the distribution of sources. The minus sign in the left-hand side is customary in many physical problems (electrostatics, heat transfer, etc.). Let us introduce a grid, for simplicity with a uniform spacing h, and consider a three-point grid molecule xk−1 , xk , xk+1 , where xk±1 = xk ± h. We shall look for the difference scheme in the form s−1 u k−1 + s0 u k + s+1 u k+1 = f (xk )

(2.85)

where the coefficients s (mnemonic for “scheme”) are to be determined. These coefficients are chosen to approximate, with the highest possible order in terms of the grid size h, the Poisson equation. (2.84). More specifically, let u ∗ be the exact solution of this equation, and let us write out the Taylor expansions of the values of u ∗ at the stencil nodes: 1 2 ∗  h u k + h.o.t. 2 u ∗k = u ∗k 1 + h 2 u ∗ k + h.o.t. 2

u ∗k−1 = u ∗k − hu ∗ k + u ∗k+1 = u ∗k + hu ∗ k

where the primes denote derivatives at the midpoint of the grid molecule, x = xk , and “h.o.t.” as before stands for “higher order terms.” Substituting these Taylor expansions into the difference scheme (2.85) and collecting the powers of h, one obtains (s−1 + s0 + s+1 ) u ∗k + (−s−1 + s+1 ) u ∗ k h +

1 (s−1 + s+1 ) u ∗ k h 2 + h.o.t. = −u ∗k  2

(2.86) where in the right-hand side, we took note of the fact that f (xk ) = −u ∗k  . The consistency error of the scheme is, by definition, c = (s−1 + s0 + s+1 )u ∗k + (−s−1 + s+1 ) u ∗ k h   2 1 s−1 + s+1 + 2 u ∗ k h 2 + h.o.t. + 2 h The consistency error tends to zero as h → 0 if and only if

(2.87)

40

2 Finite-Difference Schemes

s−1 + s0 + s+1 = 0 −s−1 + s+1 = 0 s−1 + s+1 + 2/ h 2 = 0 from which the coefficients of the scheme are immediately found to be s−1 = s+1 = − 1/ h 2 ; s0 = 2/ h 2

(2.88)

and the difference equation thus reads −u k−1 + 2u k − u k+1 = f (xk ) h2

(2.89)

It is easy to verify that this scheme is of second order with respect to h, i.e. its consistency error c = O(h 2 ). The Taylor analysis leading to this scheme is general, however, and can be extended to generate higher-order schemes, provided that the grid molecule is extended as well. As an exercise, the reader may verify that a fivepoint scheme with the coefficients [1, −16, 30, −16, 1]/(12h 2 ) is of order four of a uniform grid. Practical implementation of FD schemes involves forming a system of equations for the nodal values of function u, imposing the boundary conditions, solving this system and processing the results. The implementation is described in Sect. 2.6.4.

2.6.2 Using Constraints to Derive Difference Schemes In this subsection, a slightly different way of deriving difference schemes is presented. The idea is most easily illustrated in 1D but will prove to be fruitful in 2D and 3D, particularly for the development of the so-called “Mehrstellen” schemes (see Sects. 2.7.4, 2.8.5). For the 1D Poisson equation, we are looking for a three-point FD scheme of the form (2.90) s−1 u k−1 + s0 u k + s+1 u k+1 = s f Parameter s f in the right-hand side is not specified a priori and will be determined, along with s±1 and s0 , as a result of a formal procedure described below. Let us again expand the exact solution u into the Taylor series around the midpoint xk of the grid molecule: u(x) = c0 + c1 (x − xk ) + c2 (x − xk )2 + c3 (x − xk )3 + c4 (x − xk )4 + h.o.t. (2.91)

2.6 Schemes for One-Dimensional Boundary Value Problems

41

The coefficients cα are of course directly related to the derivatives of u at xk but will initially be treated as undetermined parameters; later on, information available about them will be taken into account. Consistency error of scheme (2.90) can be evaluated by substituting the Taylor expansion (2.91) into the scheme. Upon collecting similar terms for all coefficients cα , we get c = − s f + (s−1 + s0 + s+1 )c0 + (−s−1 + s+1 )hc1 + (s−1 + s+1 )h 2 c2 + (−s−1 + s+1 )h 3 c3 + (s−1 + s+1 )h 4 c4 + h.o.t.

(2.92)

If no information about the coefficients cα were available, the best one could do to minimize the consistency error would be to set s f = 0, s−1 + s0 + s+1 = 0, and −s−1 + s+1 = 0, which yields u k−1 − 2u k + u k+1 = 0. Not surprisingly, this scheme is not suitable for the Poisson equation with a nonzero right-hand side: we have not yet made use of the fact that u satisfies this equation—that is, that the Taylor coefficients cα are not arbitrary. In particular, u  (xk ) = 2c2 = − f (xk )

(2.93)

This condition can be taken into account by using an idea that is, in a sense, dual to the method of Lagrange multipliers in constrained optimization. (Here, we are in fact dealing with a special optimization problem—namely, minimization of the consistency error in the asymptotic sense.) In typical constrained optimization, restrictions are imposed on the optimization parameters being sought; in our case, these parameters are the coefficients s of the difference scheme. Note that constraints on optimization parameters, generally speaking, inhibit optimization. In contrast, in our case, the constraint applies to the parameters of the function being minimized. This narrows down the set of target functions and facilitates optimization. To incorporate the constraint on c2 (2.93) into the minimization problem, one can introduce an analog of the Lagrange multiplier λ: c = −s f + (s−1 + s0 + s+1 )c0 + (−s−1 + s+1 )hc1 + (s−1 + s+1 )h 2 c2 + (−s−1 + s+1 )h 3 c3 + (s−1 + s+1 )h 4 c4 + h.o.t. − λ[2c2 + f (xk )] or equivalently c = (−s f − λ f (xk )) + (s−1 + s0 + s+1 )c0 + (−s−1 + s+1 )hc1 + (s−1 h 2 + s+1 h 2 − 2λ)c2 + (−s−1 + s+1 )h 3 c3 + (s−1 + s+1 )h 4 c4 + h.o.t. (2.94)

42

2 Finite-Difference Schemes

where λ is an arbitrary parameter that one is free to choose in addition to the coefficients of the scheme. As Sects. 2.7.4 and 2.8.5 show, in 2D and 3D there are several such constraints and therefore several extra free parameters at our disposal. Maximization of the order of the consistency error (2.94) yields the following conditions: −s f − λ f (xk ) = 0 s−1 + s0 + s+1 = 0 −s−1 + s+1 = 0 s−1 h 2 + s+1 h 2 − 2λ = 0 This gives, up to an arbitrary factor, λ = 1, s±1 = h −2 , s0 = −2h −2 , s f = − f (xk ), and the resultant difference scheme is −u k−1 + 2u k − u k+1 = f (xk ) h2

(2.95)

This new “Lagrange-like” derivation produces a well-known scheme in one dimension, but in 2D/3D the idea will prove to be more fruitful and will lead to “Mehrstellen” schemes introduced by L. Collatz [Col66].

2.6.3 Flux-Balance Schemes The previous analysis was implicitly based on the assumption that the exact solution was sufficiently smooth to admit the Taylor approximation to a desired order. However, Taylor expansion typically breaks down in a number of important practical cases—particularly so in the vicinity of material interfaces. In 1D, this is exemplified by the following problem:



d dx

  du λ(x) = f (x) on  ≡ [a, b], u(a) = u a , u(b) = u b (2.96) dx

where the boundary values u a , u b are given. In this equation, λ is the material parameter whose physical meaning varies depending on the problem: it is thermal conductivity in heat transfer, dielectric permittivity in electrostatics, magnetic permeability in magnetostatics (if the magnetic scalar potential is used), and so on. This parameter is usually discontinuous across interfaces of different materials. In such cases, the solution satisfies the interface boundary conditions that in the 1D case are u(x0− ) = u(x0+ );

λ(x0− )

du(x0− ) du(x0+ ) = λ(x0+ ) dx dx

(2.97)

2.6 Schemes for One-Dimensional Boundary Value Problems

43

where x0 is the discontinuity point for λ(x), and the − and + labels correspond to the values immediately to the left and to the right of x0 , respectively. The quantities −λ(x)du/d x typically have the physical meaning of fluxes: for example, the heat flux (i.e. energy passed through point x per unit time) in heat transfer problems or the flux of charges (i.e. electric current) in electric conduction, etc. The fundamental physical principle of energy or flux conservation can be employed to construct a difference scheme. For any chosen subdomain (often called “control volume”—in 1D, a segment), the outgoing energy flow (e.g. heat flux) is equal to the total capacity of sources (e.g. heat sources) within that subdomain. In electro- or magnetostatics, with the electric or magnetic scalar potential formulation, a similar principle of flux balance is used instead of energy balance. For Eq. (2.96), energy or flux balance can mathematically be derived by integration. Indeed, let ω = [α, β] ⊂ .10 Integrating the underlying Eq. (2.96) over ω, we obtain β du du (α) − λ(β) (β) = f (x) d x (2.98) λ(α) dx dx α which from the physical point of view is exactly the flux balance equation (outgoing flux from ω is equal to the total capacity of sources inside ω). Fig. 2.13 illustrates the construction of the flux-balance scheme; α and β are chosen as the midpoints of intervals [xk−1 , xk ] and [xk , xk+1 ], respectively. The fluxes in the left-hand side of the balance equation (2.98) are approximated by finite differences to yield h

−1

  β u k − u k−1 u k+1 − u k λ(α) − λ(β) = h −1 f (x) d x h h α

(2.99)

If the central point xk of the grid molecule is placed at the material discontinuity (as shown in Fig. 2.13), λ(α) ≡ λ− and λ(β) ≡ λ+ . The factor h −1 is introduced to normalize the right-hand side of this scheme to O(1) with respect to the mesh size (i.e. to keep the magnitude of the right-hand side approximately constant as the mesh size decreases). The integral in the right-hand side can be computed either analytically, if f (x) admits that, or by some numerical quadrature—the simplest one being just f (xk )(β − α). This flux-balance scheme has a solid foundation as a discrete energy conservation condition. From the mathematical viewpoint, this translates into favorable properties of the algebraic system of equations (to be considered in Sect. 2.6.4): matrix symmetry and, as a consequence, the discrete reciprocity principle. If the middle node of the grid molecule is not located exactly at the material boundary, the flux-balance scheme (2.99) is still usable, with λ(α) and λ(β) being the values of λ in the material where the respective point α or β happens to lie. However, numerical accuracy deteriorates significantly. This can be shown analytically

symbol  refers to the whole computational domain, ω denotes its subdomain (typically “small” in some sense).

10 While

44

2 Finite-Difference Schemes

Fig. 2.13 A three-point flux balance scheme near a material interface in one dimension

Fig. 2.14 Solution of the 1D problem with material discontinuity. λ− = 1, λ+ = 10

by substituting the exact solution into the flux-balance scheme and evaluating the consistency error. Rather than performing this algebraic exercise, we simply consider a numerical illustration. Problem (2.96) is solved in the interval √ [0, 1]. The material boundary point is chosen to be an irrational number a = 1/ 2, so that in the course of the numerical experiment it does not coincide with a grid node of any uniform grid. There are no sources (i.e. f = 0) and the Dirichlet conditions are u(0) = 0, u(1) = 1. The exact solution and the numerical solution with 10 grid nodes are shown in Fig. 2.14. The log–log plot of the relative error norm of the numerical solution vs. the number of grid nodes is given in Fig. 2.15. The dashed line in the figure is drawn for reference to identify the O(h) slope. Comparison with this reference line reveals that the convergence rate is only O(h). Were the discontinuity point to coincide with a grid node, the scheme could easily be shown to be exact—in practice, the numerical solution would be obtained with machine precision. The farther the discontinuity point is from the nearest grid node (relative to the grid size), the higher the numerical error tends to be. This relative distance to the nearest node is plotted in Fig. 2.16 and does indeed correlate clearly with the numerical error in Fig. 2.15.

2.6 Schemes for One-Dimensional Boundary Value Problems

45

Fig. 2.15 Flux-balance scheme: errors vs. the number of grid points for the 1D problem with material discontinuity. λ− = 1, λ+ = 10

Fig. 2.16 Relative distance (as a fraction of the grid size) between the discontinuity point and the nearest grid node

As in the case of Taylor-based schemes of the previous section, the flux-balance schemes prove to be a very natural particular case of “Trefftz–FLAME” schemes considered in Chap. 4; see in particular Sect. 4.4.2. Moreover, in contrast with standard schemes, in FLAME the location of material discontinuities relative to the grid nodes is almost irrelevant.

46

2 Finite-Difference Schemes

2.6.4 Implementation of 1D Schemes for Boundary Value Problems Difference schemes like (2.89) or (2.99) constitute a local relationship between the values at the neighboring nodes of a particular grid molecule. Putting these local relationships together, one obtains a global system of equations. With the grid nodes numbered consecutively from 1 to n,11 the n × n matrix of this system is tridiagonal. Indeed, row k of this matrix corresponds to the difference equation—in our case, either (2.89) or (2.99)—that connects the unknown values of u at nodes k − 1, k and k + 1. For example, the flux-balance scheme (2.99) leads to a matrix L with diagonal entries L kk = (λ+ + λ− )/ h and the off-diagonal ones L k−1,k = −λ− / h, L k,k+1 = −λ+ / h, where as before λ− and λ+ are the values of material parameter λ at the midpoints of intervals [xk−1 , xk ] and [xk , xk+1 ], respectively. These entries are modified at the end points of the interval to reflect the Dirichlet boundary conditions.12 At the boundary nodes, the Dirichlet condition can be conveniently enforced by setting the corresponding diagonal matrix entry to one, the other entries in its row to zero, and the respective entry in the right-hand side to the given Dirichlet value of the solution. In addition, if j is a Dirichlet boundary node and i is its neighbor, the L i j u j term in the ith difference equation is known and therefore gets moved (with the opposite sign) to the right hand side, while the (i, j) matrix entry is simultaneously set to zero. The same procedure is valid in two and three dimensions, except that in these cases, a boundary node can have several neighbors.13 The system matrix L corresponding to this three-point scheme is tridiagonal, and the system can be easily solved by Gaussian elimination (A. George & J. W-H. Liu [GL81]) or its modifications (S. K. Godunov & V. S. Ryabenkii [GR87a]).

− 1 is often more convenient, and is the default in languages like C/C++. However, I have adopted the default numbering of MATLAB and of the classic versions of FORTRAN. 12 The implementation of Neumann and other boundary conditions is covered in all textbooks on FD schemes: L. Collatz [Col66], A. A. Samarskii [Sam01], J. C. Strikwerda [Str04], W. E. Milne [Mil70], and many others. 13 The same is true in 1D for higher-order schemes with more than three stencil nodes in the interior of the domain (more than two nodes in boundary stencils). 11 Numbering from 0 to n

2.7 Schemes for Two-Dimensional Boundary Value Problems

47

Fig. 2.17 A five-point grid “molecule” for difference scheme (2.101) in 2D

2.7 Schemes for Two-Dimensional Boundary Value Problems 2.7.1 Schemes Based on the Taylor Expansion For illustration, let us again turn to the Poisson equation—this time in two dimensions:   2 ∂2u ∂ u = f (x, y) (2.100) + − ∂x 2 ∂ y2 We introduce a Cartesian grid with grid sizes h x , h y and the number of grid subdivisions N x , N y in the x and y directions, respectively. To keep the notation simple, we consider the grid to be uniform along each axis; more generally, h x could vary along the x-axis and h y could vary along the y-axis, but the essence of the analysis would remain the same. Each node of the grid can be characterized in a natural way by two integer indices n x and n y corresponding to the x and y directions; 1 ≤ n x ≤ N x + 1, 1 ≤ n y ≤ N y + 1. To generate a Taylor-based difference scheme for the Poisson equation (2.100), it is sensible to approximate the x- and y- partial derivatives separately in exactly the same way as done in 1D. The resulting scheme for grid nodes not adjacent to the domain boundary is −u n x −1,n y + 2u n x ,n y − u n x +1,n y h 2x −u n x ,n y −1 + 2u n x ,n y − u n x ,n y +1 + = f (xn , yn ) h 2y

(2.101)

where xn , yn are the coordinates of the grid node (n x , n y ). Note that difference scheme (2.101) involves the values of u on a five-point grid “molecule” (three points in each coordinate direction, with the middle node shared, Fig. 2.17).

48

2 Finite-Difference Schemes

As in 1D, scheme (2.101) is of second order, i.e. its consistency error is O(h 2 ), where h = max(h x , h y ). By expanding the stencil, it is possible—again by complete analogy with the 1D case—to increase the order of the scheme. For example, on the grid molecule with nine nodes (five in each coordinate direction, with the middle node shared) a fourth-order scheme can be obtained by combining two–fourth-order schemes in the x and y directions on their respective five-point stencils. Other stencils can be used to construct higher-order schemes, and other ideas can be applied to this construction (see for example the Collatz “Mehrstellen” schemes on a 3 × 3 grid molecule in Sect. 2.7.4).

2.7.2 Flux-Balance Schemes Let us now turn our attention to a more general 2D problem with a varying material parameter  − ∇ · ((x, y)∇u) = f (x, y) (2.102) where  may depend on coordinates but not—in the linear case under consideration— on the solution u. Moreover,  will be assumed piecewise-smooth, with possible discontinuities only at material boundaries.14 At any material interface boundary, the following conditions hold: −

∂u − ∂u + = + ∂n ∂n

(2.103)

where “−” and “+” refer to the values on the two sides of the interface boundary and n is the normal to the boundary in a prescribed direction. The integral form of the differential equation (2.102) is, by Gauss’s Theorem, −

γ

(x, y)

∂u dγ = ∂n

ω

f (x, y) dω

(2.104)

where ω is a subdomain of the computational domain , γ is the boundary of ω, and n is the outward normal to that boundary. The physical meaning of this integral equation is either energy conservation or flux balance, depending on the application. For example, in heat transfer this equation expresses the fact that the net flow of heat through the surface of volume ω is equal to the total amount of heat generated inside the volume by sources f . In electrostatics, (2.104) is an expression of Gauss’s Law (the flux of the displacement vector D is equal to the total charge inside the volume). 14 Throughout the book, “smoothness” is not characterized in a mathematically precise way. Rather, it is tacitly assumed that the level of smoothness is sufficient to justify all mathematical operations and analysis.

2.7 Schemes for Two-Dimensional Boundary Value Problems

49

Fig. 2.18 Construction of the flux-balance scheme. The net flux out of the shaded control volume is equal to the total capacity of sources inside that volume

The integral conservation principle (2.104) is valid for any subdomain ω. Fluxbalance difference schemes are generated by applying this principle to a discrete set of subdomains (“control volumes”) such as the shaded rectangle shown in Fig. 2.18. The grid nodes involved in the construction of the scheme are the same as in Fig. 2.17 and are not labeled to avoid overcrowding the picture. For this rectangular control volume, the surface flux integral in the balance equation (2.104) splits up into four fluxes through the edges of the rectangle. Each of these fluxes can be approximated by a finite difference; for example, Flux1 ≈ 1 h y

u n x ,n y − u n x +1,n y hx

(2.105)

where 1 is the value of the material parameter at the edge midpoint marked with an asterisk in Fig. 2.18; the h y factor is the length of the right edge of the shaded rectangle. (If the grid were not uniform, this edge length would be the average value of the two consecutive grid sizes.) The complete difference scheme is obtained by summing up all four edge fluxes: u n x ,n y − u n x +1,n y u n x ,n y − u n x ,n y +1 + 2 h x hx hy u n x ,n y − u n x −1,n y u n x ,n y − u n x ,n y −1 + 3 h y + 4 h x = f (xn , yn ) h x h y hx hy

1 h y

The approximation of fluxes by finite differences hinges on the assumption of smoothness of the solution. At material interfaces, this assumption is violated, and accuracy deteriorates. The reason is that the Taylor expansion fails when the solution or its derivatives are discontinuous across boundaries. One can try to remedy that by generalizing the Taylor expansion and accounting for derivative jumps (A. Wiegmann & K. P. Bube [WB00]); however, this approach leads to unwieldy expressions. Another alternative is to replace the Taylor expansion with a linear combination of suitable basis functions that satisfy the discontinuous boundary conditions and therefore

50

2 Finite-Difference Schemes

approximate the solution much more accurately. This idea is taken full advantage of in FLAME (Chap. 4).

2.7.3 Implementation of 2D Schemes By applying a difference scheme on all suitable grid molecules, one obtains a system of equations relating the nodal values of the solution on the grid. To write this system in matrix form, one needs a global numbering of nodes from 1 to N , where N = (N x + 1)(N y + 1). The numbering scheme is in principle arbitrary, but the most natural order is either row-wise or column-wise along the grid. In particular, for row-wise numbering, node (n x , n y ) has the global number n = (N x + 1)(n y − 1) + n x − 1,

1≤n≤N

(2.106)

With this numbering scheme, the global node numbers of the two neighbors of node n = (n x , n y ) in the same row are n − 1 and n + 1, while the two neighbors in the same column have global numbers n + (N x + 1) and n − (N x + 1), respectively. For nodes adjacent to the domain boundary, fictitious “neighbors” located outside of the computational domain are ignored. It is then easy to observe that the five-point stencil of the difference scheme leads to a five-diagonal system matrix, two of the subdiagonals corresponding to node– node connections in the same row, and the other two to connections in the same column. All other matrix entries are zero. The Dirichlet boundary conditions are handled in a way similar to the 1D case. Namely, for a boundary node, the corresponding diagonal entry of the system matrix can be set to one (the other entries in the same row being zero), and the entry of the right-hand side set to the required Dirichlet value. Moreover, if j is a boundary node and i is its non-boundary neighbor, the term L i j u j in the difference scheme is known and is therefore moved to the right-hand side (with the respective matrix entry (i, j) reset to zero). There is a rich selection of computational methods for solving such linear systems of equations with large sparse matrices. Broadly speaking, these methods can be subdivided into direct and iterative solvers. Direct solvers are typically based on variants of Gaussian or Cholesky decomposition, with node renumbering and possibly block partitioning; see A. George & J. W-H. Liu [GL81, GLe] and Sect. 3.11. The second one is iterative methods—variants of conjugate gradient or more general Krylov subspace iterations with preconditioners (R. S. Varga [Var00], Y. Saad [Saa03], D. K. Faddeev & V. N. Faddeeva [FF63], H. A. van der Vorst [vdV03a]) or, alternatively, domain decomposition and multigrid techniques (W. Hackbusch [Hac85], J. Xu [Xu92], A. Quarteroni & A. Valli [QV99]); see also Sect. 3.13.4.

2.7 Schemes for Two-Dimensional Boundary Value Problems

51

Fig. 2.19 The nine-point grid molecule with the local numbering of nodes as shown. The central node is numbered first, followed by the remaining nodes of the standard five-point grid molecule, and then by the four corner nodes

2.7.4 The Collatz “Mehrstellen” Schemes in 2D For the Poisson equation in 2D − ∇2u = f

(2.107)

consider now a nine-point grid molecule of 3 × 3 neighboring nodes. The node numbering is shown in Fig. 2.19. We set out to find a scheme 9 α=1

sα u α =

9 α=1

wα f α

(2.108)

with coefficients {sα }, {wα } (α = 1,2, . . . , 9) such that the consistency error has the highest order with respect to the mesh size. For simplicity, we shall now consider schemes with only one nonzero coefficient w corresponding to the central node (node #1) of the grid molecule. It is clear that w1 in this case can be set to unity without any loss of generality, as the coefficients s still remain undetermined; thus 9 α=1

sα u α = f 1

(2.109)

The consistency error of this scheme is, by definition, c =

9 α=1

sα u ∗α − f 1 =

9 α=1

sα u ∗α + ∇ 2 u ∗1

(2.110)

where u ∗ is the exact solution of the Poisson equation and u ∗α is its value at node α. The goal is to minimize the consistency error in the asymptotic sense—i.e. to maximize its order with respect to h—by the optimal choice of the coefficients sα of the difference scheme. Suppose first that no additional information about u ∗ – other than it is a smooth function—is taken into consideration while evaluating consistency error (2.110). Then, expanding u ∗ into the Taylor series around the central point of the nine-point grid molecule, after straightforward algebra one concludes that only a second-order

52

2 Finite-Difference Schemes

scheme can be obtained—that is, asymptotically the same accuracy level as for the five-point grid molecule. However, a scheme with higher accuracy can be constructed if additional information about u ∗ is taken into account. To fix ideas, let us consider the Laplace (rather than the Poisson) equation (2.111) ∇ 2u∗ = 0 Differentiation of the Laplace equation with respect to x and y yields a few additional pieces of information: ∂3u∗ ∂3u∗ = 0 (2.112) + ∂x 3 ∂x 2 ∂ y ∂3u∗ ∂3u∗ + = 0 ∂x∂ y 2 ∂ y3

(2.113)

Another three equations of the same kind can be obtained by taking second derivatives of the Laplace equation, with respect to x x, x y, and yy. As the way these equations are produced is obvious, they are not explicitly written here to save space. All these additional conditions on u ∗ impose constraints on the Taylor expansion of u ∗ . It is quite reasonable to seek a more accurate difference scheme if only the possible exact solutions are targeted, rather than a whole class of sufficiently smooth functions. More specifically, let u ∗ (x, y) = c0 + c1 x + c2 x 2 + c3 x 3 + c4 x 4 + c5 y + c6 x y + c7 x 2 y + c8 x 3 y + c9 y 2 + c10 x y 2 + c11 x 2 y 2 + c12 y 3 + c13 x y 3 + c14 y 4 + h.o.t.

(2.114)

where cα (α = 1, 2, . . . , 14) are some coefficients (directly related, of course, to the partial derivatives of u ∗ ). For convenience, the origin of the coordinate system has been moved to the midpoint of the nine-point grid molecule. To evaluate and minimize the consistency error (2.110) of the difference scheme, we need the nodal values of the exact solution u ∗ . To this end, let us first rewrite expansion (2.114) in a more compact matrix–vector form: u ∗ (x, y) = p T c

(2.115)

where p T is a row vector of 15 polynomials in x, y in the order of their appearance in expansion (2.114): p T = [1, x, x 2 , . . . , x y 3 , y 4 ] ; c ∈ R15 is a column vector of expansion coefficients. The vector of nodal values of u ∗ on the grid molecule will be denoted with N u ∗ and is equal to N u ∗ = N c + h.o.t.

(2.116)

2.7 Schemes for Two-Dimensional Boundary Value Problems

53

The 9 × 15 matrix N comprises the 9 nodal values of the 15 polynomials on the grid molecule, i.e. (2.117) Nαβ = pβ (xα , yα ) Such matrices of nodal values will play a central role in the “Flexible Local Approximation MEthod” (FLAME) of Chap. 4. Consistency error (2.110) for the Laplace equation then becomes Laplace = s T N c + h.o.t. c

(2.118)

where s ∈ R9 is a Euclidean vector of coefficients. If no information about the expansion coefficients c (i.e. about the partial derivatives of the solution) were available, the consistency error would have to be minimized for all vectors c ∈ R15 . In fact, however, u ∗ satisfies the Laplace equation, which imposes constraints on its secondorder and higher-order derivatives. Therefore the target space for optimization is actually narrower than the full R15 . If more constraints on the c coefficients are taken into account, higher accuracy of the difference scheme can be expected. A “Lagrange-like” procedure (Sect. 2.6.2) for incorporating the constraints on u ∗ is in some sense dual to the standard technique of Lagrange multipliers: these multipliers are applied not to the optimization parameters but rather to the parameters of the target function u ∗ . Thus, we introduce five Lagrange-like multipliers λ1−5 to take into account five constraints on the c coefficients: Laplace = s T N c − λ1 (c2 + c9 ) − λ2 (3c3 + c10 ) − λ3 (c7 + 3c12 ) c − λ4 (6c4 + c11 ) − λ5 (6c14 + c11 ) − λ6 (6c8 + c13 ) + h.o.t.

(2.119)

For example, the constraint represented by λ1 is just the Laplace equation itself 2 ∗ 2 ∗ (since c2 = 21 ∂∂xu2 , c9 = 21 ∂∂ yu2 ); the constraint represented by λ2 is the derivative of the Laplace equation with respect to x (see (2.112)), and so on. In matrix form, Eq. (2.119) becomes Laplace = s T N c − λT Qc + h.o.t. c

(2.120)

where matrix Q corresponds to the λ-terms in (2.119). The same relationship can be rewritten in the block-matrix form   N

c + h.o.t. (2.121) c = s T λT −Q As in the regular technique of Lagrange multipliers, the problem is now treated as unconstrained. The consistency error is reduced just to the higher-order terms if  

s ∈ Null N T ; −Q T λ assuming that this null space is nonempty.

(2.122)

54

2 Finite-Difference Schemes

The computation of matrices N and Q, as well as the null space above, is straightforward by symbolic algebra. As a result, the following coefficients are obtained for a grid molecule with mesh sizes h x = qx h, h y = q y h in the x and y directions, respectively: s1 = 20h −2 s2,3 = − 2h −2 (5qx2 − q y2 )/(q y2 + qx2 ) s4,5 = − 2h −2 (5q y2 − qx2 )/(q y2 + qx2 ) s6−9 = − h −2 . If qx = q y (i.e. h x = h y ), the scheme simplifies: s = h −2 [20, −4, −4, −4, −4, −1, −1, −1, −1] 20 corresponds to the central node, the −4’s—to the mid-edge nodes, and the −1’s— to the corner nodes. This scheme was derived, from different considerations, by L. Collatz in the 1950’s [Col66] and called a “Mehrstellenverfahren” scheme.15 (See also A. A. Samarskii [Sam01] for yet another derivation.) It can be verified that this scheme is of order four in general but of order 6 in the special case of h x = h y .16 It will become clear in Sects. 4.4.4 and 4.4.5, that the “Mehrstellen” schemes are a natural particular case of Flexible Local Approximation MEthods (FLAME) considered in Chap. 4. More details about the “Mehrstellen” schemes and their application to the Poisson equation in 2D and 3D can be found in the same monographs by Collatz and Samarskii. The 3D case is also considered in Sect. 2.8.5, as it has important applications to long-range electrostatic forces in molecular dynamics (e.g. C. Sagui & T. Darden [SD99]) and in electronic structure calculation (E. L. Briggs et al. [BSB96]).

2.8 Schemes for Three-Dimensional Problems 2.8.1 An Overview The structure and subject matter of this section are very similar to those of the previous section on 2D schemes. To avoid unnecessary repetition, issues that are completely analogous in 2D and 3D will be reviewed briefly, but the differences between the 3D and 2D cases will be highlighted. 15 In 16

the English translation of the Collatz book, these methods are called “Hermitian”. A different perspective on the order of FD schemes is given in Sect. 4.5.

2.8 Schemes for Three-Dimensional Problems

55

We again start with low-order Taylor-based schemes and then proceed to higherorder schemes, control volume/flux-balance schemes, and “Mehrstellen” schemes.

2.8.2 Schemes Based on the Taylor Expansion in 3D The Poisson equation in 3D has the form  −

∂2u ∂2u ∂2u + + ∂x 2 ∂ y2 ∂z 2

 = f (x, y, z)

(2.123)

Finite difference schemes can again be constructed on a Cartesian grid with the grid sizes h x , h y , h z and the number of grid subdivisions N x , N y , Nz in the x, y and z directions, respectively. Each node of the grid is characterized by three integer indices n x n y , n z : 1 ≤ n x ≤ N x + 1, 1 ≤ n y ≤ N y + 1, 1 ≤ n z ≤ Nz + 1. The simplest Taylor-based difference scheme for the Poisson equation is constructed by combining the approximations of the x-, y- and z− partial derivatives: −u n x −1,n y ,n z + 2u n x ,n y ,n z − u n x +1,n y ,n z ) h 2x −u n x ,n y −1,n z + 2u n x ,n y ,n z − u n x ,n y +1,n z + h 2y −u n x ,n y ,n z −1 + 2u n x ,n y ,n z − u n x ,n y ,n z +1 + = f (xn , yn , z n ) h 2z

(2.124)

where xn , yn , z n are the coordinates of the grid node (n x , n y , n z ). This difference scheme involves a 7-point grid molecule (three points in each coordinate direction, with the middle node shared between them). As in 1D and 2D, scheme (2.124) is of second order, i.e. its consistency error is O(h 2 ), where h = max(h x , h y , h z ). Higher-order schemes can be constructed in a natural way by combining the approximations of each partial derivative on its extended 1D grid molecule; for example, a 3D molecule with 13 nodes is obtained by combining three five-point molecules in each coordinate direction, with the middle node shared. The resultant scheme is of fourth order. Another alternative is Collatz “Mehrstellen” schemes, in particular the fourth-order scheme on a 19-point grid molecule considered in Sect. 2.8.5.

56

2 Finite-Difference Schemes

2.8.3 Flux-Balance Schemes in 3D Consider now a 3D problem with a coordinate-dependent material parameter: − ∇ · ((x, y, z)∇u) = f (x, y, z)

(2.125)

As before,  will be assumed piecewise-smooth, with possible discontinuities only at material boundaries. The potential is continuous everywhere. The flux continuity conditions at material interfaces have the same form as in 2D: −

∂u − ∂u + = + ∂n ∂n

(2.126)

where “−” and “+” again refer to the values on the two sides of the interface boundary. The integral form of the differential equation (2.125) is, by Gauss’s Theorem (x, y, z)

− S

∂u dS = ∂n

ω

f (x, y, z) dω

(2.127)

where ω is a subdomain of the computational domain , S is the boundary surface of ω, and n is the normal to that boundary. As in 2D, the physical meaning of this integral condition is energy or flux balance, depending on the application. A “control volume” ω to which the flux balance condition can be applied is (2.104) is shown in Fig. 2.20. The flux-balance scheme is completely analogous to its 2D counterpart (see (2.106)): u n x ,n y ,n z − u n x +1,n y ,n z u n x ,n y ,n z − u n x ,n y +1,n z + 2 h x h z hx hy u n x ,n y ,n z − u n x −1,n y ,n z u n x ,n y ,n z − u n x ,n y −1,n z + 3 h y h z + 4 h x h z hx hy

1 h y h z

Fig. 2.20 Construction of the flux-balance scheme in three dimensions. The net flux out of the shaded control volume is equal to the total capacity of sources inside that volume. The grid nodes are shown as circles. For flux computation, the material parameters are taken at the midpoints of the faces

2.8 Schemes for Three-Dimensional Problems

57

u n x ,n y ,n z − u n x ,n y ,n z +1 u n x ,n y ,n z − u n x ,n y ,n z −1 + 6 h x h y hz hz = f (xn , yn , z n ) h x h y h z (2.128)

+ 5 h x h y

As in 2D, the accuracy of this scheme deteriorates in the vicinity of material interfaces, as the derivatives of the solution are discontinuous. Suitable basis functions satisfying the discontinuous boundary conditions are used in FLAME schemes (Chap. 4), which dramatically reduces the consistency error.

2.8.4 Implementation of 3D Schemes Assuming for simplicity that the computational domain is a rectangular parallelepiped, one introduces a Cartesian grid with N x , N y and Nz subdivisions in the respective coordinate directions. The total number of nodes Nm in the mesh (including the boundary nodes) is Nm = (N x + 1)(N y + 1)(Nz + 1). A natural node numbering is generated by letting, say, n x change first, n y second and n z third, which assigns the global number n = (N x + 1)(N y + 1)(n z − 1) + (N x + 1)(n y − 1) + n x − 1,

1≤n≤N (2.129) to node (n x , n y , n z ). When, say, a seven-point scheme is applied on all grid molecules, a 7-diagonal system matrix results. Two subdiagonals correspond to the connections of the central node (n x , n y , n z ) of the molecule to the neighboring nodes (n x ± 1, n y , n z ), another two subdiagonals to neighbors (n x , n y ± 1, n z ), and the remaining two subdiagonals to nodes (n x , n y , n z ± 1). Boundary conditions are handled in a way completely analogous to the 2D case. The selection of solvers for the resulting linear system of equations is in principle the same as in 2D, with direct and iterative methods being available. However, there is a practical difference. In two dimensions, thousands or tens of thousands of grid nodes are typically needed to achieve reasonable engineering accuracy; such problems can be easily solved with direct methods that are often more straightforward and robust than iterative algorithms. In 3D, the number of unknowns can easily reach hundreds of thousands or millions, in which case iterative methods may be the only option.17

17 Even for the same number of unknowns in a 2D and a 3D problem, in the 3D case the number of nonzero entries in the system matrix is greater, the sparsity pattern of the matrix is different, and the 3D solver requires more memory and CPU time.

58

2 Finite-Difference Schemes

Fig. 2.21 For the Laplace equation, this fourth-order “Mehrstellen”-Collatz scheme on the 19point grid molecule is a direct particular case of Trefftz–FLAME. The grid sizes are equal in all three directions. For visual clarity, the molecule is shown as three slices along the y-axis. (Reprinted by permission from [Tsu06] ©2006 Elsevier.)

2.8.5 The Collatz “Mehrstellen” Schemes in 3D The derivation and construction of the “Mehrstellen” schemes in 3D are based on the same ideas as in the 2D case, Sect. 2.7.4. For the Laplace equation, the “Mehrstellen” scheme can also be obtained as a direct and natural particular case of FLAME schemes in Chap. 4. The 19-point grid molecule for a fourth-order “Mehrstellen” scheme is obtained by discarding the eight corner nodes of a 3 × 3 × 3 node cluster. The coefficients of the scheme for the Laplace equation on a uniform grid with h x = h y = h z are visualized in Fig. 2.21. In the more general case of unequal mesh sizes in the x, y and z directions, the “Mehrstellen” scheme is derived in the monographs by L. Collatz and A. A. Samarskii. E. L. Briggs et al. [BSB96] list the coefficients of the scheme in a concise table form. The end result is as follows.  The coefficient corresponding to the central node of the grid molecule is 4/3 αh −2 α (where α = x, y, z). The coefficients corresponding to the two immediate neighbors  −2 of the central node in the α direction are −5/6h −2 α + 1/6 β h β (β = x, y or z). Finally, the coefficients corresponding to the nodes displaced by h α and h β in both −2 α- and β-coordinate directions relative to the central node are −1/12h −2 α −1/12h β . If the Poisson equation (2.123) rather than the Laplace equation is solved, with f = f (x, y, z) a smooth function of coordinates, the right-hand side of the 19-point

2.8 Schemes for Three-Dimensional Problems

59

1 6 Mehrstellen scheme is f h = 21 f 0 + 12 α=1 f α , where f 0 is the value of f at the middle node of the grid molecule and f α are the values of f at the six immediate neighbors of that middle node. Thus the computation of the right-hand side involves the same seven-point molecule as for the standard second-order scheme for the Laplace equation, not the whole 19-point molecule. HODIE schemes by R. E. Lynch & J. R. Rice [LR80] generalize the Mehrstellen schemes and include additional points in the computation of the right-hand side.

2.9 Consistency and Convergence of Difference Schemes This section presents elements of convergence and accuracy theory of FD schemes. A more comprehensive and rigorous treatment is available in many monographs (e.g. L. Collatz [Col66], A. A. Samarskii [Sam01], J. C. Strikwerda [Str04], W. E. Milne [Mil70]), S. K. Godunov & V. S. Ryabenkii [GR87a]. Consider a differential equation in 1D, 2D or 3D Lu = f

(2.130)

that we wish to approximate by a difference scheme (i) L (i) h u h = f hi

(2.131)

on a grid molecule (i). Here u (i) h is the Euclidean vector of the nodal values of the numerical solution on the grid molecule. Merging the difference schemes on all molecules into a global system of equations, one obtains L h uh = f h

(2.132)

where u h and f h are the numerical solution and the right-hand side, respectively; they can be viewed as Euclidean vectors of nodal values on the whole grid. However, the convenient choice of norms for u h and f h is not necessarily Euclidean, but rather some scaled versions of Euclidean norms (see below). Exactly in what sense does (2.132) approximate the original differential equation (2.130)? A natural requirement is that the exact solution u ∗ of the differential equation should approximately satisfy the difference equation. To write this condition rigorously, we need to substitute u ∗ into the difference scheme (2.132). Since this scheme operates on the nodal values of u ∗ , a notation for these nodal values is in order. We shall use the calligraphic letter N for this purpose: N u ∗ will mean the Euclidean vector of nodal values of u ∗ on the whole grid. Similarly, N (i) u ∗ is the Euclidean vector of nodal values of u ∗ on a given grid molecule (i).

60

2 Finite-Difference Schemes

n The consistency error vector c ≡ {ci }i=1 of scheme (2.132) is the residual obtained when the exact solution is substituted into the difference equation; that is, (2.133) L h N u ∗ = f h + c

where as before the underscored symbols are Euclidean vectors. The consistency error (a number) is defined as a norm of the error vector:     (2.134) consistency error ≡ c (h) = c k = L h N u ∗ − f h  Fh

where the norm  ·  Fh is typically chosen in such a way that for sufficiently smooth functions it turns in the h → 0 limit into a conventional continuous-space norm; for example, def (2.135)  f h 2Fh = h d  f h 22 where d = 1, 2, 3 is the dimensionality of the problem and of the grid. For a detailed discussion of the choice of norms  · Uh for the numerical solution and  ·  Fh for the right-hand side, see S. K. Godunov & V. S. Ryabenkii (G&R) [GR87a, Sect. 5.13], and also Sect. 7.12.3. There is, however, one additional caveat. According to (2.134), multiplying any difference equation through by, say, h 100 increases the order of the scheme by 100. To remedy this scaling artifact, one may seek a suitable normalization condition; for example, require that the right-hand side have h-independent bounds: c1 f (ri ) ≤ f hi ≤ c2 f (ri ), ∀ node i, ri ∈ 

(2.136)

where c1,2 do not depend on i and h. A different perspective on FD approximation and scaling is introduced in Sect. 4.5. A scheme is called consistent if the consistency error tends to zero as h → 0: 

 c =  L h N u ∗ − u h  Fh → 0 as h → 0

(2.137)

Consistency is usually relatively easy to establish. For example, the Taylor expansions in Sect. 2.6.1 show that the consistency error of the three-point scheme for the Poisson equation in 1D is O(h 2 ); see (2.87)–(2.89). This scheme is therefore consistent. Unfortunately, consistency by itself does not guarantee convergence. To see why, let us compare the difference equations satisfied by the numerical solution and the exact solution, respectively: L h N u ∗ = f h + c

(2.138)

L h uh = f h

(2.139)

2.9 Consistency and Convergence of Difference Schemes

61

These are Eqs. (2.132) and (2.133) written together for convenience. Clearly, systems of equations for the exact solution u ∗ (more precisely, its nodal values N u ∗ ) and for the numerical solution u h have slightly different right hand sides. Consistency error c is a measure of the residual of the difference equation, which is different from the accuracy of the numerical solution of this equation. Does the small difference c in the right-hand sides of (2.138) and (2.139) translate into a comparably small difference in the solutions themselves? If yes, the scheme is called stable. A formal definition of stability is as follows: h ≡ h Uh ≡ u h − N u ∗ Uh ≤ Cc  Fh

(2.140)

where the factor C may depend on the exact solution u ∗ but not on the mesh size h. Stability constant C is linked to the properties of the inverse operator L −1 h . Indeed, subtracting (2.138) from (2.139), one obtains an expression for the error vector: h ≡ u h − N u ∗ = L −1 h c

(2.141)

(assuming that L h is nonsingular). Hence the numerical error can be estimated as h ≡ h Uh ≡ u h − N u ∗ Uh ≤ L −1 h  c  Fh

(2.142)

where the matrix norm for L −1 h is induced by the vector norm, i.e., for a generic square matrix A, AxUh A = max x=0 x Fh (see Appendix 2.10). In summary, convergence of the scheme follows from consistency and stability. This result is known as the Lax–Richtmyer Equivalence Theorem (see e.g. J. C. Strikwerda [Str04] and also Sect. 7.12.3). To find the consistency error of a scheme, one needs to substitute the exact solution into it and evaluate the residual (e.g. using Taylor expansions). This is a relatively straightforward procedure. In contrast, stability (and, by implication, convergence) are in eneral much more difficult to establish. For conventional difference schemes and the Poisson equation, convergence is proved in standard texts (e.g. W. E. Milne [Mil70] or J. C. Strikwerda [Str04]). This convergence result in fact applies to a more general class of monotone schemes. Definition 6 A difference operator L h (and the respective Nm × Nm matrix) is called monotone if L h x ≥ 0 for vector x ∈ R Nm implies x ≥ 0, where vector inequalities are understood entry-wise. In other words, if L h is monotone and L h x has all nonnegative entries, vector x must have all nonnegative entries as well. Algebraic conditions related to monotonicity are reviewed at the end of this subsection.

62

2 Finite-Difference Schemes

To analyze convergence of monotone schemes, the following Lemma will be needed. Lemma 1 If the scheme is scaled according to (2.136) and the consistency condition (2.134) holds, there exists a reference nodal vector u 1h such that u 1h ≤ U1 and L h u 1h ≥ σ1 > 0,

(2.143)

with numbers U1 and σ1 independent of h. (All vector inequalities are understood entry-wise.) Remark 1 (Notation.) Subscript 1 is meant to show that, as seen from the proof below, the auxiliary potential u 1h may be related to the solution of the differential equation with the unit right-hand side. Proof The reference potential u 1h can be found explicitly by considering the auxiliary problem (2.144) Lu 1 = 1 with the same boundary conditions as the original problem. Condition (2.137) applied to the nodal values of u 1 implies that for sufficiently small h the consistency error will fall below 21 c1 , where c1 is the parameter in (2.136):  (i)T (i)  s N u 1 − f hi  ≤ 1 c1 2 Therefore, since f = 1 in (2.136),   1 1 |s (i)T N (i) u 1 | ≥ | f hi | −  f hi − s (i)T N (i) u 1  ≥ c1 − c1 = c1 2 2

(2.145)

(the vector inequality is understood entry-wise). Thus one can set u 1h =L h N u 1 , with  σ1 = 21 c1 and U1 = u 1 ∞ . Theorem 1 Let the following conditions hold for difference scheme (2.132): 1. Consistency in the sense of (2.137), (2.136). 2. Monotonicity : if L h x ≥ 0, then x ≥ 0

(2.146)

Then the numerical solution converges in the nodal norm, and u h − N u ∗ ∞ ≤ c U1 /σ1 where σ1 is the parameter in (2.143).

(2.147)

2.9 Consistency and Convergence of Difference Schemes

63

Proof Let h = u h − N u ∗ . By consistency, L h h ≤ c ≤ c L h u 1h /σ1 = L h (c u 1h /σ1 ) where (2.143) was used. Hence due to monotonicity h ≤ c u 1h /σ1

(2.148)

h ≥ −c u 1h /σ1

(2.149)

It then also follows that Indeed, if that were not true, one would have (−h ) ≤ c u 1h /σ1 , which would contradict the error estimate (2.148) for the system with (− f ) instead of f in the right-hand side.  We now summarize sufficient and/or necessary algebraic conditions for monotonicity. Of particular interest is the relationship of monotonicity to diagonal dominance, as the latter is trivial to check for any given scheme. The summary is based on the monograph of R. S. Varga [Var00] and the reference book of V. Voevodin & Yu. A. Kuznetsov [VK84]. The mathematical facts are cited without proof. Proposition 1 A square matrix A is monotone if and only if it is nonsingular and A−1 ≥ 0. [As a reminder, all matrix and vector inequalities in this section are understood entry-wise.] Definition 7 A square matrix A is called an M-matrix if it is nonsingular, ai j ≤ 0 for all i = j and A−1 ≥ 0. Thus an M-matrix, in addition to being monotone, has nonpositive off-diagonal entries. Proposition 2 All diagonal elements of an M-matrix are positive. Proposition 3 Let a square matrix A have nonpositive off-diagonal entries. Then the following conditions are equivalent: 1. A is an M-matrix. 2. There exists a positive vector w such that A−1 w is also positive. 3. Re λ > 0 for any eigenvalue λ of A. (See [VK84, Sect. 36.15] for additional equivalent statements.) Notably, the second condition above allows one to demonstrate monotonicity by exhibiting just one special vector satisfying this condition, which is simpler than verifying this condition for all vectors as stipulated in the definition of monotonicity. Even more practical is the connection with diagonal dominance [VK84].

64

2 Finite-Difference Schemes

Proposition 4 Let a square matrix A have nonpositive off-diagonal entries. If this matrix has strong diagonal dominance, it is an M-matrix. Proposition 5 Let an irreducible square matrix A have nonpositive off-diagonal entries. If this matrix has weak diagonal dominance, it is an M-matrix. Moreover, all entries of A−1 are then (strictly) positive. A matrix is called irreducible if it cannot be transformed to a block-triangular form by permuting its rows and columns. The definition of weak diagonal dominance for a matrix A is |Aii | ≥ |Ai j | (2.150) j

in each row i. The condition of strong diagonal dominance is obtained by changing the inequality sign to strict. Thus diagonal dominance of matrix L h of the difference scheme is a sufficient condition for monotonicity if the off-diagonal entries of L h are nonpositive. As a measure of the relative magnitude of the diagonal elements, one can use   mini  L h, ii   q =     j L h,i j

(2.151)

with matrix L h being weakly diagonally dominant for q = 0.5 and diagonal for q = 1. Diagonal dominance is a strong condition that unfortunately does not hold in general.

2.10 Summary and Further Reading This chapter is an introduction to the theory and practical usage of finite difference schemes. Classical FD schemes are constructed by the Taylor expansion over grid molecules; this was illustrated in Sects. 2.1–2.2 and parts of Sects. 2.6–2.8. The chapter also touched upon classical schemes (Runge–Kutta, Adams and others) for ordinary differential equations and special schemes that preserve physical invariants of Hamiltonian systems. Somewhat more special are the Collatz “Mehrstellen” schemes for the Poisson equation. These schemes (nine-point in 2D and 19-point in 3D) are described in Sects. 2.7.4 and 2.8.5. Higher approximation accuracy is achieved, in essence, by approximating the solution of the Poisson equation rather than a generic smooth function. We shall return to this idea in Chap. 4 and will observe that the Mehrstellen schemes are, at least for the Laplace equation, a natural particular case of “Flexible Local Approximation MEthods” (FLAME) considered in that chapter. In fact, in FLAME the classic FD schemes and the Collatz Mehrstellen schemes stem from one single principle and one single definition of the scheme. Very important are the schemes based on flux or energy balance for a control volume; see Sects. 2.6.3, 2.7.2, and 2.8.3. Such schemes are known to be quite robust,

2.10 Summary and Further Reading

65

which is particularly important for problems with inhomogeneous media and material interfaces. The robustness can be attributed to the underlying solid physical principles (conservation laws). For further general reading on FD schemes, the interested reader may consider the monographs by L. Collatz [Col66], J. C. Strikwerda [Str04], A. A. Samarskii [Sam01]. A comprehensive source of information not just on FD schemes but also on numerical methods for ordinary and partial differential equations in general is the book by A. Iserles [Ise96]. It covers one-step and multistep schemes for ODE, Runge–Kutta methods, schemes for stiff systems, FD schemes for the Poisson equation, the finite element method, algebraic system solvers, multigrid and other fast solution methods, diffusion and hyperbolic equations. For readers interested in schemes for fluid dynamics, S. V. Patankar’s text [Pat80] may serve as an introduction. A more advanced book by T. J. Chung [Chu02] covers not only finite-difference, but also finite-volume and finite element methods for fluid flow. Also well-known and highly recommended are two monographs by R. J. LeVeque: one on schemes for advection-diffusion equations, with the emphasis on conservation laws [LeV96], and another one with a comprehensive treatment of hyperbolic problems [LeV02a]. The book by H.-G. Roos et al. [RST96], while focusing (as the title suggests) on the mathematical treatment of singularly perturbed convection– diffusion problems, is also an excellent source of information on finite-difference schemes in general. For theoretical analysis and computational methods for fluid dynamics on the microscale, see books by G. Karniadakis et al. [KBA01] and by J. A. Pelesko & D. H. Bernstein [PB02]. Several monographs and review papers are devoted to schemes for electromagnetic applications. The literature on finite-difference time-domain (FDTD) schemes for electromagnetic wave propagation is especially extensive (Chap. 7). The book by A. F. Peterson et al. [PRM98] covers, in addition to FD schemes in both time and frequency domain, integral equation techniques and the finite element method for computational electromagnetics.

Appendix: Frequently Used Vector and Matrix Norms The following vector and matrix norms are used most frequently. x1 =

n

A1 = max

i=1

n i=1

1≤ j≤n

x2 =

|xi |

 n i=1

|Ai j |

|xi |2

 21

(2.152) (2.153)

(2.154)

66

2 Finite-Difference Schemes 1

A2 = max λi2 (A∗ A) 1≤i≤n

(2.155)

where A∗ is the Hermitian conjugate (= the conjugate transpose) of matrix A, and λi are the eigenvalues. (2.156) x∞ = max |xi | 1≤i≤n

A∞ = max

1≤i≤n

n j=1

|Ai j |

(2.157)

See linear algebra textbooks, e.g. Y. Saad [Saa03], R.A. Horn & C.R. Johnson [HJ90], F.R. Gantmakher [Gan90] for further analysis and proofs.

Appendix: Matrix Exponential It is not uncommon for an operation over some given class of objects to be defined in two (or more) different ways which for this class are equivalent. Yet one of these ways could have a broader range of applicability and can hence be used to generalize the definition of the operation. This is exactly the case for the exponential operation. One way to define exp x is via simple arithmetic operations—first for x integer via repeated multiplications, then for x rational via roots, and then for all real x.18 While this definition works well for real numbers, its direct generalization to, say, complex numbers is not straightforward (because of the branches of roots), and generalization to more complicated objects like matrices is even less clear. At the same time, the exponential function admits an alternative definition via the Taylor series ∞ xn (2.158) exp x = n! n=0 that converges absolutely for all x. This definition is directly applicable not only to complex numbers but to matrices and operators. Matrix exponential can be defined as ∞ An exp A = (2.159) n! n=0 where A is an arbitrary square matrix (real or complex). This infinite series converges for any matrix, and exp(A) defined this way can be shown to have many of the usual properties of the exponential function—most notably,

18 The

rigorous mathematical theory—based on either Dedekind’s cuts or Cauchy sequences—is, however, quite involved; see e.g. W. Rudin [Rud76].

Appendix: Matrix Exponential

67

exp((α + β)A) = exp(α A) + exp(β A), ∀α ∈ C, ∀β ∈ C

(2.160)

If A and B are two commuting square matrices of the same size, AB = B A, then exp(A + B) = exp A exp B,

if AB = B A

(2.161)

Unfortunately, for non-commuting matrices this property is not generally true. For a system of ordinary differential equations written in matrix–vector form as dy(t) = Ay, y ∈ Rn dt

(2.162)

the solution can be expressed via matrix exponential in a very simple way: y(t) = exp At y0

(2.163)

Note that if matrices A and A˜ are related via a similarity transform ˜ A = S −1 AS then

(2.164)

˜ −1 AS ˜ = S −1 A˜ 2 S A2 = S −1 ASS

and A3 = S −1 A˜ 3 S, etc.—i.e. powers of A and A˜ are related via the same similarity transform. Substituting this into the Taylor series (2.159) for matrix exponential, one obtains (2.165) exp A = S −1 exp A˜ S This is particularly useful if matrix A is diagonalizable; then A˜ can be made diagonal ˜ is a diagonal matrix containing the and contains the eigenvalues of A, and exp( A) 19 exponents of these eigenvalues. Since matrix exponential is intimately connected with such difficult problems as full eigenvalue analysis and solution of general ODE systems, it is not surprising that the computation of exp(A) is itself highly complex in general. The curious reader may find it interesting to see the “nineteen dubious ways to compute the exponential of a matrix” (C. Moler & C. Van Loan, [ML78, ML03]; see also W. A. Harris et al. [WFS01].

19 Matrices

with distinct eigenvalues are diagonalizable; so are symmetric matrices.

Chapter 3

The Finite Element Method

Elementary, my dear Watson! One of the best known phrases that Sherlock Holmes never said. https://quoteinvestigator.com/-2016/ 07/14/watson/

3.1 Everything Is Variational The finite element method (FEM) belongs to the broad class of variational methods; so it is natural to start this chapter with an introduction and overview of such methods. This section emphasizes the importance of the variational approach to computation: It can be claimed—with only a small bit of exaggeration—that all numerical methods are variational. To understand why, let us consider the Poisson equation in one, two or three dimensions as a model problem: Lu ≡ −∇ 2 u = ρ

in 

(3.1)

This equation describes, for example, the distribution of the electrostatic potential u corresponding to volume charge density ρ if the dielectric permittivity is normalized to unity.1 Solution u is sought in a functional space V () containing functions with a certain level of smoothness and satisfying some prescribed conditions on the boundary of domain ; let us assume zero Dirichlet conditions for definiteness. For purposes of 1 The Poisson equation (3.1) is written

in the SI system. In the Gaussian system, the right-hand side

is 4πρ. © Springer Nature Switzerland AG 2020 I. Tsukerman, Computational Methods for Nanoscale Applications, Nanostructure Science and Technology, https://doi.org/10.1007/978-3-030-43893-7_3

69

70

3 The Finite Element Method

this introduction, the precise mathematical details about the level of smoothness of the right-hand side ρ and the boundary of the 2D or 3D domain  are not critical, and I mention them only as a footnote.2 It is important to appreciate that solution u has infinitely many “degrees of freedom”—in mathematical terms, it lies in an infinite-dimensional functional space. In contrast, any numerical solution can only have a finite number of parameters. A general and natural form of such a solution is a linear combination of a finite number n of linearly independent approximating functions ψα ∈ V (): n  u num (r) = cα ψα (r) (3.2) α=1

where (r) is the position vector, and cα are some coefficients (in the example, real; for other problems, these coefficients could be complex). We may have in mind a set of polynomial functions as a possible example of ψα (ψ1 = 1, ψ2 = x, ψ3 = y, ψ4 = x y, ψ5 = x 2 , etc., in 2D). One important constraint, however, is that these functions must satisfy the Dirichlet boundary conditions, and so only a subset of polynomials will qualify. One of the distinguishing features of finite element analysis is a special procedure for defining piecewise-polynomial approximating functions. This procedure will be discussed in more detail in subsequent sections. The key question now is: What are the “best” parameters cα that would produce the most accurate numerical solution (3.2)? Obviously, we first need to define “best.” It would be ideal to have a zero residual R ≡ Lu num − ρ

(3.3)

in which case the numerical solution would in fact be exact. That being in general impossible, the requirements on R need to be relaxed. While R may not be identically zero, let us require that there be a set of “measures of fitness” of the solution— numbers f β (R)—that are zero: f β (R) = 0,

β = 1, 2, . . . , n

(3.4)

It is natural to have the number of these measures, i.e. the number of conditions (3.4), equal to the number of undetermined coefficients cα in expansion (3.2). In mathematical terms, the numbers f β are functionals: Each of them acts on a function (in this case, R) and produces a number f β (R). The functionals can be real or complex, depending on the problem. f ∈ L 2 (), u ∈ H 2 (), where L 2 and H 2 are the Lebesgue and Sobolev spaces standard in mathematical analysis. The requirements on the smoothness of u are relaxed in the weak formulation of the problem considered later in this chapter. Henri Léon Lebesgue (1875–1941) was a French mathematician who developed measure and integration theory. Sergei L’vovich Sobolev (1908–1989) was a Russian mathematician, renowned for his work in mathematical analysis (Sobolev spaces, weak solutions and generalized functions).

2 The domain is usually assumed to have a Lipschitz-continuous boundary;

3.1 Everything Is Variational

71

To summarize: the numerical solution is sought as a linear combination of n approximating functions, with n unknown coefficients; to determine these coefficients, one imposes n conditions (3.4). As it is difficult to deal with nonlinear constraints, the functionals f β are almost invariably chosen as linear. Example 1 Consider the 1D Poisson equation with the right-hand side ρ(x) = cos x over the interval [−π/2, π/2]: −

π  π d 2u = u =0 = cos x, u − dx2 2 2

(3.5)

The obvious exact solution is u ∗ (x) = cos x. Let us find a numerical solution using the ideas outlined above. Let the approximating functions ψα be polynomials in x. To keep the calculation as simple as possible, the number of approximating functions in this example will be limited to two only. Linear polynomials (except for the one identically equal to zero) do not satisfy the zero Dirichlet boundary conditions and hence are not included in the approximating set. As the solution must be an even function of x, a sensible (but certainly not unique) choice of the approximating functions is  π π x+ , ψ1 = x − 2 2

 π 2 π 2  ψ2 = x − x+ 2 2

(3.6)

The numerical solution is thus u num = u 1 ψ1 + u 2 ψ2

(3.7)

Here u is a Euclidean coefficient vector in R2 with components u 1,2 . Euclidean vectors are underlined to distinguish them from functions of spatial variables. The residual (3.3) then is R = − u 1 ψ1 − u 2 ψ2 − cos x

(3.8)

As a possible example of “fitness measures” of the solution, consider two functionals that are defined as the values of R at points x = 0 and x = π/43 : f 1 (R) = R(0);

f 2 (R) = R

π 4

(3.9)

With this choice of the test functionals, residual R, while not zero everywhere (which would be ideal but ordinarily not achievable), is forced by conditions (3.4) to be zero at least at points x = 0 and x = π/4. Furthermore, due to the symmetry of the problem, R will automatically be zero at x = −π/4 as well; this extra point comes as a bonus in this example. Finally, the residual is zero at the boundary points because 3 It

is clear that these functionals are linear. Indeed, to any linear combination of two different Rs there corresponds a similar linear combination of their pointwise values.

72

3 The Finite Element Method

both exact and numerical solutions satisfy the same Dirichlet boundary condition by construction. The reader may recognize functionals (3.9) as Dirac delta functions δ(x) and δ(x − π/4), respectively. The use of Dirac deltas as test functionals in variational methods is known as collocation; the value of the residual is forced to be zero at a certain number of “collocation points”—in this example, two: x = 0 and x = π/4. The two functionals (3.9), applied to residual (3.8), produce a system of two equations with two unknowns u 1,2 : −u 1 ψ1 (0) − u 2 ψ2 (0) − cos 0 = 0 −u 1 ψ1

π  4

− u 2 ψ2

π  4

− cos

π = 0 4

In matrix–vector form, this system is  Lu = ρ, L = −

     1 ψ1 (0) ψ2 (0) cos 0 √ ; ρ = = 2 ψ1 ( π4 ) ψ2 ( π4 ) cos π4 2

(3.10)

It is not difficult to see that for an arbitrary set of approximating functions ψ and test functionals f the entry L αβ of this matrix is f α (ψβ ). In the present example, with the approximating functions chosen as (3.6), matrix L is easily calculated to be L ≈

  −2 9.869604 −2 2.467401

with seven digits of accuracy. The vector of expansion coefficients then is   −0.3047378 u ≈ 0.03956838 With these values of the coefficients, and with the approximating functions of (3.6), the numerical solution becomes (Fig. 3.1)   π π 2  π 2 π x+ + 0.03956838 x − x+ u num ≈ − 0.3047378 x − 2 2 2 2 (3.11) The numerical error is shown in Fig. 3.2, and its absolute value is in the range of (3 ÷ 8) × 10−3 . The energy norm of this error is ∼0.0198. (Energy norm is defined as  π  21  2 dw 2 dx (3.12)

w E = dx − π2

3.1 Everything Is Variational

73

Fig. 3.1 Solution by collocation (3.11) in Example 1 (solid line) is almost indistinguishable from the exact solution u ∗ = cos x (markers). See also error plot in Fig. 3.2

for any differentiable function w(x) satisfying the Dirichlet boundary conditions.)4 Given that the numerical solution involves only two approximating functions with only two free parameters, the result certainly appears to be remarkably accurate.5 This example, with its more than satisfactory end result, is a good first illustration of variational techniques. Nevertheless the approach described above is difficult to turn into a systematic and robust methodology, for the following reasons: 1. The approximating functions and test functionals (more specifically, the collocation points) have been chosen in an ad hoc way; no systematic strategy is apparent from the example. 2. It is difficult to establish convergence of the numerical solution as the number of approximating functions increases, even if a reasonable way of choosing the approximating functions and collocation points is found. 3. As evident from (3.10), the approximating functions must be twice differentiable. This may be too strong a constraint. It will become apparent in the subsequent sections of this chapter that the smoothness requirements can be relaxed. The following example (Example 2) addresses the convergence issue and produces an even better numerical solution for the 1D Poisson equation considered above. The Finite Element Method covered in the remainder of this chapter provides an elegant framework for resolving all three matters on the list. Example 2 Let us consider the same Poisson equation as in the previous example and the same approximating functions ψ1,2 (3.6). However, the test functionals f 1,2 are now chosen in a different way:  f α (R) =

π 2

− π2

R(x) ψα (x) d x

(3.13)

a more rigorous mathematical context, w would be treated as a function in the Sobolev space H01 [− π2 , π2 ], but for the purposes of this introduction this is of little consequence. 5 Still, an even better numerical solution will be obtained in the following example (Example 2). 4 In

74

3 The Finite Element Method

Fig. 3.2 Error of solution by collocation (3.11) in Example 1. (Note the 10−3 scaling factor)

In contrast with collocation, these functionals “measure” weighted averages rather than pointwise values of R.6 Note that the weights are taken to be exactly the same as the approximating functions ψ; this choice signifies the Galerkin method. Substituting R(x) (3.8) into Galerkin equations (3.13), we obtain a linear system  Lu = ρ,

L αβ = −

π 2

− π2

ψβ (x) ψα (x) d x;

 ρα =

π 2

− π2

ρ(x) ψα (x) d x (3.14)

Notably, the expression for matrix entries L βα can be made more elegant using integration by parts and taking into account zero boundary conditions:  L αβ =

π 2

− π2

ψα (x) ψβ (x) d x

(3.15)

This reveals the symmetry of the system matrix. The symmetry is due to two factors: (i) the operator L of the problem—in this case, Laplacian in the space of functions with zero Dirichlet conditions—is self-adjoint; this allowed the transformation of the integrand to the symmetric form; (ii) the Galerkin method was used. The Galerkin integrals in the expressions for the system matrix (3.15) and the right-hand side (3.14) can be calculated explicitly7 : L =

6 Loosely

π3 105



35 −7π 2

 − 7π 2 ; 2π 4

 ρ =

−4 48 − 4π 2

 (3.16)

speaking, collocation can be viewed as a limiting case of weighted averaging, with the weight concentrated at one point as the Dirac delta. 7 In more complicated cases, numerical quadratures may be needed.

3.1 Everything Is Variational

75

Fig. 3.3 Error of the Galerkin solution (3.7) in Example 2. (Note the 10−4 scaling factor—about an order of magnitude better than for collocation in Fig. 3.2)

Naturally, this matrix is different from the matrix in the collocation method of the previous example (albeit denoted with the same symbol). In particular, the Galerkin matrix is symmetric, while the collocation matrix is not. The expansion coefficients in the Galerkin method are u = L −1 ρ =

1 π7

    −0.3154333 −60π 2 (3π 2 − 28) ≈ 0.03626545 −840 (π 2 − 10)

The numerical values of these coefficients differ slightly from the ones obtained by collocation in the previous example. The Galerkin solution is   π π 2  π 2 π u num ≈ − 0.3154333 x − x+ + 0.03626545 x − x+ 2 2 2 2 (3.17) The error of solution (3.17) is plotted in Fig. 3.3; it is seen to be substantially smaller than the error for collocation. Indeed, the energy norm of this error is ∼ 0.004916, which is about four times less than the same error measure for collocation. The higher accuracy of the Galerkin solution (at least in the energy norm) is not an accident. The following section shows that the Galerkin solution in fact minimizes the energy norm of the error; in that sense, it is the “best” of all possible numerical solutions representable as a linear combination of a given set of approximating functions ψ.

3.2 The Weak Formulation and the Galerkin Method In this section, the variational approach outlined above is cast in a more general and precise form; however, it does make sense to keep the last example (Example 2) in mind for concreteness. Let us consider a generic problem of the form

76

3 The Finite Element Method

Lu = ρ, u ∈ V = V ()

(3.18)

of which the Poisson equation (3.1) is a simple particular case. Here operator L is assumed to be self-adjoint with respect to a given inner product (· , ·) in the functional space V under consideration: (Lu, v) = (u, Lv), ∀u, v ∈ V

(3.19)

The reader unfamiliar with the notion of inner product may view it just as a shorthand notation for integration:  (w, v) ≡



wv d

This definition is not general8 but sufficient in the context of this section. Note that operators defined in different functional spaces (or, more generally, in different domains) are mathematically different, even if they can be described by the same expression. For example, the Laplace operator in a functional space with zero boundary conditions is not the same as the Laplace operator in a space without such conditions. One manifestation of this difference is that the Laplace operator is self-adjoint in the first case but not so in the second. Applying to the operator equation (3.18) inner product with an arbitrary function v ∈ V (in the typical case, multiplying both sides with v and integrating), we obtain (Lu, v) = (ρ, v), ∀v ∈ V

(3.20)

Clearly, this inner-product equation follows from the original one (3.18). At the same time, because v is arbitrary, it can be shown under fairly general mathematical assumptions that the converse is true as well: Original equation (3.18) follows from (3.20); that is, these two formulations are equivalent (see also Sect. 3.4). The left-hand side of (3.20) is a bilinear form in u, v; in addition, if L is selfadjoint, this form is symmetric. This bilinear form will be denoted as L(u, v) (making symbol L slightly overloaded): L(u, v) ≡ (Lu, v), ∀v ∈ V

(3.21)

To illustrate this definition: in Examples 1, 2 this bilinear form is  L(u, v) ≡ −

8 Generally,

π 2 π 2

u  v d x =



π 2 π 2

u  v d x

(3.22)

inner product is a bilinear (sesquilinear in the complex case) (conjugate-)symmetric positive definite form.

3.2 The Weak Formulation and the Galerkin Method

77

Fig. 3.4 Rounding off the corner provides a smooth approximation

The last integration-by-parts transformation appears innocuous but has profound consequences. It replaces the second derivative of u with the first derivative, thereby relaxing the required level of smoothness of the solution. The following illustration is simple but instructive. Let u be a function with a “sharp corner”—something like |x| in Fig. 3.4: it has a discontinuous first derivative and no second derivative (in the sense of regular calculus) at x = 0. However, this function can be approximated, with an arbitrary degree of accuracy, by a smooth one—it is enough just to “round off” the corner. “Accuracy” here is understood in the energy-norm sense: If the smoothed function is denoted with u, ˜ then the approximation error is  

u˜ − u E ≡



d u˜ du − dx dx

21

2 dx

(3.23)

where the precise specification of domain (segment)  is unimportant. For the smooth function u, ˜ both expressions for the bilinear form (3.21) are valid and equal. For u, the first definition, containing u  in the integrand, is not valid, but the second one, with u  , is. It is quite natural to extend the definition of the bilinear form to functions that, while not necessarily smooth enough themselves, can be approximated arbitrarily well—in the energy norm sense—by smooth functions:  L(u, v) ≡



du dv d, u , v ∈ H01 () dx dx

(3.24)

Such functions form the Sobolev space H 1 (). The subspace H01 () ⊂ H 1 () contains functions with zero Dirichlet conditions at the boundary of domain .9 Similarly, for the electrostatic equation (with the dielectric permittivity normalized to unity)

9A

rigorous mathematical characterization of “boundary values” (more precisely, traces) of functions in Sobolev spaces is outside the scope of this book. See R. A. Adams [AF03] or K. Rektorys [Rek80].

78

3 The Finite Element Method

Lu ≡ − ∇ · ∇u = ρ

(3.25)

in a two- or three-dimensional domain  with zero Dirichlet boundary conditions,10 the weak formulation is L(u, v) ≡ (∇u, ∇v) = (ρ, v)

u, v ∈ H01 ()

(3.26)

where arentheses denote the L 32 () inner product of vector functions or the L 2 () iner product of scalar functions, depending on the context:  (∇u, ∇v) ≡



∇u · ∇v d

(3.27)

 (ρ, v) ≡

ρv d

(3.28)



The analysis leading to the weak formulation (3.26) is analogous to the 1D case: The differential equation is inner-multiplied (i.e. multiplied and integrated) with a “test” function v; then integration by parts moves one of the ∇ operators over from u to v, so that the formulation can be extended to a broader class of admissible functions, with the smoothness requirements relaxed. The weak formulation (3.20) (of which (3.26) is a typical example) provides a very natural way of approximating the problem. All that needs to be done is to restrict both the unknown function u and the test function v in (3.20) to a finite-dimensional subspace Vh ⊂ V : (3.29) L(u h , vh ) = (ρ, vh ), ∀vh ∈ Vh () In Examples 1 and 2 space Vh had just two dimensions; in engineering practice, the dimension of this space can be on the order of hundreds of thousands and even millions. Also in practice, construction of Vh typically involves a mesh (this was not the case in Examples 1 and 2, but will be the case in the subsequent sections in this chapter); then subscript “h” indicates the mesh size. If a mesh is not used, h can be understood as some small parameter; in fact, one usually has in mind a family of spaces Vh that can approximate the solution of the problem with arbitrarily high accuracy as h → 0. Let us assume that an approximating space Vh of dimension n has been chosen and that ψα (α = 1, . . . , n) is a known basis set in this space. Then, the approximate solution is a linear combination of the basis functions: uh =

n 

u α ψα

(3.30)

α=1

10 Neumann conditions on the domain boundary and interface boundary conditions between different

media will be considered later.

3.2 The Weak Formulation and the Galerkin Method

79

Here, u is a Euclidean vector of coefficients in Rn (or, in the case of problems with complex solutions, in Cn ). This expansion establishes an intimate relationship between the functional space Vh to which u h belongs and the Euclidean space of coefficient vectors u. If functions ψα are linearly independent, there is a one-to-one correspondence between u h and u. Moreover, the bilinear form L(u h , u h ) induces an equivalent bilinear form over Euclidean vectors: (3.31) (Lu, v) = L(u h , vh ) for any two functions u h , vh ∈ Vh and their corresponding Euclidean vectors u, v ∈ Rn . The left-hand side of (3.31) is the usual Euclidean inner product of vectors, and L is a square matrix. From basic linear algebra, each entry L αβ of this matrix is equal to (Leα , eβ ), where eα is column #α of the identity matrix (the only nonzero entry #α is equal to one); similarly for eβ . At the same time, (Leα , eβ ) is, by definition of L, equal to the bilinear form involving ψα , ψβ ; hence L αβ = (Leα , eβ ) = L(ψα , ψβ )

(3.32)

The equivalence of bilinear forms (3.31) is central in Galerkin methods in general and FEM in particular; it can also be viewed as an operational definition of matrix L. Explicitly the entries of L are defined by the right-hand side of (3.32). Example 3 below should clarify this matter further. The Galerkin formulation (3.29) is just a restriction of the weak continuous formulation to a finite-dimensional subspace, and therefore, the numerical bilinear form inherits the algebraic properties of the continuous one. In particular, if the bilinear form L is elliptic, i.e. if L(u, u) ≥ c (u, u),

∀u ∈ V (c > 0)

(3.33)

where c is a constant, then matrix L is strictly positive definite and, moreover, (Lu, u) ≥ c (Mu, u),

∀u ∈ Rn

(3.34)

Matrix M is such that the Euclidean form (Mu, v) corresponds to the L 2 inner product of the respective functions: (Mu, v) = (u h , vh )

(3.35)

Mαβ = (ψα , ψβ )

(3.36)

so that the entries are

These expressions for matrix M are analogous to expressions (3.31) and (3.32) for matrix L. In FEM, M is often called the mass matrix and L—the stiffness matrix, due to the roles they play in problems of structural mechanics where FEM originated.

80

3 The Finite Element Method

Example 3 To illustrate the connection between Euclidean inner products and the respective bilinear forms of functions, let us return to Example 2 and choose the two coefficients arbitrarily as u 1 = 2, u 2 = −1. The corresponding function is   π π 2  π 2 π x+ − x− u h = u 1 ψ1 + u 2 ψ2 = 2 x − x+ 2 2 2 2

(3.37)

This function of course lies in the two-dimensional space Vh spanned by ψ1,2 . Similarly, let v 1 = 4, v 2 = −3 (also as an arbitrary example); then   π π 2  π 2 π x+ −3 x− x+ (3.38) v h = v 1 ψ1 + v 2 ψ2 = 4 x − 2 2 2 2 In the left-hand side of (3.31), matrix L was calculated to be (3.16), and the Euclidean inner product is  (Lu, v) =

π3 105



35 −7π 2

− 7π 2 − 2π 4

The right-hand side of (3.31) is





π 2 π 2

   2π 5 2π 7 8π 3 2 4 + + , = −1 −3 3 3 35 (3.39)

u h vh d x

where functions u h , vh are given by their expansions (3.37), (3.38). Substitution of these expansions into the integrand above yields exactly the same result as the right-hand side of (3.39), namely 2π 5 2π 7 8π 3 + + 3 3 35 This illustrates that the Euclidean inner product of vectors u, v in (3.31) (of which the left-hand side of (3.39) is a particular case) is equivalent to the bilinear form L(u, v) of functions u, v (of which the right hand side of (3.39) is a particular case). By setting vh consecutively to ψ1 , ψ2 , . . ., ψn in (3.29), one arrives at the following matrix–vector form of the variational formulation (3.29): Lu = ρ

(3.40)

with L αβ = L(ψα , ψβ );

ρα = (ρ, ψα )

This is a direct generalization of system (3.14).

(3.41)

3.3 Variational Methods and Minimization

81

3.3 Variational Methods and Minimization 3.3.1 The Galerkin Solution Minimizes the Error The analysis in this section is restricted to operator L that is self-adjoint in a given functional space V , and the corresponding symmetric (conjugate-symmetric in the complex case) form L(u, v). In addition, if L(u, u) ≥ c(u, u), ∀u ∈ V

(3.42)

for some positive constant c, the form is called elliptic (or, synonymously, coercive). The weak continuous problem is L(u, v) = (ρ, v), u ∈ V ; ∀v ∈ V

(3.43)

We shall assume that this problem has a unique solution u ∗ ∈ V and shall refer to u as the exact solution (as opposed to a numerical one). Mathematical conditions for the existence and uniqueness are cited in Sect. 3.5. The numerical Galerkin problem is obtained by restricting this formulation to a finite-dimensional subspace Vh ⊂ V : ∗

L(u h , vh ) = (ρ, vh ), u h ∈ Vh ; ∀vh ∈ Vh

(3.44)

where u h is the numerical solution. Keep in mind that u h solves the Galerkin problem in the finite-dimensional subspace Vh only; in the full space V , there is, in general, a nonzero residual R(u h , v) ≡ (ρ, v) − L(u h , v) = , v ∈ V

(3.45)

In matrix–vector form, this problem is Lu = ρ

(3.46)

with matrix L and the right-hand side ρ defined in (3.41). If matrix L is nonsingular, a unique numerical solution exists. For an elliptic form L – a particularly important case in theory and practice—matrix L is positive definite and hence nonsingular. The numerical error is (3.47) h = u h − u A remarkable property of the Galerkin solution for a symmetric form L is that it minimizes the error functional E(u h ) ≡ L(h , h ) ≡ L(u h − u, u h − u)

(3.48)

82

3 The Finite Element Method

In other words, of all functions in the finite-dimensional space Vh , the Galerkin solution u hG is the best approximation of the exact solution in the sense of measure (3.48). For coercive forms L, this measure usually has the physical meaning of energy. To prove this minimization property, let us analyze the behavior of functional (3.48) in the vicinity of some u h —that is, examine E(u h + λvh ), where vh ∈ Vh is an increment and λ is an adjustable numerical factor introduced for mathematical convenience. (This factor could be absorbed into vh but, as will soon become clear, it makes sense not to do so. Also, λ can be intuitively understood as “small” but this has no bearing on the formal analysis.) Then, assuming a real form for simplicity, E(u h + λvh ) = L(h + λvh , h + λvh ) = L(h , h ) + 2λL(h , vh ) + λ2 L(vh , vh )

(3.49) At a stationary point of E—and in particular at a maximum or minimum—the term linear in λ must vanish: L(h , vh ) = 0, ∀vh ∈ Vh This condition is nothing other than L(u h , vh ) = L(u, vh ) = ( f, vh ) (The last equality follows from the fact that u is the solution of the weak problem.) This is precisely the Galerkin equation. Thus, the Galerkin solution is a stationary point of functional (3.48). If the bilinear form L is elliptic, expression (3.49) for the variation of the energy functional then indicates that this stationary point is in fact a minimum: The term linear in λ vanishes and the quadratic term is positive for a nonzero vh .

3.3.2 The Galerkin Solution and the Energy Functional Error minimization (in the energy norm sense) is a significant strength of the Galerkin method. A practical limitation of the error functional (3.48), however, is that it cannot be computed explicitly: This functional depends on the exact solution that is unknown. At the same time, for self-adjoint problems, there is another—and computable—functional for which both the exact solution (in the original functional space V ) and the numerical solution (in the chosen finite-dimensional space Vh ) are stationary points. This functional is Fu = (ρ, u) −

1 L(u, u), u ∈ V 2

(3.50)

Indeed, for an increment λv, where λ is an arbitrary number and v ∈ V , we have F ≡ F (u + λv) − F u = (ρ, λv) −

1 1 1 L(λv, u) − L(u, λv) − L(λv, λv) 2 2 2

3.3 Variational Methods and Minimization

83

which for a symmetric real form L is F = λ[(ρ, v) − L(u, v)] −

1 2 λ L(v, v) 2

The zero linear term in λ thus corresponds precisely to the weak formulation of the problem. By a very similar argument, the Galerkin solution is a stationary point of F in Vh . Furthermore, if the bilinear form L is elliptic, the quadratic term λ2 L(v, v) is nonnegative, and the stationary point is a maximum. In electrostatics, magnetostatics and other physical applications, functional F is often interpreted as energy. It is indeed equal to field energy if u is the exact solution of the underlying differential equation (or, almost equivalently, of the weak problem). Other values of u are not physically realizable, and hence F in general lacks physical significance as energy and should rather be interpreted as “action” (an integrated Lagrangian). It is not therefore paradoxical that the solution maximizes— not minimizes—the functional.11 This matter is taken up again in Sect. 6.11 and in Appendix 6.14 in the context of electrostatic simulation. Functional F (3.50) is part of a broader picture of complementary variational principles; see the book by A. M. Arthurs [Art80] (in particular, examples in Sect. 1.4 of his book12 ).

3.4 Essential and Natural Boundary Conditions So far, for brevity of exposition, only Dirichlet conditions on the exterior boundary of the domain were considered. Now let us turn our attention to quite interesting, and in practice very helpful, circumstances that arise if conditions on part of the boundary are left unspecified in the weak formulation. We shall use the standard electrostatic equation in 1D, 2D or 3D as a model: − ∇ · ∇u = ρ in ; u = 0 on ∂ D ⊂ ∂

(3.51)

At first, the dielectric permittivity  will be assumed a smooth function of coordinates; later, we shall consider the case of piecewise-smooth  (e.g. dielectric bodies in a host medium). Note that u satisfies the zero Dirichlet condition only on part of the domain boundary; the condition on the remaining part is left unspecified for now, so the boundary value problem is not yet fully defined. 11 One could reverse the sign of F , in which case the stationary point would be a minimum. However,

this functional would no longer have the meaning of field energy, as its value at the exact solution u would be negative, which is thermodynamically impossible for electromagnetic energy (see L. D. Landau and E. M. Lifshitz [LLP84]. 12 A note for the reader interested in the Arthurs book and examples therein. In the electrostatic case, the quantities in these examples are interpreted as follows: U ≡ D (the electrostatic displacement field), v ≡  (the permittivity), Φ = u (potential), q ≡ ρ (charge density).

84

3 The Finite Element Method

The weak formulation is u, ∀v ∈ H01 (, ∂ D )

(∇u, ∇v) = (ρ, v),

(3.52)

H01 (, ∂ D ) is the Sobolev space of functions that have a generalized derivative and satisfy the zero Dirichlet condition on ∂ D .13 Let us now examine, a little more carefully than we did before, the relationship between the weak problem (3.52) and the differential formulation (3.51). To convert the weak problem into a strong one, one integrates the left-hand side of (3.52) by parts:  ∂u v d S − (∇ · ∇u, v) = (ρ, v) (3.53) ∂n ∂−∂ D It is tacitly assumed that u is such that the differential operator ∇ · ∇u makes sense. Note that the surface integral is taken over the non-Dirichlet part of the boundary only, as the “test” function v vanishes on the Dirichlet part by definition. The key observation is that v is arbitrary. First, as a particular choice, let us consider test functions v vanishing on the domain boundary. In this case, the surface integral in (3.53) disappears, and we have  (∇ · ∇u + ρ, v) ≡



v(∇ · ∇u + ρ) d = 0

(3.54)

This may hold true for arbitrary v only if the integrand I ≡ ∇ · ∇u + ρ

(3.55)

in (3.54) is identically zero. The proof, at least for continuous I , is simple. Indeed, if I were strictly positive at some point r0 inside the domain, it would, by continuity, have to be positive in some neighborhood of that point. By choosing the test function that is positive in the same neighborhood and zero elsewhere (imagine a sharp but smooth peak centered at r0 as such a test function), one arrives at a contradiction, as the integral in (3.54) is positive rather than zero. This argument shows that the Poisson equation must be satisfied for the solution u of the weak problem. Further observations can be made if we now consider a test function that is nonzero on the non-Dirichlet part of the boundary. In the integratedby-parts weak formulation (3.53), the volume integrals, as we now know, must vanish if u is the solution, because the Poisson equation is satisfied. Then, we have

13 These are functions that are either smooth themselves or can be approximated by smooth functions, in the H 1 -norm sense, with any degree of accuracy. Boundary values, strictly speaking, should be considered in the sense of traces (R. A. Adams and J. J. F. Fournier [AF03], K. Rektorys [Rek80]).

3.4 Essential and Natural Boundary Conditions

 ∂−∂ D

v

85

∂u dS = 0 ∂n

(3.56)

Since v is arbitrary, the integrand must be identically zero – the proof is essentially the same as for the volume integrand I in (3.55). We come to the conclusion that solution u must satisfy the Neumann boundary condition ∂u = 0 ∂n

(3.57)

on the non-Dirichlet part of the domain boundary (for  = 0). This is really a notable result. In the weak formulation, if no boundary condition is explicitly imposed on part of the boundary, then the solution will satisfy the Neumann condition. Such “automatic” boundary conditions that follow from the weak formulation are called natural. In contrast, conditions that have to be imposed explicitly are called essential. Dirichlet conditions are essential. For cases other than the model electrostatic problem, a similar analysis is needed to identify natural boundary conditions. As a rule of thumb, conditions containing the normal derivative at the boundary are natural. For example, Robin boundary conditions (a combination of values of u and its normal derivative) are natural. Importantly, the continuity of flux  ∂u/∂n across material interfaces is also a natural condition. The analysis is similar to that of the Neumann condition. Indeed, let Γ be the boundary between materials #1,2 with their respective parameters 1,2 . Separately within each material,  varies smoothly, but a jump may occur across Γ . With the weak problem (3.52) taken as a starting point, integration by parts yields  

  ∂u ∂u d S − (∇ · ∇u, v) = (ρ, v) [. . .] + v  −  ∂n 1 ∂n 2 ∂−∂ D Γ (3.58) Subscripts 1 and 2 indicate that the respective electric flux density  ∂u/∂n is taken in materials 1, 2; n is the unit normal to Γ , directed into material #2 (this choice of direction is arbitrary). The integrand on the exterior boundary is omitted for brevity, as it is the same as considered previously and leads, as we already know, to the Neumann boundary condition on  − ∂ D . Consider first the volume integrals (inner products) in (3.58). Using the fact that v is arbitrary, one can show in exactly the same way as before that the electrostatic differential equation must be satisfied throughout the domain, except possibly for the interface boundary where the differential operator may not be valid in the sense of ordinary calculus. Turning then to the surface integral over Γ and again noting that v is arbitrary on that surface, one observes that the integrand—i.e. the flux jump—across the surface must be zero if u is the solution of the weak problem. This is a great practical advantage because no special treatment of material interfaces is needed. For the model electrostatic problem, the finite element algorithm for heterogeneous media is essentially the same as for the homogeneous case. However, 



86

3 The Finite Element Method

for more complicated problems interface conditions may need special treatment and may result in additional surface integrals.14 It is in principle possible to impose natural conditions explicitly – that is, incorporate them into the definition of the functional space and choose the approximating and test functions accordingly. However, this is usually inconvenient and redundant, and therefore is hardly ever done in practice.

3.5 Mathematical Notes: Convergence, Lax–Milgram and Céa’s Theorems This section summarizes some essential facts about weak formulations and convergence of Galerkin solutions. The mathematical details and proofs are omitted, one exception being a short proof of Céa’s theorem. There are many excellent books on the mathematical theory: an elaborate exposition of variational methods by K. Rektorys [Rek80] and by S. G. Mikhlin [Mik64, Mik65], as well as the wellknown text by R. Weinstock [Wei74]; classical monographs on FEM by P. G. Ciarlet [Cia80], by B. Szabó and I. Babuška [SB91], and a more recent book by S. C. Brenner and L. R. Scott [BS02], among others. Those readers who are not interested in the mathematical details may skip this digest of the underlying mathematical theory without substantial harm to their understanding of the rest of the chapter. Theorem 2 (Lax–Milgram [BS02, Rek80]) Given a Hilbert space V , a continuous and elliptic bilinear form L(· , ·) and a continuous linear functional f ∈ V  , there exists a unique u ∈ V such that L(u, v) = f (v),

∀v ∈ V

(3.59)

As a reminder, a bilinear form is elliptic if L(u, u) ≥ c1 (u, u),

∀u ∈ V

and continuous if L(u, v) ≤ c2 u v ,

∀u, v ∈ V

for some positive constants c1,2 . Here, the norm is induced by the inner product: 1

v ≡ (v, v) 2

(3.60)

14 One interesting example is a hybrid formulation of eddy current problems, with the magnetic vector potential inside a conducting body and the magnetic scalar potential outside. The weak formulation contains a surface integral on the boundary of the conductor. The interested reader may see C. R. I. Emson and J. Simkin [ES83], D. Rodger [Rod83] for the formulation and my mathematical analysis in [Tsu90].

3.5 Mathematical Notes: Convergence, Lax–Milgram and Céa’s Theorems

87

Finally in the formulation of the Lax–Milgram theorem, V  is the space of continuous linear functionals over V . A linear functional is continuous if f (v) ≤ c v , where c is some constant. Conditions of the Lax–Milgram theorem correspond to the weak formulations of many problems of mathematical physics, including the model electrostatic problem of the previous section. The Lax–Milgram theorem establishes uniqueness and existence of the (exact) solution of such problems. Under the Lax–Milgram conditions, it is clear that uniqueness and existence also hold in any subspace of V —in particular, for the approximate Galerkin solution. The Lax–Milgram theorem can be proved easily for symmetric forms. Indeed, if L is symmetric (in addition to its continuity and ellipticity required by the conditions of the theorem), this form represents an inner product in V : [u, v] ≡ L(u, v). Then f (v), being a linear continuous functional, can be by the Riesz Representation Theorem (one of the basic properties of Hilbert spaces) expressed via this new inner product as f (v) = [u, v] ≡ L(u, v), which is precisely what the Lax–Milgram theorem states. The more complicated proof for nonsymmetric forms is omitted. Theorem 3 (Céa [BS02, Rek80]) Let V be a subspace of a Hilbert space H and L(· , ·) be a continuous elliptic (but not necessarily symmetric) bilinear form on V . Let u ∈ V be the solution of equation (3.59) from the Lax–Milgram theorem. Further, let u h be the solution of the Galerkin problem L(u h , vh ) = f (vh ),

∀vh ∈ Vh

(3.61)

in some finite-dimensional subspace Vh ⊂ V . Then

u − u h ≤

c2 min u − v

c1 v∈Vh

(3.62)

where c1 and c2 are the ellipticity and continuity constants of the bilinear form L. Céa’s theorem is a principal result, as it relates the error of the Galerkin solution to the approximation error. The latter is much more easily amenable to analysis: Good approximation can be produced by various forms of interpolation, while the Galerkin solution emerges from solving a large system of algebraic equations. For a symmetric form L and for the norm induced by L, constants c1,2 = 1 and the Galerkin solution is best in the energy-norm sense, as we already know. Proof The error of the Galerkin solution is h ≡ u h − u,

u h ∈ Vh

(3.63)

where u is the (exact) solution of the weak problem (3.59) and u h is the solution of the Galerkin problem (3.61). This error itself satisfies a weak problem obtained simply by subtracting the Galerkin equation from the exact one: L(h , vh ) = 0,

∀vh ∈ Vh

(3.64)

88

3 The Finite Element Method

This can be interpreted as a generalized orthogonality relationship: The error is “Lorthogonal” to Vh . (If L is not symmetric, it does not induce an inner product, so the standard definition of orthogonality does not apply.) Such an interpretation has a clear geometric meaning: The Galerkin solution is a projection (in a proper sense) of the exact solution onto the chosen approximation space. Then, we have L(h , h ) ≡ L(h , u h − u) = L(h , u h − vh − u), ≡ L(h , wh − u); vh ∈ Vh The first identity is trivial, as it reiterates the definition of the error. The second equality is crucial and is due to the generalized orthogonality (3.64). The last identity is just a variable change, wh = u h − vh . Using now the ellipticity and continuity of the bilinear form, we get c1 h 22 = c1 (h , h ) ≤ L(h , h ) = L(h , wh − u) ≤ c2 h wh − u

which, after dividing through by h , yields precisely the result of Céa’s theorem: c2 h ≤ c1 wh − u

 Céa’s theorem simplifies error analysis greatly: It is in general extremely difficult to evaluate the Galerkin error directly because the Galerkin solution emerges as a result of solving a (usually large) system of equations; it is much easier to deal with some good approximation wh of the exact solution (e.g. via an interpolation procedure). Céa’s theorem relates the Galerkin solution error to the approximation error via the stability and continuity constants of the bilinear form. From a practical point of view, Céa’s theorem is the source of robustness of the Galerkin method. In fact, the Galerkin method proves to be surprisingly reliable even for non-elliptic forms: although Céa’s theorem is silent about that case, a more general result known as the Ladyzhenskaya–Babuška–Brezzi (or just LBB) condition15 is available (O. A. Ladyzhenskaya [Lad69], I. Babuška, [Bab58], F. Brezzi [Bre74]; see also B. Szabó and I. Babuška [SB91], I. Babuška and T. Strouboulis [BS01] and Appendix 3.10.1).

3.6 Local Approximation in the Finite Element Method The Galerkin method happily resolves (at least for elliptic problems) two of the three shortcomings of collocation listed on Sect. 3.2. Indeed, the way to choose the test functions is straightforward (they are the same as the approximating functions), and Céa’s theorem gives an error bound for the Galerkin solution. 15 Occasionally

used with some permutations of the names.

3.6 Local Approximation in the Finite Element Method

89

The only missing ingredient is a procedure for choosing “good” approximating functions. The Finite Element Method does provide such a procedure, and the following sections explain how it works in one, two and three dimensions. The guiding principle is local approximation of the solution. This usually makes perfect physical sense. It is true that in certain cases a global approximation over the whole computational domain is effective—these cases usually involve homogeneous media with a smooth distribution of sources or no sources at all, with the field approximated by a Fourier series or a polynomial expansion. However, in practical problems, local geometric and physical features of systems and devices, with the corresponding local behavior of fields and potentials, is typical. Discontinuities at material interfaces, peaks, boundary layers, complex behavior at edges and corners and many other features make it all but impossible to approximate the solution globally.16 Local approximation in FEM is closely associated with a mesh: The computational domain is subdivided into small subdomains – elements. A large assortment of geometric shapes of elements can be used: Triangular or quadrilateral are most common in 2D, tetrahedral and hexahedral—in 3D. Note that the term “element” is overloaded: Depending on the context, it may mean just the geometric figure or, in addition to that, the relevant approximating space and degrees of freedom (more about that later). For example, linear and quadratic approximations over a triangle give rise to different finite elements in the sense of FEM, even though the geometric figure is the same. For illustration, Figs. 3.5, 3.6 and 3.7 present FE meshes for a few particles of arbitrary shapes—the first two of these figures in 2D, and the third one in 3D. The mesh in the second figure (Fig. 3.6) was obtained by global refinement of the mesh in the first figure: Each triangular element was subdivided into four. Mesh refinement can be expected to produce a more accurate numerical solution, albeit at a higher computational cost. Global refinement is not the most effective procedure: A smarter way is to make an effort to identify the areas where the numerical solution is least accurate and refine the mesh there. This idea leads to local adaptive mesh refinement (Sect. 3.13). Each approximating function in FEM is nonzero only over a small number of adjacent elements and is thus responsible for local approximation without affecting the approximation elsewhere. The following sections explain how this is done.

16 Analytical

approximations over homogeneous subdomains, with proper matching conditions at the interfaces of these subdomains, can be a viable alternative but is less general than FEM. One example is the Multiple Multipole Method popular in some areas of high-frequency electromagnetic analysis and optics; see e.g. T. Wriedt (ed.), [Wri99].

90 Fig. 3.5 Illustrative example of a finite element mesh in 2D

Fig. 3.6 Global refinement of the mesh of Fig. 3.5, with each triangular element subdivided into four by connecting the midpoints of the edges

3 The Finite Element Method

3.7 The Finite Element Method in One Dimension

91

Fig. 3.7 Example of a finite element mesh in 3D

Fig. 3.8 “Hat” function for first-order 1D elements

3.7 The Finite Element Method in One Dimension 3.7.1 First-Order Elements In one dimension, the computational domain is a segment [a, b], the mesh is a set of nodes x0 = a, x1 , . . ., xn = b, and the elements (in the narrow geometric sense) are the segments [xi−1 , xi ], i = 1,2, . . ., n. The simplest approximating function is shown in Fig. 3.8 and is commonly called a “hat function” or, much less frequently, a “tent function.”17 The hat functions form a convenient basis of the simplest finite element vector space, as discussed in more detail below. 17 About

50 times less, according to Google. “Hut function” also makes some intuitive sense but is used very infrequently.

92

3 The Finite Element Method

For notational convenience only, we shall often assume that the grid is uniform; i.e. the grid size h = xi − xi−1 is the same for all nodes i. For nonuniform grids, there are no conceptual changes and only trivial differences in the algebraic expressions. A formal expression for ψi on a uniform grid is ⎧ −1 ⎨ h (x − xi−1 ), xi−1 ≤ x ≤ xi h −1 (xi+1 − x), xi ≤ x ≤ xi+1 ψi (x) = ⎩ 0 otherwise

(3.65)

The hat function ψi straddles two adjacent elements (segments) and satisfies the obvious Kronecker delta property on the grid: It is equal to one at xi and zero at all other nodes. This property is not critical in theoretical analysis but is very helpful in practice. In particular, for any smooth function u(x), piecewise-linear interpolation on the grid can be written simply as the linear combination u interp (x) =

n 

u(xi ) ψi

i=1

Indeed, the fact that the nodal values of u and u interp are the same follows directly from the Kronecker delta property of the ψs. We now have all the prerequisites for solving an example problem. Example 4 −

d 2u = sin x, dx2

 = [0, π], u(0) = u(π) = 0

(3.66)

The obvious theoretical solution u(x) = sin x is available for evaluating the accuracy of the finite element result. Let us use a uniform grid x0 = 0, x1 = h, . . ., xn = π with the grid size h = π/n. In numerical experiments, the number of nodes will vary, and we can expect higher accuracy (at higher computational cost) for larger values of n. The weak formulation of the problem is 

π

0

du dv dx = dx dx



π 0

sin x v(x) d x, u, ∀v ∈ H01 ([0, π])

(3.67)

The FE Galerkin formulation is simply a restriction of the weak problem to the subspace P0h ([0, π]) of piecewise-linear functions satisfying zero Dirichlet conditions; this is precisely the subspace spanned by the hat functions ψ1 , . . . , ψn−1 :18 

π

0

du h dvh dx = dx dx



π

vh (x) sin x d x, u h , ∀vh ∈ P0h ([0, π])

(3.68)

0

ψ0 and ψn are not included, as they do not satisfy the Dirichlet conditions. Implementation of boundary conditions will be discussed in more detail later.

18 Functions

3.7 The Finite Element Method in One Dimension

93

As we know, formulation can be cast in matrix–vector form by substituting the this n−1 u hi ψi for u h and by setting vh , sequentially, to ψ1 , . . ., ψn−1 to expansion i=1 obtain (n − 1) equations for (n − 1) unknown nodal values u hi : Lu = f ,

u, f ∈ Rn−1

(3.69)

where, as we also know, the entries of matrix L and the right-hand side f are 

π

Li j = 0

dψi dψ j d x; dx dx

 fi =

π

ψi (x) sin x d x

(3.70)

0

As already noted, the discrete problem, being just a restriction of the continuous one to the finite-dimensional FE space, inherits the algebraic properties of the continuous formulation. This implies that the global stiffness matrix L is positive definite in this example (and in all cases where the biliniear form of the problem is elliptic). Equally important is the sparsity of the stiffness matrix: most of its entries are zero. Indeed, the Galerkin integrals for L i j in (3.70) are nonzero only if ψi and ψ j are simultaneously nonzero over a certain finite element. This implies that either i = j or nodes i and j are immediate neighbors. In 1D, the global matrix is therefore tridiagonal. In 2D and 3D, the sparsity pattern of the FE matrix depends on the topology of the mesh and on the node numbering (see Sects. 3.8 and 3.8). Algorithmically, it is convenient to compute these integrals on an element-byelement basis, gradually accumulating the contributions to the integrals as the loop over all elements progresses. Clearly, for each element the nonzero contributions will come only from functions ψi and ψ j that are both nonzero over this element. For element #i—that is, for segment [xi−1 , xi ]—there are four such nonzero contributions altogether:  elem i L i−1,i−1 =

 elem i = L i−1,i

xi xi−1

xi xi−1

dψi−1 dψi−1 dx = dx dx

dψi−1 dψi dx = dx dx



xi xi−1



xi xi−1

1 1 1 dx = h h h

1 1 −1 dx = − h h h

elem i elem i L i,i−1 = L i−1,i by symmetry elem i L i,i =

 f i−1 =

xi xi−1

elem i (same as L i−1,i−1 )

ψi−1 (x) sin x d x =

sin xi − xi cos xi + xi−1 cos xi − sin xi−1 h

ψi (x) sin x d x = −

sin xi − xi cos xi−1 + xi cos xi − sin xi−1 h

xi xi−1

 fi =

1 h

94

3 The Finite Element Method

These results can be conveniently arranged into a 2 × 2 matrix L

elem i

1 = h



1 −1

 −1 1

(3.71)

called, for historical reasons, the element stiffness matrix, and the element contribution to the right-hand side is a vector f elem i =

1 h



sin xi − xi cos xi + xi−1 cos xi − sin xi−1 − sin xi + xi cos xi−1 − xi−1 cos xi−1 + sin xi−1

 (3.72)

Remark 2 A word of caution: In the engineering literature, it is not uncommon to introduce “element equations” of the form L elem i u elem i = f elem i (!!??) Such equations are devoid of mathematical meaning. The actual Galerkin equation involves a test function that spans a group of adjacent elements (two in 1D), and so there is no valid equation for a single element. Incidentally, triangular meshes have approximately two times more elements than nodes; so, if “element equations” were taken seriously, there would be about twice as many equations as unknowns! A sample MATLAB code at the end of this Sect. 3.8.1 gives a “no-frills” implementation of the FE algorithm for the 1D model problem. To keep the code as simple as possible, much of the formulation is hard-coded, including the specific interval , expressions for the right-hand side and (for verification and error analysis) the exact solution. The only free parameter is the number of elements n. In actual computational practice, such hard-coding should of course be avoided. Commercial FE codes strive to provide maximum flexibility in setting up geometrical and physical parameters of the problem, with a convenient user interface. Some numerical results are shown in the following figures. Figure 3.9 provides a visual comparison of the FE solutions for 6 and 12 finite elements with the exact solution. Not surprisingly, the solution with 12 elements is more accurate. Figure 3.10 displays several measures of the error: • The relative nodal error defined as nodal =

u − N u ∗

N u ∗

where u ∈ Rn−1 is the Euclidean vector of nodal values of the FE solution, u ∗ (x) is the exact solution, and N u ∗ denotes the vector of nodal values of u ∗ on the grid. • The L 2 norm of the error  L2 = u h − u ∗

3.7 The Finite Element Method in One Dimension

95

Fig. 3.9 FE solutions with 8 and 16 elements versus the exact solution sin x

This error measures the discrepancy between the numerical and exact solutions as functions over [0, π] rather than Euclidean vectors of nodal values. • The L 2 norm of the derivative    d(u h − u ∗ )    H 1 =   dx For the zero Dirichlet boundary conditions, this norm differs by no more than a constant factor from the H1 -norm; hence the notation. Due to the simplicity of this example and of the exact solution, these measures can be computed up to the roundoff error. For more realistic problems, particularly in 2D and 3D, the errors can only be estimated. In Fig. 3.10, the three error measures are plotted versus the number of elements. The linearity of the plots on the log–log scale implies that the errors are proportional to h γ , and the slopes of the lines correspond to γ = 2 for the nodal and L 2 errors and γ = 1 for the H1 error. The derivative of the solution is computed less accurately than the solution itself. This certainly makes intuitive sense and also agrees with theoretical results quoted in Sect. 3.10. Example 5 How will the numerical procedure change if the boundary conditions are different? First consider inhomogeneous Dirichlet conditions. Let us assume that in the previous example the boundary values are u(0) = 1, u(π) = −1, so that the exact solution is now u ∗ (x) = cos x. In the hat-function expansion of the (piecewise-linear) FE solution n  u hi ψi (x) u h (x) = i=0

the summation now includes boundary nodes in addition to the interior ones. However, the coefficients u h0 and u hn at these nodes are the known Dirichlet values, and

96

3 The Finite Element Method

Fig. 3.10 Several measures of error versus the number of elements for the 1D model problem: relative nodal error, L 2 error norm, H1 error norm. Note the log–log scale

hence no Galerkin equations with test functions ψ0 and ψn are necessary. In the Galerkin equation corresponding to the test function ψ1 , (ψ0 , ψ1 ) u h0 + (ψ1 , ψ1 ) u h1 = ( f, ψ1 ) the first term is known and gets moved to the right-hand side: (ψ1 , ψ1 ) u h1 = ( f, ψ1 ) − (ψ0 , ψ1 ) u h0

(3.73)

As usual, parentheses in these expressions are L 2 inner products and imply integration over the computational domain. Neumann conditions in the Galerkin formulation are natural19 and therefore do not require any algorithmic treatment: Elements adjacent to the Neumann boundary are treated exactly the same as interior elements. The necessary algorithmic adjustments should now be clear. There is no change in the computation of element matrices. However, whenever an entry of the element matrix corresponding to a Dirichlet node is encountered,20 this entry is not added to the global system matrix. Instead, the right-hand side is adjusted as prescribed by (3.73). A similar adjustment is made for the other boundary node (xn = π) as well. In 2D and 3D problems, there may be many Dirichlet nodes, and all of them are handled in a similar manner. The appropriate changes in the MATLAB code are left as an exercise for the interested reader. The FE solution for a small number of elements is compared with the exact solution (cos x) in Fig. 3.11, and the error measures are shown as a function of the number of elements in Fig. 3.12.

19 We showed that Neumann conditions are natural – i.e. automatically satisfied—by the solution of the continuous weak problem. The FE solution does not, as a rule, satisfy the Neumann conditions exactly but should do so in the limit of h → 0, although this requires a separate proof. 20 Clearly, this may happen only for elements adjacent to the boundary.

3.7 The Finite Element Method in One Dimension

97

Fig. 3.11 FE solutions versus exact solution cos x of a 1D test problem with nonzero Dirichlet conditions

Fig. 3.12 Numerical errors for a 1D test problem with nonzero Dirichlet conditions. Note the log–log scale

Despite its simplicity, the one-dimensional example above contains the key ingredients of general FE algorithms: 1. Mesh generation and the choice of FE approximating functions. In the 1D example, “mesh generation” is trivial, but it becomes complicated in 2D and even more so in 3D. Only piecewise-linear approximating functions have been used here so far; higher-order functions are considered in the subsequent sections. 2. Local and global node numbering. For the computation of element matrices (see below), it is convenient to use local numbering (e.g. nodes 1, 2 for a segment in 1D, nodes 1, 2, 3 for a triangular element in 2D, etc.) At the same time, some global numbering of all mesh nodes from 1 to n is also needed. This global numbering is produced by a mesh generator that also puts local node numbers for each element in correspondence with their global numbers. In the 1D example, mesh generation is trivial, and so is the local-to-global association of node numbers: For element (segment) #i, (i = 1,2, . . . , n), local node 1 (the left node) corresponds to global node i − 1, and local node 2 corresponds to global node i. The 2D and 3D cases are considered in Sects. 3.8 and 3.9.

98

3 The Finite Element Method

3. Computation of element matrices and of element-wise contributions to the right-hand side. In the 1D example, these quantities were computed analytically; in more complicated cases, when analytical expressions are unavailable (this is frequently the case for curved or high-order elements in 2D and 3D), Gaussian quadratures are used. 4. Assembly of the global matrix and of the right-hand side. In a loop over all elements, the element contributions are added to the global matrix and to the righthand side; in the FE langauge, the matrix and the right-hand side are “assembled” from element-wise contributions. The entries of each element matrix are added to the respective entries of the global matrix and right-hand side. See Sect. 3.8 for more details in the 2D case. 5. The treatment of boundary conditions. The Neumann conditions in 1D, 2D or 3D do not require any special treatment—in other words, the FE algorithm may simply “ignore” these conditions and the solution will, in the limit, satisfy them automatically. The Robin condition containing a combination of the potential and its normal derivative is also natural but results in an additional boundary integral that will not be considered here. Finally, the Dirichlet conditions have to be taken into account explicitly. The following algorithmic adjustment is made in the loop over all elements. If L i j is an entry of the element matrix and j is a Dirichlet node but i is not, then L i j is not added to the global stiffness matrix. Instead, the quantity L i j u j , where u j is the known Dirichlet value of the solution at node j, is subtracted from the right-hand side entry f i , as prescribed by equation (3.73). If both i and j are Dirichlet nodes, L i j is set to zero. 6. Solution of the FE system of equations. System solvers are reviewed in Sect. 3.11. 7. Postprocessing of the results. This may involve differentiation of the solution (to compute fields from potentials), integration over surfaces (to find field fluxes, etc.), and various contour, line or surface plots. Modern commercial FE packages have elaborate postprocessing capabilities and sophisticated graphical user interface; this subject is largely beyond the scope of this book, but some illustrations can be found in Chap. 8. At the same time, there are several more advanced features of FE analysis that are not evident from the 1D example and will be considered (at a varying level of detail) in the subsequent sections of this chapter: • Curved elements—used in 2D and 3D for more accurate approximation of curved boundaries. • Adaptive mesh refinement (Sect. 3.13). The mesh is refined locally, in the subregions where the numerical error is estimated to be highest. (In addition, the mesh may be unrefined in subregions with lower errors.) The problem is then solved again on the new grid. The key to the success of this strategy is a sensible error indicator that is computed a posteriori, i.e. after the FE solution is found. • Vector finite elements (Sect. 3.12). The most straightforward way of dealing with vector fields in FE analysis is to approximate each Cartesian component separately by scalar functions. While this approach is adequate in some cases, it turns out not

3.7 The Finite Element Method in One Dimension

99

to be the most solid one in general. One deficiency is fairly obvious from the outset: Some field components are discontinuous at material interfaces, which is not a natural condition for scalar finite elements and requires special constraints. This is, however, only one manifestation of a deeper mathematical structure: Fundamentally, electromagnetic fields are better understood as differential forms (Sect. 3.12).

A Sample MATLAB Code for the 1D Model Problem function FEM_1D_example1 = FEM_1D_example1_Poisson(n) % Finite element solution of the Poisson equation % -u’’ = sin x on [0, pi]; u(0) = u(pi) = 0 % Input: % n -- number of elements domain_length = pi; % hard-coded for simplicity of this sample code h = domain_length / n; % mesh size (uniform mesh assumed) % Initialization: system_matrix = sparse(zeros(n-1, n-1)); rhs = sparse(zeros(n-1, 1)); % Loop over all elements (segments) for elem_number = 1 : n node1 = elem_number - 1; node2 = elem_number; % Coordinates of nodes: x1 = h*node1; x2 = x1 + h; % Element stiffness matrix: elem_matrix = 1/h * [1 -1; -1 1]; elem_rhs = 1/h * [sin(x2) - x2 * cos(x2) + x1 * cos(x2) - sin(x1); ... -(sin(x2) - x2 * cos(x1) + x1 * cos(x1) - sin(x1))]; % Add element contribution to the global matrix if node1 ˜= 0 % contribution for nonzero Dirichlet condition only system_matrix(node1, node1) = system_matrix(node1, node1) ... + elem_matrix(1, 1); rhs(node1) = rhs(node1) + elem_rhs(1); end if (node1 ˜= 0) && (node2 ˜= n) % contribution for nonzero % Dirichlet condition only system_matrix(node1, node2) = system_matrix(node1, node2) ... + elem_matrix(1, 2); system_matrix(node2, node1) = system_matrix(node2, node1) ... + elem_matrix(2, 1); end if node2 ˜= n % contribution for nonzero Dirichlet condition only system_matrix(node2, node2) = system_matrix(node2, node2) ... + elem_matrix(2, 2);

100

3 The Finite Element Method

rhs(node2) = rhs(node2) + elem_rhs(2); end end % end element cycle u_FEM = system_matrix \ rhs;

% refrain from using % matrix inversion inv()!

FEM_1D_example1.a = 0; FEM_1D_example1.b = pi; FEM_1D_example1.n = n; FEM_1D_example1.u_FEM = u_FEM; return;

3.7.2 Higher-Order Elements There are two distinct ways to improve the numerical accuracy in FEM. One is to reduce the size h of (some or all) the elements; this approach is known as (local or global) h-refinement. Remark 3 It is very common to refer to a single parameter h as the “mesh size,” even if finite elements in the mesh have different sizes (and possibly even different shapes). With this terminology, it is tacitly assumed that the ratio of maximum/minimum element sizes is bounded and not too large; then the difference between the minimum, maximum or some average size is relatively unimportant. However, several recursive steps of local mesh refinement may result in a large disparity of the element sizes; in such cases, reference to a single mesh size would be misleading. The other way to improve the accuracy is to increase the polynomial order p of approximation within (some or all) elements; this is (local or global) p-refinement. Let us start with second-order elements in one dimension. Consider a geometric element—in 1D, a segment of length h. We are about to introduce quadratic polynomials over this element; since these polynomials have three free parameters, it makes sense to deal with their values at three nodes and to place these nodes at x = 0, h/2, h relative to a local coordinate system. The canonical approximating functions satisfy the Kronecker delta conditions at the nodes. The first function is thus equal to one at node #1 and zero at the other two nodes; this function is easily found to be ψ1 =

2 h2

 x−

 h (x − h) 2

(3.74)

(The factors in the parentheses are due to the roots at h/2 and h; the scaling coefficient 2/ h 2 normalizes the function to ψ1 (0) = 1.) Similarly, the remaining two functions are

3.7 The Finite Element Method in One Dimension

101

Fig. 3.13 Three quadratic basis functions over one 1D element. h = 0.1 as an example

4 x(h − x) h2   2 h ψ3 = 2 x x − h 2 ψ2 =

(3.75)

(3.76)

Figure 3.13 displays all three quadratic approximating functions over a single 1D element. While the “bubble” ψ2 is nonzero within one element only, functions ψ1,3 actually span two adjacent elements, as shown in Fig. 3.14. The entries of the element stiffness matrix L and mass matrix M (that is, the Gram matrix of the ψs) are  h

Li j = 0

ψi ψ j d x

where the prime sign denotes the derivative, and  Mi j =

h

ψi ψ j d x

0

These matrices can be computed by straightforward integration: ⎛ 7 1 ⎝ −8 L = 3h 1

−8 16 −8

⎞ 1 − 8⎠ 7

(3.77)

102

3 The Finite Element Method

Fig. 3.14 Quadratic basis function over two adjacent 1D elements. h = 0.1 as an example

⎛ 4 2 h ⎝ 2 16 M = 30 −1 2

⎞ −1 2 ⎠ 4

(3.78)

Naturally, both matrices are symmetric. The matrix assembly procedure for second-order elements in 1D is conceptually the same as for first-order elements. There are some minor differences: • For second-order elements, the number of nodes is about double the number of elements. • Consequently, the correspondence between the local node numbers (1, 2, 3) in an element and their respective global numbers in the grid is a little less simple than for first-order elements. • The element matrix is 3 × 3 for second-order elements versus 2 × 2 for first-order ones; the global matrices are five and three diagonal, respectively. Elements of order higher than two can be introduced in a similar manner. The element of order n is, in 1D, a segment of length h with n + 1 nodes x0 , x1 , . . ., xn = x0 + h. The approximating functions are polynomials of order n. As with first- and second-order elements, it is most convenient if polynomial #i has the Kronecker delta property: equal to one at the node xi and zero at the remaining n nodes. This is the Lagrange interpolating polynomial i (x) =

(x − x0 )(x − x1 ) . . . (x − xi−1 )(x − xi+1 ) . . . (x − xn ) (xi − x0 )(xi − x1 ) . . . (xi − xi−1 )(xi − xi+1 ) . . . (xi − xn )

(3.79)

3.7 The Finite Element Method in One Dimension

103

Indeed, the roots of this polynomial are x0 , x1 , . . ., xi−1 , xi+1 , . . ., xn , which immediately leads to the expression in the numerator. The denominator is the normalization factor needed to make i (x) equal to one at x = xi . The focus of this chapter is on the main ideas of finite element analysis rather than on technical details. With regard to the computation of element matrices, assembly procedures and other implementation issues for high-order elements, I defer to more comprehensive FE texts cited at the end of this chapter.

3.8 The Finite Element Method in Two Dimensions 3.8.1 First-Order Elements In two dimensions, most common element shapes are triangular (by far) and quadrilateral. Figure 3.15 gives an example of a triangular mesh, with the global node numbers displayed. Element numbering is not shown to avoid congestion in the figure. This section deals with first-order triangular elements. The approximating functions are linear over each triangle and continuous in the whole domain. Each approximating function spans a cluster of elements (Fig. 3.16) and is zero outside that cluster. Expressions for element-wise basis functions can be derived in a straightforward way. Let the element nodes be numbered 1, 2, 321 in the counterclockwise direction22 and let the coordinates of node i (i = 1,2,3) be xi , yi . As in the 1D case, it is natural to look for the basis functions satisfying the Kronecker delta condition. More specifically, the basis function ψ1 = a1 x + b1 y + c1 , where a1 , b1 and c1 are coefficients to be determined, is equal to one at node #1 and zero at the other two nodes: a1 x1 + b1 y1 + c1 = 1 a1 x2 + b1 y2 + c1 = 0 (3.80) a1 x3 + b1 y3 + c1 = 0 or equivalently in matrix–vector form ⎛

X d1 = e1 ,

⎛ ⎞ ⎛ ⎞ ⎞ x1 y1 1 a1 1 X = ⎝x2 y2 1⎠ ; d1 = ⎝b1 ⎠ ; e1 = ⎝0⎠ 0 c1 x3 y3 1

(3.81)

21 These are local numbers that have their corresponding global numbers in the mesh; for example, in the shaded element of Fig. 3.15 (bottom) global nodes 179, 284 and 285 could be numbered as 1, 2, 3, respectively. 22 The significance of this choice of direction will become clear later.

104

3 The Finite Element Method

Fig. 3.15 Example of a triangular mesh with node numbering (top) and a fragment of the same mesh (bottom)

Similar relationships hold for the other two basis functions, ψ2 and ψ3 , the only difference being the right-hand side of system (3.81). It immediately follows from (3.81) that the coefficients a, b, c for all three basis functions can be collected together in a compact way: ⎛ ⎞ a1 a2 a3 X D = I, D = ⎝b1 b2 b3 ⎠ (3.82) c1 c2 c3

3.8 The Finite Element Method in Two Dimensions

105

Fig. 3.16 Piecewise-linear basis function in 2D over a cluster of triangular elements. Circles indicate mesh nodes. The basis function is represented by the surface of the pyramid

where I is the 3 × 3 identity matrix. Hence, the coefficients of the basis functions can be expressed succinctly as (3.83) D = X −1 From analytical geometry, the determinant of X is equal to 2S , where S is the area of the triangle. (That is where the counterclockwise numbering of nodes becomes important; for clockwise numbering, the determinant would be equal to minus 2S .) This leads to simple explicit expressions for the basis functions: ψ1 =

(y2 − y3 )x + (x3 − x2 )y + (x2 y3 − x3 y2 ) 2S

(3.84)

with the other two functions obtained by cyclic permutation of the indexes. Since the basis functions are linear, their gradients are just constants: ∇ψ1 =

y2 − y3 x3 − x2 xˆ + yˆ 2S 2S

(3.85)

with the formulas for ψ2,3 again obtained by cyclic permutation. These expressions are central in the FE Galerkin formulation. It would be straightforward to verify from (3.84), (3.85) that ψ 1 + ψ2 + ψ3 = 1

(3.86)

∇ψ1 + ∇ψ2 + ∇ψ3 = 0

(3.87)

106

3 The Finite Element Method

However, these results can be obtained without any algebraic manipulation. Indeed, due to the Kronecker delta property of the basis, any function u(x, y) linear over the triangle can be expressed via its nodal values u 1,2,3 as u(x, y) = u 1 ψ1 + u 2 ψ2 + u 3 ψ3 Equation (3.86) follows from this simply for u(x, y) ≡ 1. Functions ψ1,2,3 are also known as barycentric coordinates and have an interesting geometric interpretation (Fig. 3.17). For any point x, y in the plane, ψ1 (x, y) is the ratio of the shaded area to the area of the whole triangle: ψ1 (x, y) = S1 (x, y)/S . Similar expressions are of course valid for the other two basis functions. Indeed, the fact that S1 /S is equal to one at node #1 and zero at the other two nodes is geometrically obvious. Moreover, it is a linear function of coordinates because S1 is proportional to height l of the shaded triangle (the “elevation” of point x, y over the “base” segment 2–3), and l can be obtained by a linear transformation of coordinates (x, y). The three barycentric coordinates are commonly denoted with λ1,2,3 , so the linear FE basis functions are just ψi ≡ λi (i = 1,2,3). Higher-order FE bases can also be conveniently expressed in terms of λ (Sect. 3.8.2). The element stiffness matrix for first-order elements is easy to compute because the gradients (3.85) of the basis functions are constant:  (∇λi , ∇λ j ) ≡



∇λi · ∇λ j d S = ∇λi · ∇λ j S , i, j = 1, 2, 3

(3.88)

where the integration is over a triangular element and S is the area of this element. Expressions for the gradients are available (3.85) and can be easily substituted into (3.88) if an explicit formula for the stiffness matrix in terms of the nodal coordinates is desired. Computation of the element mass matrix (the Gram matrix of the basis functions) is less simple, but the result is quite elegant. The integral of, say, the product λ1 λ2 over the triangular element can be found using an affine transformation of this element to the “master” triangle with nodes 1, 2, 3 at (1, 0), (0, 1) and (0, 0), respectively. Since the area of the master triangle is 1/2, the Jacobian of this transformation is equal to 2S and we have23  (λ1 , λ2 ) ≡



 λ1 λ2 d S = 2S



1

xd x 0

1−x

ydy =

0

Similarly, (λ1 , λ1 ) = 2

S 12

and the complete element mass matrix is 23 The

Jacobian is positive for the counterclockwise node numbering convention.

S 12

3.8 The Finite Element Method in Two Dimensions

107

Fig. 3.17 Geometric interpretation of the linear basis functions: ψ1 (x, y) = S1 (x, y)/S , where S1 is the shaded area and S is the area of the whole triangle. (Similar for ψ2,3 .)

⎛ ⎞ 2 1 1 S M = ⎝1 2 1⎠ 1 1 2 12

(3.89)

The expressions for the inner products of the barycentric coordinates are a particular case of a more general formula that appears in many texts on FE analysis and is quoted here without proof: 

i! j! k! 2S (i + j + k + 2)!

j



λi1 λ2 λk3 d S =

(3.90)

for any nonnegative integers i, j, k. M11 of (3.89) corresponds to i = 2, j = k = 0; M12 corresponds to i = j = 1, k = 0; etc. Remark 4 The notion of “master element” (or “reference element”) is useful and long established in finite element analysis. Properties of FE matrices and FE approximations are usually examined via affine transformations of elements to the “master” ones. In that sense, analysis of finite element interpolation errors in Sect. 3.14.2 below (p. 154) is less typical. Example 6 Let us find the basis functions and the FE matrices for a right triangle with node #1 at the origin, node #2 on the x-axis at (h x , 0), and node #3 on the y-axis at (0, h y ) (mesh sizes h x , h y are positive numbers). The coordinate matrix is ⎛

⎞ 0 0 1 X = ⎝ h x 0 1⎠ 0 hy 1 which yields the coefficient matrix ⎛

D = X −1

⎞ −h −1 h −1 0 x x ⎠ 0 h −1 = ⎝−h −1 y y 1 0 0

108

3 The Finite Element Method

Each column of this matrix is a set of three coefficients for the respective basis function; thus, the three columns translate into −1 ψ1 = 1 − h −1 x x − hy y −1 ψ2 = h x x ψ3 = h −1 y y

The sum of these functions is identically equal to one as it should be according to (3.86). Functions ψ2 and ψ3 in this case are particularly easy to visualize: ψ2 is a linear function of x equal to one at node #2 and zero at the other two nodes; ψ3 is similar. The gradients are −1 ˆ ∇ψ1 = − h −1 x xˆ − h y y −1 ∇ψ2 = h x xˆ ˆ ∇ψ3 = h −1 y y

Computing the entries of the element stiffness matrix is easy because the gradients of λs are (vector) constants. For example,  (∇λ1 , ∇λ1 ) =



−2 ∇λ1 · ∇λ1 d S = (h −2 x + h y ) S

Since S = h x h y /2, the complete stiffness matrix is ⎛

−2 h −2 x + hy −2 L = ⎝ −h x −h −2 y

− h −2 x h −2 x 0

⎞ − h −2 y hx h y 0 ⎠ 2 h −2 y

(3.91)

This expression becomes particularly simple if h x = h y = h: ⎛ 2 1 ⎝ −1 L = 2 −1

−1 1 0

⎞ −1 0 ⎠ 1

(3.92)

The mass matrix is, according to the general expression (3.89), ⎛ ⎞ ⎛ ⎞ 2 1 1 2 1 1 h h S ⎝ x y ⎝1 2 1⎠ 1 2 1⎠ = M = 12 1 1 2 24 1 1 2

(3.93)

An example of MATLAB implementation of FEM for a triangular mesh is given at the end of this section; see p. 113 for the description and listing of the code. As an illustrative example, consider a dielectric particle with some nontrivial shape—say, T-shaped – in a uniform external field. The geometric setup is clear from Figs. 3.18 and 3.19.

3.8 The Finite Element Method in Two Dimensions

109

Fig. 3.18 Finite element mesh for the electrostatic problem: a T-shaped particle in an external field. The mesh has 422 nodes and 782 triangular elements

Fig. 3.19 Potential distribution for the electrostatic example: a T-shaped particle in an external field

The potential of the applied external field is assumed to be u = x and is imposed as the Dirichlet condition on the boundary of the computational domain. Since the particle disturbs the field, this condition is not exact but becomes more accurate if the domain boundary is moved farther away from the particle; this, however, increases the number of nodes and consequently the computational cost of the simulation. Domain truncation is an intrinsic difficulty of electromagnetic FE analysis (unlike, say, analysis of stresses and strains confined to a finite mechanical part). Various ways of reducing the domain truncation error are known: radiation boundary conditions and perfectly matched layers (PMLs) for wave problems (Sect. 7.10 in Chap. 4, A. Paganini et al. [PSHT16]), hybrid finite element/boundary element methods, infinite elements, “ballooning,” spatial mappings (A. Plaks et al. [PTPT00]) and various other techniques (Sect. 7.10). Since domain truncation is only tangentially related to the material of this section, it is not considered here further but will reappear in Chaps. 7 and 8. For inhomogeneous Dirichlet conditions, the weak formulation of the problem has to be modified, with the corresponding minor adjustments to the FE algorithm.

110

3 The Finite Element Method

The underlying mathematical reason for this modification is that functions satisfying a given inhomogeneous Dirichlet condition form an affine space rather than a linear space (e.g. the sum of two such functions has a different value at the boundary). The remedy is to split the original unknown function u up as u = u 0 + u =0

(3.94)

where u =0 is some sufficiently smooth function satisfying the given inhomogeneous boundary condition, while the remaining part u 0 satisfies the homogeneous one. The weak formulation is L(u 0 , v0 ) = ( f, v0 ) − L(u =0 , v0 ), u 0 ∈ H01 (), ∀v0 ∈ H01 ()

(3.95)

In practice, the implementation of this procedure is more straightforward than it may appear from this expression. The inhomogeneous part u =0 is spanned by the FE basis functions corresponding to the Dirichlet nodes; the homogeneous part of the solution is spanned by the basis functions for all other nodes. If j is a Dirichlet boundary node, the solution value u j at this node is given, and hence, the term L i j u j in the global system of FE equations is known as well. It is therefore moved (with the opposite sign of course) to the right-hand side. In the T-shaped particle example, the mesh has 422 nodes and 782 triangular elements, and the stiffness matrix has 2446 nonzero entries. The sparsity structure of this matrix (also called the adjacency structure)—the set of index pairs (i, j) for which L i j = 0—is exhibited in Fig. 3.20. The distribution of nonzero entries in the matrix is quasi-random, which has implications for the solution procedures if direct solvers are employed. Such solvers are almost invariably based on some form of Gaussian elimination; for symmetric positive definite matrices, it is Cholesky decomposition U T U , where U is an upper-triangular matrix.24 While Gaussian elimination is a very reliable25 and relatively simple procedure, for sparse matrices it unfortunately produces “fill-in”: Zero entries become nonzero in the process of elimination (or Cholesky decomposition), which substantially degrades the computational efficiency and memory usage. In the present example, Cholesky decomposition applied to the original stiffness matrix with 2446 nonzero entries26 produces the Cholesky factor with 24,969 nonzeros and hence requires about 20 times more memory (if symmetry is taken advantage of); compare Figs. 3.20 and 3.21. For more realistic practical cases, where matrix sizes are much greater, the effect of fill-in is even more dramatic. It is worth noting—in passing, since this is not the main theme of this section— that several techniques are available for reducing the amount of fill-in in Cholesky 24 Cholesky

decomposition is usually written in the equivalent form of L L T , where L is a lower triangular matrix, but symbol L in this chapter is already used for the FE stiffness matrix. 25 It is known to be stable for symmetric positive definite matrices but may require pivoting in general. 26 Of which only a little more than one half need to be stored due to matrix symmetry.

3.8 The Finite Element Method in Two Dimensions

111

Fig. 3.20 Sparsity (adjacency) structure of the global FE matrix in the T-shaped particle example

Fig. 3.21 Sparsity structure of the Cholesky factor of the global FE matrix in the T-shaped particle example

factorization. The main ideas behind these techniques are clever permutations of rows and columns (equivalent to renumbering of nodes in the FE mesh), block algorithms (including divide-and-conquer type recursion), and combinations thereof. A. George and J. W. H. Liu give a detailed and lucid exposition of this subject [GL81]. In the current example, the so-called reverse Cuthill–McKee ordering reduces the number of nonzero entries in the Cholesky factor to 7230, which is more than three times better than for the original numbering of nodes (Figs. 3.22 and 3.23). The “minimum degree” ordering [GL81] is better by another factor of ∼ 2: The number of nonzeros in the Cholesky triangular matrix is equal to 3717 (Figs. 3.24 and 3.25). These permutation algorithms will be revisited in the solver section (p. 124).

112 Fig. 3.22 Sparsity structure of the global FE matrix after the reverse Cuthill–McKee reordering of nodes

Fig. 3.23 Sparsity structure of the upper-triangular Cholesky factor of the global FE matrix after the reverse Cuthill–McKee reordering of nodes

Fig. 3.24 Sparsity structure of the global FE matrix after the minimum degree reordering of nodes

3 The Finite Element Method

3.8 The Finite Element Method in Two Dimensions

113

Fig. 3.25 Sparsity structure of the upper-triangular Cholesky factor of the global FE matrix after the minimum degree reordering of nodes

Appendix: Sample MATLAB Code for FEM with First-Order Triangular Elements The MATLAB code below is intended to be the simplest possible illustration of the finite element procedure. As such, it uses first-order elements and is optimized for algorithmic simplicity rather than performance. For example, there is some duplication of variables for the sake of clarity, and symmetry of the FE stiffness matrix is not taken advantage of. Improvements become fairly straightforward to make once the essence of the algorithm is understood. The starting point for the code is a triangular mesh generated by COMSOL™, a commercial finite element package. The input data structure fem generated by COMSOL in general contains the geometric, physical and FE mesh data relevant to the simulation. For the purposes of this section, only mesh data (the field fem.mesh) is needed. Second-order elements are the default in COMSOL, and it is assumed that this default has been changed to produce first-order elements for the sample MATLAB code. The fem.mesh structure (or simply mesh for brevity) contains several fields: • mesh.p is a 2 × n matrix, where n is the number of nodes in the mesh. The ith column of this matrix contains the (x, y) coordinates of node #i. • mesh.e is a 7 × n be matrix, where n be is the number of element edges on all boundaries: the exterior boundary of the domain and material interfaces. The first and second rows contain the node numbers of the starting and end points of the respective edge. The sixth and seventh row contain the region (subdomain) numbers on the two sides of the edge. Each region is a geometric entity that usually corresponds to a particular medium, e.g. a dielectric particle or air. Each region is assigned a unique number. By convention, the region outside the computational domain is labeled as zero, which is used in the MATLAB code below to identify the exterior boundary edges and nodes in mesh.e. The remaining rows of this matrix will not be relevant to us here.

114

3 The Finite Element Method

• mesh.t is a 4 × n elems matrix, where n elems is the number of elements in the mesh. The first three rows contain node numbers of each element in counterclockwise order. The fourth row is the region number identifying the medium where the element resides. The second input parameter of the MATLAB code, in addition to the fem structure, is an array of dielectric permittivities by region number. In the T-shaped particle example, region #1 is air, and the particle includes regions #2–#4, all with the same dielectric permittivity. The following sequence of commands could be used to call the FE solver: % Set parameters: epsilon_air = 1; epsilon_particle = 10; epsilon_array = [epsilon_air epsilon_particle*ones(1, 5)]; % Solve the FE problem FEM_solve = FEM_triangles (fem, epsilon_array)

The operation of the MATLAB function FEM_triangles below should be clear from the comments in the code and from Sect. 3.8.1. function FEM_triangles = FEM_triangles (fem, epsilon_array) % Input parameters: % fem -- structure generated by COMSOL. % (See comments in the code and text.) % epsilon_array -- material parameters by region number. mesh = fem.mesh; % duplication for simplicity n_nodes = length(mesh.p); % array p has dimension 2 x n_nodes; % contains x- and y-coordinates of the nodes. n_elems = length(mesh.t); % array t has dimension 4 x n_elements. % First three rows contain node numbers % for each element. % The fourth row contains region number % for each element. % Initialization rhs = zeros(n_nodes, 1); global_stiffness_matrix = sparse(n_nodes, n_nodes); dirichlet = zeros(1, n_nodes); % flags Dirichlet conditions % for the nodes (=1 for Dirichlet % nodes, 0 otherwise) % Use COMSOL data on boundary edges to determine Dirichlet nodes: boundary_edge_data = mesh.e;

% mesh.e contains COMSOL data % on element edges at the domain boundary number_of_boundary_edges = size(boundary_edge_data, 2); for boundary_edge = 1 : number_of_boundary_edges % Rows 6 and 7 in the array are region numbers % on the two sides of the edge

3.8 The Finite Element Method in Two Dimensions

115

region1 = boundary_edge_data(6, boundary_edge); region2 = boundary_edge_data(7, boundary_edge); % If one of these region numbers is zero, the edge is at the % boundary, and the respective nodes are Dirichlet nodes: if (region1 == 0) | (region2 == 0) % boundary edge node1 = boundary_edge_data(1, boundary_edge); node2 = boundary_edge_data(2, boundary_edge); dirichlet(node1) = 1; dirichlet(node2) = 1; end end % Set arrays of nodal coordinates: for elem = 1 : n_elems % loop over all elements elem_nodes = mesh.t(1:3, elem); % node numbers for the element for node_loc = 1 : 3 node = elem_nodes(node_loc); x_nodes(node) = mesh.p(1, node); y_nodes(node) = mesh.p(2, node); end end % Matrix assembly -- loop over all elements: for elem = 1 : n_elems elem_nodes = mesh.t(1:3, elem); region_number = mesh.t(4, elem); for node_loc = 1 : 3 node = elem_nodes(node_loc); x_nodes_loc(node_loc) = x_nodes(node); y_nodes_loc(node_loc) = y_nodes(node); end % Get element matrices: [stiff_mat, mass_mat] = elem_matrices_2D(x_nodes_loc, y_nodes_loc); for node_loc1 = 1 : 3 node1 = elem_nodes(node_loc1); if dirichlet(node1) ˜= 0 continue; end for node_loc2 = 1 : 3 % symmetry not taken advantage of, to simplify code node2 = elem_nodes(node_loc2); if dirichlet(node2) == 0 % non-Dirichlet node global_stiffness_matrix(node1, node2) = ... global_stiffness_matrix(node1, node2) ... + epsilon_array(region_number) ... * stiff_mat(node_loc1, node_loc2); else % Dirichlet node; update rhs rhs(node1) = rhs(node1) - ... stiff_mat(node_loc1, node_loc2) * ... dirichlet_value(x_nodes(node2), y_nodes(node2)); end end end end % Equations for Dirichlet nodes are trivial:

116

3 The Finite Element Method

for node = 1 : n_nodes if dirichlet(node) ˜= 0 % a Dirichlet node global_stiffness_matrix(node, node) = 1; rhs(node) = dirichlet_value(x_nodes(node), y_nodes(node)); end end solution = global_stiffness_matrix \ rhs; % Output fields: FEM_triangles.fem = fem; % record the fem structure FEM_triangles.epsilon_array = epsilon_array; % material parameters % by region number FEM_triangles.n_nodes = n_nodes; % number of nodes in the mesh FEM_triangles.x_nodes = x_nodes; % array of x-coordinates of the nodes FEM_triangles.y_nodes = y_nodes; % array of y-coordinates of the nodes FEM_triangles.dirichlet = dirichlet; % flags for the Dirichlet nodes FEM_triangles.global_stiffness_matrix = global_stiffness_matrix; % save matrix for testing FEM_triangles.rhs = rhs; % right hand side for testing FEM_triangles.solution = solution; % nodal values of the potential return; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% function [stiff_mat, mass_mat] = elem_matrices_2D(x_nodes, y_nodes) % Compute element matrices for a triangle. % Input parameters: % x_nodes -- x-coordinates of the three nodes, % in counter-clockwise order % y_nodes -- the corresponding y-coordinates coord_mat = [x_nodes’ y_nodes’ ones(3, 1)]; % matrix of nodal coordinates, with an extra column of ones coeffs = inv(coord_mat); % coefficients of the linear basis functions grads = coeffs(1:2, :); % gradients of the linear basis functions area = 1/2 * abs(det(coord_mat)); % area of the element stiff_mat = area * grads’ * grads; % the FE stiffness matrix mass_mat = area / 12 * (eye(3) + ones(3, 3)); % the FE mass matrix return; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% function dirichlet_value = dirichlet_value (x, y) % Set the Dirichlet boundary condition dirichlet_value = x; return;

% as a simple example

3.8 The Finite Element Method in Two Dimensions

117

Fig. 3.26 Second-order triangular element. The six nodes can be labeled with triplets of indexes (k1 , k2 , k3 ), ki = 0, 1, 2. Each node has the corresponding basis function kk11 (λ1 )kk22 (λ2 )kk33 (λ3 )

3.8.2 Higher-Order Triangular Elements The discussion in Sect. 3.8.1 suggests that in a triangular element the barycentric variables λ (p. 106) form a natural set of coordinates (albeit not independent, as their sum is equal to unity). For first-order elements, the barycentric coordinates themselves double as the basis functions. They can also be used to generate FE bases for higher-order triangular elements. A second-order element has three corner nodes #1–#3 and three midpoint nodes (Fig. 3.26). All six nodes can be labeled with triplets of indexes (k1 , k2 , k3 ); each index ki increases from 0 to 1 to 2 along the edges toward node i (i = 1, 2, 3). To each node, there corresponds an FE basis function that is a second-order polynomial in λ with the Kronecker delta property. The explicit expression for this polynomial is kk11 (λ1 )kk22 (λ2 )kk33 (λ3 ). For example, the basis function corresponding to node (0, 1, 1)—the midpoint node at the bottom – is 1 (λ2 )1 (λ3 ). Indeed, it is the Lagrange polynomial 1 that is equal to one at the midpoint and to zero at the corner nodes of a given edge, and it is the barycentric coordinates λ2,3 that vary (linearly) along the bottom edge. This construction can be generalized to elements of order p. Each side of the triangle is subdivided into p segments; the nodes of the resulting triangular grid are again labeled with triplets of indexes, and the corresponding basis functions are defined in the same way as above. Details can be found in the FE monographs cited at the end of the chapter.

3.9 The Finite Element Method in Three Dimensions Tetrahedral elements, by analogy with triangular ones in 2D, afford the greatest flexibility in representing geometric shapes and are therefore the most common type in many applications. Hexahedral elements are also frequently used. This section

118

3 The Finite Element Method

describes the main features of tetrahedral elements; further information about elements of other types can be found in specialized FE books (Sect. 3.16). Due to a direct analogy between tetrahedral and triangular elements (Sect. 3.8), results for tetrahedra are presented below without further ado. Let the coordinates of the four nodes be xi , yi , z i (i = 1,2,3,4). A typical linear basis function—say, ψ1 —is ψ1 = a1 x + b1 y + c1 z + d1 with some coefficients a1 , b1 , c1 , d1 . The Kronecker delta property is desired: a1 x1 + b1 y1 + c1 z 1 + d1 a1 x2 + b1 y2 + c1 z 2 + d2 a1 x3 + b1 y3 + c1 z 3 + d3 a1 x4 + b1 y4 + c1 z 4 + d4

= = = =

1 0 0 0

(3.96)

Equivalently in matrix–vector form ⎛

X f 1 = e1 ,

x1 ⎜x2 ⎜ X = ⎝ x3 x4

y1 y2 y3 y4

z1 z2 z3 z4

⎞ 1 1⎟ ⎟; 1⎠ 1

⎛ ⎞ ⎛ ⎞ a1 1 ⎜b1 ⎟ ⎜0 ⎟ ⎟ ⎜ ⎜ f 1 = ⎝ ⎠ ; e1 = ⎝ ⎟ (3.97) c1 0⎠ d1 0

with similar relationships for the other three basis functions. In compact notation,

X F = I,

⎛ a1 ⎜b1 F = ⎜ ⎝c1 d1

a2 b2 c2 d2

a3 b3 c3 d3

⎞ a4 b4 ⎟ ⎟ c4 ⎠ d4

(3.98)

where I is the 4 × 4 identity matrix. The coefficients of the basis functions thus are F = X −1

(3.99)

The determinant of X is equal to 6V , where V is the volume of the tetrahedron (assuming that the nodes are numbered in a way that produces a positive determinant). The basis functions can be found from (3.99), say, by Cramer’s rule. Since the basis functions are linear, their gradients are constants. The sum of the basis functions is unity, for the same reason as for triangular elements: (3.100) ψ1 + ψ2 + ψ3 + ψ4 = 1 The sum of the gradients is zero: ∇ψ1 + ∇ψ2 + ∇ψ3 + ∇ψ4 = 0

(3.101)

3.9 The Finite Element Method in Three Dimensions

119

Functions ψ1,2,3,4 are identical with the barycentric coordinates λ1,2,3,4 of the tetrahedron. They have a geometric interpretation as ratios of tetrahedral volumes—an obvious analog of the similar property for triangles (Fig. 3.17). The element stiffness matrix for first-order elements is (noting that the gradients are constant)  ∇λi · ∇λ j d V = ∇λi · ∇λ j V, i, j = 1, 2, 3, 4 (3.102) (∇λi , ∇λ j ) ≡ 

where the integration is over the tetrahedron and V is its volume. The element mass matrix (the Gram matrix of the basis functions) turns out to be ⎛

2 ⎜1 M = ⎜ ⎝1 1 which follows from the formula  j λi1 λ2 λk3 λl4 d V = 

1 2 1 1

1 1 2 1

⎞ 1 1⎟ ⎟ V 1⎠ 20 2

i! j! k! l! 6V (i + j + k + l + 3)!

(3.103)

(3.104)

for any nonnegative integers i, j, k, l. Higher-order tetrahedral elements are constructed in direct analogy with the triangular ones (Sect. 3.8.2). The second-order tetrahedron has ten nodes (four main vertices and six edge midpoints); the cubic tetrahedral element has 20 nodes (two additional nodes per edge subdividing it into three equal segments, and four nodes at the barycenters of the faces). Detailed descriptions of tetrahedral elements, as well as first- and high-order elements of other shapes (hexahedra, triangular prisms and others) are easy to find in FE monographs (Sect. 3.16).

3.10 Approximation Accuracy in FEM Theoretical considerations summarized in Sect. 3.5 show that the accuracy of the finite element solution is directly linked, and primarily depends on, the approximation accuracy. In particular, for symmetric elliptic forms L, the Galerkin solution is actually the best approximation of the exact solution in the sense of the L-norm (usually interpreted as an energy norm). In the case of a continuous elliptic, but not necessarily symmetric, form, the solution error depends also on the ellipticity and continuity constants, according to Céa’s theorem; however, the approximation error is still key. The same is true in the general case of continuous but not necessarily symmetric or elliptic forms; then the so-called Ladyzhenskaya–Babuška–Brezzi

120

3 The Finite Element Method

(LBB) condition relates the solution error to the approximation error via the inf-sup constant (Sect. 3.10.1). In all cases, the central role of FE approximation is clear. The main theoretical results on approximation accuracy in FEM are summarized below. But first, let us consider a simple intuitive 1D picture. The exact solution (solid line in Fig. 3.27) is approximated on a FE grid of size h; several finite elements (e) are shown in the figure. The most natural and easy to analyze form of approximation is interpolation, with the exact and approximating functions sharing the same nodal values on the grid. The FE solution of a boundary value problem in general will not interpolate the exact one, although there is a peculiar case where it does (see Appendix on p. 123). However, due to Céa’s theorem (or Galerkin error minimization or the LBB condition, whichever may be applicable), the smallness of the interpolation error guarantees the smallness of the solution error. It is intuitively clear from Fig. 3.27 that the interpolation error decreases as the mesh size becomes smaller. The error will also decrease if higher-order interpolation —say, piecewise-quadratic—is used. (Higher-order nodal elements have additional nodes that are not shown in the figure.) If the derivative of the exact solution is only piecewise-smooth, the approximation will not suffer as long as the points of discontinuity – typically, material interfaces—coincide with some of the grid nodes. The accuracy will degrade significantly if a material interface boundary passes through a finite element. For this reason, FE meshes in any number of dimensions are generated in such a way that each element lies entirely within one medium. For curved material boundaries, this is strictly speaking possible only if the elements themselves are curved; nevertheless, approximation of curved boundaries by piecewise-planar element FE surfaces is often adequate in practice. P. G. Ciarlet and P. A. Raviart gave the following general and powerful mathematical characterization of interpolation accuracy [CR72]. Let  be a finite set in Rn and let polynomial Iu interpolate a given function u, in the Lagrange or Hermite sense, over a given set of points in . Notably, the only significant assumption in the Ciarlet–Raviart theory is uniqueness of such a polynomial. Then   h p+1 sup{ D m u(x) − D m Iu(x) ; x ∈ K } ≤ C M p+1 m , 0 ≤ m ≤ p (3.105) ρ Here K is the closed convex hull of ; h—diameter of K ; p—maximum order of the interpolating polynomial; M p+1 = sup{ D p+1 u(x) ; x ∈ K }; ρ—supremum of the diameters of spheres inscribed in K . C—a constant. While the result is applicable to abstract sets, in the FE context K is a finite element (as a geometric figure).

3.10 Approximation Accuracy in FEM

121

Fig. 3.27 Piecewise-linear FE interpolation of the exact solution

Let us examine the factors that the error depends upon. M p+1 , being the magnitude of the ( p + 1)st derivative of u, characterizes the level of smoothness of u; naturally, the polynomial approximation is better for smoother functions. The geometric factor can be split up into the shape and size components: h p+1 = ρm

 m h h p+1−m ρ

h/ρ is dimensionless and depends only on the shape of K ; we shall return to the dependence of FE errors on element shape in Sect. 3.14. The following observations about the second factor, h p+1−m , can be made: • Example: the maximum interpolation error by linear polynomials is O(h 2 ) ( p = 1, m = 0). The error in the first derivative is asymptotically higher, O(h) ( p = 1, m = 1). • The interpolation error behaves as a power function of element size h but depends exponentially on the interpolation order p, provided that the exact solution has at least p + 1 derivatives. • The interpolation accuracy is lower for higher-order derivatives (parameter m). Most of these observations make clear intuitive sense. A related result is cited in Sect. 4.4.4.

3.10.1 Appendix: The Ladyzhenskaya–Babuška–Brezzi Condition For elliptic forms, the Lax–Milgram theorem guarantees well-posedness of the weak problem and Céa’s theorem relates the error of the Galerkin solution to the approximation error (Sect. 3.5). For non-elliptic forms, the Ladyzhenskaya–Babuška–Brezzi

122

3 The Finite Element Method

(LBB) condition plays a role similar to the Lax–Milgram–Céa results, although analysis is substantially more involved. Conditions for the well-posedness of the weak problem were derived independently by O. A. Ladyzhenskaya, I. Babuška and F. Brezzi [Lad69, BA72, Bre74]. In addition, the Babuška and Brezzi theories provide error estimates for the numerical solution. Unfortunately, the LBB condition is in many practical cases not easy to verify. As a result, less rigorous criteria are common in engineering practice; for example, the “patch test” that is not considered in this book but is easy to find in the FE literature (e.g. O. C. Zienkiewicz et al. [ZTZ05]). Non-rigorous conditions should be used with caution; I. Babuška and R. Narasimhan [BN97] give an example of a finite element formulation that satisfies the patch test but not the LBB condition. They also show, however, that convergence can still be established in that case, provided that the input data (and hence the solution) are sufficiently smooth. A mathematical summary of the LBB condition is given below for reference. It is taken from the paper by J. Xu and L. Zikatanov [XZ03]. Let U and V be two Hilbert spaces, with inner products (·, ·)U and (·, ·)V , respectively. Let B(·, ·): U × V  → R be a continuous bilinear form B(u, v) ≤ B u U v V

(3.106)

Consider the following variational problem: Find u ∈ U such that B(u, v) =  f, v, ∀v ∈ V

(3.107)

where f ∈ V ∗ (the space of continuous linear functionals on V and ·, · is the usual pairing between V ∗ and V . . . . problem (3.107) is well posed if and only if the following conditions hold . . .: inf sup

u∈U v∈V

B(u, v) > 0

u U v V

(3.108)

Furthermore, if (3.108) hold, then inf sup

u∈U v∈V

B(u, v) B(u, v) = inf sup ≡ α > 0 v∈V u∈U u U v V

u U v V

(3.109)

and the unique solution of (3.107) satisfies

u U ≤

f V ∗ α

(3.110)

. . . Let Uh ⊂ U and Vh ⊂ V be two nontrivial subspaces of U and V , respectively. We consider the following variational problem: Find u h ∈ Uh such that B(u h , vh ) =  f, vh , ∀vh ∈ Vh

(3.111)

. . . problem (3.111) is uniquely solvable if and only if the following conditions hold: inf

sup

u h ∈Uh vh ∈Vh

B(u h , vh ) =

u h Uh v Vh

inf

sup

vh ∈Vh u h ∈Uh

B(u h , vh ) ≡ αh > 0

u h Uh v Vh

(End of quote from J. Xu and L. Zikatanov [XZ03].)

(3.112)

3.10 Approximation Accuracy in FEM

123

The LBB result, slightly strengthened by Xu and Zikatanov, for the Galerkin approximation is Theorem 4 Let (3.106), (3.108) and (3.112) hold. Then

u − u h U ≤

B

inf u − wh U αh wh ∈Vh

(3.113)

3.10.2 Appendix: A Peculiar Case of Finite Element Approximation The curious special case considered in this Appendix is well known to the expert mathematicians but much less so to applied scientists and engineers. I am grateful to B. A. Shoykhet for drawing my attention to this case many years ago and to D. N. Arnold for insightful comments and for providing a precise reference, the 1974 paper by J. Douglas and T. Dupont [DD74, p. 101]. Consider the 1D Poisson equation −

d 2u = f (x),  = [a, b]; u(a) = u(b) = 0 dx2

(3.114)

where the zero Dirichlet conditions are imposed for simplicity only. Let us examine the finite element solution u h of this equation using first-order elements. The Galerkin problem for u h on a chosen mesh is (u h , vh ) = ( f, vh ), u h , ∀vh ∈ P0h

(3.115)

where the primes denote derivatives and P0h is the space of continuous functions that are linear within each element (segment) of the chosen grid and satisfy the zero Dirichlet conditions. The inner products are those of L 2 . We know from Sect. 3.3.1 that the Galerkin solution is the best approximation (in P0h ) of the exact solution u ∗ , in the sense of minimum “energy” (u h − u ∗ , u h − u ∗ ). Geometrically, it is the best (in the same energy sense) representation of the curve u ∗ (x) by a piecewise-linear function compatible with a given mesh. Surprisingly, in the case under consideration the best approximation actually interpolates the exact solution; in other words, the nodal values of the exact and numerical solutions are the same. In reference to Fig. 3.27, approximation of the exact solution (solid line) by the the piecewise-linear interpolant (dotted line) on a fixed grid cannot be improved by shifting the dotted line up or down a bit. Proof Let us treat vh in the Galerkin problem (3.115) for u h as a generalized function (distribution; see Appendix 6.15).27 Then 27 The reviewer of this book noted that in a purely mathematical text the use of distributional derivatives would not be appropriate without presenting a rigorous theory first. However, distributions

124

3 The Finite Element Method

−u h , vh  = ( f, vh ), u h , ∀vh ∈ P0h where the angle brackets denote a linear functional acting on u h and vh is the second distributional derivative of vh . This transformation of the left-hand side is simply due to the definition of distributional derivative. The right-hand side is transformed in a similar way, after noting that f = −u  , where u is the exact solution of the Poisson equation. We obtain u h , vh  = (u, vh ) or

u h − u, vh  = 0, ∀vh ∈ P0h

(3.116)

It remains to be noted that vh is a piecewise-constant function28 and hence vh is a set of Dirac delta functions residing at the grid nodes. This makes it obvious that (3.116) is satisfied if and only if u h indeed interpolates the exact solution at the nodes of the grid.  Exactness of the FE solution at the grid nodes is an extreme particular case of the more general phenomenon of superconvergence: The accuracy of the FE solution at certain points (e.g. element nodes or barycenters) is asymptotically higher than the average accuracy. The large body of research on superconvergence includes books, conference proceedings and many journal publications.29

3.11 An Overview of System Solvers The finite element method leads to systems of equations with large matrices—in practice, the dimension of the system can range from thousands to millions. When the method is applied to differential equations, the matrices are sparse because each basis function is local and spans only a few neighboring elements; nonzero entries in the FE matrices correspond to the overlapping supports of the neighboring basis functions. (The situation is different when FEM is applied to integral equations. The integral operator is nonlocal and typically all unknowns in the system of equations (Dirac delta functions in particular) make our analysis here much more elegant and simple. I rely on the familiarity of applied scientists and engineers—the intended audience of this book—with delta functions, even if the usage is not backed up by full mathematical rigor. 28 With zero mean due to the Dirichlet boundary conditions for v , but otherwise arbitrary. h 29 M. Kˇrižek, P. Neittaanmaki and R. Stenberg, eds. Finite Element Methods: Superconvergence, Post-Processing, and a Posteriori Estimates, Lecture Notes in Pure and Applied Mathematics, vol. 196, Marcel Dekker: New York, 1998. L.B. Wahlbin, Superconvergence in Galerkin Finite Element Methods, Berlin; New York: Springer-Verlag, 1995. M. Kˇrížek, Superconvergence phenomena on three-dimensional meshes, Int. J. of Num. Analysis and Modeling, vol. 2, pp. 43–56, 2005.

3.11 An Overview of System Solvers

125

Fig. 3.28 Matrix sparsity structure as a graph: an example

are coupled; the matrix is full. Integral equations are considered in this book only in passing.) The sparsity (adjacency) structure of a matrix is conveniently described as a graph. For an n × n matrix, the graph has n nodes.30 To each nonzero entry ai j of the matrix there corresponds the graph edge i − j. If the structure of the matrix is not symmetric, it is natural to deal with a directed graph and distinguish between edges i → j and j → i (each of them may or may not be present in the graph, independently of the other one). Symmetric structures can be described by undirected graphs. As an example, the directed graph corresponding to the matrix ⎛

2 ⎜1 ⎜ ⎝0 −1

0 1 0 0

3 0 4 0

⎞ 1 0⎟ ⎟ 0⎠ 3

(3.117)

is shown in Fig. 3.28. For simplicity, the diagonal entries of the matrix are always tacitly assumed to be nonzero and are not explicitly represented in the graph. An important question in finite difference and finite element analysis is how to solve such large sparse systems effectively. One familiar approach is Gaussian elimination of the unknowns one by one. As the simplest possible illustration, consider a system of two equations of the form      x1 f1 a11 a12 = a21 a22 x2 f2

(3.118)

For the natural order of elimination of the unknowns (x1 eliminated from the first equation and substituted into the others, etc.) and for a nonzero a11 , we obtain x1 = ( f 1 − a12 x2 )/a11 and −1 −1 a12 ) x2 = f 2 − a11 f1 (a22 − a21 a11

(3.119)

30 For matrices arising in finite difference or finite element methods, the nodes of the graph typically

correspond to mesh nodes; otherwise graph nodes are abstract mathematical entities.

126

3 The Finite Element Method

This simple result looks innocuous at first glance but in fact foreshadows a problem with the elimination process. Suppose that in the original system (3.118) the diagonal entry a22 is zero. In the transformed system (3.119) this is no longer so: The entry −1 corresponding to x2 (the only entry in the remaining 1 × 1 matrix) is a22 − a21 a11 a12 . Such transformation of zero matrix entries into nonzeros is called “fill-in.” For the simplistic example under consideration, this fill-in is of no practical consequence. However, for large sparse matrices, fill-in tends to accumulate in the process of Gaussian elimination and becomes a serious complication. In our 2 × 2 example with a22 = 0, the fill-in disappears if the order of equations (or equivalently the sequence of elimination steps) is changed:      a21 0 x1 f1 = x2 f2 a11 a12 Obviously, x1 is now found immediately from the first equation, and x2 is computed from the second one, with no additional nonzero entries created in the process. In general, permutations of rows and columns of a sparse matrix may have a dramatic effect on the amount of fill-in, and hence on the computational cost and memory requirements, in Gaussian elimination. Gaussian elimination is directly linked to matrix factorization into lower- and upper-triangular terms. More specifically, the first factorization step can be represented in the following form: ⎞ ⎞ ⎛ ⎛ ⎞ u 11 u 12 . . . u 1n l11 0 . . . 0 a11 a12 . . . a1n ⎟ ⎟ ⎜0 ⎜ ⎟ ⎜a21 ⎟ = ⎜ l21 ⎟ ⎜ ⎟ ⎜ ⎠ ⎠ ⎝ ⎝ ⎠ ⎝ . A1 . L1 . U1 an1 ln1 0 ⎛

(3.120)

The fact that this factorization is possible (and even not unique) can be verified by direct multiplication of the factors in the right hand side. This yields, for the first diagonal element, first column and first row, respectively, the following conditions: l11 u 11 = a11 l21 u 11 = a21 , l31 u 11 = a31 , . . . , ln1 u 11 = an1 l11 u 12 = a12 , l11 u 13 = a13 , . . . , l11 u 1n = a1n where n is the dimension of matrix A. Fixing l11 by, say, setting it equal to one defines the column vector l1 = (l11 , l21 , . . . , ln1 )T and the row vector u 1T = (u 11 , u 12 , . . . , u 1n ) unambiguously: l11 = 1; u 11 = a11

(3.121)

−1 −1 l21 = u −1 11 a21 , l31 = u 11 a31 , . . . , ln1 = u 11 an1

(3.122)

3.11 An Overview of System Solvers

127

u 12 = a12 , u 13 = a13 , . . . , u 1n = a1n

(3.123)

Further, the condition for matrix blocks L 1 and U1 follows directly from factorization (3.120): L 1 U1 + l1 u 1T = A1 or equivalently

where

L 1 U1 = A˜ 1 A˜ 1 ≡ A1 − l1 u 1T

The updated matrix A˜ 1 is a particular case of the Schur complement (R. A. Horn and C. R. Johnson [HJ90], Y. Saad [Saa03]). Explicitly the entries of A˜ 1 can be written as −1 a1 j (3.124) a˜ 1,i j = ai j − li1 u 1 j = ai j − ai1 a11 Thus, the first step of Gaussian factorization A = LU is accomplished by computing the first column of L (3.121), (3.122), the first row of U (3.121), (3.123) and the updated block A˜ 1 (3.124). The factorization step is then repeated for A˜ 1 , etc., until (at the nth stage) the trivial case of a 1 × 1 matrix results. Theoretically, it can be shown that this algorithm succeeds as long as all leading minors of the original matrix are nonzero. In practical computation, however, care should be taken to ensure computational stability of the process (see below). Once the matrix is factorized, solution of the original system of equations reduces to forward elimination and backward substitution, i.e. to solving systems with the triangular matrices L and U , which is straightforward. An important advantage of Gaussian elimination is that, once matrix factorization has been performed, equations with the same matrix but multiple right-hand sides can be solved at the very little cost of forward elimination and backward substitution only. Let us review a few computational aspects of Gaussian elimination. 1. Fill-in. The matrix update formula (3.124) clearly shows that a zero matrix entry ai j can become nonzero in the process of LU factorization. The 2 × 2 example considered above is the simplest possible case of such fill-in. A quick look at the matrix update Eq. (3.124) shows how the fill-in is reflected in the directed sparsity graph. If at some step of the process node k is being eliminated, any two edges i → k and k → j produce a new edge i → j (corresponding to a new nonzero matrix entry i j). This is reminiscent of the usual “head-to-tail” rule of vector addition. Figure 3.29 may serve as an illustration. Similar considerations apply for symmetric sparsity structures represented by undirected graphs. Methods to reduce fill-in are discussed below. 2. The computational cost. For full matrices, the number of arithmetic operations (multiplications and additions) in LU factorization is approximately 2n 3 /3. For sparse matrices, the cost depends very strongly on the adjacency structure and

128

3 The Finite Element Method

Fig. 3.29 Block arrows indicate fill-in created in a matrix after elimination of unknown #1

can be reduced dramatically by clever permutations of rows and columns of the matrix and other techniques reviewed later in this section.31 3. Stability. Detailed analysis of LU factorization (J. H. Wilkinson [Wil94], G. H. Golub and C.F. Van Loan [GL96], G. E. Forsythe and C. B. Moler [FM67], N. J. Higham [Hig02]) shows that numerical errors (due to roundoff) can accumulate if the entries of L and U grow. Such growth can, in turn, be traced back to small diagonal elements arising in the factorization process. To rectify the problem, the leading diagonal element at each step of factorization is maximized either via complete pivoting – reshuffling of rows and columns of the remaining matrix block—or via partial pivoting—reshuffling of rows only. The existing theoretical error estimates for both types of pivoting are much more pessimistic than practical experience indicates.32 In fact, partial pivoting works so well in practice that it is used almost exclusively: Higher stability of complete pivoting is mostly theoretical but its higher computational cost is real. Likewise, orthogonal factorizations such as Q R, while theoretically more stable than LU factorization, are hardly ever used as system solvers because their computational cost is approximately

the O(n 3 ) operation count is not asymptotically optimal for solving large systems with full matrices of size n × n. In 1969, V. Strassen discovered a trick for computing the product of two 2 × 2 block matrices with seven block multiplications instead of eight that would normally be needed [Str69]. When applied recursively, this idea leads to O(n γ ) operations, with γ = log2 7 ≈ 2.807. Theoretically, algorithms with γ as low as 2.375 now exist, but they are computationally unstable and have very large numerical prefactors that make such algorithms impractical. I. Kaporin has developed practical (i.e. stable and faster than straightforward multiplication for matrices of moderate size) algorithms with the asymptotic operation count O(N 2.7760 ) [Kap04]. Note that solution of algebraic systems with full matrices can be reduced to matrix multiplication (V. Pan [Pan84]). See also S. Robinson [Rob05] and H. Cohn et al. [CKSU05]. 32 J. H. Wilkinson [Wil61] showed that for complete pivoting the growth factor for the numerical error does not exceed 31 Incidentally,

n 1/2 (21 × 31/2 × 41/3 × . . . × n 1/(n−1) )1/2 ∼ Cn 0.25 log n (which is ∼ 3500 for n = 100 and ∼ 8.6 × 106 for n = 1000). In practice, however, there are no known matrices with this growth factor higher than n. For partial pivoting, the bound is 2n−1 , and this bound can in fact be reached in some exceptional cases.

3.11 An Overview of System Solvers

129

twice that of LU .33 L. N. Trefethen [Tre85] gives very interesting comments on this and related matters. Remarkably, the modern use of Gaussian elimination can be traced back to a 1948 paper by A. M. Turing34 [Tur48, Bri92]. N. J. Higham writes [Hig02, pp. 184–185]: “ [Turing] formulated the . . . LDU factorization of a matrix, proving [that the factorization exists and is unique if all leading minors of the matrix are nonzero] and showing that Gaussian elimination computes an LDU factorization. He introduced the term “condition number” . . . He used the word “preconditioning” to mean improving the condition of a system of linear equations (a term that did not come into popular use until the 1970s). He described iterative refinement for linear systems. He exploited backward error ideas. . . . he analyzed Gaussian elimination with partial pivoting for general matrices and obtained [an error bound]. ”

The case of sparse symmetric positive definite (SPD) systems has been studied particularly well, for two main reasons. First, such systems are very common and important in both theory and practice. Second, it can be shown that the factorization process for SPD matrices is always numerically stable (A. George and J. W. H. Liu [GL81], G. H. Golub and C. F. Van Loan [GL96], G. E. Forsythe and C. B. Moler [FM67]). Therefore, one need not be concerned with pivoting (permutations of rows and columns in the process of factorization) and can concentrate fully on minimizing the fill-in. The general case of nonsymmetric and/or nonpositive definite matrices will not be reviewed here but is considered in several monographs: books by O. Østerby and Z. Zlatev [sZZ83], by I. S. Duff et al. [DER89], by T. A. Davis [Dav06]. The remainder of this section deals exclusively with the SPD case and is, in a sense, a digest of the excellent treatise by A. George and J. W. H. Liu [GL81]. For SPD matrices, it is easy to show that in the LU factorization U can be taken as L T , leading to Cholesky factorization L L T already mentioned on p. 111. Cholesky decomposition has a small overhead of computing the square roots of the diagonal entries of the matrix; this overhead can be avoided by using the L DL T factorization instead (where D is a diagonal matrix). Methods for reducing fill-in are based on reordering of rows and columns of the matrix, possibly in combination with block partitioning. Let us start with the permutation algorithms. The simplest case where the sparsity structure can be exploited is that of banded matrices. The band implies part of the matrix between two subdiagonals parallel to the main diagonal or, more precisely, the set of entries with indexes i, j such that −k1 ≤ i − j ≤ k2 , where k1,2 are nonnegative integers. A matrix is banded if its entries are all zero outside a certain band (in practice, usually k1 = k2 = k). 33 Q R

algorithms are central in eigenvalue solvers; see Appendix 8.17. Mathison Turing (1912–1954, www.turing.org.uk/turing), the legendary inventor of the Turing machine and the Bombe device that broke, with an improvement by Gordon Welchman, the German Enigma codes during World War II. (Very interesting Youtube videos are available from Numberphile: Enigma Machine www.youtube.com/watch?v=G2_Q9FoD-oQ and Flaw in the Enigma Code www.youtube.com/watch?v=V4V2bpZlqx8.) Also well known is the Turing test that defines a “sentient” machine. Overall, Turing lay the foundation of modern computer science.

34 Alan

130

3 The Finite Element Method

Fig. 3.30 Symmetric sparsity structure as a graph: an example

The importance of this notion for Gaussian (or Cholesky) elimination lies in the easily verifiable fact that the band structure is preserved during factorization; i.e., no additional fill is created outside the band. Cholesky decomposition for a band matrix requires approximately k(k + 3)n/2 multiplicative operations, which for k  n is much smaller than the number of operations needed for the decomposition of a full matrix n × n. A very useful generalization is to allow the width of the band to vary row by row: k = k(i). Such a variable-width band is called an envelope. Figures 3.22 and 3.23 may serve as a helpful illustration. Again, no fill is created outside the envelope. Since the minimal envelope is obviously a subset of the minimal band, the computational cost of the envelope algorithm is generally lower than that of the band method.35 The operation count for the envelope method can be found in George and Liu’s book [GL81], along with a detailed description and implementation of the Reverse Cuthill–McKee ordering algorithm that reduces the envelope size. There is no known algorithm that would minimize the computational cost and/or memory requirements for a matrix with any given sparsity structure, even if pivoting is not involved, and whether or not the matrix is SPD. D. J. Rose and R. E. Tarjan [RT75] state (but do not include the proof) that this problem for a non-SPD matrix is NP-complete and conjecture that the same is true in the SPD case. However, powerful heuristic algorithms are available, and the underlying ideas are clear from adjacency graph considerations. Figure 3.30 shows a small fragment of the adjacency graph; thick lines in Fig. 3.31 represent the corresponding fill-in if node #1 is eliminated first. These figures are very similar to Figs. 3.28 and 3.29, except that the graph for a symmetric structure is unordered. Elimination of a node couples all the nodes to which it is connected. If nodes 2, 3 and 4 were to be eliminated prior to node 1, there would be no fill-in in this fragment of the graph. This simple example has several ramifications. First, a useful heuristic is to start the elimination with the graph vertices that have the fewest number of neighbors, i.e. the minimum degree. (Degree of a vertex is the number of edges incident to it.) The minimum degree algorithm, first introduced by W. F. Tinney and J. W. Walker [TW67], is quite useful and effective in practice, although there is of course no guarantee that local minimization of fill-in at each 35 I

disregard the small overhead related to storage and retrieval of matrix entries in the band and envelope.

3.11 An Overview of System Solvers

131

Fig. 3.31 Fill-in (block arrows) created in a matrix with symmetric sparsity structure after elimination of unknown #1

step of factorization will lead to global optimization of the whole process. George and Liu [GL81] describe the Quotient Minimum Degree (QMD) method, an efficient algorithmic implementation of MD in the SPARSPAK package that they developed. Second, it is obvious from Fig. 3.31 that elimination of the root of a tree in a graph is disastrous for the fill-in. The opposite is true if one starts with the leaves of the tree. This observation may not seem practical at first glance, as adjacency graphs in FEM are very far from being trees.36 What makes the idea useful is block factorization and partitioning. Suppose that graph G (or, almost equivalently, the finite mesh) is split into  element  two parts G 1 and G 2 by a separator S, so that G = G 1 G 2 S and G 1 G 2 = ∅; this corresponds to block partitioning of the system matrix. The partitioning has a tree structure, with the separator as the root and G 1,2 as the leaves. The system matrix has the following block form: ⎞ L G1 0 L G1,S L G2 L G2,S ⎠ L = ⎝ 0 T T L G1,S L G2,S LS ⎛

(3.125)

Elimination of block L G1 leaves the zero blocks unchanged, i.e. does not—on the block level—generate any fill in the matrix. For comparison, if the “root” block L S were eliminated first (quite unwisely), zero blocks would be filled. George and Liu [GL81, GL89] describe two main partitioning strategies: OneWay Dissection (1WD) and Nested Dissection (ND). In 1WD, the graph is partitioned by several dissecting lines that are, if viewed as geometric objects on the FE mesh, approximately “parallel.”37 Taken together, the separators form the root of a tree structure for the block matrix; the remaining disjoint blocks are the leaves of the tree. Elimination of the leaves generates fill-in in the root block, which is acceptable as long as the size of this block is moderate. To get an idea about the computational savings 36 For first-order elements in FEM, the mesh itself can be viewed as the sparsity graph of the system matrix, element nodes corresponding to graph vertices and element edges to graph edges. For a 2D triangular mesh with n nodes, the number of edges is approximately 2n, whereas for a tree it is n − 1. 37 The separators need not be straight lines, as their construction is topological (based on the sparsity graph) rather than geometric. The word “parallel” therefore should not be taken literally.

132

3 The Finite Element Method

of 1WD as compared to the envelope method, one may consider an m × l rectangular grid (m < l) in 2D38 and optimize the number of operations or, alternatively, memory requirements with respect to the chosen number of separators, each separator√being a grid line with m nodes. The end result is that the memory in 1WD can be ∼ 6/m times smaller than for the envelope method [GL81]. For example, if m = 100, the √ savings are by about a factor of four ( 6/100 ≈ 0.25). A typical ND separator in 2D can geometrically be pictured as two lines, horizontal and vertical, that split the graph into four approximately equal parts. The procedure is then applied recursively to each of the disjoint subgraphs. For a regular m × m grid in 2D, one can write a recursive relationship for the amount of computer memory MND (m) needed for ND; this ultimately yields [GL81] MND (m) =

31 2 m log2 m + O(m 2 ) 4

Hence, for 2D problems, ND is asymptotically almost optimal in terms of its memory requirements: The memory is proportional to the number of nodes times a relatively mild logarithmic factor. However, the computational cost is not optimal even for 2D meshes: The number of multiplicative operations is approximately 829 3 m + O(m 2 log2 m) 84 That is, the computational cost grows as the number of nodes n to the power of 1.5. Performance of direct solvers further deteriorates in three dimensions. For example, the computational cost and memory for ND scale as O(n 2 ) and O(n 4/3 ), respectively, when the number of nodes n is large. Some improvement has been achieved by combining the ideas of 1WD, ND and QMD, with a recursive application of multisection partitioning of the graph. These algorithms are implemented in the SPOOLES software package39 developed by C. Ashcraft, R. Grimes, J. Liu and others [AL98, AG99]. For illustration, Fig. 3.32 shows the number of nonzero entries in the Cholesky factor for several ordering algorithms as a function of the number of nodes in the finite element mesh. This data is for the scalar electrostatic equation in a cubic domain; Nested Dissection and one of the versions of Multistage Minimum Degree from the SPOOLES package perform better than other methods in this case.40 The limitations of direct solvers for 3D finite element problems are apparent, the main bottleneck being memory requirements due to the fill in the Cholesky factor (or the LU factors in the nonsymmetric case): tens of millions of nonzero entries for meshes of fairly moderate size, tens of thousands of nodes. The difficulties are 38 A

similar estimate can also be easily obtained for 3D problems, but in that case 1WD is not very efficient. 39 SParse Object Oriented Linear Equations Solver, netlib.org/linalg/spooles/spooles.2.2.html. 40 I thank Cleve Ashcraft for his detailed replies to my questions on the usage of SPOOLES 2.2.

3.11 An Overview of System Solvers

133

Fig. 3.32 Comparison of memory requirements (number of nonzero entries in the Cholesky factor) as a function of the number of finite element nodes for the scalar electrostatic equation in a cubic domain. Algorithms: Quotient Minimum Degree, Nested Dissection and two versions of Multistage Minimum Degree from the SPOOLES package

exacerbated in vector problems, in particular the ones that arise in electromagnetic analysis in 3D. Therefore for many 3D problems, and for some large 2D problems, iterative solvers are indispensable, their key advantage being a very limited amount of extra memory required.41 In comparison with direct solvers, iterative ones are arguably more diverse, more dependent on the algebraic properties of matrices and would require a more wide-ranging review and explanation. Instead of including it in this chapter, I refer the reader to the excellent monographs and review papers on iterative solvers by Y. Saad and H. A. van der Vorst [Saa03, vdV03b, SvdV00], L. A. Hageman and D. M. Young [You03, HY04], O. Axelsson [Axe96], W. Hackbusch [Hac18].

3.12 Electromagnetic Problems and Edge Elements 3.12.1 Why Edge Elements? In electromagnetic analysis and a number of other areas of physics and engineering, the unknown functions are often vector rather than scalar fields. A straightforward finite element model would involve approximation of the Cartesian components of the fields. This approach was historically the first to be used and is still in use today. However, it has several flaws—some of them obvious and some hidden. 41 Typically several auxiliary vectors in Krylov subspaces and sparse preconditioners need to be stored; see references below.

134

3 The Finite Element Method

An obvious drawback is that nodal element discretization of the Cartesian components of a field leads to a continuous approximation throughout the computational domain. This is inconsistent with the discontinuity of some field components – in particular, the normal components of E and H—at material boundaries. The treatment of such conditions by nodal elements is possible but inelegant: The interface nodes are “doubled,” and each of the two coinciding nodes carries the field value on one side of the interface boundary. Constraints then need to be imposed to couple the Cartesian components of the field at the double nodes. Although this difficulty is more of a nuisance than a serious obstacle for implementing the component-wise formulation, it is also an indication that something may be “wrong” with this formulation on a more fundamental level (more about that below). So-called spurious modes—the hidden flaw of the component-wise treatment— were noted in the late 1970s and provide further evidence of some fundamental limitations of Cartesian approximation. These modes are frequently branded as “notorious,” and indeed hundreds of papers on this subject have been published and are still being published.42 As a representative example, consider the computation of the eigenfrequencies ω and the corresponding electromagnetic field modes in a cavity resonator. The resonator is modeled as a simply connected domain  with perfectly conducting walls ∂. The governing equation for the electric field is ∇ × μ−1 ∇ × E − ω 2 E = 0 in ; n × E = 0 on ∂

(3.126)

where the standard notation for the electromagnetic material parameters μ,  and for the exterior normal n to the domain boundary ∂ is used. The ideally conducting walls cause the tangential component of the electric field to vanish on the boundary. Mathematically, the proper functional space for this problem is H0 (curl, )— the space of square-integrable vector functions with a square-integrable curl and a vanishing tangential component at the boundary: H0 (curl, ) ≡ {E : E ∈ L2 (), ∇ × E ∈ L2 (), n × E = 0 on ∂} (3.127) The weak formulation is obtained by inner-multiplying the eigenvalue equation by an arbitrary test function E ∈ H0 (curl, ): (∇ × μ−1 ∇ × E, E ) − ω 2 (E, E ) = 0, ∀E ∈ H0 (curl, )

(3.128)

where the inner product is that of L2 (), i.e.

42 891

references in the ISI database for the term “spurious modes” as of June 2019, compared to 457 such references before 2008, when the first edition of this book was published. This reference search does not include alternative relevant terminology such as spectral convergence, spurious-free approximation, “vector parasites,” etc., so the actual number of papers is much greater.

3.12 Electromagnetic Problems and Edge Elements

135

 (X, Y) ≡



X · Y d

for vector fields X and Y in H0 (curl, ). Using the vector calculus identity ∇ · (X × Y) = Y · ∇ × X − X · ∇ × Y

(3.129)

with X = μ−1 ∇ × E, Y = E , Eq. (3.128) can be integrated by parts to yield (μ−1 ∇ × E, ∇ × E ) − ω 2 (E, E ) = 0, ∀E ∈ H0 (curl, )

(3.130)

(It is straightforward to verify that the surface integral resulting from to the left-hand side of (3.129) vanishes, due to the fact that n × E = 0 on the wall.) The discrete problem is obtained by restricting E and E to a finite element subspace of H0 (curl, ); a “good” way of constructing such a subspace is the main theme of this section. The mathematical theory of convergence for the eigenvalue problem (3.130) is quite involved and well beyond the scope of this book43 ; however, some uncomplicated but instructive observations can be made. The continuous eigenproblem in its strong form (3.126) guarantees, for nonzero frequencies, zero divergence of the D vector (D = E). This immediately follows by applying the divergence operator to the equation. For the weak formulation (3.130), the zero-divergence condition is satisfied in the generalized form (see Appendix 3.17): (3.131) (E, ∇φ ) = 0 This follows by using, as a particular case, an arbitrary curl-free test function E = ∇φ in (3.130).44 It is now intuitively clear that the divergence-free condition will be correctly imposed in the discrete (finite element) formulation if the FE space contains a “sufficiently dense”45 population of gradients E = ∇φ . This argument was articulated for the first time (to the best of my knowledge) by A. Bossavit in 1990 [Bos90]. From this viewpoint, a critical deficiency of component-wise nodal approximation is that the corresponding FE space does not ordinarily contain “enough” gradients. The reason for that can be inferred from Fig. 3.33 (2D illustration for simplicity). Suppose that there exists a function φ vanishing outside a small cluster of elements and such that its gradient is in P13 —i.e. continuous throughout the computational domain and linear within each element. It is clear that φ must be a piecewise-quadratic function of coordinates. Furthermore, since ∇φ vanishes on the outer side of edge 23, due to the continuity of the gradient along that edge φ can only vary in proportion 43 References: the book by P. Monk [Mon03], papers by P. Monk, L. Demkowicz, D. Boffi, S. Caorsi and their collaborators: [MD01, BFea99, Bof01, BDC03, Bof07, BCD+11, BG19, CFR00]. 44 The equivalence between curl-free fields and gradients holds true for simply connected domains. 45 The quotation marks are used as a reminder that this analysis does not have full mathematical rigor.

136

3 The Finite Element Method

Fig. 3.33 Fragment of a 2D finite element mesh. A piecewise-quadratic function φ vanishes outside a cluster of elements. For ∇φ to be continuous, φ must be proportional to n 223 within element 123 and to n 234 within element 134. However, these quadratic functions are incompatible on the common edge 13, unless the normals n 23 and n 34 are parallel

to n 223 within element 123, where n 23 is the normal to edge 23. Similarly, φ must be proportional to n 234 in element 134. However, these two quadratic functions are incompatible along the common edge 13 of these two elements, unless the normals n 23 and n 34 are parallel. This observation illustrates very severe constraints on the construction of irrotational continuous vector fields that would be piecewise-linear on a given FE mesh. As a result, the FE space does not contain a representative set of gradients for the divergence-free condition to be enforced even in weak form. This failure to impose the zero-divergence condition on the D vector usually leads to nonphysical solutions. Additional considerations are given in my paper [Tsu03a]. The arguments presented above are insightful but from a rigorous mathematical perspective incomplete. A detailed analysis can be found in the literature cited in Footnote 43. For our purposes, the important conclusion is that the lack of spectral convergence (i.e. the appearance of “spurious modes”) is inherent in componentwise finite element approximation of vector fields. Attempts to rectify the situation by imposing additional constraints on the divergence, penalty terms, etc., are counterproductive. A radical improvement can be achieved by using edge elements described in Sect. 3.12.2 below. As we shall see, the approximation provided by these elements is, in a sense, more “physical” than the component-wise representation of vector fields; the corresponding mathematical structures also prove to be quite elegant.

3.12.2 The Definition and Properties of Whitney-Nédélec Elements As became apparent in Sect. 3.8.1 and in Sect. 3.9, a natural coordinate system for triangular and tetrahedral elements is formed by the barycentric coordinates λα

3.12 Electromagnetic Problems and Edge Elements

137

(α = 1, 2, 3 for triangles and α = 1, 2, 3, 4 for tetrahedra). Each function λ is linear and equal to one at one of the nodes and zero at all other nodes. Since the barycentric coordinates play a prominent role in the finite element approximation of scalar fields, it is sensible to explore how they can be used to approximate vector fields as well, and not in the component-wise sense. Remark 5 The most mathematically sound framework for the material of this section is provided by the treatment of physical fields as differential forms rather than vector fields. A large body of material—well written and educational – can be found on A. Bossavit’s website.46 Other references are cited in Sect. 3.12.4 and in Sect. 3.16. While differential geometry is a standard tool for mathematicians and theoretical physicists, it is not so for many engineers and applied scientists. For this reason, only regular vector calculus is used in this section and in the book in general; this is sufficient for our purposes. Natural “vector offspring” of the barycentric coordinates are the gradients ∇λα . These, however, are constant within each element and can therefore represent only piecewise-constant and—even more importantly—only irrotational vector fields. 6−12 = λα ∇λβ ; it  is sufficient to restrict them Next, we may consider products ψαβ to α = β because the gradients are linearly dependent, ∇λα = 0. Superscript “612” indicates that there are six such functions for a triangle and 12 for a tetrahedron. 3−6 . A little later, we shall consider a two times smaller set ψαβ It almost immediately transpires that these new vector functions have one of the desired properties: Their tangential components are continuous across element facets (edges for triangles and faces for tetrahedra), while their normal components are in general discontinuous. The most elegant way to demonstrate the tangential continuity 6−12 = ∇ × (λα ∇λβ ) = ∇λα × ∇λβ is by noting that the generalized curl ∇ × ψαβ is a regular function, not only a distribution, because the λs are continuous.47 (A jump in the tangential component would result in a Dirac delta term in the curl; see Appendix 3.17 and Formula (3.216) in particular.) The tangential components can also be examined more explicitly. The circulation 6−12 over the corresponding edge αβ is of ψαβ 

 edge αβ

6−12 ψαβ · τˆαβ dτ =

edge αβ

 = ∇λβ · τˆαβ

edge αβ

λα dτ =

λα ∇λβ · τˆαβ dτ

1 1 1 lαβ = lαβ 2 2

(3.132)

where τˆαβ is the unit edge vector pointing from node α to node β, and lαβ is the edge length. In the course of the transformations above, it was taken into account that (i) 46 http://lgep.geeps.centralesupelec.fr/index.php?page=alain-bossavit; last accessed on 10 May 2020. Bossavit is one of the key developers and proponents of edge element analysis. 47 Here each barycentric coordinate is viewed as a function defined in the whole domain, continuous everywhere but nonzero only over a cluster of elements sharing the same node.

138

3 The Finite Element Method

∇λβ is a (vector) constant, (ii) λα is a function varying from zero to one linearly along the edge, so that the component of its gradient along the edge is 1/lαβ and the mean value of λα over the edge αβ is 1/2. 6−12 Thus, the circulation of each function ψαβ is equal to 1/2 over its respective edge αβ and (as is easy to see) zero over all other edges. One type of edge element is defined by introducing (i) the functional space spanned 6−12 basis, and (ii) a set of degrees of freedom, two per edge: the tangential by the ψαβ components E αβ of the field (say, electric field E) at each node α along each edge αβ emanating from that node. The number of degrees of freedom and the dimension of the functional space are six for triangles and 12 for tetrahedra. It is not difficult to verify that the space in fact coincides with the space of linear vector functions within the element. A major difference, however, is that the basis functions for edge elements are only tangentially continuous, in contrast with fully continuous component-wise approximation by nodal elements. The FE representation of the field within the edge element is  6−12 E αβ ψαβ Eh = α=β 6−12 An interesting alternative is obtained by observing that each pair of functions ψαβ , 6−12 ψβα have similar properties: Their circulations along the respective edge (but taken in the opposite directions) are the same, and their curls are opposite. It makes sense to combine each pair into one new function as 3−6 6−12 6−12 ≡ ψαβ − ψβα = λα ∇λβ − λβ ∇λα ψαβ

(3.133)

6−12 3−6 It immediately follows from the properties of ψαβ that the circulation of ψαβ is one along its respective edge (in the direction from node α to node β) and zero along all other edges. The FE representation of the field is almost the same as before

Eh =



3−6 cαβ ψαβ

α=β

except that summation is now over a twice smaller set of basis functions, one per edge: Three for triangles and six for tetrahedra; cαβ are the circulations of the field along the edges. Figure 3.34 helps to visualize two such functions for a triangular element; for tetrahedra, the nature of these functions is similar. Their rotational character is obvious from the figure, the curls being equal to 3−6 = 2∇λα × ∇λβ ∇ × ψαβ

The (generalized) divergence of these vector basis functions (see Appendix 3.17) is also of interest:

3.12 Electromagnetic Problems and Edge Elements

139

3−6 3−6 Fig. 3.34 Two basis functions ψ 3−6 visualized for a triangular element: ψ23 (left) and ψ12 (right)

3−−6 ∇ · ψαβ = λα ∇ 2 λβ − λβ ∇ 2 λα

When viewed as regular functions within each element, the Laplacians in the righthand side are zero because the barycentric coordinates are linear functions. However, these Laplacians are nonzero in the sense of distributions and contain Dirac delta terms on the interelement boundaries due to the jumps of the normal component of the gradients of λ. Disregard of the distributional term has in the past been the source of two misconceptions about edge elements: 1. The basis set ψ 3−6 allegedly cannot be used to approximate fields with nonzero divergence. However, if this were true, linear elements, by similar considerations, could not be used to solve the Poisson equation with a nonzero right-hand side because the Laplacian of the linear basis functions is zero within each element. 2. Since the basis functions have zero divergence, spurious modes are eliminated. While the conclusion is correct, the justification would only be valid if divergence were zero in the distributional sense. Furthermore, there are families of edge elements that are not divergence-free and yet do not produce spurious modes. Rigorous mathematical analysis of spectral convergence is quite involved (see footnote 43).

3.12.3 Implementation Issues As already noted on p. 135, the finite element formulation of the cavity resonance problem (3.130) is obtained by restricting E and E to a finite element subspace Wh ⊂ H0 (curl, ) (μ−1 ∇ × Eh , ∇ × Eh  ) − ω 2 (Eh , Eh  ) = 0, ∀E ∈ Wh

(3.134)

Subspace Wh can be spanned by either of the two basis sets introduced in the previous section for tetrahedral elements (one or two degrees of freedom per edge)

140

3 The Finite Element Method

or, alternatively, by higher-order tetrahedral bases or bases on hexahedral elements (Sect. 3.12.4). In the algorithmic implementation of the procedure, the role of the edges is analogous to the role of the nodes for nodal elements. In particular, the matrix sparsity structure is determined by the edge-to-edge adjacency: For any two edges that do not belong to the same element, the corresponding matrix entry is zero. An excellent source of information on adjacency structures and related algorithms (albeit not directly in connection with edge elements) is S. Pissanetzky’s monograph [Pis84]. A new algorithmic issue, with no analogs in node elements, is the orientation of the edges, as the sign of field circulations depends on it. To make orientations consistent between several elements sharing the same edge, it is convenient to use global node numbers in the mesh. One suitable convention is to define the direction from the smaller global node number to the greater one as positive.

3.12.4 Historical Notes on Edge Elements In 1980 and 1986, J.-C. Nédélec proposed two families of tetrahedral and hexahedral edge elements [Néd80, Néd86]. For tetrahedral elements, Nédélec’s six- and twelve-dimensional approximation spaces are spanned by the vector basis functions λα ∇λβ − λβ ∇λα and λα ∇λβ , respectively, as discussed in the previous section. Nédélec’s exposition is formally mathematical and rooted heavily in the calculus of differential forms. As a result, there was for some time a disconnect between the outstanding mathematical development and its use in the engineering community. To applied scientists and engineers, finite element analysis starts with the basis functions. This makes practical sense because one cannot actually solve an FE problem without specifying a basis. Many practitioners would be surprised to hear that a basis is not part of the standard mathematical definition of a finite element. In the mathematical literature, a finite element is defined, in addition to its geometric shape, by a (finite-dimensional) approximation space and a set of degrees of freedom—linear functionals over that approximation space (see e.g. the classical book by P. G. Ciarlet [Cia80]). Nodal values are the most typical such functionals, but there certainly are other possibilities as well. As we already know, in Nédélec’s elements the linear functionals are circulations of the field along the edges. Nédélec built upon related ideas of P.-A. Raviart and J. M. Thomas who developed special finite elements on triangles in the late 1970s [RT77]. It took almost a decade to transform edge elements from a mathematical theory into a practical tool. A. Bossavit’s contribution in that regard is exceptional. He presented, in a very lucid way, the fundamental rationale for edge elements [Bos88b, Bos88a] and developed their applications to eddy current problems [BV82, BV83], scattering [BM89], cavity resonances [Bos90], force computation [Bos92] and other

3.12 Electromagnetic Problems and Edge Elements

141

areas. Stimulated by prior work of P. R. Kotiuga48 and the mathematical papers of J. Dodziuk [Dod76], W. Müller [Mül78] and J. Komorowski [Kom75], Bossavit discovered a link between the tetrahedral edge elements with six degrees of freedom and differential forms in the 1957 theory of H. Whitney [Whi57]. Nédélec’s original papers did not explicitly specify any bases for the FE spaces. Since practical computation does rely on the bases, the engineering and computational electromagnetics communities in the late 1980s and in the 1990s devoted much effort to more explicit characterization of edge element spaces. A detailed description of various types of elements would lead us too far astray, as this book is not a treatise on electromagnetic finite element analysis. However, to give the reader a flavor of some developments in this area and to provide a reference point for the experts, succinct definitions of several common edge element spaces are compiled in Appendix 3.12.5 (see also [Tsu03b]). Further information can be found in the monographs by P. Monk [Mon03], J. Jin [Jin02] and J. L. Volakis et al. [VCK98]. Comparative analysis of edge element spaces by symbolic algebra can be found in [Tsu03b]. Families of hierarchical and adaptive elements developed independently by J. P. Webb [WF93, Web99, Web02] and by L. Vardapetyan and L. Demkowicz [VD99] deserve to be mentioned separately. In hierarchical refinement, increasingly accurate FE approximations are obtained by adding new functions to the existing basis set. This can be done both in the context of h-refinement (reducing the element size and adding functions supported by smaller elements to the existing functions on larger elements) and p-refinement (adding, say, quadratic functions to the existing linear ones). Hierarchical and adaptive refinement are further discussed in Sect. 3.13 for the scalar case. The vectorial case is much more complex, and I defer to the papers cited above for additional information. One more paper by Webb [Web93] gives a concise but very clear exposition of edge elements and their advantages.

3.12.5 Appendix: Several Common Families of Tetrahedral Edge Elements Several representative families of elements, with the corresponding bases, are listed below. The list is definitely not exhaustive; for example, Demkowicz–Vardapetyan elements with hp-refinement and R. Hiptmair’s general perspective on high-order edge elements are not included. My paper [Tsu03b] shows how various edge element families and the respective functional spaces can be compared and analyzed using symbolic algebra. In the expressions below, λi , as before, is the barycentric coordinate corresponding to node i (i = 1,2,3,4) of a tetrahedral element.

48 Kotiuga was apparently the first to note, in his 1985 Ph.D. thesis, the connection of finite element analysis in electromagnetics with the fundamental branches of mathematics: differential geometry and algebraic topology.

142

3 The Finite Element Method

1. The Ahagon–Kashimoto basis (20 functions) [AK95].  {4λ1 (λ2 ∇λ3 − λ3 {12 “edge” functions (4λi − 1)(λi ∇λ j − λ j ∇λi ), i= j} ∇λ2 ), 4λ2 (λ3 ∇λ1 − λ1 ∇λ3 ), 4λ1 (λ3 ∇λ4 − λ4 ∇λ3 ), 4λ4 (λ1 ∇λ3 − λ3 ∇λ1 ), 4λ1 (λ2 ∇λ4 − λ4 ∇λ2 ), 4λ2 (λ1 ∇λ4 − λ4 ∇λ1 ), 4λ2 (λ3 ∇λ4 − λ4 ∇λ3 ), 4λ4 (λ2 ∇λ3 − λ3 ∇λ2 )}. 2. The Lee–Sun–Cendes basis (20 functions) [LSC91]. {12 edge-based functions  λi ∇λ j , i = j} { λ1 λ2 ∇λ3 , λ1 λ3 ∇λ2 , λ2 λ3 ∇λ4 , λ2 λ4 ∇λ3 , λ3 λ4 ∇λ1 , λ3 λ1 ∇λ4 , λ4 λ1 ∇λ2 , λ4 λ2 ∇λ1 }.  3. The Kameari basis (24 functions) [Kam99]. {the Lee basis} { ∇(λ2 λ3 λ4 ), ∇(λ1 λ3 λ4 ), ∇(λ1 λ2 λ4 ), ∇(λ1 λ2 λ3 ) }. 4. The Ren–Ida basis (20 functions) [RI00]. {12 edge-based functions λi ∇λ j ,  i = j} { λ1 λ2 ∇λ3 − λ2 λ3 ∇λ1 , λ1 λ3 ∇λ2 − λ2 λ3 ∇λ1 , λ1 λ2 ∇λ4 − λ4 λ2 ∇λ1 , λ1 λ4 ∇λ2 − λ4 λ2 ∇λ1 , λ1 λ3 ∇λ4 − λ4 λ3 ∇λ1 , λ1 λ4 ∇λ3 − λ3 λ4 ∇λ1 , λ2 λ3 ∇λ4 − λ4 λ3 ∇λ2 , λ2 λ4 ∇λ3 − λ3 λ2 ∇λ4 }.  5. The Savage–Peterson basis [SP96]. {12 edge-based functions λi ∇λ j , i = j} { λi λ j ∇λk − λi λk ∇λ j , λi λ j ∇λk − λ j λk ∇λi , 1 ≤ i < j < k ≤ 4}. 6. The Yioultsis–Tsiboukis basis  (20 functions) [YT97]. {(8λi 2 − 4λi )∇λ j + (−8λi λ j + 2λ j )∇λi , i = j} {16λ1 λ2 ∇λ3 − 8λ2 λ3 ∇λ1 − 8λ3 λ1 ∇λ2 ; 16λ1 λ3 ∇λ2 − 8λ3 λ2 ∇λ1 − 8λ2 λ1 ∇λ3 ; 16λ4 λ1 ∇λ2 − 8λ1 λ2 ∇λ4 − 8λ2 λ4 ∇λ1 ; 16λ4 λ2 ∇λ1 − 8λ2 λ1 ∇λ4 − 8λ1 λ4 ∇λ2 ; 16λ2 λ3 ∇λ4 − 8λ3 λ4 ∇λ2 − 8λ4 λ2 ∇λ3 ; 16λ2 λ4 ∇λ3 − 8λ4 λ3 ∇λ2 − 8λ3 λ2 ∇λ4 ; 16λ3 λ1 ∇λ4 − 8λ1 λ4 ∇λ3 − 8λ4 λ3 ∇λ1 ; 16λ3 λ4 ∇λ1 − 8λ4 λ1 ∇λ3 − 8λ1 λ3 ∇λ4 }. 7. The Webb–Forghani basis(20 functions) [WF93]. {6 edge-based functions  {6 edge-based functions ∇(λi λ j ), i = j} λi ∇λ j − λ j ∇λi , i = j} { λ1 λ2 ∇λ3 , λ1 λ3 ∇λ2 , λ2 λ3 ∇λ4 , λ2 λ4 ∇λ3 , λ3 λ4 ∇λ1 , λ3 λ1 ∇λ4 , λ4 λ1 ∇λ2 , λ4 λ2 ∇λ1 }. 8. The Graglia–Wilton–Peterson basis (20 functions) [GWP97]. { (3λi − 1)(λi ∇λ j  − λ j ∇λi ), i = j} 9/2 × {λ2 (λ3 ∇λ4 − λ4 ∇λ3 ), λ3 (λ4 ∇λ2 − λ2 ∇λ4 ), λ3 (λ4 ∇λ1 − λ1 ∇λ4 ), λ4 (λ1 ∇λ3 − λ3 ∇λ1 ), λ4 (λ1 ∇λ2 − λ2 ∇λ1 ), λ1 (λ4 ∇λ2 − λ2 ∇ λ4 ), λ1 (λ2 ∇λ3 − λ3 ∇λ2 ), λ2 (λ1 ∇λ3 − λ3 ∇λ1 )}.

3.13 Adaptive Mesh Refinement and Multigrid Methods 3.13.1 Introduction One of the most powerful ideas, which has shaped the development of finite element analysis since the 1980s, is adaptive refinement. Once an FE problem has been solved on a given initial mesh, special a posteriori error estimates or indicators49 are used to identify the subregions with relatively high error. The mesh is then refined in these areas, and the problem is resolved. It is also possible to “unrefine” the mesh in the 49 Estimates

provide an approximate numerical value of the actual error. Indicators show whether the error is relatively high or low, without necessarily predicting its numerical value.

3.13 Adaptive Mesh Refinement and Multigrid Methods

143

Fig. 3.35 Local mesh refinement (2D illustration for simplicity). Left: continuity of the solution at “slave” nodes must be maintained. Right: “green refinement.” (Reprinted by permission from c [TP99a] 1999 IEEE.)

regions where the error is perceived to be small. The procedure is then repeated recursively and is typically integrated with efficient system solvers such as multigrid cycles or multilevel preconditioners (Sect. 3.13.4). There are two main versions of mesh refinement. In h-refinement, the mesh size h is reduced in selected regions to improve the accuracy. In p-refinement, the elementwise order p of local approximating polynomials is increased. The two versions can be combined in an hp-refinement procedure. There are numerous ways of error estimation (Sect. 3.13.3) and numerous algorithms for effecting the refinement. To summarize, adaptive techniques are aimed at generating a quasi-optimal mesh adjusted to the local behavior of the solution, while maintaining a high convergence rate of the iterative solver. Three different but related issues arise: 1. Implementation of local refinement without violating the geometric conformity of the mesh. 2. Efficient multilevel iterative solvers. 3. Local a posteriori error estimates. Figure 3.35 shows nonconforming (“slave”) nodes appearing on a common boundary between two finite elements e1 and e2 if one of these elements (say, e1 ) is refined and the other one (e2 ) is not. The presence of such nodes is a deviation from the standard set of requirements on a FE mesh. If no restrictions are imposed, the continuity of the solution at slave nodes will generally be violated. One remedy is a transitory (so-called green) refinement of element e2 (W. F. Mitchell [Mit89, Mit92], F. Bornemann et al. [BEK93]) as shown in Fig. 3.35, right. However, green refinement generally results in non-nested meshes, which may affect the performance of iterative solvers.

3.13.2 Hierarchical Bases and Local Refinement Alternatively , nonconforming nodes may be retained if proper continuity conditions are imposed. This can be accomplished in a natural way in the hierarchical basis

144

3 The Finite Element Method

c Fig. 3.36 Fragment of a two-level 1D mesh. (Reprinted by permission from [TP99a] 1999 IEEE.)

(H. Yserentant [Yse86], W. F. Mitchell [Mit89, Mit92], U. Rüde [Rüd93]). A simple 1D example (Fig. 3.36) illustrates the hierarchical basis representation of a function. In the nodal basis a piecewise-linear function has a vector of nodal values u (N ) = (u 1 , u 2 , u 3 , u 4 , u 5 , u 6 )T . Nodes 5 and 6 are generated by refining the coarse level elements 1–2 and 2–3. In the hierarchical basis, the degrees of freedom at nodes 5, 6 correspond to the difference between the values on the fine level and the interpolated value from the coarse level. Thus, the vector in the hierarchical basis is 1 1 u (H ) = (u 1 , u 2 , u 3 , u 4 , u 5 − (u 1 + u 2 ), u 6 − (u 2 + u 3 ))T 2 2

(3.135)

This formula effects the transformation from nodal to hierarchical values of the same piecewise-linear function. More generally, let a few levels of nested FE meshes (in one, two or three dimensions) be generated by recursively subdividing some or all elements on a coarser level into several smaller elements. For simplicity, only first-order nodal elements will be considered and it will be assumed that new nodes are added at the midpoints of the existing element edges. (The ideas are quite general, however, and can be carried over to high-order elements and edge elements; see e.g. P. T. S. Liu and J. P. Webb [LW95], J. P. Webb and B. Forghani [WF93].) The hierarchical representation of a piecewise-linear function can be obtained from its nodal representation by a recursive application of elementary transforms similar to (3.135). Precise theory and implementation are detailed by H. Yserentant [Yse86]. An advantage of the hierarchical basis is the natural treatment of slave nodes (Fig. 3.35, left). The continuity of the solution is ensured by simply setting the hierarchical basis value at these nodes to zero. Remark 6 In the nonconforming refinement of Fig. 3.35 (left), element shapes do not deteriorate. However, this advantage is illusory. Indeed, the FE space for the “green refinement” of Fig. 3.35 (right) obviously contains the FE space of Fig. 3.35

3.13 Adaptive Mesh Refinement and Multigrid Methods

145

(left), and therefore, the FE solution with slave nodes cannot be more accurate than for green refinement. Thus, the effective “mesh quality,” unfortunately, is not preserved with slave nodes. For tetrahedral meshes, subdividing an element into smaller ones when the mesh is refined is not trivial; careless subdivision may lead to degenerate elements. S. Y. Zhang [Zha95] proposed two schemes: “labeled edge subdivision” and “shortedge subdivision” guaranteeing that tetrahedral elements do not degenerate in the refinement process. The initial stage of both methods is the same: The edge midpoints of the tetrahedron are connected, producing four corner tetrahedra and a central octahedron. The octahedron can be further subdivided into four tetrahedra in three different ways [Zha95] by using one additional edge. The difference between Zhang’s two refinement schemes is in the way this additional edge is chosen. The “labeled edge subdivision” algorithm relies on a numbering convention for nodes being generated (see [Zha95] for details). In the “short-edge subdivision” algorithm the shortest of the three possible interior edges is selected. For tetrahedra without obtuse planar angles between edges both refinement schemes are equivalent, provided that the initial refinement is the same—i.e. for a certain numbering of nodes of the initial element [Zha95]. Zhang points out that “in general, it is not simple to find the measure of degeneracy for a given tetrahedron” [Zha95] and uses as such a measure the ratio of the maximum edge length to the radius of the inscribed sphere. A. Plaks and I used a more precise criterion—the minimum singular value condition (Sect. 3.14) to compare the two refinement schemes. Short-edge subdivision in general proves to be better than labeled edge subdivision [TP99b].

3.13.3 A Posteriori Error Estimates Adaptive hp-refinement requires some information about the distribution of numerical errors in the computational domain. The FE mesh is refined in the regions where the error is perceived to be higher and left unchanged, or even unrefined, in regions with lower errors. Numerous approaches have been developed for estimating the errors a posteriori—i.e. after the FE solution has been found. Some of these approaches are briefly reviewed below; for comprehensive treatment, see monographs by M. Ainsworth and J. T. Oden [AO00], I. Babuška and T. Strouboulis [BS01], R. Verfürth [Ver96], and W. Bangerth and R. Rannacher [BR03]. Much information and many references for this section were provided by S. Prudhomme, the reviewer of this book; his help is greatly appreciated. The overview below follows the book chapter by Prudhomme and Oden [PO02] as well as W. F. Mitchell’s paper [Mit89].

146

3 The Finite Element Method

3.13.3.1

Recovery-Based Error Estimators

These methods were proposed by O. C. Zienkiewicz and J. Z. Zhu. According to the ISI database, as of June 2019, their 1987 and 1992 papers [ZZ87, ZZ92a, ZZ92b] were cited approximately 1450, 1260 and 640 times, respectively. The essence of the method, in a nutshell, is in field averaging. The computed field within an element is compared with the value obtained by double interpolation: element-to-node first and then node-to-element. The intuitive observation behind this idea is that the field typically has jumps across element boundaries; these jumps are a numerical artifact that can serve as an error indicator. The averaging procedure captures the magnitudes of the jumps. Some versions of the Zienkiewicz–Zhu method rely on superconvergence properties of the FE solution at special points in the elements. For numerical examples and validation of gradient-recovery estimators, see e.g. I. Babuška et al. [BSU+94]. The method is easy to implement and in my experience (albeit limited mostly to magnetostatic problems) works well [TP99a].50 One difficulty is in handling nodes at material interfaces, where the field jump can be a valid physical property rather than a numerical artifact. In our implementation [TP99a] of the Zienkiewicz–Zhu scheme, the field values were averaged at the interface nodes separately for each of the materials involved. Ainsworth and Oden [AO00] note some drawbacks of recovery-based estimators and even present a 1D example where the recovery-based error estimate is zero, while the actual error can be arbitrarily large. Specifically, they consider a 1D Poisson equation with a rapidly oscillating sinusoidal solution. It can be shown (see Appendix 3.10.2) that the FE Galerkin solution with first-order elements actually interpolates the exact solution at the FE mesh nodes. Hence, if these nodes happen to be located at the zeros of the oscillating exact solution, the FE solution, as well as all the gradients derived from it, are identically zero! Prudhomme and Oden also point out that for problems with shock waves gradientrecovery methods tend to indicate mesh refinement around the shock rather than at the shock itself.

3.13.3.2

Residual-Based Methods

While the solution error is not directly available, residual – the difference between the right and left-hand sides of the equation—is. For a problem of the form Lu = ρ

(3.136)

and the corresponding weak formulation L(u, v) = (ρ, v) 50 Joint

work with A. Plaks.

(3.137)

3.13 Adaptive Mesh Refinement and Multigrid Methods

147

the residual is Ru h ≡ ρ − Lu h

(3.138)

R(u h , v) ≡ (ρ, v) − L(u h , v)

(3.139)

or in the weak form Symbols L and R here are overloaded (with little possibility of confusion) as operators and the corresponding bilinear forms. The numerical solution u h satisfies the Galerkin equation in the finite-dimensional subspace Vh . In the full space V residuals (3.138) or (3.139) are, in general, nonzero and can serve as a measure of accuracy. In principle, the error, and hence the exact solution, can be found by solving the problem with the residual in the right-hand side. However, doing so is no less difficult than solving the original problem in the first place. Instead, one looks for computationally inexpensive ways of extracting useful information about the magnitude of the error from the magnitude of the residual. One of the simplest element-wise error estimators of this kind combines, with proper weights, two residual-related terms: (Lu − ρ)2 integrated over the volume (area) of the element and the jump of the normal component of flux density, squared and integrated over the facets of the element (R. E. Bank and A. H. Sherman [BS79]). P. Morin et al. [MNS02] develop convergence theory for adaptive methods with this estimator and emphasize the importance of the volume-residual term that characterizes possible oscillations of the solution. A different type of method, proposed by I. Babuška and W. C. Rheinboldt in the late 1970s, makes use of auxiliary problems over small clusters (“patches”) of adjacent elements [BR78b, BR78a, BR79]. To gain any additional nontrivial information about the error, the auxiliary local problem must be solved with higher accuracy than the original global problem, i.e. the FE space has to be locally enriched (usually using h- or p-refinement). An alternative interpretation (W. F. Mitchell [Mit89]) is that such an estimator measures how strongly the FE solution would change if the mesh were to be refined locally. Yet another possibility is to solve the problem with the residual globally but approximately, using only a few iterations of the conjugate gradient method (Prudhomme and Oden [PO02]).

3.13.3.3

Goal-Oriented Error Estimation

In practice, FE solution is often aimed at finding specific quantities of interest— for example, field, temperature, stress, etc. at a certain point (or points), equivalent parameters (e.g. capacitance or resistance between electrodes), and so on. Naturally, the effort should then be concentrated on obtaining these quantities of interest, rather than the overall solution, with maximum accuracy. Pointwise estimates have a long history dating back at least to the the 1940s– 1950s (H. J. Greenberg [Gre48], C. B. Maple, [Map50]; K. Washizu [Was53]). The key idea can be briefly summarized as follows. One can express the value of solution

148

3 The Finite Element Method

u at a point r0 using the Dirac delta functional as u(r0 ) = u, δ(r − r0 )

(3.140)

(Appendix 6.15 gives an introduction to generalized functions (distributions), with the Dirac delta among them.) Further progress can be made by using Green’s function g of the L operator51 : Lg(r, r0 ) = δ(r − r0 ). Then u(r0 ) = (u, Lg(r, r0 )) = (L∗ u, g(r, r0 )) = L∗ (u, g(r, r0 ))

(3.141)

where symbol L∗ is the adjoint operator and (again with overloading) the corresponding bilinear form L∗ (u, v) ≡ L(v, u). The role of Green’s function in this analysis is to convert the delta functional (3.140) that is hard to evaluate directly into an L-form that is closely associated with the problem at hand. The right-hand side of (3.141) typically has the physical meaning of the mutual energy of two fields. For example, if L is the Laplace operator (self-adjoint if the boundary conditions are homogeneous), then the right-hand side is (∇u, ∇g)—the inner product (mutual energy) of fields −∇u (the solution) and −∇g (field of a point source). Importantly, due to the variational nature of the problem, lower and upper bounds can be established for u(r0 ) of (3.141) (A. M. Arthurs [Art80]). Moreover, bounds can be established for the pointwise error as well. In the finite element context (1D), this was done in 1984 by E. C. Gartland [Gar84]. Also in 1984, in a series of papers [BM84a, BM84b, BM84c], I. Babuška and A.D. Miller applied the duality ideas to a posteriori error estimates and generalized the method to quantities of physical interest. In Babuška and Miller’s example of an elasticity problem of beam deformation, such quantities include the average displacement of the beam, the shear force, the bending moment, etc. For a contemporary review of the subject, including both the duality techniques and goal-oriented estimates with adaptive procedures, see R. Becker and R. Rannacher [BR01] and J. T. Oden and S. Prudhomme [OP01]. For electromagnetic applications, methods of this kind were developed by R. Albanese, R. Fresa and G. Rubinacci [AF98, AFR00], by J. P. Webb [Web05] and by P. Ingelstrom and A. Bondeson [IB03].

3.13.3.4

Fully Adaptive Multigrid

In this approach, developed by W. F. Mitchell [Mit89, Mit92] and U. Rüde [Rüd93]), solution values in the hierarchical basis (Sect. 3.13.2) characterize the difference between numerical solutions at two subsequent levels of refinement and can therefore serve as error estimators.

51 The

functional space where this operator is defined, and hence the boundary conditions, remain fixed in the analysis.

3.13 Adaptive Mesh Refinement and Multigrid Methods

149

3.13.4 Multigrid Algorithms The presentation of multigrid methods in this book faces a dilemma. These methods are first and foremost iterative system solvers—the subject matter not in general covered in the book. On the other hand, multigrid methods, in conjunction with adaptive mesh refinement, have become a truly state-of-the-art technique in modern FE analysis and an integral part of commercial FE packages; therefore the chapter would be incomplete without mentioning this subject. Fortunately, several excellent books exist. The one by W. L. Briggs et al. [BHM00] gives a clear explanation of key ideas and elements of the theory. For a comprehensive exposition of the mathematical theory, the monographs by W. Hackbusch [Hac85], S.F. McCormick [McC89], P. Wesseling [Wes91] and J.H. Bramble [Bra93], as well as the seminal paper by A. Brandt [Bra77], are highly recommended; see also the review paper by C.C. Douglas [Dou96]. On a historical note, the original development of multilevel algorithms is attributed to the work of the Russian mathematicians R. P. Fedorenko [Fed61, Fed64] and N. S. Bakhvalov [Bak66] in the early 1960s. There was an explosion of activity after A. Brandt further developed the ideas and put them into practice [Bra77]. As a guide for the reader unfamiliar with the essence of multigrid methods, this section gives a narrative description of the key ideas, with “hand-waving” arguments only. Consider the simplest possible model 1D equation Lu ≡ −

d 2u = f on  = [0, a]; dx2

u(0) = u(a) = 0

(3.142)

where f is a given function of x. FE Galerkin discretization of this problem leads to a system of equations (3.143) Lu = f where u and f are Euclidean vectors and L is a square matrix; u represents the nodal values of the FE solution. For first-order elements, matrix L is three diagonal, with 2 on the main diagonal and −1 on the adjacent ones. (The modification of the matrix due to boundary conditions, as described in Sect. 3.7.1, will not be critical in this general overview.) Operator L has a discrete set of spatial eigenfrequencies and eigenmodes, akin to the modes of a guitar string. As Fig. 3.37 illustrates, the discrete operator L of (3.143) inherits the oscillating behavior of the eigenmodes but has only a finite number of those. There is the Nyquist limit for the highest spatial frequency that can be adequately represented on a grid of size h. Figure 3.37 exhibits the eigenmodes with lowest and highest frequency on a uniform grid with 16 elements. Any iterative solution process for equation (3.143)—including multigrid solvers— involves an approximation v to the exact solution vector u. The error vector

150

3 The Finite Element Method

Fig. 3.37 Eigenvectors with lowest (top) and highest (bottom) spatial frequency. Laplace operator discretized on a uniform grid with 16 elements

e ≡ u −v

(3.144)

is of course generally unknown in practice; however, the residual r = f − Lv is computable. It is easy to see that the residual is equal to Le : r = f − Lv = Lu − Lv = Le

(3.145)

The following sequence of observations leads to the multigrid methodology. 1. High-frequency components of the error – or, equivalently, of the residual— (similar to the bottom part of Fig. 3.37) can be easily and rapidly reduced by applying basic iterative algorithms such as Jacobi or Gauss–Seidel. In contrast, low-frequency components of the error decay very slowly. See [BHM00, Tre97, GL96] for details. 2. Once highly oscillatory components of the error have been reduced and the error and the residual have thus become sufficiently smooth, the problem can be effectively transferred to a coarser grid (typically, twice coarser). The procedure for information transfer between the grids is outlined below. The spatial frequency

3.13 Adaptive Mesh Refinement and Multigrid Methods

151

of the eigenmodes relative to the coarser grid is higher than on the finer grid, and the components of the error that are oscillatory relative to the coarse grid can be again eliminated with basic iterative solvers. This is effective not only because the relative frequency is higher, but also because the system size on the coarser grid is smaller. 3. It remains to see how the information transfer between finer and coarser grids is realized. Residuals are transferred from finer to coarser grids. Correction vectors obtained after smoothing iterations on coarser grids are transferred to finer grids. There is more than one way of defining the transfer operators. Vectors from a coarse grid can be moved to a fine one by some form of interpolation of the nodal values. The simplest fine-to-coarse transfer is injection: The values at the nodes of the coarse grids are taken to be the same as the values at the corresponding nodes of the fine grid. However, it is often desirable that the coarse-to-fine and fine-to-coarse transfer operators be adjoint to one another,52 especially for symmetric problems, to preserve the symmetry. In that case the fine-to-coarse transfer is different from injection. Multigrid utilizes these ideas recursively, on a sequence of nested grids. There are several ways of navigating these grids. V-cycle starts on the finest grid and descends gradually to the coarsest one; then moves back to the finest level. W-cycle also starts by traversing all fine-to-coarse levels; then, using the coarsest level as a base, it goes back-and forth in rounds spanning an increasing number of levels. Finally, full multigrid cycle starts at the coarsest level and moves back-and-forth, involving progressively more and more finer levels. A precise description and pictorial illustrations of these algorithms can be found in any of the multigrid books. Convergence of multigrid methods depends on the nature of the underlying problem: primarily, in mathematical terms, on whether or not the problem is elliptic and on the level of regularity of the solution, on the particular type of the multigrid algorithm employed, and to a lesser extent on other details (the norms in which the error is measured, smoothing algorithms, etc.) For elliptic problems, convergence can be close to optimal—i.e. proportional to the size of the problem, possibly with a mild logarithmic factor that in practice is not very critical. Furthermore, multigrid methods can be used as preconditioners in conjugate gradient and similar solvers; particularly powerful are the Bramble–Pasciak–Xu (BPX) preconditioners developed in J. Xu’s Ph.D. thesis [Xu89] and in [BPX90]. Since BPX preconditioners are expressed as double sums over all basis functions and over all levels, they are relatively easy to parallelize. A broad mathematical framework for multilevel preconditioners and for the analysis of convergence of multigrid methods in general is established in Xu’s papers [Xu92, Xu97]. Results of numerical experiments with BPX for several electromagnetic applications are reported in our papers with A. Plaks [Tsu94, TPB98, PTPT00]. 52 There is an interesting parallel with Ewald methods of Chap. 5, where charge-to-grid and gridto-charge interpolation operators must be adjoint for conservation of momentum in a system of charged particles to hold numerically; see p. 266.

152

3 The Finite Element Method

Another very interesting development is algebraic multigrid (AMG) schemes, where multigrid ideas are applied in an abstract form (K. Stüben et al. [Stü83, SL86, Stü00]). The underlying problem may or may not involve any actual geometric grids; for example, there are applications to electric circuits and to coupled field-circuit problems (D. Lahaye et al. [LVH04]). In AMG, a hierarchical structure typical of multigrid methods is created artificially, by examining the strength of the coupling between the unknowns. The main advantage of AMG is that it can be used as a “black box” solver. For further information, the interested reader is referred to the books cited above and to the tutorials posted on the MGNet website.53

3.14 Special Topic: Element Shape and Approximation Accuracy The material of this section was inspired by my extensive discussions with Alain Bossavit and Pierre Asselin in 1996–1999. (By extending the analysis of J. L. Synge [Syn57], Asselin independently obtained a result similar to the minimum singular value condition on p. 164.) Numerical experiments were performed jointly with Alexander Plaks. I also thank Ivo Babuška and Randolph Bank for informative conversations in 1998–2000.

3.14.1 Introduction Common sense, backed up by rigorous error estimates (Sect. 3.10) tells us that the accuracy of the finite element approximation depends on the element size and on the order of polynomial interpolation. More subtle is the dependence of the error on element shape. Anyone who has ever used FEM knows that a triangular element similar to the one depicted on the left side of Fig. 3.38 is “good” for approximation, while the element shown on the right is “bad.” The flatness of the second element should presumably lead to poor accuracy of the numerical solution. But how flat are flat elements? How can element shape in FEM be characterized precisely and how can the “source” of the approximation error be identified? Some of the answers to these questions are classical but some are not yet well known, particularly the connection between approximation accuracy and FE matrices (Sect. 3.14.2), as well as the minimum singular value criterion for the “edge shape matrix” (Sects. 3.14.2 and 3.14.3). The reader need not be an expert in FE analysis to understand the first part of this section; the second part is more advanced. Overall, the section is based on my joint work with A. Plaks [Tsu98b, Tsu98a, Tsu98c, TP98, TP99b, Tsu99]. 53 http://www.mgnet.org/mgnet-tuts.html

Last accessed 10 May 2020.

3.14 Special Topic: Element Shape and Approximation Accuracy

153

Fig. 3.38 “Good” and “bad” element shape (details in the text)

For triangular elements, one intuitively obvious consideration is that small angles should be avoided. The mathematical basis for that is given by Zlámal’s minimum angle condition [Zlá68]: If the minimum angle of elements is bounded away from zero, φmin ≥ φ0 > 0, then the FE interpolation error tends to zero for the family of meshes with decreasing mesh sizes. Geometrically equivalent to Zlámal’s condition is the boundedness of the ratio of the element diameter (maximum element edge lmax ) to the radius ρ of the inscribed circle. Zlámal’s condition implies that small angles should be avoided. But must they? In mathematical terms, one may wonder if Zlámal’s condition is not only sufficient but in some sense necessary for accurate approximation. If Zlámal’s condition were necessary, a right triangle with a small acute angle would be unsuitable. However, on a regular mesh with right triangles, first-order FE discretization of the Laplace equation is easily shown to be identical with the standard 5-point finite-difference scheme. But the FD scheme does not have any shape related approximation problems. (The accuracy is limited by the maximum mesh size but not by the aspect ratio.) This observation suggests that Zlámal’s condition could be too stringent. Indeed, a less restrictive shape condition for triangular elements exists. It is sufficient to require that the maximum angle of an element be bounded away from π. In particular, according to this condition, right triangles, even with very small acute angles, are acceptable (what matters is the maximum angle that remains equal to π/2). The maximum angle condition appeared in J. L. Synge’s monograph [Syn57, pp. 209–213] in 1957, before the finite element era. (Synge considered piecewiselinear interpolation on triangles without calling them finite elements.) In 1976, I. Babuška and A. K. Aziz [BA76] published a more detailed analysis of FE interpolation on triangles and showed that the maximum angle condition was not only sufficient, but in a sense essential for the convergence of FEM. In addition, they proved the corresponding W p1 -norm estimate. In 1992, M. Kˇrižek [Kˇri92] generalized the maximum angle condition to tetrahedral elements: The maximum angle for all triangular faces and the maximum dihedral angle should be bounded away from π. Other estimates for tetrahedra (and, more generally, simplices in Rd ) were given by Yu. N. Subbotin [Sub90] and S. Waldron [Wal98a]. P. Jamet’s condition [Jam76] is closest to the result of this section but is more difficult to formulate and apply.

154

3 The Finite Element Method

On a more general theoretical level, the study of piecewise-polynomial interpolation in Sobolev spaces, with applications to spline interpolation and FEM, has a long history dating back to the fundamental works of J. Deny and J. L. Lions, J. H. Bramble and S. R. Hilbert [BH70], I. Babuška [Bab71], and the already cited Ciarlet and Raviart paper. Two general approaches systematically developed by Ciarlet and Raviart have now become classical. The first one is based on the multipoint Taylor formula (P. G. Ciarlet and C. Wagschal [CW71]); the second approach (e.g. Ciarlet [Cia80]) relies on the Deny-Lions and Bramble–Hilbert lemmas. In both cases, under remarkably few assumptions, error estimates for Lagrange and Hermite interpolation on a set of points in Rn are obtained. For tetrahedra, the “shape part” of Ciarlet and Raviart’s powerful result (p. 120) translates into the ratio of the element diameter (i.e. the maximum edge) to the radius of the inscribed sphere. Boundedness of this ratio ensures convergence of FE interpolation on a family of tetrahedral meshes with decreasing mesh sizes. However, as in the 2D case, such a condition is a little too restrictive. For example, “right tetrahedra” (having three mutually orthogonal edges) are rejected, even though it is intuitively felt, by analogy with right triangles, that there is in fact nothing wrong with them. A precise characterization of the shape of tetrahedral elements is one of the particular results of the general analysis that follows. An algebraic, rather than geometric, source of interpolation errors for arbitrary finite elements is identified and its geometric interpretation for triangular and tetrahedral elements is given.

3.14.2 Algebraic Sources of Shape-Dependent Errors: Eigenvalue and Singular Value Conditions First, we establish a direct connection between interpolation errors and the maximum eigenvalue (or the trace) of the appropriate FE stiffness matrices. This is different from the more standard consideration of matrices of the affine transformation to/from a reference element (as done e.g. by N. Al Shenk [She94]). As shown below, the maximum eigenvalue of the stiffness matrix has a simple geometric meaning for first- and higher-order triangles and tetrahedra. Even without a geometric interpretation, the eigenvalue/trace condition is useful in practical FE computation, as the matrix trace is available at virtually no additional cost. Moreover, the stiffness matrix automatically reflects the chosen energy norm, possibly for inhomogeneous and/or anisotropic parameters. For the energy-seminorm approximation on first-order tetrahedral nodal elements, or equivalently, for L 2 -approximation of conservative fields on tetrahedral edge elements (Sect. 3.12), the maximum eigenvalue analysis leads to a new criterion in terms of the minimum singular value of the “edge shape matrix”. The

3.14 Special Topic: Element Shape and Approximation Accuracy

155

columns of this matrix are the Cartesian representations of the unit edge vectors of the tetrahedron. The new singular value estimate has a clear algebraic and geometric meaning and proves to be not only sufficient, but in some strong sense necessary for the convergence of FE interpolation on a sequence of meshes. The minimum singular value criterion is a direct generalization of the Synge–Babuška–Aziz maximum angle condition to three (and more) dimensions. Even though the approach presented here is general, let us start with first-order triangular elements to fix ideas. Let  ⊂ R2 be a convex polygonal domain. Following the standard definition, we shall call a set M of triangular elements K i , M= n{K 1 , K 2 , . . . , K n }, a triangulation of the domain if K i = ; (a) i=1 (b) any two triangles either have no common points, or have exactly one common node, or exactly one common edge. Let h i = diam K i ; then the mesh size h is the maximum of h i for all elements in M (i.e. the maximum edge length of all triangles). Let N be the geometric set ¯ of all triangles in M, and let P 1 (M) be the of nodes {ri } (i = 1, 2, . . . , n, ri ∈ ) space of functions that are continuous in  and linear within each of the triangular elements K i .54 Let P 1 (K i ) be the restriction of P 1 (M) to a specific element K i . Thus P 1 (K i ) is just the (three-dimensional) space of linear functions over the element. ¯ for simplicity, one can define the Considering interpolation of functions in C 2 () 2 ¯ 1 interpolation operator Π : C () → P (M) by ¯ (Π u)(ri ) = u(ri ), ∀ri ∈ N , ∀u ∈ C 2 ()

(3.146)

We are interested in evaluating the interpolation error Π u − u in the energy norm

· E induced by an inner product (· , ·) E (“E” for “energy,” not to be confused with Euclidean spaces).55 Remark 7 In FE applications, u is normally the solution of a certain boundary value problem in . The error bounds for interpolation and for the Galerkin or Ritz projection are closely related (e.g. by Céa’s lemma or the LBB condition, Sect. 3.5). Although this provides an important motivation to study interpolation errors, here u need not be associated with any boundary value problem. Consider a representative example where the inner product and the energy semi¯ are introduced as norm in C 2 ()  (u, v) E, = ∇u · ∇v d (3.147) 

54 Elsewhere in the book, symbol N

denotes the nodal values of a function. The usage of this symbol for the set of nodes is limited to this section only and should not cause confusion. 55 The analysis is also applicable to seminorms instead of norms if the definition of energy inner product is relaxed to allow (u, u) E = 0 for a nonzero u.

156

3 The Finite Element Method 1

|u| E = (u, u) E2

(3.148)

(If Dirichlet boundary conditions on a nontrivial part of the boundary are incorporated in the definition of the functional space, the seminorm is in fact a norm.) The element stiffness matrix A(K i ) for a given basis {ψ1 , ψ2 , ψ3 } of P 1 (K i ) corresponds to the energy inner product (3.147) viewed as a bilinear form on P 1 (K i ) × P 1 (K i ): (u, v) E,K i = (A(K i ) u(K i ), v(K i )), ∀u, v ∈ P 1 (K i )

(3.149)

where vectors of nodal values of a given function are underscored. u(K i ) is an E 3 vector of node values on a given element and u is an E n vector of node values on the whole mesh. The standard E 3 inner product is implied in the right-hand side of (3.149). Explicitly, the entries of the element stiffness matrix are given by  A(K i ) jl = (ψ j , ψl ) E,K i =

∇ψ j · ∇ψl d,

j, l = 1, 2, 3

(3.150)

Ki

To obtain an error estimate over a particular element K i , we shall use, as an auxiliary ¯ around an arbitrary function, the first-order Taylor approximation T u of u ∈ C 2 () point r0 within that element: (T u)(r0 , r ) = u(r0 ) + ∇u(r0 ) · (r − r0 ) Figure 3.39 illustrates this in 1D. The difference between the nodal values of the Taylor approximation T u and the exact function u (or its FE interpolant Π u) is “small” (on the order of O(h 2 ) for linear approximation) and shape-independent in 2D and 3D. At the same time, the difference between T u and Π u in the energy norm is generally much greater: Not only is the order of approximation lower, but also the error can be adversely affected by the element shape. Obviously, somewhere in the transition from the nodal values to the energy norm the precision is lost. Since the energy norm in the FE space is governed by the FE stiffness matrix, the large error in the energy norm indicates the presence of a large eigenvalue of the matrix. For a more precise analysis, let us write the function u as its Taylor approximation plus a (small) residual term R(r0 , r ): u(r ) = (T u)(r0 , r ) + R(r0 , r ),

r ∈ Ki ,

where R(r0 , r ) can be expressed via the second derivatives of u at an interior point of the segment [r, r0 ]: R(r0 , r ) =

 D α u(r0 + θ(r − r0 )) (r − r0 )α , 0 ≤ θ < 1 α! |α|=2

(3.151)

3.14 Special Topic: Element Shape and Approximation Accuracy

157

Fig. 3.39 Taylor approximation versus FE interpolation. Function u (solid line) is approximated by its piecewise-linear node interpolant Π u (dashed line) and by element-wise Taylor approximations T u (dotted lines). The energy norm difference between Π u and T u is generally much greater than the difference in their node values

with the standard shorthand notation for the multi-index α = (α1 , α2 , . . . , αd ) (in the current example d = 2), |α| = α1 + α2 + . . . + αd , α! = α1 !α2 ! . . . , αd !, and partial derivatives ∂ |α| u Dαu = α1 ∂x1 ∂x2α2 . . . ∂xdαd It follows from (3.151) that the residual term is indeed small, in the sense that |R(r0 , r )| = |(T u)(r0 , r ) − u(r )| ≤ u 2,∞,K i |r − r0 |2

(3.152)

|∇ R(r0 , r )| = |∇(T u)(r0 , r ) − ∇u(r )| ≤ u 2,∞,K i |r − r0 |

(3.153)

where

u m,∞,K =



|D α u| L ∞ (K )

(3.154)

|α|=m

The key observations leading to the maximum eigenvalue condition can be informally summarized as follows: 1. The Taylor approximation is uniformly accurate within the element due to (3.152), (3.153) and is completely independent of the element geometry. Therefore, for the purpose of evaluating the dependence of the interpolation error on shape, T u can be used in lieu of u, i.e. one can consider the difference Π u − T u instead of Π u − u. 2. The energy norm of the difference Π u − T u is generally much higher than the nodal values of Π u − T u: The nodal values are of the order O(h 2 ) and indepen-

158

3 The Finite Element Method

dent of element shape due to (3.152), while the energy norm is O(h) and depends on the shape. 3. The above observations imply that in the transition from node values to the energy norm the accuracy is lost. Since within the element K i both u and T u lie in the FE space P 1 (K i ), and since in this space the energy norm is induced by the element stiffness matrix A(K i ), a large energy norm can be attributed to the presence of a large eigenvalue in that stiffness matrix. The first of these statements can be made precise by writing

Π u − u E,K i ≤ Π u − T u E,K i + T u − u E,K i 1

≤ Π u − T u E,K i + ch i Vi 2 u 2,∞,K i

(3.155)

where the second inequality follows from estimate (3.153) of the Taylor residual, Vi = meas(K i ), and c is an absolute constant independent of the element shape and of u. We now focus on the term Π u − T u E,K i in (3.155). Restrictions of both u and T u to K i lie in the FE space P 1 (K i ), and therefore

Π u − T u E,K i =



A(K i )(u(K i ) − T u(K i )), u(K i ) − T u(K i )

 21

(3.156)

The standard Euclidean inner product in E 3 is implied in the right-hand side of (3.156), and we recall that the underscore denotes Euclidean vectors of nodal values. It follows immediately from (3.156) that 

Π u − T u E,K i ≤

max

x=0,x∈R3

(A(K i )x, x) (x, x)

 21

  u(K i ) − T u(K i )

E3

(3.157)

that is, 1   2

Π u − T u E,K i ≤ λmax (A(K i )) u(K i ) − T u(K i ) E 3

(3.158)

In the right-hand side of (3.158), λmax is the maximum eigenvalue of the element stiffness matrix (3.149), (3.150). The difference u(K i ) − T u(K i ) is the error vector for the Taylor expansion at element nodes, and due to the uniformity (3.152), (3.153) of the Taylor approximation, we have   u(K i ) − T u(K i )

E3

≤ ch i2 |u|2,∞,K i

(3.159)

(the generic constant c is not necessarily the same in all occurrences). Combining (3.158) and (3.159), we obtain the element-wise estimate 1

2 (A(K i ))|u|2,∞,K i

Π u − T u E,K i ≤ ch i2 λmax

(3.160)

3.14 Special Topic: Element Shape and Approximation Accuracy

159

or, taking into account the triangle inequality (3.155),  1 1 2 (A(K i )) + h i Vi 2 |u|2,∞,K i

Π u − u E,K i ≤ c h i2 λmax

(3.161)

The corresponding global estimate is ⎡

Π u − u E, ≤ c|u|2,∞, ⎣

 

⎤ 21  h i4 λmax (A(K i )) + h i2 Vi ⎦

(3.162)

K i ∈M

 This  result can be simplified by noting that λmax (A(K i )) ≤ tr A(K i ), K i tr A(K i ) = tr A, K i Vi = V , where A is the global stiffness matrix and V = meas():  1 1

Π u − u E, ≤ c|u|2,∞, h 2 (tr 2 A + hV 2 )

(3.163)

Alternatively, one can factor out the element area Vi in (3.161) to obtain   1 1 2 ˆ i )) + h i

Π u − u E,K i ≤ cVi 2 |u|2,∞,K i h i2 (λmax ( A(K

(3.164)

ˆ i ) = A(K i )/Vi . Then, where the hat denotes the scaled element stiffness matrix A(K the global error estimate simplifies to 1

Π u − u E, ≤ cV 2 max

K i ∈M



 1 2 ˆ i )) + h i |u|2,∞,K i h i2 λmax ( A(K

(3.165)

The maximum eigenvalue can again be replaced with the (easily computable) matrix trace. Remark 8 The trace- and max-terms in estimates (3.163), (3.165) are not of the ˆ i )) are O(h −2 ). order O(h 2 ) as it might appear, but O(h), since both tr A and λmax ( A(K The analysis above can be generalized, without any substantial changes, to elements of any geometric shape and order: Theorem 5 Let M be a finite element mesh in a bounded domain  ∈ Rd (d ≥ 1) and let the following assumptions hold for any (scalar or vector) function u ∈ ¯  → Rs , with some nonnegative integers m and s. (C m+1 )s (): (A.1) A given energy (semi)norm is bounded as |u|2E,K i ≤

ν 

c2j |u|2j,∞,K i , cν > 0, Vi = meas(K i )

(3.166)

j=0

for any element K i , with constants c j independent of the element. (A.2) The FE approximation space over K i contains all polynomials of degree ≤ m.

160

3 The Finite Element Method

(A.3) The FE degrees of freedom—linear functionals ψ j over the FE space—are bounded as μ  c˜l2 |u|l,∞,K i , c˜μ > 0 (3.167) |ψ j (K i )u| ≤ l=0

for a certain μ ≥ 0, with some absolute constants c˜l . Then,   1 1

Π u − u E, ≤ c|u|m+1,∞, h κ tr 2 A + h τ V 2

(3.168)

where V = meas(), κ = m + 1 − μ, τ = m + 1 − ν, and the global stiffness matrix A is given by (3.149), (3.150). Alternatively,   1 1 2 ˆ i )) + h τ

Π u − u E, ≤ c|u|m+1,∞, V 2 max h iκ λmax ( A(K Ki

(3.169)

ˆ i ) = A(K i )/Vi . where A(K The meaning of the parameters in the theorem is as follows: m characterizes the level of smoothness of the function that is being approximated; s = 1 for scalar functions and s > 1 for vector functions with s components, approximated component-wise; ν is the highest derivative “contained” in the energy (semi)norm; μ is the highest derivative in the degrees of freedom. Example 7 First-order tetrahedral node elements satisfy assumptions (A.1–A.3). Indeed, for the energy norm (3.148), condition (3.166) holds with ν = 1, c0 = 0, c1 = 1. (A.2) is satisfied with m = 1, and (A.3) is valid because of the uniformity (3.152) of the Taylor approximation within a sufficiently small circle. More generally, (A.3) is satisfied if FE degrees of freedom are represented by a linear combination of values of the function and its derivatives at some specified points of the finite element. Example 8 First-order triangular nodal elements. Let the seminorm be (3.148), (3.147). Then, the trace of the scaled element stiffness matrix has a simple geometric interpretation. The diagonal elements of the matrix are equal to d −2 j ( j = 1, 2, 3), where the d j s are the altitudes of the triangle (Fig. 3.40). Therefore, denoting interior angles of the triangle with φ j and its sides with l j , and assuming h i = diam(K i ) = l1 ≥ l2 ≥ l3 , one obtains ˆ i )) ≤ Tr A(K ˆ i) = λmax ( A(K

3 

d −2 = h i−2 j

j=1

< h i−2



l2 + l3 d1

2

 +

l1 d2

2

 +

l1 d3

2

 3   l1 2 j=1

dj

≤ 3h i−2 (sin−2 φ2 + sin−2 φ3 ) (3.170)

3.14 Special Topic: Element Shape and Approximation Accuracy

161

Fig. 3.40 Geometric parameters of a triangular element K i

which leads to Zlámal’s minimum angle condition. This result is reasonable but not optimal, which shows that the maximum eigenvalue criterion does not generally guarantee the sharpest estimates. Nevertheless the optimal condition for first-order elements—the maximum angle condition—will be obtained below by applying the maximum eigenvalue criterion to the Nédélec–Whitney–Bossavit edge elements. Example 9 For first-order tetrahedral elements, the trace of the scaled nodal stiffness matrix can also be interpreted geometrically. A simple transformation similar to (3.170) [Tsu98b] yields the minimum–maximum angle condition for angles φ jl between edges j and faces l: φ jl are to be bounded away from both zero and π to ensure that the interpolation error tends to zero as the element size decreases. For higher-order scalar elements on triangles and tetrahedra, the matrix trace is evaluated in an analogous but lengthier way, and the estimate is similar, except for an additional factor that depends on the order of the element.56 Example 10 L 2 -approximation of scalar functions on tetrahedral or triangular node elements. Suppose that  is a two- or three-dimensional polygonal (polyhedral) domain and that continuous and discrete spaces are taken as L 2 () and P 1 (M), respectively, for a given triangular/tetrahedral mesh. Assume that the energy inner product and norm are the standard L 2 ones. This energy norm in the FE space is induced by the “mass matrix”  A(K i ) jl =

φi φl d; Ki

ˆ i ) jl = Vi−1 A(K

 φi φl d

(3.171)

Ki

For first-order tetrahedral elements, this matrix is given by (3.103), repeated here for convenience: ⎛ ⎞ 2 1 1 1 ⎜ ⎟ ˆ i ) = 1 ⎜1 2 1 1⎟ A(K (3.172) ⎝ 20 1 1 2 1⎠ 1 1 1 2 56 Here

we are discussing shape dependence only, as the factor related to the dependence of the approximation error on the element size is obvious.

162

3 The Finite Element Method

The maximum eigenvalue of Aˆ is equal to 1/4 and does not depend on the element shape. Assumptions (A.1–A.3) of Theorem 5 hold with m = 1, μ = ν = 0, c0 = c˜0 = 1, and therefore approximation of the potential is shape-independent due to (3.169). This known result is obtained here directly from the maximum eigenvalue condition. Analysis for first-order triangular elements is completely similar, and the conclusion is the same. Example 11 (L 2 )3 -approximation of conservative vector fields on tetrahedral or triangular meshes. In lieu of the piecewise-linear approximation of u on a triangular or tetrahedral mesh, one may consider the equivalent piecewise-constant approximation of ∇u on the same mesh. Despite the equivalence of the two approximations, the corresponding error estimates are not necessarily the same, since the maximum eigenvalue criterion is not guaranteed to give optimal results in all cases. It therefore makes sense to apply the maximum eigenvalue condition to interpolation errors in L 32 () for a conservative field q = ∇u on a tetrahedral mesh. To this end, a version of the first-order edge element on a tetrahedron K may be defined by the Whitney–Nédélec–Bossavit space (see Sect. 3.12) spanned by functions w jl , 1≤ j G(0, ρ) = 0 (see Eq. (6.105) below). Remark 20 G can be viewed as a mathematical function of different sets of independent variables. In (6.104), the variables are u and ρ; however, when u = u ∗ ≡ u ∗ (ρ), G(ρ) ≡ G(u ∗ (ρ), ρ) can be considered as a function of ρ only. Furthermore, in the computation of forces via virtual work, we shall need to introduce the displacement of the body on which the electrostatic force is acting; then, clearly, G also depends on that displacement. Mathematically, these cases correspond to different functions, defined in different mathematical domains. Nevertheless for simplicity, but with some abuse of notation, the same symbol G will be used for all such functions, the distinguishing feature being the set of arguments.34 It is well known that for u = u ∗ , the expression for G(u, ρ) simplifies because 



ρu d =



(∇u ∗ ) · ∇u ∗ d

(To prove this, integrate the right-hand side by parts and take into account the electrostatic equation for u ∗ and the boundary conditions.) Hence, for u = u ∗ , action is in fact equal to the energy of the electrostatic field: G(ρ) ≡ G(u ∗ (ρ), ρ) = 34 In

1 2



(∇u ∗ ) · ∇u ∗ d

(6.105)

modern programming languages, such overloading of “functions” or “methods” is the norm.

6.11 Thermodynamic Potential, Free Energy and Forces

337

Remark 21 It is for this reason that G is often considered in the physical literature to be the free energy functional for the field; see K. A. Sharp and B. Honig [SH90], E. S. Reiner and C. J. Radke [RR90], M. K. Gilson et al. [GDLM93]. (In these papers, an additional term corresponding to microions in the electrolyte is included, as explained below.) However, the unqualified identification of G with energy is misleading, for the following reasons. First, G is mathematically defined for arbitrary u but its physical meaning for potentials not satisfying the electrostatic equation is unclear. (What is the physical meaning of a quantity that cannot physically exist?) Second, as already noted, G is maximized, not minimized, by u = u ∗ , which is rather strange if G is free energy.35 F. Fogolari and J. M. Briggs [FB97] make very similar observations. “Action” is a term from theoretical mechanics; in thermodynamics, G is commonly referred to as thermodynamic potential. An accurate physical interpretation and treatment of potential G are essential for computing electrostatic forces via virtual work, as forces are directly related to free energy rather than to the more abstract Lagrangian. More precisely, if a (possibly charged) body, such as a colloidal particle, is subject to a (“virtual”) displacement dξ, the electrostatic force F acting on the body satisfies F · dξ = − dG ∗ (ξ) ≡ − dG(ξ, u ∗ (ρ(ξ)), ρ(ξ))

(6.106)

where G ∗ , the thermodynamic potential evaluated at u = u ∗ , is the energy of the field according to (6.105). The definition of G is overloaded (see Remark 20): It now includes an additional parameter ξ, the displacement of the body.36 Importantly, the notation for G in (6.106) makes it explicit that solution u ∗ is a function of charge density ρ, which in turn depends on the position of the body. Then dG(ξ, u ∗ (ρ(ξ)), ρ(ξ)) = where



∂u ≡ ∂ξ

∂G(u ∗ ) ∂u ∗ ∂ρ ∂G ∂ρ ∂G + + ∂ξ ∂u ∂ρ ∂ξ ∂ρ ∂ξ 

∂u ∂u ∂u , , ∂ξx ∂ξ y ∂ξz

 · dξ



Since u ∗ is a stationary point of the thermodynamic potential, ∂G(u ∗ )/∂u = 0, the second term in the right-hand side vanishes and the differential becomes dG(ξ, u ∗ (ρ(ξ)), ρ(ξ)) =



∂G ∂ρ ∂G + ∂ξ ∂ρ ∂ξ

 · dξ

(6.107)

35 One could reverse the sign of F , in which case the stationary point would be a minimum; however,

this functional would no longer have the meaning of field energy, as its value at the exact solution u would be negative. See the same comment in footnote 11 in Sect. 3.3.2. 36 For a deformable structure, there exists a deeper and more general mathematical description of motion as a diffeomorphism ξt :  → , parameterized by time t; see, e.g. A. Bossavit [Bos92]. For the purposes of this section, a simpler definition will suffice.

338

6 Long-Range Interactions in Heterogeneous Systems

Let us now consider an alternative interpretation of G, where u is not, from the outset, constrained to be u ∗ . In this case,   ∂G ∂G ∂ρ · dξ (6.108) dG(ξ, u, ρ(ξ)) = + ∂ξ ∂ρ ∂ξ In this interpretation, u is an independent variable in function G, and hence, its partial derivative with respect to the displacement does not appear. Evaluation of this version of dG at u = u ∗ thus yields the same result as in the previous case (6.107), where u was constrained to be u ∗ from the beginning. In summary, one can compute the electrostatic energy first, by fixing u = u ∗ in the thermodynamic potential or by any other standard means, and then apply the virtual work principle for forces. Alternatively, it is possible to apply virtual work directly to the thermodynamic potential (even though it is not energy for an arbitrary u) and then set u = u ∗ ; the end result is the same. Potential G PB (u, ρ) for the PBE includes, in addition to (6.104), an entropic term related to the distribution of microions in the solvent. Theoretical analysis and derivation of G PB go back to the classical DLVO theory and the subsequent work of G. M. Bell and S. Levine [BL58] (1958). A systematic analysis is given by M. Deserno and C. Holm [DH01] and M. Deserno and H.-H. von Grünberg [DvG02] (2001– 2002). In the context of macromolecular simulation, thermodynamic functionals, free energy, electrostatic and osmotic forces were studied by K. A. Sharp and B. Honig [SH90] (1990) and by M. K. Gilson et al. [GDLM93] (1993). These developments are considered in more detail below. A much more advanced treatment that goes beyond mean field theory, and beyond the scope of this book, is due to R. D. Coalson and A. Duncan [CD92], R. R. Netz and H. Orland [NO99, NO00], Y. Levin [Lev02b], A. Yu. Grosberg et al. [GNS02], T. T. Nguyen et al. [NGS00]. There are several equivalent representations of the thermodynamic potential. The following expression for the canonical ensemble (fixed total number of ions N , volume V and temperature T ) is essentially the same as given by Deserno and von Grünberg [DvG02] and by Dobnikar et al. [DHM+04]: G PB (u, ρ) =

R3



    1 ρu + k B T dV n α log n α λ3T 2 α

(6.109)

where n α is the (position-dependent) volume concentration of species α of the microions and ρ is the total charge density equal to the sum of charge densities ρ f of macroions (“fixed” ions) and ρm of microions (mobile ions). The normalization factor λT —the thermal De Broglie wavelength—renders the argument of the logarithmic function dimensionless and makes the classical and quantum mechanical expressions compatible:  λT =

2π  mk B T

6.11 Thermodynamic Potential, Free Energy and Forces

339

For the canonical ensemble, this factor adds a non-essential constant to the entropy. If u = u ∗ is the solution of the Poisson equation37 with charge density ρ, then G PB (u ∗ , ρ) is equal to the Helmholtz free energy of the system. Indeed, the right-hand side of (6.109) has in this case a natural interpretation as electrostatic energy minus temperature times the entropy of the microions. Details are given in Appendix 6.14. Solution u PB of the Poisson–Boltzmann equation is in fact a stationary point of G PB , under two constraints: (i) u is the electrostatic potential corresponding to ρ (i.e. u satisfies the Poisson equation with ρ as a source), and (ii) electroneutrality of the solvent. This is verified in Appendix 6.14 by computing the variation of G PB with respect to u. The osmotic pressure force is given by the following expression [GDLM93, DHM+04] (Appendix 6.14) 

Fosm = − k B T S

n α dS

(6.110)

α

This is not surprising: Since correlations are ignored, the microions behave as an ideal gas with pressure n α k B T for each species. Naturally, gas pressure depends on the density, and a non-uniform distribution of the microions around a colloidal particle in general produces a net force on it. In the numerical implementation, surface integral (6.110) is a simple amendment to the Maxwell stress tensor integral over a surface enclosing the particle under consideration (M. Fushiki [Fus92], J. Dobnikar [DHM+04]).

6.12 Comparison of FLAME and DLVO Results In this numerical example of two charged colloidal particles in a solvent, the following parameters are used: particle radius normalized to unity; the solvent and solute dielectric constants are 80 and 2, respectively; the size of the computational domain is 10 × 10 × 10; charges of the two particles are equal and normalized to unity. The linearized PBE, with the Debye length of 0.5, is applied in the solvent. For comparison and verification, the problem is solved both with FEM and FLAME. In addition, an approximate analytical solution is available as a superposition of two Yukawa potentials.38 Finite element simulations were run using COMSOL multiphysics, a commercial finite element package.39 Two FE meshes with second-order tetrahedra are generated: a coarser one with 4,993 nodes, 25,195 elements, 36,696 degrees of freedom and to the solution of the Poisson–Boltzmann equation if, and only if, ρm obeys the Boltzmann distribution. 38 The Yukawa potential is the exact solution for a single particle in a homogeneous solvent, not perturbed by the presence of any other particles. 39 http://www.comsol.com. 37 Equivalent

340

6 Long-Range Interactions in Heterogeneous Systems

Fig. 6.19 Sample FE mesh for two particles

a finer one (Fig. 6.19) with 18,996 nodes, 97,333 elements and 138,053 degrees of freedom. Two FLAME grids are used: 32 × 32 × 32 and 64 × 64 × 64. The FLAME scheme is applied on seven-point stencils in the vicinity of each particle—more precisely, if the midpoint of the stencil is within the distance rp + h from the center of the particle with radius rp (as usual, h is the mesh size). Otherwise, the standard seven-point scheme is used. Figure 6.20 shows the potential distribution along the line connecting the centers of the two particles. The FEM and FLAME results, as well as the approximate analytical solution, are all in good agreement.

Fig. 6.20 Electrostatic potential along the line passing through the centers of two particles. FLAME and FEM results are almost indistinguishable

6.12 Comparison of FLAME and DLVO Results

341

As in the 2D case of Sect. 6.2.1, electrostatic forces can be computed via the Maxwell stress tensor (MST). The 3D analysis in this section also includes osmotic pressure forces due to the “gas” of microions. The electrostatic energy for linear dielectric materials is (J. D. Jackson [Jac99], W. K. H. Panofsky and M. Phillips [PP62]) W el =

1 2

R3

E · D dV

(6.111)

where, as usual, E and D are the electric field and displacement vectors, respectively. Noting that E = −∇u, ∇ · D = ρ (where ρ is the total electric charge density, including that of colloids and microions), and integrating by parts, one obtains another well-known expression for the total energy: W

el

1 = 2

ρu d V

(6.112)

R3

← → The electrostatic part T el of the MST is defined as (see (6.12); J. D. Jackson [Jac99], J. A. Stratton [Str41] or W. K. H. Panofsky and M. Phillips [PP62]) ⎛ 2 1 2 ⎞ Ex − 2 E Ex E y Ex Ez ← →el E y2 − 21 E 2 E y Ez ⎠ =  ⎝ E y Ex T 2 Ez Ex Ez E y E z − 21 E 2

(6.113)

where  is the dielectric constant of the medium in which the particles are immersed, E is the amplitude of the electric field, and E x,y,z are its Cartesian components. The electrostatic force acting on a particle is ← →el T · dS = 

Fel = S

 1 (E · n)E ˆ − E 2 nˆ d S 2 S

(6.114)

where S is any surface enclosing one, and only one, particle. Theoretically, the value of the force does not depend on the choice of the integration surface, but for the numerical results this is not exactly true. In the FLAME experiments, the integration surface is usually chosen as spherical and is slightly larger than the particle. Adaptive numerical quadratures in the ϕ–θ plane are used for the integration. Obviously, the integration knots in general differ from the nodes of the FLAME grid, and therefore, interpolation is needed. This involves a linear combination of the FLAME basis functions (six functions in the case of a seven-point scheme), plus the particular solution of the inhomogeneous equation in the vicinity of a charged particle. The interpolation procedure is completely analogous to the 2D one (Sect. 6.2.1).

342

6 Long-Range Interactions in Heterogeneous Systems

Fig. 6.21 Example of potential distribution (in arbitrary units) near two colloidal particles. The potential is plotted in the symmetry plane between the particles. (Simulation by E. Ivanova and S. Voskoboynikov.)

Fig. 6.22 Example of potential distribution (in arbitrary units) around eight colloidal particles. In the plane of the plot, only four of the particles produce a visible effect. (Simulation by E. Ivanova and S. Voskoboynikov.)

It is interesting to compare FLAME results for the electrostatic force between two particles with the DLVO values from (6.85).40 For this comparison, the main quantities are rendered dimensionless by scaling: r˜ = r/rp , F˜ = rp F/kT . FLAME is applied to the linearized PBE, with periodic boundary conditions. Typical surface plots of the potential distribution are shown in Figs. 6.21 and 6.22 for illustration. FLAME versus DLVO forces are plotted in Figs. 6.23 and 6.24. The first of these figures corresponds to the Debye length equal to the diameter of the particle (or κrp = 0.5). In the second figure, the Debye length is five times greater (κrp = 0.1), so that the electrostatic interactions decay more slowly. Other parameters are listed in the figure captions. 40 FLAME

simulations were performed by E. Ivanova and S. Voskoboynikov.

6.12 Comparison of FLAME and DLVO Results

343

Fig. 6.23 Comparison of FLAME and DLVO forces between two particles. Parameters: Z = 4, e2 s k B T /rp = 0.012, p = 1, s = 80, κrp = 0.5, domain size 20. (Simulations by E. Ivanova and S. Voskoboynikov.)

Fig. 6.24 Comparison of FLAME and DLVO forces between two particles. Parameters: same as in Fig. 6.23, except for κrp = 0.1. (Simulations by E. Ivanova and S. Voskoboynikov.)

Both the DLVO and FLAME results are approximations, and some discrepancy between them is to be expected. For small separations, the difference between the results can be attributed primarily to the approximations taken in the DLVO formula (6.85) for the ψ0 and β parameters (Sect. 6.9). For intermediate distances between the particles, the agreement between DLVO and FLAME is excellent. For large separations comparable with the size of the computational box, FLAME suffers from the artifacts of periodic boundary conditions: The field and forces are affected by the periodic images of the particles.41 For example, when the distance between a pair of particles A and B is half the size of the computational cell, the forces on A due to B and due to the periodic image of B on the opposite side of A cancel out. (More remote images have a similar but weaker effect, due to the Debye screening.) Obviously, this undesirable effect can be reduced by increasing the size of the box or by imposing approximate boundary conditions as a superposition of the Yukawa potentials.

41 As we know from Chap. 5, a similar “periodic imaging” phenomenon is central in Ewald methods.

344

6 Long-Range Interactions in Heterogeneous Systems

6.13 Summary and Further Reading Heterogeneous electrostatic models on the micro- and nanoscale, particularly in the presence of electrolytes, are of critical importance in a broad range of physical and biophysical applications: colloidal suspensions, polyelectrolytes, polymer- and biomolecules, etc. Due to the enormous complexity of these problems, any substantial improvement in the computational methodology is welcome. Ewald methods that are commonly used in current computational practice (Chap. 5) work very well for homogeneous media. While in colloidal simulation, the dielectric contrast between the solvent and solute can be neglected with an acceptable degree of accuracy, in macromolecular simulation this contrast cannot be ignored. From this perspective, the Flexible Local Approximation MEthods (FLAME) appear to be a step in the right direction. In FLAME, the numerical accuracy is improved— in many cases significantly—by incorporating accurate local approximations of the solution into the difference scheme. The literature on colloidal, polyelectrolyte and molecular systems is vast. The following brief, and certainly incomplete, list includes only publications that are closely related to the material of this chapter: H. C. Ottinger [Ott96], M. O. Robbins et al. [RKG88], M. Fushiki [Fus92], J. Dobnikar et al. [DHM+04], M. Deserno et al. [DHM00, DH01], B. Honig and A. Nicholls [HN95], W. Rocchia et al. [RAH01], N. A. Baker et al. [BSS+01], T. Simonson [Sim03], D. A. Case et al. [CCD+05].

6.14 Appendix: Thermodynamic Potential for Electrostatics in Solvents In this Appendix, thermodynamic potential (6.109) (reproduced here for convenience) G PB (u, ρ) =

R3



 1 ρu + k B T n α (log(n α λ3T ) − 1) 2 α

 dV

(6.115)

is considered in more detail. The total charge density ρ = ρ f + ρm is the sum of charge densities of macro- and microions, and λT is the thermal De Broglie wavelength h (6.116) λT = √ 2πmk B T Although the integral in (6.115) is formally written over the whole space, in reality the integration can of course be limited just to the finite volume of the solvent. Alternative forms of the thermodynamic functional (M. Deserno and C. Holm [DH01], M. Deserno and H.-H. von Grünberg [DvG02], K. A. Sharp and B. Honig [SH90],

6.14 Appendix: Thermodynamic Potential for Electrostatics in Solvents

345

M. K. Gilson et al. [GDLM93], J. Dobnikar et al. [DHM+04]) are considered later in this Appendix. If u = u ∗ is the solution of!the Poisson equation with the total charge density ρ as the source, then the first term R3 21 ρu ∗ d V is, as is well known from electromagnetic theory, equal to the energy of the electrostatic field. Free energy—the amount of energy available for reversible work—is different from the electrostatic energy due to heat transfer between the microions and the “heat bath” of the solvent. The Helmholtz free energy is F = E − T S where the angle brackets indicate statistical averaging. This coincides with expression (6.115) for G PB (u ∗ , ρ) because the entropy of the “gas” of microions is S = kB

 R3

n α (log(n α λ3T ) − 1) d V

α

Let us now show that the solution u PB of the Poisson–Boltzmann equation is a stationary point of the thermodynamic potential G PB (u ∗ , ρ), subject to two constraints. The first one is electroneutrality:

 

R3

 qα n α − ρ

f

dV = 0

(6.117)

α

The second constraint (or more precisely, a set of constraints—one for each species of the microions) in the canonical ensemble is a fixed total number Nα of ions of species α: R3

n α d V = Nα

(6.118)

To handle the constraints, terms with a set of Lagrange multipliers λ and λα are included in the functional:     1 ∗ ∗ f ρu − λ G PB (u , ρ, λ, λα ) = qα n α − ρ 2 R3 α + kB T

 α

 n α (log(n α λ3T )

− 1)

dV −

 α

 λα

 R3

n α d V − Nα

(6.119)

Note that the functional is evaluated at u = u ∗ , the solution of the Poisson equation; clearly, u ∗ is the only electrostatic potential that can physically exist for a given charge density ρ. The stationary point of this functional is found by computing the variation δG PB . The integration-by-parts identity

346

6 Long-Range Interactions in Heterogeneous Systems

R3

δρ u ∗ d V =



ρ δu ∗ d V R3

helps to simplify the electrostatic part of δG PB : ∗



δG PB (u , ρ, λ, λα ) =

R3

+ kB T

" 

u ∗ qα δn α − λ



α



qα δn α −

α



λα δn α

α

# log(n α λ3T ) δn α

dV

α

(The obvious relationship ρα = qα n α between charge density and concentration has been taken into account.) Since the variations δn α are arbitrary, the following conditions emerge: u ∗ qα + k B T (log(n α λ3T ) + 1) − λqα − λα = 0 This immediately yields the Boltzmann distribution for the ion density: n α = n α0

  qα u ∗ exp − kB T

(6.120)

Thus, the Poisson–Boltzmann distribution of the microions is indeed the stationary point of the thermodynamic potential, under the constraints of electroneutrality and a fixed number of ions. It was already argued, on physical grounds, that the thermodynamic functional (6.115), evaluated at u = u PB —the solution of the Poisson–Boltzmann equation— yields the free energy of the colloidal system. Since this result is fundamental and has important implications (in particular, for the computation of forces as derivatives of free energy with respect to [virtual] displacement), it is desirable to derive it in a systematic and rigorous way. The classical work on this subject goes back to the 1940s and 1950s (E. J. W. Verwey and J. Th. G. Overbeek [VO48], G. M. Bell and S. Levine [BL58]). Here I review more recent contributions that are most relevant to the material of the present chapter: K. A. Sharp and B. Honig [SH90], E. S. Reiner and C. J. Radke [RR90], M. K. Gilson et al. [GDLM93], and M. Deserno and C. Holm [DH01]. Sharp and Honig [SH90] note that a thermodynamic potential similar to G PB above is minimized by the solution of the Poisson–Boltzmann equation. Therefore, they argue, this potential represents the free energy of the system. While the conclusion itself is correct, the argument leading to it lacks rigor. First, it is not difficult to verify that the functional is actually maximized, not minimized, by the PB solution. More importantly, there are infinitely many different functionals that are stationary at u PB . This was already noted in Remark 21.

6.14 Appendix: Thermodynamic Potential for Electrostatics in Solvents

347

Reiner and Radke [RR90] address this latter point by postulating that free energy must be a function F of the action functional and that F must have additive properties with respect to the volume and surfaces of the system. They then proceed to show that F may alter G PB only by an unimportant additive term and a scaling factor—in other words, G PB is essentially a unique representation of free energy. However, the initial postulate is not justified: The fact that two functionals share the same stationary point does not imply that one of them can be expressed as a function of the other. For example, all functionals of the form Um =

R3

|u|m d V,

m = 1, 2, . . .

have the same obvious minimization point u = 0. Yet it is impossible to express, say, U100 as a function of just U1 —much more information about the underlying function u is needed.42 Deserno and Holm’s derivation [DH01] is based on the principles of statistical mechanics and combines rigor with relative simplicity. Their analysis starts with the system Hamiltonian for N microions (only one species for brevity) treated as point charges: N  pi2 + H (r, p) = 2m i=1

 1≤i< j≤N

q2 + 4π |ri − r j |



N  R3 i=1

qρ f (r) dV 4π |ri − r|

(6.121) where q and m are the charge and mass of each microion; ri and pi are the position and momentum vectors of the i-th microion. Mutual interactions of fixed charges are not included in the Hamiltonian, as that would only add an inessential constant. The Hamiltonian can be rewritten using potentials u m and u f of the microions and fixed ions, respectively:  N N    pi2 1 m f H (r, p) = +q u (ri ; r) + u (ri ) 2m 2 i=1 i=1

(6.122)

Remark 22 In this last form, the Hamiltonian includes self-energies of the microions, and so the expression should strictly speaking be adjusted (as done in Chap. 5) to eliminate the singularities. However, anticipating that the microcharges will eventually be smeared and treated as a continuum, we turn a blind eye to this complication and opt for simpler notation. Remark 23 The microion potential u m (ri ; r), is “measured” at point ri but depends on the 3N -vector r of coordinates of all charges. This coupling of all coordinates 42 In case the reader is unconvinced, here is a simple 1D illustration. Let a family of rectangular pulses u  be defined as equal to −1 on [0, ] ( > 0) and zero otherwise. These pulses have the same U1 but very different U100 . It is therefore impossible to determine U100 based on U1 alone.

348

6 Long-Range Interactions in Heterogeneous Systems

makes precise statistical analysis extremely difficult. In the mean field approximation, the situation is simplified dramatically by averaging out the contribution to u m (ri ) of all charges other than i. As is well known from thermodynamics, the partition function Z is obtained, in the classical limit, by integrating the exponentiated Hamiltonian43 : 1 Z = N ! h 3N

exp(−β H ) dr dp,

β≡

1 kB T

(6.123)

where the integral is over the whole 6N -dimensional phase space. Z serves as a normalization factor for the probability density of finding the system near a given energy value H : f (r1 , . . . , r N , p1 , . . . , p N ) = Z −1 exp(−β H )

(6.124)

The Helmholtz free energy is, as is also well known, F = − k B T log Z

(6.125)

The momentum part of Z gets integrated out of (6.123) quite easily and yields   3 Fp = k B T log(N ! λ3N T ) ≈ N log(N λT ) − 1)

(6.126)

where the Stirling formula for the factorial has been used. The position part of Z , unlike the momentum part, is impossible to evaluate exactly, due to the pairwise coupling of the coordinates of all microions via the |ri − r j | terms in the Hamiltonian. The mean field approximation decouples these coordinates (see Remark 23), thereby splitting the system Hamiltonian into a sum of the individual Hamiltonians of all microions. Consequently, the joint probability density (6.124) becomes a product of the individual probability densities of the ions, implying that the correlations between the ions are neglected. The limitations of this assumption are summarized in Sect. 6.5. Once the coordinates are (approximately) decoupled, the N -fold integration of exp(−β H ) in Z (6.123) yields the following expression for thermodynamic potential (M. Deserno and C. Holm [DH01]):       1 m f 3 qn(r) u (r) + u (r) + k B T n(r) log(n(r)λT ) − 1 d V 2 R3 (6.127) where both the momentum part (6.126) and the mean field coordinate part are included. In addition, the continuum limit has been taken, so that the microions G˜ PB =

43 Partition



function is arguably a misnomer: It is in fact the result of integration or summation, which is the opposite of partitioning. “Sum over states” (a direct translation from the original German Zustandssumme) is a more appropriate but less frequently used term.

6.14 Appendix: Thermodynamic Potential for Electrostatics in Solvents

349

are now represented by the equivalent volume density n(r). The tilde sign in G˜ PB is used to recognize that the electrostatic energy part in this functional is different from a more natural expression 1 ρu d V 3 2 R appearing in (6.115). However, the difference is not essential. Indeed, splitting the total charge density ρ and the total electrostatic potential u up into the microion and fixed charge parts, we get R3

1 1 ρu d V = 2 2 =

R3



R3

(ρm u m + ρm u f + ρ f u m + ρ f u f ) d V

1 m m 1 ρ u + ρm u f + ρ f u f 2 2

 dV

where the reciprocity principle (or, mathematically, integration by parts) was used to reveal two equal terms. The last term, involving only the fixed charges, is constant and can therefore safely be dropped from the potential. This immediately makes the expression equivalent to the electrostatic part of G PB (6.115). Alternative forms of the thermodynamic functional can be obtained under an additional constraint: Potential u satisfies the electrostatic equation for the Boltzmann distribution of the microions (6.120). An equivalent expression for the Boltzmann distribution is qα u + const log n α = − kB T Hence, the entropic term in the functional—for the Boltzmann distribution of the ion density—can be rewritten as R3

kB T



n α (log(n α λ3T ) − 1) d V = −

α

R3



n α qα u d V + const

α

= −

R3

ρm u d V + const

(6.128)

6.15 Appendix: Generalized Functions (Distributions) The first part of this Appendix is an elementary introduction to generalized functions, or distributions. The second part outlines their applications to boundary value problems and to the treatment of interface boundary conditions. The history of mathematics is full of examples where the existing notions and objects work well for a while but then turn out to be insufficient and need to be

350

6 Long-Range Interactions in Heterogeneous Systems

Fig. 6.25 Steep ramp (top) approximates the Heaviside step function. The derivative of this ramp function is a sharp pulse (bottom). However, as  → 0, the pointwise limit of this derivative is not meaningful

extended to make further progress. That is, for example, how one proceeds from natural numbers to integers and then to rational, real and complex numbers. In each case, there are desirable operations (e.g. division of integers) that cannot be performed within the existing class, which calls for an extension of this class. A different example that involves an extension of the exponential function from numbers to matrices and operators is outlined in Appendix 2.10. Why would functions in standard calculus need to be generalized? What features are they lacking? One notable problem is differentiation. As an example, the Heaviside unit step function44 H (x), equal to one for x ≥ 0 and zero otherwise, in regular calculus does not have a derivative at zero. In an attempt to generalize the notion of derivative and make it applicable to the step function, one may consider an approximation H to H (x) (Fig. 6.25). The derivative of H (x) is a rectangular pulse equal to 1/ for |x| < /2 and zero for |x| > /2. (In standard calculus, this derivative is undefined for x = ±/2.) As  → 0, H tends to the step function, but the limit of the derivative H (x) in the usual

(x) is equal to infinity at sense is not meaningful. Indeed, this pointwise limit H→0 x = 0 and zero everywhere else. In contrast with the usual integration/differentiation operations that are inverses of one another, in this irregular case the original unit step

(x). Indeed, although the existence of the step H (x) cannot be recovered from H→0

44 Oliver Heaviside (1850–1925) is a British physicist and mathematician, the inventor of operational

calculus, whose work profoundly influenced electromagnetic theory and analysis of transmission lines. The modern vector form of Maxwell’s equations was derived by Heaviside (Maxwell had 20 equations with 20 unknowns). .

6.15 Appendix: Generalized Functions (Distributions)

351

can be inferred from H→0 (x), the information about the magnitude of the step is lost. A critical observation in regard to the sequence of narrow and tall pulses with  → 0 is that the precise pointwise values of these pulses are unimportant; what matters is the “action” of such pulses on some system to which they may be applied. A mathematically meaningful definition of this action is the integral

R

H (x)ψ(x) d x

(6.129)

where ψ(x) is any smooth function that can be viewed as a “test” function to which H (x) is applied.45 It is easy to see that for  → 0, the integral in (6.129), unlike H itself, has a simple limit: /2 H (x)ψ(x) d x = −1 ψ(x) d x → ψ(0) −/2

R

Thus, the “action” of H (x) on any smooth function ψ(x) is just ψ(0). The proper mathematical term for this action is a linear functional: It takes a smooth function ψ and maps it to a number, in this particular case to ψ(0). This insight ultimately leads to the far-reaching notion of generalized functions, or distributions: linear functionals defined on smooth “test” functions. Example 21 The above functional that maps any smooth function ψ to its value at zero is the famous Dirac delta: δ, ψ = ψ(0)

(6.130)

where the angle brackets denote a linear functional. For instance, δ, exp(x) = exp(0) = 1, δ, x 2 + 3 = 3, etc.46 There is an inconsistency between the proper mathematical treatment of the Dirac delta (and other distributions) as a linear functional and the ! popular informal notation δ(x) (implying that the Dirac delta is a function of x) and δ(x)ψ(x)d x. The integral sign, strictly speaking, should be understood only as a shorthand notation for a linear functional. Example 22 Any regular function f (x) can be viewed also as a distribution by associating it with the linear functional f (x)ψ(x) d x

 f, ψ =

(6.131)

R

technical reasons, in the usual definition of generalized functions it is assumed that ψ(x) is differentiable infinitely many times and has a compact support. For the mathematical details, see the monographs cited at the end of this Appendix. 46 Strictly speaking, since exp(x) and x 2 + 3 do not have a compact support, these expressions are not valid without additional elaboration. 45 For

352

6 Long-Range Interactions in Heterogeneous Systems

It can be shown that the distributions corresponding to different integrable functions are indeed different, and so this definition is a valid one. For ! example, the sinusoidal function sin x is associated with the generalized function R sin x ψ(x) d x. Example 23 While any regular function can be identified with a distribution, the opposite is not true. The Dirac delta is one example of a generalized function that does not correspond to any regular one. Another such example is the Cauchy principal value distribution   1 ψ(x) p.v. , ψ = lim dx (6.132) →0+ |x|> x x This distribution cannot be identified, in the sense of (6.131), just with the function 1/x, as the integral 1 ψ(x) d x R x does not in general exist if ψ(0) = 0. Generalized functions have very vast applications to differential equations: Suffice it to say that Green’s functions are, by definition, solutions of the equation with the right-hand side equal to the Dirac delta. The remainder of this Appendix covers the most essential features and notation relevant to the content of this chapter. While functions in classic calculus are not always differentiable, generalized functions are. To see how the notion of derivative can be generalized, start with a differentiable (in the calculus sense) function f (x) and consider the “action” of its derivative on any smooth test function ψ(x):

f (x)ψ(x) d x = − R



f (x)ψ (x) d x

(6.133)

R

This is an integration-by-parts identity, where the term outside the integral vanishes because the test function ψ, by definition, has a compact support and therefore must vanish at ±∞. Since differentiation has been removed from f , the right-hand side of (6.133) has a wider range of applicability and can now be taken as a definition of the generalized derivative of f even if f is not differentiable in the calculus sense. Namely the generalized derivative of f is defined as the linear functional



f (x)ψ (x) d x

 f , ψ = −

(6.134)

R

Example 24 Applying this definition to the Heaviside step function H , we have



H , ψ = −





H (x)ψ (x) d x = − R

0

ψ (x) d x = ψ(0) = δ, ψ (6.135)

6.15 Appendix: Generalized Functions (Distributions)

353

In more compact notation, this is a well-known identity H = δ The derivative of the unit step function (in the sense of distributions) is the delta function. Example 25 As a straightforward but practically very useful generalization of the previous example, consider a function f (x) that is smooth everywhere except for a few discrete points xi , i = 1, . . . , n, where it may have jumps [ f ]i ≡ f (xi +) − f (xi −). Then the distributional derivative of f is f = { f } +

n 

[ f ]i δ(x − xi )

(6.136)

i=1

where δ(x − xi ) is, by definition, the functional47 δ(x − xi ), ψ = ψ(xi ) In (6.136), the braces denote regular derivatives48 viewed as generalized functions. The generalized derivative of f is thus equal to the regular one, plus a set of Dirac deltas corresponding to the jumps of f . The derivation of (6.136) is a straightforward extension of that of (6.135). Example 26 For f (x) = H (x) cos x, where H (x) is the Heaviside step function, f (x) = { f (x)} + δ(x), with { f (x)} = −H (x) sin x. Example 27 We now make a leap over to three dimensions. In 3D, distributions are also defined as linear functionals acting on smooth “test” functions with a compact support. For instance, the Dirac delta in 3D is δ, ψ = ψ(0)

(6.137)

which is formally the same definition as in 1D, except that now ψ is a function of three coordinates and zero in the right-hand side means the origin x = y = z = 0. Generalized partial derivatives are defined by analogy with the 1D case; for example, 

∂f ,ψ ∂x



= −

R3

f (x)

∂ψ dx ∂x

(6.138)

Of particular interest in this chapter is generalized divergence. The divergence equation ∇ · D = ρ is valid for volume charge density ρ; however, if divergence is underis an inconsistency between the popular notation δ(x − xi ), suggesting that δ is a function of x, and the mathematical meaning of δ as a linear functional. More proper notation would be δ(xi , ψ). 48 This is V. S. Vladimirov’s notation [Vla84].. 47 There

354

6 Long-Range Interactions in Heterogeneous Systems

stood in the sense of distributions, this equation becomes applicable to surface charges as well. If D is a smooth field, then for any “test” function ψ integration by parts yields49 R3

ψ∇ · D d V = −

R3

D · ∇ψ d V

(6.139)

The extra term outside the integral vanishes because ψ has a compact support and is therefore zero at infinity. The above identity suggests, by analogy with generalized derivative, a definition of generalized divergence as a linear functional ∇ · D, ψ = −

R3

D · ∇ψ d V

(6.140)

Consider now the generalized derivative for the case where the normal component of D may have a jump across a surface S enclosing a domain . (In electrostatic problems,  may be a body with a dielectric permittivity different from that of the outside medium, and S may carry a surface charge.) Then the generalized derivative is transformed, by splitting the integral into regions inside and outside  and again using integration by parts, to ∇ · D, ψ = −

R3

D · ∇ψ d V =



ψ∇ · D d V +

R3 −

∇ · D ψ dV +

S

ψ[Dn ] d S

(6.141)

where [Dn ] is the jump of the normal component of [D] across the surface: [Dn ] = (Dout − Din ) · n and n is the outward normal to the surface of . In more compact form, generalized divergence (6.141) can be written as ∇ · D = {∇ · D} + [Dn ] δ S

(6.142)

where the curly brackets again denote “calculus style” divergence in the volume and δ S is the surface delta defined formally as the functional δ S , ψ =

ψ dS S

The physical meaning of expression (6.142) is transparent: Generalized divergence is equal to regular divergence (that can be defined via the usual derivatives everywhere except for the surface), plus the surface delta term corresponding to the jump. This result is analogous to the 1D expression for generalized derivative (6.136) in the presence of jumps.

49 Test

functions are smooth by definition.

6.15 Appendix: Generalized Functions (Distributions)

355

The last example shows, as a consequence of (6.142), that Maxwell’s divergence equation ∇ · D = ρ is valid for both volume and surface charges (or any combination thereof) if divergence is understood in the generalized sense. This point of view is very convenient, as it allows one to treat interface boundary conditions as a natural part of the differential equations rather than as some extraneous constraints. In particular, zero generalized divergence of the D field in electrostatics implies zero volume charges and zero surface charges—the continuity of the normal component of D across the surface. Further Reading The original book by L. Schwartz [Sch66] is a very good introduction to the theory of distributions, at the mathematical level accessible to engineers and physicists. V. S. Vladimirov’s book [Vla84] focuses on applications of distributions in mathematical physics and is highly relevant to the content of this chapter. A simpler introduction, with the emphasis on electromagnetic problems, is given by D. G. Dudley [Dud94]. There is also a vast body of advanced mathematical literature on the theory of distributions, but that is well beyond the scope of this book.

Chapter 7

Finite-Difference Time-Domain Methods for Electrodynamics

7.1 Introduction Since electromagnetic fields are described by differential equations (Maxwell’s equations, Sects. 7.2, 8.2), a natural way of modeling them computationally is by finitedifference schemes (Chap. 2). This was, indeed, one of the first approaches in computational electromagnetism1 dating back to the seminal 1966 paper by K. S. Yee [Yee66]. Wikipedia gives an excellent summary of that: 2 Finite

difference time-domain or Yee’s method ... is a numerical analysis technique used for modeling computational electrodynamics... Since it is a time-domain method, FDTD solutions can cover a wide frequency range with a single simulation run, and treat nonlinear material properties in a natural way. The FDTD method belongs in the general class of grid-based differential numerical modeling methods (finite-difference methods). The time-dependent Maxwell’s equations (in partial differential form) are discretized using central-difference approximations to the space and time partial derivatives. The resulting finite-difference equations are solved in either software or hardware in a leapfrog manner: the electric field vector components in a volume of space are solved at a given instant in time; then the magnetic field vector components in the same spatial volume are solved at the next instant in time; and the process is repeated over and over again until the desired transient or steady-state electromagnetic field behavior is fully evolved.

This chapter includes an introduction to FDTD, where the above definition is elaborated upon, as well as several more advanced topics important in practical electromagnetic and nanoscale photonic simulations. The term “FDTD” will be understood broadly and will apply also to the conceptually related Finite Integration Techniques

1 Along

with integral equation methods, known in the engineering community as “the moment method” (R. F. Harrington [Har93]). 2 https://en.wikipedia.org/wiki/Finite-difference_time-domain_method. © Springer Nature Switzerland AG 2020 I. Tsukerman, Computational Methods for Nanoscale Applications, Nanostructure Science and Technology, https://doi.org/10.1007/978-3-030-43893-7_7

357

358

7 Finite-Difference Time-Domain Methods for Electrodynamics

(FIT, Sect. 7.8).3 This terminology is adopted for the sake of brevity only; as we shall note, there are similarities but also substantive differences between FIT and FDTD proper. For further reading, in addition to the “FDTD encyclopedia” by A. Taflove & S. C. Hagness [TH05], a number of other monographs exist: K. S. Kunz & R. J. Luebbers [KL93], S. D. Gedney [Ged11], D. M. Sullivan [Sul00]. J. B. Schneider has made his book freely available online [Sch10].

7.2 Basic Ideas and Schemes Let us apply the material of Chap. 2 to Maxwell’s equations (ME); this will lead us to the basic ideas of FDTD. One caveat is that ME are written differently in different systems of units. FDTD originated in the electrical engineering community, where the SI system is common, but is being widely used and studied by physicists and mathematicians, who generally prefer the Gaussian system. As a compromise (which will make everybody equally happy or equally unhappy), for purposes of introduction I will write ME in the following unified form: ζ ∇ × E = −∂t B

(7.1)

ζ ∇ × H = ∂t D

(7.2)

where the auxiliary system-unifying coefficient ζ in the SI system is just equal to one and can therefore be ignored; but in the Gaussian system, ζ is the the speed of light c in a vacuum:  1, SI ζ = (7.3) c, Gaussian To fix the ideas of FDTD, we start with the simplest possible case: EM waves depending on only one coordinate—say, x—and traveling in a homogeneous isotropic medium with a permittivity  and permeability μ. Also for maximum simplicity, let the fields have only one Cartesian component: E = yˆ E, H = zˆ H . Maxwell’s equations then assume a particularly simple form:

3 However,

ζ ∂x E = −μ ∂t H

(7.4)

ζ ∂x H = − ∂t E

(7.5)

the Transmission Line Modeling method is not covered in this chapter. This method relies on a physical analogy between electromagnetic fields and a grid of transmission lines in space, rather than on mathematical approximations (C. Christopoulos [Chr95]).

7.2 Basic Ideas and Schemes

359

These equations need to be supplemented with appropriate initial and boundary conditions, which we shall discuss later. Note that in the SI system  and μ are absolute permittivity and permeability—that is, they include the factors of 0 and μ0 , which are absent in the Gaussian system. Recall (and/or directly verify) that (7.1), (7.2) admit particular solutions in the form of traveling waves (TW) E T∓W (x, t) = E 0 f (x ∓ v p t); H P∓W (x, t) = H0 f (x ∓ v p t) 

where Z H0 = E 0 , Z =

μ 

(7.6)

(7.7)

Z being the intrinsic impedance of the medium. In (7.6), the ∓ sign correspond to waves traveling in the ±x directions, respectively; f is an arbitrary function (waveform), sufficiently smooth for Maxwell’s equations (7.4), (7.5) to be valid. If these equations are understood in strong form, then f must be differentiable. A sinusoidal waveform f produces plane waves (PW) as particular analytical solutions in a homogeneous medium: E P W (x, t) = E 0 sin(kx − ωt); H P W (x, t) = H0 sin(kx − ωt) where the wavenumber k is k =

ω√ μ ζ

(7.8)

(7.9)

Let us determine the most promising class of FD schemes for the differential equations (7.4), (7.5). Several major considerations were noted in Chap. 2: 1. 2. 3. 4.

Consistency and order. Stability. Computational cost. Qualitative similarity between discrete and continuous models (see e.g. Sect. 2.5).

For definitions of consistency and stability, see Sect. 2.9; here is a quick informal reminder. Consistency error is obtained if the nodal values4 of the exact solution are substituted into the difference scheme. The order of this residual error with respect to the grid size and time step is, by definition, the order of the scheme.5 Stability implies that the FD solution remains bounded as a function of (discrete) time. Item 4 above sounds open-ended but is as important as the other three. For wave equations in particular, one wants to preserve the oscillatory nature of the solution.

4 Or,

more generally, the relevant degrees of freedom such as edge circulations or face fluxes. orders with respect to space and time usually are, but do not necessarily have to be, equal. If one of them is lower, the solution accuracy will naturally be constrained by that lower order.

5 The

360

7 Finite-Difference Time-Domain Methods for Electrodynamics

Fig. 7.1 Staggered E and H grids in the xt-plane, for Eqs. (7.4), (7.5). The × signs indicate typical points where the derivatives in (7.4), (7.5) are approximated to second order by central differences (only a few of these points are marked to avoid overcrowding the picture)

If actual waves do not decay, their numerical counterpart must not decay either, or at least experience negligible attenuation. One type of scheme, known as the Yee scheme [Yee66], satisfies all four conditions particularly well and has withstood the test of time. Yee’s key idea is to apply central differencing consistently, which for a homogeneous medium produces second-order accuracy—in contrast with forward or backward schemes which would only be first order. (See Sect. 2.2 for the analysis of this and related matters.) Central-difference schemes for (7.4), (7.5) are set up on staggered grids (Fig. 7.1). That is, the grids for the E and H fields are shifted relative to one another by half a step in both space and time. In Fig. 7.1, the E-grid is indicated with solid circles, the H -grid—with empty squares, and the × signs indicate a typical example of a point where the exact solution is approximated to second order: E n+1,m − E n,m + O(x 2 ) x

(7.10)

Hn+1/2,m+1/2 − Hn+1/2,m−1/2 + O(t 2 ) t

(7.11)

∂x E(x× , t× ) = ∂t H (x× , t× ) =

From these two equations, one can immediately deduce that the following FD scheme for Maxwell’s equation (7.4) is of second order in both space and time: ζ

Hn+1/2,m+1/2 − Hn+1/2,m−1/2 E n+1,m − E n,m = −μ x t

(7.12)

Clearly, similar considerations apply to the other Maxwell equation, (7.5). The Yee scheme for it is ζ

Hn+1/2,m+1/2 − Hn−1/2,m+1/2 E n,m+1 − E n,m−1 = − x t

(7.13)

7.3 Consistency of the Yee Scheme in 1D

361

7.3 Consistency of the Yee Scheme in 1D Let us determine the consistency error of the Yee scheme in one particular but important case when the exact solution of Maxwell’s equations is a plane wave (7.8). Substituting this solution into the FD equations (7.10), one obtains ζ E0 k 3 [μx 2 − t 2 ζ 2 ] [ζ 2 t 2 k 2 + μ x 2 k 2 − 80μ] + h.o.t. 1920 2 μ2 (7.14) where “h.o.t.” stands for higher order terms in x and t. Since the (squared) phase velocity of waves in a homogeneous medium is = ∇×E c

v 2p =

ζ2 μ

(7.15)

one can simplify (7.14) by attributing the factor of k 2 /(μ) to the first pair of square brackets, and another 1/(μ) to the second pair:   ζ E0 k  (kx)2 − (v p kt)2 (kv p t)2 + (kx)2 − 80 + h.o.t. 1920 (7.16) The appearance of k in the first factor is natural because the curl of E has the units of inverse length. Other terms in (7.16) are dimensionless. The consistency error for (7.11) is completely similar and not reproduced here. Expression (7.16) reconfirms, in quantitative terms, that consistency error of the Yee scheme is of second order in x and t. In the curious case of the special (sometimes called “magic”) relation between the steps, x = v p t, not only does the lead error term in (7.16) vanish, but the error is in fact identically zero—that is, the Yee scheme in that case is exact. This can be shown by direct substitution of the plane wave solution—or, more generally, any traveling wave solution—into the Yee scheme (Appendix 7.19). Why, then, would not one always use the “magic” step? That is because the magic-step solution is exact only in the trivial 1D case of a homogeneous medium. In practice, obviously, one is interested in inhomogeneous media, usually in more than one spatial dimension; then, unfortunately, the magic disappears, and numerical modeling becomes much more complex. = ∇×E c

7.4 The Yee Scheme in Fourier Space (1D) Stability analysis is more straightforward in Fourier space, where FDTD schemes turn into simple algebraic relations. It is convenient to define the Fourier transform of variables on staggered grids as a particular case of the continuous transform. To this end, we write any generic grid-based function u  as

362

7 Finite-Difference Time-Domain Methods for Electrodynamics

u  (x, t) ≡



u(n) δ(x − xn x , t − tn t )

(7.17)

n∈Z2

where n = (n x , n t ) ∈ Z2 ; xn x ≡ x0 + n x x; tn t ≡ t0 + n t t

(7.18)

    u(n) are the grid values of a given function, xn x and tn t are the space and time grids, and x0 , t0 are arbitrary reference points, not necessarily zero (allowing one to consider staggered grids). The Fourier transform of u  (x, t) in (7.17) is, by definition, FT{u(n)}(k, ω) ≡ U (k, ω) = u  (x, t) exp(ikx) exp(iωt) d x dt R2

=



u(n) exp(ik(x0 + n x x)) exp(iω(t0 + n t t))

n

= exp(ikx0 ) exp(iωt0 )



u(n) exp(ikn x x) exp(iωn t t)

(7.19)

n∈Z2

Since we are interested in initial value problems, as well as in possible instabilities of the time-stepping procedure, the Fourier transform will be understood in the generalized sense of complex frequency ω = ω  + iω  ;

⇒ exp(iωt) = exp(iω  t) exp(−ω  t)

(7.20)

This turns the Fourier transform essentially into the Laplace transform, under the tacit assumption that the solution is set to zero before some initial moment of time (usually t = 0) when the time-stepping process starts. Clearly, U (k, ω) is periodic in both k and ω, wit the periods of 2π/x and 2π/t, respectively. Since (7.19) can be viewed as a Fourier series for U , the standard expression for the coefficients applies: u(n) =

x t 4π 2

0

k0



ω0

exp(−ik(x0 + n x x)) exp(−iω(t0 + n t t)) dk dω

0

(7.21) 2π 2π k0 ≡ , ω0 ≡ x t

The pair (7.19), (7.21) is well known as forward and inverse Fourier transform taking discrete nonperiodic functions to continuous periodic ones and back. Remark 24 Here we are sacrificing some mathematical rigor in favor of simplicity of analysis. If u(n) is unbounded in space or time, then the existence of the Fourier transform has to be mathematically justified. Furthermore, if u(n) is periodic on the lattice, then a fully discrete transform would need to be introduced. Alternatively,

7.4 The Yee Scheme in Fourier Space (1D)

363

if u(n) is considered only in a bounded domain of the xt-plane (which is always the case in practical simulations), then the transform exists without any additional stipulations. However, in that case boundary conditions have to be specified, which significantly complicates matters (Sect. 7.10.5). It is also well known that a space or time shift in the xt-domain translates into a phase shift in the Fourier domain. Indeed, the FT of the space-shifted function u(n ˜ x , n t ) ≡ u(n x − 1, n t ) is, according to (7.19), FT{u(n x + 1, n t )} = exp(ikx0 ) exp(iωt0 )



u(n x + 1, n t ) exp(ikn x x) exp(iωn t t)

n∈Z2

= exp(ikx0 ) exp(iωt0 )



u(n x , n t ) exp(ik(n x − 1)x) exp(iωn t t)

n ∈Z2

= exp(−ikx) U (k, ω)

(7.22)

The same result is also easy to obtain from the inverse transform (7.21). Quite similarly, (7.23) FT{u(n x , n t + 1)} = exp(−iωt) U (k, ω) With the above preliminaries in mind, we can now Fourier-transform the Yee equations (7.12), (7.13) simply by replacing the generic variable u with either E or H and setting x0 = 0, t0 = 0 for the E grid and x0 = x/2, t0 = t/2 for the H grid: exp(−ikx) − 1 ˜ E(k, ω) = x



t exp(−iωt) − 1 ˜ x exp −iω H (k, ω) μ exp −ik 2 2 t ζ

˜ H˜ are the Fourier transforms of the lattice fields E(n), H (n). The above where E, expression is easily simplified by collecting the exponentials with x and t: ζ sin

ωt 1 ˜ kx 1 ˜ E(k, ω) = μ sin H (k, ω) 2 x 2 t

(7.24)

Similarly, the second Yee equation becomes, in the Fourier domain, ζ sin

kx 1 ˜ ωt 1 ˜ H (k, ω) =  sin E(k, ω) 2 x 2 t

(7.25)

The dispersion relation can now be derived, say, by multiplying Eqs. (7.24) and (7.25):

364

7 Finite-Difference Time-Domain Methods for Electrodynamics

ζ

2

1 kYee x sin x 2

2



ωYee t 1 sin = μ t 2

2 (7.26)

where generic variables k, ω have been replaced with kYee , ωYee to emphasize the fact that this dispersion relation is valid for the FD solution corresponding to the Yee scheme. Expression (7.26) can be made more transparent by noting that phase velocity v 2p = ζ 2 /(μ): 1 kYee x 1 ωYee t sin = ± sin (7.27) x 2 v p t 2 In this form, the dispersion relation is convenient for analysis and, as a small bonus, is the same in the SI and Gaussian systems of units. In the limit of vanishingly small time and space steps the discrete dispersion relation (7.27) tends to the exact one, as should certainly be the case for a valid scheme: ωYee , as (x, t) → (0, 0) (7.28) kYee → kexact = vp More generally, the Taylor expansion of kYee for small around x = 0 yields the following relative error in kYee :

(kexact x)2 1 − S 2 v p t kYee − kexact + h.o.t., S ≡ held constant (7.29) = kexact 24 x where notation S for the Courant stability factor is chosen for consistency with [TH05].

7.5 Stability of the Yee Scheme in 1D A minor rearrangement of terms in (7.27) helps to analyze stability: sin

v p t kYee x ωYee t = ± sin 2 x 2

(7.30)

For stability, ωYee must be real; otherwise, the FD equation will have exponential solutions, one of which will be growing. (Although it is theoretically possible to engineer the initial conditions in a way that would give rise only to the decaying discrete solution, even in that case the growing solution will emerge due to roundoff errors.) With this in mind, the absolute value of the right-hand side of (7.30) must not exceed one. Moreover, since all spatial frequencies are in general present in the Fourier transform, we demand that ωYee be real for any kYee , which leads to the

7.5 Stability of the Yee Scheme in 1D

365

following stability condition: v p t ≤ x

⇐⇒

S=

v p t ≤ 1 x

(7.31)

This relationship has a simple intuitive interpretation: over the time step t, the wave does not travel farther than the spatial step x. Despite the relative simplicity of the above analysis, there is one point of potential confusion. In the generic 1D wave equation ∂x2 u(x, t˜) = ∂t˜2 u(x, t˜), t˜ ≡ v 2p t is mathematically invariant with respect to the interchange of variables x ↔ t. It appears, then, that this interchange could turn the stability argument around: one could rewrite (7.30) as sin

x ωt kYee x = ± sin 2 v p t 2

and demand that the right-hand side of this equality be less or equal to one, for kYee to be real. This would lead to a stability condition exactly opposite to (7.31). What gives? In fact, the space vs. time symmetry is broken by the fact that an initial value problem in time is being considered. It is tacitly assumed that the initial conditions have a well-defined Fourier transform for real k, whereas in the time-domain exponential instabilities may be possible, leading to our use of complex frequencies.

7.6 The Yee Scheme in 2D 7.6.1 Formulation (2D) The system-unifying ζ factor was convenient for introductory purposes but gets cumbersome to retain as we progress toward more advanced topics. This factor is henceforth omitted. This makes no difference for the SI system; in the Gaussian system, one may imagine that the vacuum speed of light c is normalized to unity. Let us start with the E-mode (s-mode), where the electric field has one component (say, E = E z ) in a chosen Cartesian system, and the magnetic field is orthogonal to it, H = Hx xˆ + Hy yˆ . In this case, Maxwell’s curl equations can be written as ∂x Hy − ∂ y Hx =  ∂t E z

(7.32)

∂ y E z = −μ ∂t Hx

(7.33)

366

7 Finite-Difference Time-Domain Methods for Electrodynamics

Fig. 7.2 Yee grids in 2D (one lattice cell shown). Left: the E-mode (s-mode). Right: the H -mode ( p-mode)

∂x E z = μ ∂t Hy

(7.34)

To fix ideas, we focus on these equations themselves and defer the discussion of initial and especially boundary conditions for later. We are developing an FDTD scheme for (7.32)–(7.34), and the idea of staggered grids carries over to this problem. The left panel in Fig. 7.2 shows the staggered Yee grids for the E-mode; for compactness, only one grid cell is depicted, but there of course is a repeated pattern of such cells in the x y-plane. In the time domain, the E and H variables are also placed in a staggered (leapfrog) fashion: That is, there is a half-step time shift between the E variables and the respective H variables. As in 1D, the half-step shifts in both space and time lead to a second-order scheme. Each space and time derivative in (7.32)–(7.34) is approximated by a central difference. For example,

∂x Hy

t xmid , ymid , t0 + nt + 2

Hy xright , ymid , t0 + nt +

t 2

=

− Hy xleft , ymid , t0 + nt + x

t 2

+ O(x 2 )

(7.35) where the indexes “left,” “right” and “mid” correspond to the left edge, right edge, and the midpoint of the cell in Fig. 7.2 (left panel). As in the 1D case, the E field is sampled at the moments t0 , t0 + t, . . . , t0 + nt, . . ., and the H field—at t0 + t/2, t0 + t + t/2, . . . , t0 + nt + t/2, . . . Approximation of time derivatives by central differences is equally straightforward. For example,

t = ∂t E z xmid , ymid , t0 + nt + 2 E z (xmid , ymid , t0 + (n + 1)t) − E z (xmid , ymid , t0 + nt) + O(t 2 ) t

(7.36)

7.6 The Yee Scheme in 2D

367

From this equation, the electric field at each subsequent time step can be easily expressed via the E and H fields at the current time step, and thus, time marching for E can be implemented in a straightforward way. Similarly, time marching for both components of the H field can be implemented from the FD discretizations of (7.33), (7.34). We omit these explicit FD equations because they are analogous to (7.36), and the key principle is clear. Once all three differential equations (7.32)–(7.34) have been discretized, they are time-marched in a leapfrog fashion. That is, the E field is updated based on the current values of the H field, and then the H field is updated based on the values of the E field. The H mode ( p mode; H = Hz , E = E x xˆ + E y yˆ ) is modeled in a completely analogous way. For reference, here are the relevant differential equations6 : ∂x E y − ∂ y E x = −μ ∂t Hz

(7.37)

∂ y Hz =  ∂t E x

(7.38)

∂x Hz = − ∂t E y

(7.39)

Each derivative is again approximated by the respective central difference. This procedure is straightforward, but FD equations become lengthy if all the terms are written out explicitly, so there is not much benefit in doing so.

7.6.2 Numerical Dispersion and Stability (2D) The Fourier analysis also extends quite naturally from 1D to 2D. The discrete dispersion relation (7.27) (valid for both s and p polarizations) generalizes to kYee,y y 1 1 kYee,x x ωYee t 1 + = 2 2 sin2 sin2 sin2 x 2 2 y 2 2 v p t 2

(7.40)

It is clear that in the limit (x, y, t) → (0, 0, 0) (7.40) gives the exact dispersion relation

2 ω 2 2 kx + k y = (7.41) vp However, for finite values of (x, t) the discrete dispersion relation (7.27) is only an approximation. In particular, it lacks isotropy. Indeed, at a fixed frequency, for a wave traveling at any given angle θ to the x-axis, (7.27) leads to

6 [Yee66]

has a typo in (7.38):  appears on the wrong side of the equality.

368

7 Finite-Difference Time-Domain Methods for Electrodynamics

    ˜ cos θ + sin2 kYee (θ) ˜ sin θ = const sin2 kYee (θ)

(7.42)

where, to shorten the expression, we assumed equal grid sizes in both spatial directions, and  ˜ ≡ ,  ≡ x = y  2 As is easy to see, it is only in the limit  → 0 that (7.42) leads to kYee independent of the angle θ. Otherwise the wavenumber is different for different directions of propagation, and isotropy is lost as a numerical artifact of FD discretization. In connection with this, one can only state the obvious: no numerical method is generally free from various approximation errors, except either for trivial problems with very simple solutions or for mathematical limits such as  → 0. In engineering practice, one tries to solve a given problem using a sequence of meshes with diminishing spatial and time steps. If the results are close, within an accepted tolerance, then one can be reasonably sure that the numerical solution is valid. Stability analysis can also be carried out in full analogy with the 1D case. For stability, the (generalized) frequency must be purely real, and hence, the squared sine function in the right-hand side of the discrete dispersion relation (7.40) must not exceed unity. A sufficient condition for that is v p t ≤ 

xy x 2 + y 2

(7.43)

which for a square grid x = y =  simplifies to  v p t ≤ √ 2 Due to the presence of the factor of than the 1D condition (7.31).



(7.44)

2, this condition is somewhat more restrictive

7.7 The Yee Scheme in 3D 7.7.1 Formulation (3D) In 3D, the electromagnetic field is described by Maxwell’s curl equations (7.1), (7.2). In this section, we will still consider only homogeneous media; the treatment of inhomogeneities is reviewed in Sect. 7.12. As in 1D and 2D, the 3D Yee scheme is produced by consistently applying central differencing to all derivatives in the equations, which ultimately results in secondorder accuracy.

7.7 The Yee Scheme in 3D

369

Fig. 7.3 Yee grid in 3D. One lattice cell for ∇ × E is outlined with a solid blue frame A similar cell for ∇ × H is not explicitly shown to avoid overcrowding the picture, but one face of that cell corresponding to (∇ × H) y is indicated with yellow spherical balls

Fig. 7.4 Front face of the 3D Yee cell of Fig. 7.3

In the time domain, there is the familiar “leapfrog” shift by t/2 between the E and H variables. The geometric setup of the spatial grids is shown in Fig. 7.3. This setup looks convoluted but becomes clear once we parse it out plane by plane. For example, in the front face of the grid cell, redrawn separately for clarity in Fig. 7.4, the arrangement is the same as in the 2D case of the H mode ( p mode). In reference to this setup, one has (∇ × E) y = ∂z E x − ∂x E z =

E z,right − E z,left E x,top − E x,bot − + O(x 2 + z 2 ) z x

(7.45) The notation here is self-explanatory: x and z, as usual, refer to the geometric dimensions of the cell; “top,” “bottom,” “right” and “left”—to the midpoints of the respective edges. Clearly, this relationship can be used to approximate the Maxwell ∇ × E equation to second order, provided that time discretization is performed in the usual leapfrog fashion (that is, the E and H variables are shifted by half the time step).

370

7 Finite-Difference Time-Domain Methods for Electrodynamics

Fig. 7.5 Schematic representation of the interlocked EH-grids for the Yee scheme in 3D (after https://en.wikipedia.org/ wiki/Finite-difference_timedomain_method#/media/ File:FDTD_Yee_grid_2d-3d. svg). The green cell on the upper left and the brown cell on the lower right carry the E and H degrees of freedom, respectively

The FD cell for ∇ × H is similar and shifted relative to the ∇ × E cell in each spatial direction by half of the respective step. To avoid overcrowding the picture, this ∇ × E cell is not explicitly shown, but one face of that cell, corresponding to the y component of ∇ × H is indicated with yellow spherical balls in Fig. 7.3. Figure 7.57 may help to clarify this further: The Yee cells carrying the E and H degrees of freedom are shifted by half-step with respect to one another, in all three directions. A similar time shift between E and H is implied but cannot be rendered graphically in one figure. From this equation, the electric field at each subsequent time step can be easily expressed via the E and H fields at the current time step, and thus time marching for E can be implemented in a straightforward way. Similarly, time marching for both components of the H field can be implemented from the FD discretizations of (7.33), (7.34). We omit these explicit FD equations because they are analogous to (7.36), and the key principle is clear. Equation (7.45) makes it clear how all other spatial derivatives in Maxwell’s curl equations are discretized. Discretization of the time derivatives is essentially the same as in 1D and 2D, and so is the time-marching procedure. Once this is understood, expressions for the Yee scheme become completely straightforward to write but are lengthy, so there is little benefit in including them here explicitly.

7.7.2 Numerical Dispersion and Stability (3D) Despite the fully vectorial nature of the electromagnetic problem in 3D, dispersion and stability analysis proceed the same way as in 2D and produce similar results. The dispersion relation for the Yee scheme in a homogeneous medium reads, 7 After

https://en.wikipedia.org/wiki/Finite-difference_time-domain_method. Picture contributed to wikipedia by “FDominec”.

7.7 The Yee Scheme in 3D

371

kYee,y y 1 kYee,x x kYee,z z 1 1 sin2 sin2 sin2 + + 2 2 2 x 2 y 2 z 2 =

1 v 2p t 2

sin2

ωYee t 2

(7.46)

This again tends to the exact dispersion relation in the mathematical limit of all step sizes going to zero:

k x2

+

k 2y

+

k z2



ω vp

2 ,

(x, y, z, t) → (0, 0, 0, 0)

(7.47)

As in 2D, this dispersion relation in general lacks isotropy for nonvanishing values of space and time steps; that is, the numerical wavenumber depends on the direction of wave propagation. A sufficient stability condition for the 3D Yee scheme is, in the case of a cubic grid x = y = z =  for simplicity,  v p t ≤ √ 3

(7.48)

√ The factor of 3 makes this condition more restrictive than the respective 1D and 2D conditions (7.31), (7.43).

7.8 The Finite Integration Technique (FIT) By approximating Maxwell’s equations in their integral rather than differential form, one may devise discrete models that mimic several important properties of their continuous counterparts. Toward that end, one uses edge circulations of E and H and surface fluxes of B and D, rather than the more traditional nodal values, as the primary degrees of freedom (DOF). More specifically, consider a hexahedral grid cell. Clearly, Faraday’s law for any face of that cell (e.g. the shaded one on the right in Fig. 7.6) can be written exactly in terms of the four edge circulations of the E field and the B flux through that face: E1 + E2 − E3 − E4 = − dt 

(7.49)

  where Eα ≡ edge α E · dl, (α = 1, 2, 3, 4);  ≡ face B · dS. These discrete manifestations of Faraday’s law, collected over the whole grid, can be cast in matrix vector form: (7.50) Ce e = − dt 

372

7 Finite-Difference Time-Domain Methods for Electrodynamics

 Fig. 7.6 Block arrows schematically indicate the edge circulations Eα ≡ edge α E · dl (α =  1, 2, 3, 4) of the E field and the flux  ≡ B · dS of the B field. With some abuse of schematic representation, circulations and flux are shown as arrows in the reference direction of the respective fields, even though these mathematical quantities are scalars

where e is a Euclidean vector containing all edge circulations of E, and  is a vector of all face fluxes of B. The Ce matrix, closely related to the curl of the E field (hence the “C” notation chosen) contains only 0, ±1 as its entries and reflects only the topology of the grid (which edge belongs to which face and in which of the two possible directions the edge circulation is calculated). The dimension of Ce is N F × N E , where N F and N E are the numbers of faces and edges in the grid, respectively.8 It is also meaningful to introduce a discrete divergence-related quantity—namely, the sum of six fluxes out of the faces of each hexahedral grid cell. (Only one such flux is shown in Fig. 7.6 to avoid overcrowding the picture.) Over the whole grid, these sums (i.e. the total fluxes out of each grid cell) can be represented by a “topological” matrix De (not to be confused with the D vector; “D” here stands for divergence). The dimension of De is N V × N F , where N V is the number of grid cells (“volumes”), and the entries of this matrix are 0, ±1, indicating which face in the global grid belongs to which cell, and whether the chosen normal direction for that face points in or out of the cell. It goes without saying that both matrices Ce , De are quite sparse, and that their particular structure depends on the chosen global numbering of edges, faces and volumes in the grid, and on the chosen directions of edge circulations and face fluxes. For the Ce and De matrices so defined, a discrete analog of the calculus identity ∇ · (∇×) holds: (7.51) De C e = 0 An outline of the proof is as follows. Consider any hexahedral cell Vα in the E grid and any given Euclidean vector e ∈ R N E of edge circulations. There are six rows in Ce corresponding to the faces of Vα ; each of these rows contains exactly four nonzero entries ±1. Let us call the collection of these rows (a submatrix of Ce ) C6 . Then, C6 e 8 With

the E grid in mind. Since the H grid is assumed to have the same number of edges, faces, and cells, we do not make a distinction between these two grids.

7.8 The Finite Integration Technique (FIT)

373

is a 6-vector, and each of its entries is the sum of four circulations of e, with their respective signs. Further, let Dα be the row of De corresponding to the grid cell Vα ; this row contains exactly six nonzero entries ±1 corresponding to the faces of Vα . Then, the product Dα C6 e contains 6 × 4 = 24 contributing terms—the circulations in e,—but they cancel in a pairwise fashion because each edge of Vα is traversed twice in two opposite directions. This is analogous to the standard semiformal derivations of the div and curl identity and Stokes’ theorem.9 Obviously, completely similar considerations apply to the H grid, displaced relative to the E grid by half the mesh size in all spatial directions. Matrices Dh and C h , fully analogous to De and Ce , can be defined, and Dh C h = 0

(7.52)

With the discrete analogs of circulation and flux operations (or, almost equivalently, of div and curl) so defined, Maxwell’s equations can be discretized with zero consistency error. That is, the edge circulations and face fluxes of any actual solution of Maxwell’s equations satisfy the discrete analogs of Maxwell’s curl equations such as (7.50) exactly. It would clearly be too good to be true if the numerical solution obtained from these schemes were itself exact. One obvious source of error is FD approximation of the time derivatives of fields. But there is another, more consequential source. The system of discrete circulation-flux equations needs to be closed by specifying material relations between the DB and EH degrees of freedom. For spatially uniform fields, such relations are trivial: B =

1 ledge

E H Sface ;  D =

1 ledge

E E Sface

where the notation is self-explanatory. For nonuniform fields, and especially in the presence of material interfaces, the flux-circulation relations become complicated. This is one of the central problems in FD-type methods in general and FIT in particular. To get a flavor of the issues involved and various ideas available, the interested reader may turn to Sect. 7.12 and multiple references there—in particular, papers by T. Weiland, I. Zagorodnov, M. Clemens, H. De Gersem [CW02, ZSW03, Wei03, SW04, DMW08]. More fundamentally, the flux-circulation material relations can be viewed from a rigorous differential-geometric perspective, via “discrete Hodge operators.” This subject is beyond the scope of the book, and I refer to papers by R. Hiptmair, A. Bossavit, T. Tarhasaari, F. Trevisan, L. Kettunen, F. Teixeira, W. C. Chew [TKB99, TC99, BK00, Hip01, TK04, Tei14]. Related work by E. Tonti [Ton02], C. Mattiussi [Mat97, Mat00], J.M. Hyman & M. Shashkov [HS99] must also be mentioned.

9 E.g.

https://en.wikipedia.org/wiki/Stokes%27_theorem#Underlying_principle.

374

7 Finite-Difference Time-Domain Methods for Electrodynamics

T. Weiland’s development of FIT [Wei77, Wei96, Wei03] formed a basis for commercial software packages by Dassault Systèmes—Computer Simulation Technology (CST) (Sect. 7.18).

7.9 Advanced Techniques 7.9.1 Implicit Schemes Although explicit schemes are simple and efficient, they are subject to stability requirements. Namely, the maximum time step is related to the spatial grid size, and therefore, stability constraints are particularly severe for problems with fine spatial grids. In contrast, for implicit schemes the time step is constrained only by accuracy considerations rather than stability (Sect. 7.9.1); but, on the downside, large systems of equations have to be solved at each time step. An interesting trade-off is achieved in the alternating direction implicit (ADI) method, which can be traced back to the work of D. W. Peaceman & H. H. Rachford [PR55] and J. Douglas [Dou55] in the 1950s. ADI can be applied to matrix equations of the form AX − X B = C, where A, B, C are given matrices and X is unknown, or to FD systems arising from elliptic, parabolic or hyperbolic boundary value problems. To highlight the key ideas, let us consider the following setup which, as a bonus, has some interesting but not widely known features. Suppose that we are solving a time-dependent boundary value problem in two spatial dimensions, on a regular Cartesian grid in a rectangular computational domain. For simplicity of exposition, consider a parabolic problem of the same form as in the work of S. P. Voskoboynikov & Yu. V. Rakitskii [DRV84]: ∂t u = Lu + f (x, y), L = Lx + L y

(7.53)

Lx u = ∂x (ax (x) ∂x u) + bx (x) ∂x u + cx (x)u

(7.54)

L y u ≡ ∂ y (a y (y) ∂ y u) + b y (y) ∂ y u + c y (y)u

(7.55)

Note the limitation: The coefficients a, b, c depend on one coordinate only. For definiteness, assume that Eqs. (7.53), (7.54), (7.55) are supplemented by proper Dirichlet boundary conditions and initial conditions for u(x, y) at t = 0. The traditional way of representing the FD discretization of this problem on a regular Cartesian grid with n x × n y = n nodes is by unwrapping the numerical solution into a column vector in Rn (or Cn for complex-valued solutions). An interesting alternative, however, is to keep the numerical solution in its natural 2D form, i.e. as an n x × n y matrix U . The FD problem can then be written as a matrix equation

7.9 Advanced Techniques

375

dt U (t) = L x U (t) + U (t)L y + F(t); U (0) = U0

(7.56)

where the square matrices L x,y represent a given discretization of the operators Lx,y , respectively. Note that L x multiplies U from the left and can be viewed as acting on each column of U separately, which in our representation corresponds to a varying x at a fixed y. Similarly, L y acts on the rows of U , which, as expected, corresponds to the L y operator. One could apply basic explicit and implicit schemes to (7.56). For example, the Crank–Nicolson scheme reads U (t + t) + U (t) U (t + t) − U (t) U (t + t) + U (t) = Lx + L y (7.57) t 2 2 where F is set to zero for brevity; this is also the most common case in FDTD simulations. The above equation for U (t + t) is computationally costly to solve because of the combination of the L x and L y terms. The ADI idea is to decouple those terms by splitting the time step into two half-steps: U (t + t/2) − U (t) = L x U (t + t/2) + U (t) L y t/2 U (t + t) − U (t + t/2) = L x U (t + t/2) + U (t + t) L y t/2

(7.58)

(7.59)

The key observation is that the first half-step, implicit with respect to U (t + t/2), requires solution of a matrix equation with L x only, whereas the second half-step involves solution of a system with L y . These separate systems can be much easier to solve than one with both L x and L y . Typical, for example, is a three-point stencil per coordinate direction (as in the Yee scheme), in which case each of the two matrices L x,y is tridiagonal, and the respective systems can be solved just with O(n x,y ) operations. The ADI ideas extend to Maxwell’s equations (ME); however, due to the vectorial three-dimensional nature of ME and multi-index notation, expressions become cumbersome. I do not reproduce them here and refer instead to the original work by R. Holland [Hol84] and T. Namiki [Nam99], as well as to the lucid paper by M. Chai et al. [CXZL07]. In addition, H. de Raedt gives an excellent exposition of FDTD-ADI methods, including a general framework involving approximations of matrix exponentials [TH05, Chap. 18]. A different splitting idea for Maxwell’s equations was proposed by F. Zheng et al. [ZCZ99, ZCZ00]. Consider first Maxwell’s ∇ × H equations in the Cartesian representation: (7.60) ∂t E x = −1 (∂ y Hz − ∂z Hy ) ∂t E y = −1 (∂z Hx − ∂x Hz )

(7.61)

376

7 Finite-Difference Time-Domain Methods for Electrodynamics

∂t E z = −1 (∂x Hy − ∂ y Hx )

(7.62)

In conventional ADI schemes in 3D, one time step is broken up into three substeps, implicit with respect to the x, y and z coordinates. In contrast, there are two time substeps in the procedure due to F. Zheng et al., and the splitting is with respect to the two terms in the right-hand side of each of Maxwell’s equations (7.60), (7.61), (7.62). That is, the first substep is implicit with respect to the first terms (∂ y Hz , ∂z Hx , ∂x Hy ), and the second substep—with respect to the second terms (∂z Hy , ∂x Hz , ∂ y Hx ).

7.9.2 Pseudospectral Time-Domain Methods It is well known that smooth periodic functions and their derivatives can be approximated very accurately via a Fourier series, so that the approximation errors can fall off exponentially as the number of terms in the expansion increases. The relevant mathematical theory can be found in S. A. Orszag’s paper [Ors80]; see also [TMCM19, Sect. 2] for a recent summary. This allows one to construct FD schemes much more accurate than the traditional ones. In such schemes, the relevant spatial derivatives of the numerical solution are obtained in a standard fashion from its fast Fourier transform (FFT); time derivatives are still approximated by finite differencing. This idea has been around for a long time. For example, H. O. Kreiss & J. Oliger noted in 1972 [KO72, Sect. 4]: ... using [the Fourier] method, we need only two points per wave length to represent the wave exactly, compared to seven points for the fourth order scheme allowing an error of 10%, and thirteen points allowing 1% error.

In 1975, B. Fornberg published a detailed mathematical analysis of FFT-FD methods for hyperbolic equations in one spatial dimension [For75]. In the early 1980s, the practical advantages of this class of methods for 2D problems in acoustics were recognized (J. Gazdag [Gaz81], D. D. Kosloff & E. Baysal [KB82], among others). One limitation of FFT-FD is the requirement of spatial periodicity, which until the 1990s impeded applications of these methods in FDTD electromagnetic analysis, especially in 3D. Imposing periodic boundary conditions on a hexahedral computational domain leads to a numerical artifact—the “wraparound effect,” whereby a wave leaving the domain on one side reemerges on the opposite side. After J. P. Bérenger’s breakthrough in perfectly matched layers (PML) [Ber96], Q. H. Liu noted in 1997 [Liu97] that FFT-FD can be gainfully combined with the PML. The latter allows the outgoing waves to exit the domain, while suppressing the incoming wraparound ghosts. Since 1997, the FFT-TD-PML method, known as “pseudospectral time domain” (PSTD), has gained recognition and popularity. The interested reader may find the technical details and further references e.g. in the papers by G. Chen et al. [CYK08] and M. Chai et al. [CXZL07].

7.10 Exterior Boundary Conditions

377

7.10 Exterior Boundary Conditions 7.10.1 Introduction In FDTD, as well as in all other methods based on differential formulations, unbounded domains must for obvious computational reasons be truncated by an artificial exterior boundary. Conditions on such boundaries turn out to be highly nontrivial. A person unfamiliar with the problem might suggest moving this boundary sufficiently far away from all sources and scatterers, and then imposing, say, the Dirichlet conditions on the field at that boundary. This may indeed be feasible (albeit not necessarily efficient) for static fields or diffusion problems. For waves, however, this approach is qualitatively wrong because Dirichlet conditions transform an open-boundary domain to a cavity with perfectly conducting walls. All outgoing waves are then reflected back into the computational domain and produce spurious interference. If the artificial boundary is placed far away, and the simulated time interval is short and insufficient for a wave to make a round trip from the sources and scatterers to the boundary and back, then there is no problem; but this situation is uncommon in practice. The general goal is to devise exterior boundary conditions under which outgoing waves would either be absorbed or will exit the domain with as little reflection as possible. The critical role of artificial boundary conditions for finite difference or finite element solution of wave problems was recognized early on. Important ideas and methods were put forward in the 1970s–90s, but this is still an active research field. The range of proposed solutions is vast; they can be subdivided into three major categories: 1. Nonlocal conditions in space and/or time: L. Ting, M. J. Miksis, D. Givoli, J. B. Keller & M. J. Grote [TM86, KG89, Giv91, GK95, GK96, GK98, Giv99]. 2. Absorbing boundary conditions (ABC), also known as nonreflecting conditions (NRBC); these are local and approximate: B. Engquist, A. Majda, G. Mur, R. Higdon, A. Bayliss, E. Turkel, T. Hagstrom, S. Hariharan, T. Warburton [EM77, Mur81, Hig86, Hig87, BT80, BGT82, HH98, Giv01, GN03, HHT03, HW04, HMOG08, HWG10, ZT13]. 3. Perfectly Matched Layers: J.-P. Bérenger, Z. S. Sacks et al., S. Abarbanel & D. Gottlieb, F. L. Teixeira & W. C. Chew, S. D. Gedney, F. Collino & P. B. Monk, E. Bécache & P. Joly [Ber96, SKLL95, AG98, TC98a, Ged96, CM98b, BJ02, BPG04b] A number of excellent reviews and analyses are available for all the approaches above: D. Givoli [Giv04], S. Tsynkov [Tsy98], T. Hagstrom [Hag99]. This section is not a comprehensive exposition of this vast subject; rather, the objective is twofold: (i) a concise review, with a few historical notes; (ii) a different perspective on some absorbing conditions, showing that they can be viewed as peculiar versions of Trefftz methods, with ideas similar to those of Trefftz–FLAME (Chap. 4).

378

7 Finite-Difference Time-Domain Methods for Electrodynamics

The historical development of non-reflecting boundary conditions (NRBC) has been nicely captured by D. Givoli [Giv04]10 : The main milestones in the history of NRBCs are as follows: (1) Till the late 1970s the use of the Sommerfeld-like NRBC has dominated the field [Giv92, Giv91]. However, soon it became clear that this NRBC provides a very crude approximation. In fact, today it is thought of as a “zero-order” boundary condition. (2) From the late 1970s to the mid-1980s, a few improved low-order NRBCs have been proposed. Some of them became well-known, e.g., the Engquist–Majda NRBCs [EM77] and the Bayliss–Turkel NRBCs [BT80]. The second-order NRBCs in these two sequences have especially become popular... (3) The late 1980s and early and mid-1990s have been characterized by the emerging of exact non-local NRBCs like those based on the Dirichlet-to-Neumann (DtN) map [KG89, GK90] and on the Difference Potential Method (DPM) [RT95, TTA96]. (4) In the mid-1990s the perfectly matched layer (PML) [Ber96] was invented. (5) Since the mid-1990s high-order local NRBCs have been developed.

To this list, we can now add modern versions of absorbing conditions and PML— by T. Hagstrom et al. [HGRB14], V. Druskin et al. [DGK16], S. D. Gedney & coworkers [RG00]. The “discrete PML” by A. Chern deserves separate attention; see Sect. 7.10.4. Even though absorbing (non-reflecting) conditions were historically developed prior to the PML, the latter proved to be more practical, so we start this overview with the PML.

7.10.2 Perfectly Matched Layers In the mid-1990s, J.-P. Bérenger’s idea of Perfectly Matched Layers (PMLs) [Ber94, Ber96] as an alternative to absorbing conditions revolutionized this field of research. A. Taflove & S. C. Hagness [TH05, Chap. 7] provide the following concise summary: Since 1994, a new fervor ... has been created by Berenger’s introduction of a highly effective absorbing-material ABC designated the perfectly matched layer (PML). The innovation of Berenger’s PML is that plane waves of arbitrary incidence, polarization, and frequency are matched at the boundary. Perhaps of equal importance is that the PML can be used as an absorbing boundary to terminate domains comprised of inhomogeneous, dispersive, anisotropic, and even nonlinear media, which was previously not possible with analytically derived ABC. In his pioneering work, Berenger derived a novel split-field formulation of Maxwell’s equations where each vector field component is split into two orthogonal components. Maxwell’s curl equations were also appropriately split, leading to a set of 12 coupled first-order partial differential equations. ... In a continuous space, the PML absorber and the host medium are perfectly matched. However, in the discrete FDTD lattice ... discretization errors can degrade the ideal behavior of the PML. 10 In the following quote, the original reference numbers from D. Givoli’s paper are replaced with this book’s references.

7.10 Exterior Boundary Conditions

379

Technical details of J.-P. Bérenger’s PML can be found in his original papers [Ber94, Ber96]. His formulation is not reproduced here because it is rather cumbersome and, more importantly, has been superseded by better versions of PMLs, as already noted in Sect. 7.10.1. Bérenger’s proposal turned out, in subsequent studies and numerical experiments, to have subtle instabilities. S. Abarbanel & D. Gottlieb’s mathematical analysis showed that Bérenger’s PML is not strongly well-posed [AG97, AG98]; see also S. Abarbanel et al. [AGH02]. More precisely, as stated by P. Petropoulos [Pet00], The Berenger PML does not correspond to a physical process, and the mathematical manifestation of this is the weak well-posedness of the relevant Cauchy problem. ... the hyperbolic Berenger system is not symmetric, ... allows wave modes which grow linearly, in time ... it is ill-posed under perturbations.

Bérenger’s idea gave rise to a large volume of research on various other versions of PML: uniaxial PML (UPML, S. D. Gedney, [Ged96], Z. S. Sacks et al. [SKLL95]), convolution PML (CPML, S. D. Gedney & J. A. Roden [RG00]), and PML based on complex coordinate transforms, closely related to UPML (W. C. Chew, W. Weedon, F. L. Teixeira [CW94, TC97, TC98b]).

7.10.3 PML: Complex Coordinate Transforms In the mid-1990s, W. C. Chew and collaborators developed a PML based on complex coordinate transformations [CW94, TC97, TC98b]. This approach has proved to be quite fruitful and is outlined below, following closely a lucid exposition by S. Johnson [Joh10] and a rigorous mathematical presentation by A. Chern [Che19]. For brevity of notation, let us start with a 1D scalar wave equation, and then consider extensions to Maxwell’s equations in multiple dimensions. Let u(z) = exp(ikz) be a traveling wave satisfying the Helmholtz equation in a homogeneous medium at a fixed frequency ω and the corresponding wavenumber k: dz2 u + k 2 u = 0

(7.63)

This Helmholtz equation itself and its plane wave solution can be complexified. That is, one may consider their analytical continuation from the real z-axis into the complex plane: 2 u + k 2 u = 0, (7.64) dZ u(Z) = exp(ikZ),

(7.65)

Z = z + i f (z)

(7.66)

with

380

7 Finite-Difference Time-Domain Methods for Electrodynamics

Fig. 7.7 Path P in the complex plane for the construction of a coordinate-stretched PML. (After S. G. Johnson’s notes [Joh10].)

where f is some suitable function of z. Note that (7.66) defines a path in the complex plane, dependent on the choice of f . The key idea of this complexification is that for f (z) > 0 the wave exp(ikZ) acquires the attenuation factor exp(− f (z)z); but if f = 0, this wave is obviously unaffected. Let us assume that the exterior boundary of the domain is located at z = z 0 , and then consider a path P going along the real axis for z < z 0 but deviating into the first quadrant for z ≥ z 0 (Fig. 7.7).11 From the italicized statement of the previous paragraph, it is clear that the first-quadrant part of P (z ≥ z 0 ) acts as a perfectly matched layer indeed: The outgoing wave remains unchanged for z < z 0 (i.e. no reflected wave exists) but gets attenuated for z > z 0 . Once the attenuation reaches a desired level (in principle, the amplitude of the wave could even be reduced to machine precision), the domain can be truncated and a zero Dirichlet condition imposed. Thus, the problem of absorbing conditions could be reduced to solving the wave equation along a (suitably truncated) path P. This approach is, however, unconventional, so it makes practical sense to pull the differential equation back to the real z-axis. This is accomplished by the straightforward substitution dZ ≡ dz+i f (z) =

1 dz 1 + i f  (z)

Hence, the wave equation (7.64) along the complex-plane path can be converted to the standard equation (7.67) dz2 u + (1 + i f  (z))2 k 2 u = 0 Rewriting this equation in terms of frequency ω and refractive index n of the original homogeneous medium, we get, since k = nω/c,  nω 2 u = 0 dz2 u + (1 + i f  (z)) c

(7.68)

or just 11 This

first-quadrant part of the path is shown as a straight ray in the figure, for reasons that will become clear shortly.

7.10 Exterior Boundary Conditions

381

dz2 u

+

nω ˜ c

2 u = 0,

n˜ ≡ (1 + i f  (z)) n

(7.69) (7.70)

We thus arrived at a conventional wave problem with a PML material described by (7.69). We have also observed that a coordinate transformation—in this case, (7.66)— can be recast equivalently as a material transformation (7.70). This equivalence is the cornerstone of transformation optics (A. J. Ward, J. B. Pendry et al. [WP96, PSS06]; Sect. 9.2.3) but has a very long history (see e.g. I. R. Ciric & S. H. Wong [CW86], E. M. Freeman & D. A. Lowther [FL89], A. Stohchniol [Sto92], A. Plaks et al. [PTPT00]). Any physically reasonable function f (z) will give rise to a valid PML; however, not all choices are equally efficient in practice. It is clear from (7.68) that the local attenuation rate around a point z is governed by the product f  (z)nω/c. For timedomain problems, where the solution has a broad frequency spectrum, it is desirable to make this attenuation rate frequency independent. This is achieved if f  (z) =

σ(z) ω

(7.71)

To simplify this further, σ could be a constant. The refractive index (7.70) becomes  σ n n˜ ≡ 1 + i ω

(7.72)

A mild downside of this choice of f is that it defines an artificial PML medium with frequency dispersion (frequency-dependent index). However, there are several efficient implementations of FDTD for dispersive media; see Sect. 7.14. So far, we have considered outgoing waves traveling in a given fixed direction (z). In 3D, for a planar exterior boundary surface perpendicular to z, outgoing waves will in general impinge on that boundary at arbitrary angles. The above analysis holds with the replacement of the wavenumber k with its z-component k z . This still defines a PML, although the attenuation rate in it will now depend on the angle of incidence. This seems to be an unavoidable feature of any PML, since waves traveling at grazing incidence cannot be expected to be strongly attenuated. This is not a serious problem, though, since in practice the PML is placed at a distance from all sources and scatterers, and so grazing incidence of outgoing waves can be avoided. Since the coordinate transformation is performed in the direction perpendicular to the exterior boundary, while the tangential directions remain unchanged, the artificial PML is uniaxial. There are ways of deriving the respective material parameters directly, by imposing the condition that the reflection coefficient be zero at all angles (Z. S. Sacks et al. [SKLL95]).

382

7 Finite-Difference Time-Domain Methods for Electrodynamics

7.10.4 Discrete Perfectly Matched Layers Bérenger’s PML and most of its later improvements were developed in continuous space and time. These PMLs were—true to their name—indeed perfectly matched. That is, no reflection is generated for any outgoing waves incident on the layer at any acute angle. Unfortunately, once the continuous equations are discretized, the matching degrades and is no longer perfect; spurious reflections do appear. Thus, an interesting idea is to devise PMLs directly on the discrete level. Such development is discretization-dependent and hence not as universal as it is in the continuous case; but on the plus side, one may hope that discrete absorbing layers could indeed be perfectly matched and generate no reflection for any lattice-based outgoing waves. W. C. Chew & J. M. Jin moved in this direction in the mid-1990s [CJ96]. They examined discrete analogs of the Chew Weedon coordinate stretching [CW94] and concluded: A perfectly matched interface is shown not to exist in the discretized space, even though it exists in the continuum space. Numerical simulations both using finite difference method and finite element method confirm that such discretization error exists. A numerical scheme using the finite element method is then developed to optimize the PML with respect to its parameters.

A few versions of absorbing conditions (rather than PMLs) on the discrete level are described in Sect. 7.10.5. A. Modave et al. [MDG14] analyzed various spatial discretizations of continuous-space PMLs and optimized the respective damping functions. In all these cases, though, the ABC and PMLs are only approximately non-reflecting. When this chapter was being written, a perfect discrete PML was published by A. Chern [Che19]. This appears to be a breakthrough; time will tell if this is indeed so. The crux of this development is Discrete Complex Analysis, which produces lattice-based analogs of analytical functions and analytical continuation. As a result, it turns out to be possible to engineer a PML by discrete coordinate stretching. What follows is an outline of A. Chern’s idea. Consider a lattice—not necessarily rectangular—in the complex z plane, indexed by an integer pair (m, n). For any complex-valued function f on this lattice, one may introduce a discrete analog of the Cauchy–Riemann condition: f m,n+1 − f m+1,n f m+1,n+1 − f m,n = z m+1,n+1 − z m,n z m,n+1 − z m+1,n

(7.73)

Note that these fractions have a transparent geometric meaning: they correspond to the two diagonals of the lattice cell (“face”) indexed by m, m + 1, n, n + 1. If (7.73) is satisfied for a given face, its left or right-hand side is defined to be the discrete complex derivative of f over that face. If this derivative exists (i.e. the discrete Cauchy–Riemann condition holds) for all faces, then function f is said to be discrete holomorphic.

7.10 Exterior Boundary Conditions

383

A. Chern then considers the standard central-difference scheme for the scalar wave equation in the time domain and in any number of spatial dimensions; in a homogeneous domain, it can be Fourier-transformed to the central-difference scheme for the Helmholtz equation. It then turns out that a discrete analytical continuation (in the sense of discrete holomorphic functions, as defined above) can be constructed for the difference scheme and its solutions. This produces a full discrete analog of continuous coordinate stretching (Sect. 7.10.3) and of the respective PML. ∂t2 u =

d  τα u − 2u + τ −1 u α

α=1

h2

∂ t α = −

+

d  σα (τα α ) − (τα−1 σα )(τα−1 α ) h α=1

(7.74)

τα u − τα−1 u (τα−1 σα )(τα−1 α ) + σα α − 2 2h

(7.75)

(τα−1 σα )α + σα (τα α ) τα u − τα−1 u − 2 2h

(7.76)

∂t α = −

The notation here is as follows: • • • •

d is the number of spatial dimensions. h is the mesh size, for simplicity the same in all coordinate directions. α is the coordinate index. τα is the spatial shift operator—translation of the solution by one grid cell in the direction of coordinate number α. In particular, the first term in the right-hand side of (7.74) and the last terms in (7.75), (7.76) are the standard central differences. • σα are damping coefficients, equal to zero outside the PML and adjustable parameters within the PML. • α and α are auxiliary variables defined only within the PML. Once this notation is understood, implementation of the discrete PML for the scalar wave equation is straightforward. One important limitation of the method, however, is that it has to be tailored to each difference equation separately. Quoting A. Chern [Che19]: Each discrete wave equation requires a new investigation in the perspective of discrete complex analysis and discrete differential geometry. A discrete wave equation based on a higher order scheme may require a different discrete Cauchy–Riemann relation. ... Another notable property of the current discrete wave equation is that it allows one to investigate the PML dimension by dimension, each of which fortunately admits a straightforward discrete complex extension. Such a property may not be true for other discretized wave equations. A discrete wave equation formulated on an unstructured mesh also requires new coordinate-free theories for discussing any notion of discrete analytic continuation. ... The current work also relies on taking Fourier transform in time ... Such a treatment may need to be revised for the discrete Maxwell’s equations based on Yee’s scheme.

384

7 Finite-Difference Time-Domain Methods for Electrodynamics

7.10.5 Absorbing Conditions 7.10.5.1

General Considerations

This section is not a comprehensive study or review of absorbing (non-reflecting) boundary conditions (ABC). Rather, we look at a number of these conditions from a non-traditional perspective, consistent with other material in this book—namely, Trefftz approximations. These are used in Chap. 4 to construct high-order finitedifference schemes and in Chap. 9 to develop two-scale homogenization procedures. For detailed information about various absorbing conditions, see reviews by S. V. Tsynkov [Tsy98] and D. Givoli [Giv04]. The remainder of this subsection follows closely my preprint [Tsu14], as well as its modified and more condensed published version (A. Paganini et al. [PSHT16]). To fix ideas, we shall consider the scalar wave equation either in the frequency domain ∇ 2 u + k02 u = f in R n , n = 1, 2, 3; supp f ⊂  ⊂ R n

(7.77)

or, alternatively, in the time domain v 2 ∇ 2 u − ∂tt2 u = f in R n , n = 1, 2, 3; supp f ⊂  × [0, ∞)

(7.78)

As indicated in these equations, sources f are assumed to be confined to a bounded domain  in space. In (7.77), k0 is a given positive wavenumber. In (7.78), v is the velocity of waves, for simplicity assumed to be position-independent, although the analysis can be extended to more complex cases. When convenient, v will be normalized to unity. We shall deal primarily with 2D problems, although all ideas can be extended to 3D. Let us assume that ∂ is a rectangle (a parallelepiped in 3D); conditions at the corners and edges will not be ignored. Problem (7.77) requires radiation boundary conditions (e.g. Sommerfeld) at infinity, but our task is to replace these theoretical conditions with approximate but accurate and practical ones on the exterior surface ∂ away from the sources. This is to be done in such a way that the solution subject to these artificial conditions be by some measure close to the true solution in . Similarly, we shall seek approximate boundary conditions on ∂ for problem (7.78) as well. Initial conditions for (7.78) are assumed to be given and are tangential to our analysis. Let us first consider a straight artificial boundary in 2D, for convenience at x = 0, with the computational domain  situated on the positive x side. One classical nonreflecting condition, due to B. Engquist & A. Majda [EM77], follows from the dispersion relation k x2 + k 2y − k 2 = 0



k x = −k



1 − k 2y /k 2

(7.79)

7.10 Exterior Boundary Conditions

385

which holds for problem (7.77) if k = k0 and for the Fourier-transformed problem (7.78) if k = ω/v, ω ∈ (−∞, ∞). The negative sign of k x in (7.79) corresponds to outgoing waves (waves moving in the x direction) under the exp(−iωt) phasor convention. If instead of the square root (7.79) contained a rational fraction of k x , k y , k0 , then the inverse transform of the corresponding dispersion relation would be an exact nonreflecting boundary condition involving a combination of x, y and t derivatives. It is then clear that a sequence of approximate absorbing conditions can be derived using Taylor or Padé approximations of the square root and inverse-transforming these relationships back to real space; see e.g. W. Cai’s monograph [Cai13] for details. (Engquist & Majda’s analysis is ultimately equivalent but couched in the language of pseudodifferential operators.) Another classical idea, due to E. Turkel and collaborators [BT80, ZT13], involves a cylindrical (2D) or spherical (3D) harmonic expansion of radiated fields. A sequence of differential operators annihilating progressively higher numbers of terms in this expansion constitutes absorbing conditions of progressively higher orders. One well-recognized shortcoming of these classical methods is their reliance on high-order derivatives that are difficult to deal with in numerical simulations. To overcome this deficiency, several clever reformulations have been proposed by D. Givoli, T. Hagstrom and coworkers [GN03, HH98, HW04, HWG10], with a sequence of auxiliary variables on the exterior boundary instead of high-order derivatives. Methods of this type will remain out of the scope of this book. Rather, we focus on a semiautomatic “machine” for generating approximate absorbing schemes that include, but are certainly not limited to, the classical Engquist–Majda and Bayliss–Turkel conditions. Several examples of such schemes are presented below. The “machine” has two main ingredients. The first one is a set of local basis functions ψα (α = 1, 2, . . . , n) approximating the solution near a given point on the exterior boundary. These functions can be chosen as outgoing waves of the form g(kˆ · r − t), where g is a given function (e.g. sinusoidal) and kˆ is a unit vector at an acute angle to the outward normal on ∂. The second ingredient is a set of m degrees of freedom (DOF)—linear functionals lβ (u) (β = 1, 2, . . . , m); m is not usually equal to n. To elaborate, let the exact solution be approximated locally as a linear combination ua =



cα ψα = c T ψ

(7.80)

α

where c is a Euclidean coefficient vector and ψ is a vector of basis functions. (Vectors are underlined to distinguish them from other entities.) The functions and coefficients can be real-valued or complex-valued, as will be clear from the context. Coefficients c may be different at different boundary points, but for simplicity of notation this is not explicitly indicated.

386

7 Finite-Difference Time-Domain Methods for Electrodynamics

We are looking for a suitable boundary condition of the form 

sβ lβ (u a ) = 0

(7.81)

β

where s ∈ Rm (or Cm in the complex case) is a set of coefficients (“scheme”) to be determined. We require that the scheme be exact for any u a , i.e. for any linear combination of basis functions:     sβ l β cα ψα = 0 α

β

or in matrix form cT N T s = 0 T = lβ (ψα ). Since the above equality where N T is an n × m matrix with entries Nαβ is required to hold for all c, one must have

s ∈ Null N T

(7.82)

This whole development is completely analogous to that of FLAME in Chap. 4, although the goal there was to construct a finite-difference scheme rather than an absorbing condition. The DOF in FLAME are the nodal values of the solution on a given grid “molecule.”12 It is, however, interesting to bring more general linear functionals into consideration, as outlined below. S. Gratkowski [Gra09] uses similar ideas to derive analytical boundary conditions, albeit for static problems only and without the null-space formula (7.82). As multiple examples below and in [Tsu07, Tsu05a, Tsu06, Tv08, Tsu09] demonstrate, this formula, despite its simplicity, is rich and leads to a variety of useful schemes, both numerical and analytical. One important measure of the quality of the boundary condition is the reflection coefficient R(θ), defined as follows. Consider an outgoing complex-exponential wave u o = Ao exp (i(−x cos θα + y sin θα + t)) and the corresponding reflected wave u r = Ar exp (i(x cos θα + y sin θα + t)), where Ao , Ar are complex amplitudes. Further, let the absorbing condition be defined by a set of coefficients sβ . Then, by definition, R satisfies 

sβ lβ (u o + Ru r ) = 0

β

or

12 It

is for the sake of compatibility of notation with the material on FLAME that the matrix has been denoted with N T rather than just N .

7.10 Exterior Boundary Conditions

387



β sβ l β (u o )

R = −

(7.83)

β sβ l β (u r )

We now consider several applications of the null-space formula (7.82).

7.10.5.2

Example: Basis of Cylindrical Harmonics, Derivatives as dof

Consider the 2D Helmholtz equation (7.77). As we shall see in this section, the “machine” described above produces, with a natural choice of basis functions and degrees of freedom, the classical Bayliss–Turkel conditions. Indeed, the scattered field outside  can be expanded into cylindrical harmonics as u(r) =

∞ 

cn h |n| (k0 r ) exp(inθ),

(7.84)

n=−∞

where h n is the Hankel function (of the first kind under the exp(−iωt) phasor convention for time-harmonic functions, and of the second kind under exp(+iωt)). It is convenient to replace Hankel functions with their asymptotic expansions at infinity, viz.: 1

∞   π   al 2 2 nπ − exp i w − h n (w) = πw 2 4 wl l=0 with some coefficients al , expressions for which are rather cumbersome and unimportant for our purposes. Substituting this Hankel expansion into series (7.84) for u, one obtains

u(r) =

2 πk0 r

21  ∞





∞  al |n|π π − exp(inθ) cn exp i k0 r − 2 4 (k0 r )l n=−∞ l=0



2 πk0 r

21

exp(ik0 r )

∞  gl (θ) l=0

rl

(7.85)

Here, gl (θ) are some functions that absorb both al and the n-index summation and whose explicit form will not be needed. The ∼ sign indicates that this well-known series is, as a more rigorous analysis shows, an asymptotic rather than necessarily a convergent one (S. N. Karp [Kar61], A. Zarmi & E. Turkel [ZT13]). Even though functions gm depend on the solution and therefore are unknown a priori, we still proceed and use the first few terms in (7.85) as basis functions for our “machine.” This works because gl depend only on the angle θ, while we deliberately choose the DOF to be independent of θ. The general idea is best illustrated with a particular case of only two basis functions

388

7 Finite-Difference Time-Domain Methods for Electrodynamics

ψ0 =

exp(ik0 r ) g0 (θ), √ k0 r

ψ1 =

exp(ik0 r ) g1 (θ) √ r k0 r

Since our DOF need to be independent of θ (see above), radial derivatives are a natural choice: ∂β u , β = 0, 1, 2 lβ (u) = ∂r β Applying these DOF to the basis set, one obtains by straightforward calculation N T = {lβ (ψα )}nα=1 mβ=1 = exp(ik0 r ) √ k0

 i

− r 3/2 0 g0 (θ) 1 0 g1 (θ) 1/2 r

2k0 r +3i 2r 5/2 −1+2ik0 r 2r 3/2

−ik02 r 2 +3k0 r +15i/4 r 7/2 k02 r 2 +ik0 r −3/4 r 5/2

− −



The null space of this matrix is seen to be independent of θ, and the coefficients for the absorbing condition are calculated to be

s = null N

T

=

T 3 − 2ik0 , 1 r

3 3ik0 , − k02 − 2 4r r

More explicitly, the boundary condition is Lu = 0,

Lu ≡

 β

sβ lβ (u) =

3 3ik0 − k02 − r 4r 2



u +

3 − 2ik0 r



∂2u ∂u + ∂r ∂r 2

which is none other than the second-order Bayliss–Turkel condition.

7.10.5.3

Example: Sinusoidal Basis, Mixed Derivatives as DOF

Now consider the time-dependent wave equation (7.78) in the half-plane x > 0. To run our “machine,” let us choose the basis of outgoing waves ψα (x, y, t) =

∂α exp (i(−x cos θ − y sin θ + t))|θ=0 , α = 0, 1, . . . , n − 1 ∂θα

(7.86)

The rationale for this choice of functions is that they are expected to provide accurate approximation of outgoing waves near normal incidence. Explicit expressions for the first five of these functions are ψ0 = exp(i(−x + t)); ψ1 = −i y exp(i(−x + t)); ψ2 = (−y 2 + i x) exp(i(−x + t))

ψ3 = (i + i y 2 + 3x)y exp(i(−x + t));

7.10 Exterior Boundary Conditions

389

ψ4 = (−i x − 3x 2 + 4y 2 + y 4 − 6i x y 2 ) exp(i(−x + t)) As DOF, let us introduce

m

lβ (u) = ∂xm x ∂ y y ∂tm t u where m x = 0, 1; m y = 0, 2; m x + m y + m t is either 2 (a second-order method) or 3 (a third-order method). The omission of m y = 1 reflects the symmetry of the problem with respect to y. The respective N T matrices for n = 3 and n = 5 basis functions are NT

NT

⎛ ⎞ −1 0 1 = ⎝0 0 0⎠ 0 −2 −1

⎛ −i ⎜0 ⎜ = ⎜ ⎜0 ⎝0 0

0 0 −2i 0 8i

⎞ i 0 0 0 ⎟ ⎟ −i 2i ⎟ ⎟ 0 0 ⎠ i −20i

(n = 3)

(n = 5)

Calculating the null space of these matrices, one arrives at the Engquist–Majda conditions of order two and three, respectively. Thus, not only the Bayliss–Turkel but also the Engquist–Majda conditions can be generated by the proposed machine. Remark 25 Clearly, with an elementary degree of foresight, the odd-numbered basis functions ψ1,3 could have been omitted from the basis set, as they produce zero rows of N T due to symmetry. These functions were retained, however, to demonstrate the operation of the Trefftz machine in semiautomatic mode, with as little “human intervention” as possible.

7.10.5.4

Two Absorbing Schemes in the Frequency Domain

In this section, we compare three absorbing schemes for the 2D Helmholtz equation (7.77) in the frequency domain. The domain is a square  = [−L , L] × [−L , L]. 1. First, we consider the previously developed FLAME scheme [Tsu06, Tsu07, Tsu05a] over a six-point stencil on the sides of ∂ and over a four-point stencil at the corners of ∂. The basis set consists of five (on the straight sides) or three (for corner stencils) outgoing plane waves; see details below. The DOF are, as in finite-difference analysis, the nodal values of the solution. 2. Same as above, but with the new basis set (7.86). The rationale for this choice is to maximize the accuracy of approximation around normal incidence. However, approximation turns out to be good not only for very small angles but in a fairly broad range of angles of incidence.

390

7 Finite-Difference Time-Domain Methods for Electrodynamics

The detailed setup of these three methods is as follows. In the first one (FLAME schemes), the basis over the straight part x = 0 of the boundary consists of five plane waves ψα (x, y) = exp(ik0 (−x cos θα − y sin θα )), with θα = απ/6, α = −2, −1, 0, 1, 2. The DOF are the nodal values of these plane waves on the six-point stencil (xβ , yβ ), β = 1, . . . , 6. The coordinates of the stencil nodes are x1..6 = {0, h, 0, h, 0, h}, y1..6 = {0, 0, −h, −h, h, h}, where for simplicity the origin is set at the first node and h is the grid size. The coefficient vector of the FLAME scheme, T = ψα (xβ , yβ ). Expresi.e. of the absorbing condition, is s = Null N T , where Nαβ sions for these coefficients are too cumbersome to be listed here but easily obtainable with symbolic algebra. For reference, the numerical values of these coefficients for h = λ0 /12 = 2π/(12k0 ) are s1 = 0.35149777 − 1.3321721i, s2 = −1.36164086 − 0.21016073i, s3 = −0.39962632 + 0.91667814i, s4 = 1, s5 = −0.39962632+ 0.91667814i, s6 = 1. At a corner (placed for simplicity at the origin), the four-point stencil is x1..4 = {0, h, 0, h}, y1..4 = {0, 0, h, h}, and the three basis functions are ψα (x, y) = exp(ik0 (−x cos θα − y sin θα )), θα = απ/6, α = 0, 1, 2. In Method 2 above, the grid stencils are the same as in Method 1, but with the basis set (7.86). The absorbing scheme is again found as the null space of the respective matrix N T , although this matrix is of course different from that of Method 1. The scheme is simple enough to be written out explicitly: s1 = −2 exp(−ik0 h)

5(k0 h)2 + 3ik0 h − 3 −5(k0 h)2 + 3ik0 h + 3 ; s ; = −2 2 (k0 h)2 + 3ik0 h + 3 (k0 h)2 + 3ik0 h + 3

s3 = s5 = − exp(−ik0 h)

(k0 h)2 − 3ik0 h + 3 ; s4 = s6 = 1 (k0 h)2 + 3ik0 h + 3

The respective scheme at the corner for Method 2 is s1,corner = −s3,corner = exp(−ik0 h); s4,corner = −s2,corner = 1 The absolute value of the reflection coefficient for both methods is plotted in Fig. 7.8 (20 points per wavelength, i.e. λ0 / h = 20 ⇔ k0 h = π/10); the respective results for the Engquist–Majda conditions of orders one through three are also shown for reference. It is evident that |R(θ)| in Method 2, which is based on the θ-derivative basis (7.86), are virtually indistinguishable from that of the Engquist–Majda condition of order three. Since basis functions in Method 2 are tailored toward approximation of waves near normal incidence, it is not surprising that this method outperforms Method 1 for small angles of incidence (θ  π/6, Fig. 7.8). At greater angles, it is Method 1 that yields lower reflection. In practical simulations, one may therefore expect that if the artificial boundary is placed far away from the scatterers and consequently the scattered field impinges on it at close-to-normal incidence, Method 2 will be preferable; otherwise Method 1 can be expected to perform better.

7.10 Exterior Boundary Conditions

391

Fig. 7.8 Absolute value of the reflection coefficient vs. the angle of incidence of a plane wave. Results for the “Trefftz machine” with ainusoidal and θ-derivative bases, in comparison with the Engquist–Majda conditions

7.10.5.5

Summary on Absorbing Conditions

We have reviewed selected types of absorbing conditions, with a focus on a semiautomatic generator of such conditions for wave problems. This generator relies on a set of local Trefftz basis functions (outgoing waves) and a commensurate set of linear functionals (degrees of freedom, DOF). Degrees of freedom involving nodal values on a grid give rise to numerical (finite-difference-type) nonreflecting conditions, while DOF involving derivatives produce analytical ones. The schemes, analytical as well as numerical, are given by the simple null-space formula (7.82). Consistency of such schemes can be established in a way similar to the analysis of [Tsu06, Tsu07]; stability and convergence cannot be guaranteed a priori and need to be examined on a case-by-case basis. Classical boundary conditions such as Engquist–Majda and Bayliss–Turkel, and likely also their extensions [ZT13], can be reproduced faithfully by the proposed “Trefftz machine.” Corners and edges are treated algorithmically the same way as straight boundaries. This may open up various avenues for the development of new approximate boundary conditions and gives an opportunity to look at the existing ones from a different perspective.

7.11 Long-Term Stability and the Lacunae Method Extensive practical simulations have shown that many, if not all, versions of ABC and PML suffer from long-term numerical instabilities, the sources of which are not necessarily clear (see papers by S. Abarbanel and collaborators [AG97, AG98,

392

7 Finite-Difference Time-Domain Methods for Electrodynamics

AGH02], E. Bécache, P. Joly, P. G. Petropoulos, S. D. Gedney [BJ02, BPG04a], T. Hagstrom and collaborators [HL07, HWG10], S. Petropavlovsky, S. Tsynkov and collaborators [Tsy03, QT08, FKT+16, PT17]). A universal remedy for these mild instabilities—the lacunae method—was developed by V. S. Ryaben’kii’s mathematical school, especially S. V. Tsynkov, S. V. Petropavlovsky and their coworkers [RTT01, QT08, FKT+16, PT17]. The idea of this method is rather simple, but its mathematical foundation and efficient algorithmic implementation, especially for Maxwell’s electrodynamics, are not. A key physical observation is that any wave packet traveling in a finite computational domain  has a finite lifetime tlife in that domain (tlife ∼ diam ()/c if the domain is empty; much more complicated in the presence of scatterers and resonators). If the field is created by sources whose lifetime tsrc is also finite, then the field will vanish entirely after time T = tsrc + tlife . Under the reasonable assumption that this time interval is shorter than the characteristic time scale over which the ABC/PML instabilities might develop, a stable solution is obtained by simply setting it to zero for t > T . If the sources of the field are continuous in time, their excitation can be artificially partitioned into the intervals of length T , each of which gives rise to a stable subsolution. The bookkeeping associated with this procedure is not trivial, however. For Maxwell’s equations there is an additional complication: the time-partitioning of source currents may result in violations of the zero-divergence condition for these currents, leading to artificial electric charges and related electrostatic fields. S. V. Petropavlovsky & S.V. Tsynkov have been able to overcome these complications [PT12, PT17].

7.12 The Treatment of Material Interfaces 7.12.1 Introduction The treatment of material interfaces is a critical problem for FDTD; there are over 100 papers on the subject in engineering, mathematical and physical literature. As we discussed in Chap. 2, it is difficult to construct schemes of order higher than one at material interfaces. In connection with FDTD, the topic is quite intricate, with many different facets: • Material interfaces parallel to the grid lines are relatively easy to handle, but slanted and curved ones are not. • There are significant style differences between the engineering, mathematical and physical treatment of the problem. Rigorous analysis of the order of the schemes is often missing in the engineering and physical literature. On the other hand, practical implementation of some of the mathematical methods can be complicated.

7.12 The Treatment of Material Interfaces

393

• While stability of the standard Yee scheme in a homogeneous domain is well established, any deviations from it due to the presence of material interfaces can potentially result in instabilities, especially long-term instabilities. • There are substantial differences between the treatment of perfect electric conductors and dielectric/dielectric interfaces. • There is a large variety of approaches to the problem, ranging from purely heuristic ones to mathematically rigorous. • There is a significant trade-off between the simplicity of regular Yee grids and more complicated meshes geometrically conforming to the boundaries. The latter range from split cells to hybrid Yee—finite element methods to finite element—time domain, Discontinuous Galerkin and other methods on unstructured meshes. • Staggered grids introduce an additional layer of algorithmic complexity, since different components of different fields are defined at different locations relative to interface boundaries. Therefore even a single way of treating the interface conditions may branch out into different interpolation schemes for the field components; an insightful analysis and comparison is provided by D. M. Shyroki [Shy11]. • There is a significant difference between the order of local errors near interfaces and global solution errors over the whole domain, including errors in global quantities such as energy, resonance frequencies, etc. Therefore claims about, say, second-order convergence may be ambiguous. It should be clear from the above that a concise review of the subject would not be possible. I consider here only Yee-like schemes on regular Cartesian grids rather than more complex geometrically conforming meshes. A few popular approaches are outlined below, but no detailed discussion of all their flavors is attempted. Section 7.12.3 addresses an important question of whether second-order local approximation is possible for Yee-like schemes. In connection with the material of Chap. 4, it can be mentioned that a general recipe for high-order schemes is provided by Flexible Local Approximation MEthods (FLAME). Their applications to many frequency-domain and static problems have been successful but in the time domain have so far been limited.

7.12.2 An Overview of Existing Techniques FDTD approximations of fields at interfaces are diverse and difficult to classify, but two related but different threads can still be distinguished. The first one is known as the “contour-path FDTD” method (CP-FDTD), which relies on Maxwell’s equations in integral form. Papers on CP-FDTD by T. G. Jurgens & A. Taflove appeared in the early 1990s [JTUM92, JT93], but there is a clear conceptual connection with finite integration techniques (FIT) developed by T. Weiland much earlier, in the 1970s [Wei77, Wei96, Wei03]. A brief review of FIT can be found in Sect. 7.8. The original versions of CP-FDTD suffered from long-term instability, due apparently

394

7 Finite-Difference Time-Domain Methods for Electrodynamics

Fig. 7.9 Illustration of field averaging within a pixel intersected by a dielectric/dielectric interface. The permittivities are 1,2 ; n is the unit normal to the interface. The normal component Dn = D · n and the tangential component E τ are continuous across the interface. The effective permittivity tensor for the cell can be defined as E ≡ diag (, −1 −1 ). See text for details

to the nonreciprocal terms in the difference schemes, which has led to a number of amendments (e.g. C. J. Railton & I. J. Craddock [RC95]). Another FD scheme, along similar lines but simpler and with a superior performance was later devised by R. Mittra and collaborators [DM97, DMC97, YM01]. In their scheme, applicable to perfect electric conductors, the contour integral of the electric field is numerically approximated over the boundary of a partially filled grid cell. Since the electric field is zero inside the perfect conductor, only parts of the grid edges lying in the air contribute to this contour integral, and that is properly taken into account in the update equations for the H field. For further references on this subject, the interested reader may see papers by C. J. Railton & I. J. Craddock [RC95], J. Häggblad, B. Engquist & O. Runborg [Häg10, HE12, HR14], A. Mohammadi et al. [MNA05], T. Xiao & Q. H. Liu [XL08]. The main idea behind the second thread is smoothing (“smearing”) of the permittivity in the grid cells (“pixels” in 2D or “voxels” in 3D) at dielectric/dielectric interfaces (Fig. 7.9). The permittivities of the dielectrics are 1,2 ; n is the unit normal to the interface. The normal component Dn = D · n and the tangential component E τ are continuous across the interface. Neglecting the curvature of the interface and the variation of the fields within each homogeneous subregion, one has Dτ  = 1 E τ 1 + 2 E τ 2 ≈ E τ ; −1 −1 E n  = −1 1 Dn 1 + 2 Dn 2 ≈  Dn

This leads to the definition of the effective permittivity tensor for the cell as E ≡ diag(Eτ τ , Enn ); Eτ τ = , Enn = −1 −1 This matrix form corresponds to the τ n coordinates but can be converted to the x y coordinates by the standard tensor transformation.

7.12 The Treatment of Material Interfaces

395

This “effective pixel tensor” idea is quite natural and exemplified by the “subpixel smoothing” technique of A. Farjadpour et al. [FRR+06] but appears in one form or another in a variety of other publications, e.g. A. Mohammadi et al. [MNA05]. However, there are many implementation details, especially in the case of anisotropic dielectrics (A. F. Oskooi et al. [OKJ09] and C. A. Bauer et al. [BWC11]).

7.12.3 Order of a Difference Scheme Revisited: Trefftz Test Matrix Near material interfaces, the order of a difference scheme is more difficult to evaluate than in homogeneous media. As noted in Sect. 7.1, the problem is exacerbated by the availability of various interpolation schemes for staggered grids (D. M. Shyroki [Shy11]) and the difference in the numerical accuracy of local and global quantities. In the engineering literature, analysis is rarely performed with full mathematical rigor, so ambiguities do arise. A typical and important example is the “subpixel smoothing” technique, which in [FRR+06, OKJ09] is claimed to be of second order. But C. A. Bauer et al. state that this algorithm “... has first-order error (possibly obscured up to high resolutions or high dielectric contrasts)” [BWC11]. D. M. Shyroki [Shy11] provides numerical evidence for second-order convergence of Yee-like schemes with a particular type of interpolation between different E and D field components on staggered grids. However, this convergence is with respect to a global quantity (resonance frequency) rather than the local fields near a slanted or curved interface. (For material boundaries parallel to the gridlines, it is relatively easy to construct second-order schemes for the local quantities; see e.g. T. Hirono et al. [HSL+00] or K.-P. Hwang & A. C. Cangellaris [HC01].) As noted in Sect. 2.9 and Sect. 4.5, the order of an FD scheme is not as straightforward a notion as it might seem. Indeed, multiplying any FD equation by a factor depending on the grid size and/or time step, one may alter the order of the scheme arbitrarily. One very simple example is the 2D Laplace equation, for which the standard five-point stencil is {−1, −1, 4, −1, −1}h −2 , where 4 corresponds to the central node; this scheme is of order 2 (Sect. 2.7.1). At the same time, the finite element— Galerkin procedure on a regular mesh with linear triangular elements (Sect. 3.8.1) produces the same scheme but without the h −2 factor, which formally makes it a scheme of order 4 (!?). Needless to say, rescaling of the scheme does not affect the FD solution (in the absence of roundoff errors); rather, it alters the balance between the consistency and stability estimates (Sect. 2.9). Thus, a formal way to fix the “true” order of a scheme would be to restrict consideration to schemes for which the stability constant is O(1). In practice, though, one typically wishes to separate the issues of consistency and stability to the extent possible, because the former lends itself to analysis much more easily than the latter.

396

7 Finite-Difference Time-Domain Methods for Electrodynamics

Fig. 7.10 Schematic representation of the Trefftz–FLAME scheme (7.87)–(7.92) for the 2D Helmholtz equation in free space. Grid size h in both coordinate directions

Many standard schemes feature a 1:1 correspondence between derivatives in the original differential equation and the respective FD terms (e.g. dx2 ↔ {1, −2, 1}h −2 ), which makes the proper scaling intuitively clear. However, this direct correspondence may not be quite as obvious or may not even exist in other cases—notably, for Trefftz– FLAME schemes (Chap. 4). To illustrate this point, consider as an example the high-order 3 × 3 FD stencil for the 2D Helmholtz equation ∇ 2 u + k 2 u = 0 with, for simplicity, a real positive wavenumber k ([Tsu05a, p. 2218], [Tsu06, p. 695], [vT08, p. 1379], Eqs. (4.102)– (4.104) on Sect. 4.9, and a similar scheme due to I. Babuška et al. [BIPS95, p. 342]): scenter =

 A  B C e 21 + 1 , smid−edge = − , scorner = e 1 D D D −2

(7.87)

where A = e 21 e1 + 2e 12 e0 − 4e− 21 e1 + e 12 − 4e− 21 + e1 + 2e0 + 1

(7.88)

B = e 23 e0 − 2e 12 e1 + 2e 12 e0 − 2e 21 + e0

(7.89)

C = 2e 21 e0 − e− 21 e1 − 2e− 21 e0 − e− 21 + 2e0

(7.90)

D = (e0 − 1)2 (e− 21 − 1)4

(7.91)

1 3 1 eγ = exp(2γ i hk), γ = − , 0, , 1, 2 2 2

(7.92)

These coefficients are schematically represented in Fig. 7.10. For (7.87)–(7.92) and similar schemes, it is not immediately obvious what the “right” scaling should be (e.g. whether multiplication by a certain power of h would be appropriate). In a comprehensive convergence theory, which would include stability analysis in addition to approximation errors, the scaling issue would be moot. However, since stability is much more difficult to study than approximation, it is

7.12 The Treatment of Material Interfaces

397

desirable to establish some universal measure by which the approximation accuracy of difference schemes can be compared in a consistent manner. S. K. Godunov & V. S. Ryabenkii (G&R) discuss a closely related but different matter in [GR87a, Sect. 5.13]. They note that consistency error can be formally reduced by changing the norm in which this error is measured. Incorporating an arbitrary power of h into that norm or, even more dramatically, a factor like 2−1/ h , one could magically “improve” the accuracy of the scheme, although the numerical solution would remain unchanged. Given a discrete space Uh where an FD scheme “lives” and the corresponding continuous space U for the underlying boundary value problem, (G&R) write: It is customary to choose a norm in the space Uh in such a way that, as h tends to zero, it will go over into some norm for functions given on the whole [domain], i.e. so that lim u h Uh = uU

h→0

(7.93)

Trivial examples of that  are (i) the maximum norms in both spaces; (ii) the discrete norm u h U2 h = h d m |u m |2 , which transforms into the L 2 norm as h → 0. G&R’s condition (7.93) is quite natural, but it does not address the scaling issue. Indeed, even if the norms in both discrete and continuous spaces are chosen in a reasonable way—for example, set to either (i) or (ii) of the previous paragraph—the scheme itself can still be rescaled by an arbitrary factor, including an h-dependent factor. Outlined below (Sect. 7.12.3) is a way of casting consistency errors in a scaleindependent form, allowing an “apples vs. apples” comparison of various schemes. But first, let us recall the all-important Lax–Richtmyer theorem (a.k.a. the Lax equivalence theorem) relating consistency, stability and convergence (e.g. J. C. Strikwerda [Str04], G&R [GR87a], and also Sect. 2.9). The connection is easy to see if the difference systems for the numerical and exact solutions are written side by side: Sh u h = f h ;

Sh u ∗h = f h + c

Subtracting these equations, one immediately observes that

or equivalently

Sh s = c , s ≡ u h − u ∗h

(7.94)

s = Sh−1 c

(7.95)

Here s ∈ Cn (or Rn , depending on the type of the problem) is the solution error vector. This is a simple but critical result relating consistency and solution errors. With these preliminary considerations in mind, let us take a closer look at the accuracy of Yee-like schemes near material interfaces. Consider an FD scheme in the following generic form, in the absence of sources: u (i)T s (i) (h, t, , μ) = 0

(7.96)

398

7 Finite-Difference Time-Domain Methods for Electrodynamics

where u (i) is a Euclidean vector of degrees of freedom corresponding to a particular grid “molecule” i 13 with node locations rβ(i) , tβ(i) (β = 1, 2, . . . , n), and s is the coefficient vector defining the scheme. As indicated in (7.96), these coefficients depend on the electromagnetic parameters  = (r) and μ = μ(r) (assuming linear characteristics of all media). It is convenient to introduce a small local domain (i) containing the grid molecule (this could be, for example, the convex hull of the nodes of the molecule); diam (i) is thus of the order of the grid size. The local (stencil-wise) consistency error (i) c is defined as (i)T N (i) u ∗ (i) c (h, t, , μ) = s

(7.97)

In this expression, the operator N (i) : T ((i) ) → Rn (or Cn ) in (7.95) produces n given degrees of freedom (typically, the nodal values) of any smooth function u ∗ in (i) . It makes sense to restrict u ∗ to the local Trefftz space T ((i) )—the space of functions satisfying the underlying weak-form differential equations in (i) ; thus, u ∗ ∈ T ((i) ). Our goal is to establish an accuracy measure which, unlike (i) c (7.97), would be independent of the scaling of the scheme s (i) . Clearly, the numerical solution and the error in it do not, in the absence of roundoff errors, depend on this scaling. If all stencils were to be simultaneously rescaled by any nonzero factor λ, the inverse matrix Sh−1 in (7.95) would also be simultaneously rescaled, but by the inverse factor λ−1 . Ultimately, therefore, the scaling would not matter. Theoretical analysis, however, faces a twofold problem: 1. Rigorous analytical estimates of the stability factor related to the norm of S −1 are available for fairly simple model problems and may be difficult or impossible to obtain for more complicated practical ones. 2. Even more importantly, the Lax–Richtmyer estimate (7.94) applies to global errors and does not characterize the influence of local consistency errors on local solution errors. Such local estimates are available for finite element methods, thanks to variational formulations and duality principles (A. H. Schatz, L. B. Wahlbin, V. Thomee, A. Demlow, D. Leykekhman [SW77, SW78, SW79, SW95, STW98, Dem07, DLSW12]) but not for FD schemes. Let us consider one way of examining the local errors. A related subject, “schemeexact Trefftz subspaces,” is discussed in Sect. 4.5.14 Obviously, for any single solution u ∗ one can generate infinitely many exact schemes—that is, schemes with a zero consistency error c (7.97). (Since only one FD stencil, with its consistency error, is considered in the remainder, the superscript (i) indicating the stencil number is now dropped.) Suppose, though, that we have 13 As

far as I know, the “molecule” locution was coined by J. P. Webb [PW09]. Surprisingly, there does not seem to be a standard term for a set of nodes over which an FD scheme is defined. In the past, I have used the word “stencil” for that purpose; however, by “stencil” most researchers mean the set of coefficients of a scheme rather than the set of nodes. [This is a repeat of the footnote on Sect. 2.4.4 for easy reference.]. 14 Disclaimer: both methods are still unpublished at the time of this writing (end of 2019).

7.12 The Treatment of Material Interfaces

399

a Trefftz test set of m different solutions u ∗1,...,m independent of the mesh size and time step. By definition, each u ∗α satisfies the differential equation of the problem and interface boundary conditions (if any) within the convex hull of the grid molecule. We want the consistency error to be small for each test solution. Let us arrange the nodal values of u ∗1,...,m in a matrix U , analogous to matrix N T in Sect. 4.3. That is, each row α of U contains the nodal values of the test solution u ∗α , so that (7.98) Uαβ = u ∗α (rβ ) Assembling the consistency errors (7.97) for all test solutions into one Euclidean vector c ∈ Rm , we get c ≡ U s

⇒ c 2 ≥ σmin (U )s2

(7.99)

where σmin (U ) is the minimum singular value of U = U (h, t, , μ). This is a lower bound of the consistency error. Using different sets of Trefftz functions, or one large set, one can obtain different bounds of this form. A rigorous mathematical solution of the stability and scaling problems listed on Sect. 7.12.3 is not in general available, but the error bound (7.99) suggests an alternative on physical/engineering grounds. Namely, one can compare the minimum singular values of matrix U (h, t, , μ) and its free-space instantiation U1 (h, t) = U (h, t,  = 1, μ = 1): σmin (U (h, t, , μ)) (7.100) ζ = σmin (U (h, t, 1, 1)) Clearly, in the inhomogeneous case one may expect the estimate (7.99) to be worse than the respective one for free space; hence typically ζ > 1 and quite possibly ζ  1. Let us consider illustrative examples. Example 28 To start with, we calculate only the free-space value σmin (U1 ) for the 2D Laplace equation ∇ 2 u = 0. Introduce the standard five-point stencil and assume, for simplicity of notation, the same grid size h in both coordinate directions; the node coordinates are x1,...,5 = {0, 0, −h, h, 0}; y1,...,5 = {0, −h, 0, 0, h} (thus, node #1 is the central node in this grid molecule). Let the Trefftz basis {u ∗α (x, y)} consist of harmonic polynomials up to order 4 (these are the real and imaginary parts of (x + i y)k , k = 0, . . . , 4): u ∗1,...,8 = {1, x, y, x 2 − y 2 , 2x y, x 3 − 3x y 2 , 3x 2 y − y 3 , x 4 − 6x 2 y 2 + y 4 } Then the U matrix is

400

7 Finite-Difference Time-Domain Methods for Electrodynamics

⎛ 1 ⎜1 ⎜ U =⎜ ⎜1 ⎝1 1

0 0 −h h 0

0 −h 0 0 h

0 − h2 h2 h2 − h2

0 0 0 0 0

0 0 − h3 h3 0

0 h3 0 0 − h3

⎞ 0 h4⎟ ⎟ h4⎟ ⎟ h4⎠ h4

(Recall that each column of this matrix contains the nodal values of the respective test function. Obviously, function 2x y, corresponding to the zero column, could have been omitted from the basis.) Symbolic algebra gives the following lowest-order terms for the singular values of U : 1 1 1 1 σ1,...,5 (N ) ∼ {2h 2 , 2 2 h, 2 2 h, 2 · 5− 2 h 4 , 5 2 } Hence

σmin (U ) = σ4 (U ) ∼ 2 · 5− 2 h 4 = O(h 4 ) h1

1

Example 29 Consider now the 2D wave equation v 2 ∇ 2 u − ∂t2 u = 0 We follow the same exact steps as in the previous example. The simplest grid “molecule” has seven nodes in the x yt space: x1,...,7 = {0, 0, −h, h, 0, 0, 0} y1,...,7 = {0, −h, 0, 0, h, 0, 0} t1,...,7 = {0, 0, 0, 0, 0, −t, t}; where the Trefftz test set {u α (x, y)} is now chosen as u ∗1,...,17 = {1, (x ± vt)k , (y ± vt)k }, k = 1, 2, 3, 4

(7.101)

For the U matrix corresponding to the chosen grid molecule and test set (7.101), symbolic algebra yields h1

σmin (U ) ∼ c4 h 4 The analytical expression for the coefficient c4 is too cumbersome to be reproduced here, but it is plotted, as a function of vt/ h, in Fig. 7.11. Example 30 Following the preliminary examples above, we are in a position to examine the Yee setup in 2D in a similar fashion. Consider two dielectric media with permittivities 1,2 , separated by a slanted interface boundary (Fig. 7.12). Of most interest is the H -mode ( p-mode), with a one-component H field perpendicular

7.12 The Treatment of Material Interfaces

401

Fig. 7.11 c4 vs. vt/ h, where σmin (U ) ∼ c4 h 4

Fig. 7.12 Yee-like schemes near material boundaries. H -mode ( p-mode): a one-component H field perpendicular to the plane of the figure and a two-component E field in the plane. Two dielectric media (1,2 ) with an interface boundary slanted with respect to the Yee grid. Yellow spheres: H nodes; horizontal arrows: E x nodes; vertical arrows: E y nodes. The central H sphere indicates a triple node with respect to time (t0 , t0 ± t), and the E x,y arrows indicate double nodes (t0 ± t/2); t0 is any “current” moment of time

to the plane of the figure and a two-component E field whose normal component is discontinuous across the interface. The H nodes are labeled with the yellow spheres in the figure, and the E x,y components—with the arrows. The time axis is for simplicity not shown, but it is understood that the central H sphere indicates in fact a triple Yee node (t0 , t0 ± t), and that the E x,y arrows indicate double nodes (t0 ± t/2); t0 is any “current” moment of time. There are 7 degrees of freedom for the H field (five at t0 and two at t0 ± /2) and 2 × 4 = 8 for E x,y —altogether 15 degrees of freedom. The horizontal and vertical black lines in Fig. 7.12 belong to the H grid with a size h in both directions; the finer blue grid indicates h/4 subdivisions as a visual aid. The slope θ of the interface is an adjustable parameter. For definiteness, we take θ = 30◦ and assume that the interface boundary divides the “horizontal” segment between two adjacent H nodes in the ratio of 1:3, as indicated in the figure. With

402

7 Finite-Difference Time-Domain Methods for Electrodynamics

Fig. 7.13 log10 σmin (N ) vs. log10 h. Solid line: log10 σmin (U ); markers: linear fit with slope K≈ 3. Indexes of refraction n 1,2 =1, 2. ct/ h = 0.1. Geometric setup of Sect. 7.12.3; material interface at angle θ = π/6. 15-node stencil: 7 nodes for H , 2 × 2 = 4 nodes for E x , 2 × 2 = 4 nodes for E y . Trefftz test basis: polynomial traveling waves in each dielectric, matched via the standard interface boundary conditions (see text)

this geometric setup, two H nodes (the left one and the top one) happen to lie in the 1 dielectric, as does the top “double node” E x . All other nodes happen to lie in the second dielectric. The Trefftz test set is a slightly generalized version of (7.101). Now that two media with their respective phase velocities v1,2 are present, each polynomial of the form (x ± v1 t)k , (y ± v1 t)k in the first subdomain is matched, via the standard interface boundary conditions, with the corresponding polynomial (x ± v2 t)k , (y ± v2 t)k in the second subdomain; we take k = 0, 1, 2, 3. The rest of the analysis proceeds the same way as in the previous examples, except that a final closed-form analytical result for σmin (U ), where U is in this case a 15 × 16 matrix, is no longer feasible to obtain via symbolic algebra. Instead, this smallest singular value is computed in variable precision arithmetic (32 digits). The results are plotted in Fig. 7.13 and clearly indicate that σmin (U ) = O(h 3 )



ct fixed h

(7.102)



ct fixed h

(7.103)

At the same time, in free space σmin (U ) = O(h 4 )

That is, in the presence of a dielectric interface one cannot achieve the same order of convergence of Yee-like schemes as in free space, and the deterioration factor ζ of (7.100) (Sect. 7.12.3) is

7.12 The Treatment of Material Interfaces

ζYee = O h −3 / h −4 = O(h)

403

(7.104)

This conclusion—not entirely unexpected—is valid for the setup examined in this example. Its main limitation is that our analysis has been applied to specific degrees of freedom adopted in classical Yee schemes—that is, the nodal values of the fields on staggered grids. This analysis can be generalized to other degrees of freedom— notably, to edge circulations and/or surface fluxes—and to non-staggered grids.15 However, the odds of preserving the second order of the free-space Yee scheme in the presence of interfaces are slim; see also A.-K. Tornberg & B. Engquist [TE08]. On the positive side, second-order schemes are available for interface boundaries parallel to the gridlines, and there is also compelling evidence that global quantities can also be evaluated with second-order accuracy if FDTD schemes are constructed judiciously (e.g. C. A. Bauer et al. [BWC11], D. M. Shyroki [Shy11]).

7.13 The Total-Field/Scattered-Field Formulations Occasionally, simple observations have significant consequences. The subject of this section is one such example. Consider a different scheme s T uh = 0

(7.105)

where s is a column vector of coefficients (a.k.a. an FD stencil), and u h is a column vector of the chosen degrees of freedom (for definiteness, let those be nodal values over a certain grid “molecule”). The right-hand side of (7.105) is assumed to be zero for convenience of exposition. The simple observation in this case is that if

and u h satisfies (7.105), then

uh = ua + ub

(7.106)

s T ua = − s T ub

(7.107)

u a,b being an absolutely arbitrary splitting of u h . (By “absolutely arbitrary” I mean that u a,b separately need not have any physical meaning, and that their values need not be correlated in any systematic way, as long as (7.106) holds.) One application of that is the standard superposition of incident (‘inc’) and scattered (‘sc’) fields in wave problems, viz. (7.108) u tot = u inc + u sc u tot being the total field over any given grid molecule. In electromagnetic problems, this generic u may stand for either of the fields or their components. 15 Such

grids are used e.g. in pseudospectral time-domain methods (PSTD), Sect. 7.9.2.

404

7 Finite-Difference Time-Domain Methods for Electrodynamics

Fig. 7.14 Total and scattered-field regions separated by an artificial boundary Γ

Inside the computational domain (“in the bulk”), it is usually more convenient to formulate the problem in terms of total fields, whereas absorbing conditions at exterior boundaries are more naturally expressed in terms of scattered fields. This mild inconsistency can be easily reconciled via (7.106), (7.107). Indeed, let us introduce an artificial boundary Γ , on one side of which we wish to use the total field, and on the other side—the scattered field (Fig. 7.14). For simplicity, only five nodes (1–5) in the grid molecule are shown, even though there will typically be more. Let the incident-/scattered-field splitting be introduced only in the “scattered-field” region (upper left in the figure). The incident field is considered to be given and could be arbitrary (the field of a point source, a plane wave, etc.) Suppose that a scheme (7.105) is written for the total field; more explicitly, s1 u tot1 + s2 u tot2 + s3 u tot3 + s4 u tot4 + s5 u tot5 = 0

(7.109)

Since nodes 2 and 5 in the illustrative example of Fig. 7.14 happen to lie in the scattered region, the incident-/scattered-field splitting of the total field produces the following scheme: s1 u tot1 + s2 u sc2 + s3 u tot3 + s4 u tot4 + s5 u sc5 = −s2 u inc2 − s5 u inc5

(7.110)

This example illustrates how one can easily switch between the total- and scatteredfield representations at some or all nodes in a given grid molecule. The total-/incident-/scattered-field formulations are ubiquitous in FDTD and other methods with a natural field splitting. As another example, quite similar ideas were used in FLAME when “particular solutions” needed to be introduced for problems with nonzero right-hand side (Sect. 4.3.4).

7.14 The Treatment of Frequency-Dependent Parameters

405

7.14 The Treatment of Frequency-Dependent Parameters The permittivity of any material with a linear dielectric response must depend on frequency ω; this follows, in particular, from the Kramers–Kronig causality relations (V. Lucarini et al. [LSPV05]). Thus, one writes—in the isotropic case for simplicity: D(r, ω) = (r, ω) E(r, ω)

(7.111)

This frequency-domain product corresponds to time convolution D(t) =

t −∞

E(t − τ ) E(τ )dτ

(7.112)

Here E and (ω) are related via the Fourier transform; the dependence on r has been dropped for brevity. In FDTD, a direct implementation of (7.112) would be impractical, since the whole history of the time-dependent fields would have to be stored and processed. Fortunately, computationally efficient ways to handle the convolution integral exist for a broad and important class of dielectric functions (ω) of the following general form: (ω) = ∞ ±

N 

a p ω 2p

p=1

c p ω 2 − iωγ p − f p ω 2p

≡ ∞ ±

N 

 p (ω)

(7.113)

p=1

This general expression encompasses Drude, Debye and Lorentz models with an arbitrary number N of the poles; ω p and γ p are the frequency and damping factor of the pth pole; a p , c p and f p are given coefficients (zero in some models). The ± sign in the right hand side of (7.113) depends on the phasor convention16 : the guiding principle is that, for passive media, Im  > 0 for the exp(−iωt) convention, and Im  < 0 for exp(+ jωt). One implementation of (7.113) in FDTD has a particularly clear physical meaning. It involves auxiliary variables D p (ω) =  p (ω)E

(7.114)

Since iω corresponds to the time derivative, one arrives at the following ordinary differential equations for D p : c p ∂t2 D p (t) + γ p ∂t D p (t) + f p ω 2p D p (t) = a p ω 2p E

16 This

(7.115)

sign could have been incorporated into the coefficient a p , but I chose not to do so, since the sign ambiguity is resolved in the final differential equations (7.115).

406

7 Finite-Difference Time-Domain Methods for Electrodynamics

Fig. 7.15 Oblique incidence of a time-dependent plane wave on a periodic structure. Boundary conditions across a lattice cell, and ways of implementing them in FDTD, are considered in this section

Note that the phasor-dependent sign ambiguity has disappeared in these time-domain equations. Thus, the FDTD scheme has been supplemented with a set of ordinary differential equations, which can be discretized and time-stepped along with the “regular” FDTD equations. Details of the numerical implementation vary widely (A. Taflove & S. C. Hagness [TH05], M. A. Alsunaidi & A. A. Al-Jabr [AAJ09], S.-C. Kong et al. [KSB08], Z. Lin & L. Thylen [LT09]).

7.15 Simulation of Periodic Structures In a variety of applications, one needs to model electromagnetic fields in periodic structures—e.g. various types of gratings, photonic crystals, metamaterials. Fig. 7.15 schematically illustrates a typical situation: a plane wave impinges on a structure periodic in the x direction, at an angle θinc . In that case, the incident wave is of the form Einc (r, t) = f(r · vˆ p − v p t), Hinc (r, t) = Z −1 f(r · vˆ p − v p t),

(7.116)

Here v p is phase velocity, vˆ p is the unit vector in the direction of wave propagation, Z is the intrinsic impedance of the medium, and f is an arbitrary waveform (assumed differentiable, so that the wave satisfies Maxwell’s equations in strong form). The lattice layers in Fig. 7.15 could contain an arbitrary microstructure; what matters is x-periodicity with a given period a. This 2D setup generalizes in a natural way to 3D. We look for solutions satisfying the following time-delay conditions on the side boundaries of the lattice cell: E(x + a, y, t) = E(x, y, t − ta ),

(7.117)

H(x + a, y, t) = H(x, y, t − ta )

(7.118)

7.15 Simulation of Periodic Structures

The time delay is

407

ta = v −1 p a sin θinc

(7.119)

In the particular case of normal incidence (θinc = 0 and ta = 0), the above boundary conditions become periodic and can be imposed in a straightforward way, as described e.g. in Sect. 8.10.7. To recap the main idea briefly: consider a scalar equation for a function u(x) and an FD grid x0 = 0, x1 = x0 + x, . . ., xn = a. To impose periodic conditions, one identifies node n with node 0, so that the immediate neighbors of node 0 are nodes n and 1, while those of node n − 1 are n − 2 and 0. With this adjustment in mind, the FD scheme is then constructed in an otherwise standard way. For the Yee scheme, the idea is the same, but is applied to staggered grids. Unfortunately, for oblique incidence (θinc = 0, and ta = 0) this construction of boundary conditions no longer works. This is because one of the actual neighbors of node 0 is node ‘−1’, x−1 = x0 − x; but that node is not part of the scheme if only one lattice cell is modeled. In the FD scheme, the value u(x−1 , t) would need to be replaced with u(xn−1 , t + ta ). The problem is that the latter, time-advanced, value is not available at any given time t. Several workarounds are available; we consider two major ones. For other approaches and details, the reader may consult specialized FDTD literature—e.g. the paper by I. Valuev et al. [VDB08] and references therein. The first workaround, due to M. E. Veysoglu et al., J. A. Roden et al., G. Zheng et al., and others [VSK93, RGK+98, ZKGY06], is to introduce an auxiliary variable which would satisfy periodic boundary conditions rather than being subject to time advance or delay. To fix this idea, let us for simplicity consider a pair of scalar wave equations—a 1D analog of Maxwell’s equations: v p ∂x u(x, t) = ∂t v(x, t)

(7.120)

v p ∂x v(x, t) = ∂t u(x, t)

(7.121)

Let both functions u, v satisfy the time-shift boundary conditions on the interval [0, a]: (7.122) u(a, t) = u(0, t − ta ), v(a, t) = v(0, t − ta ) To transform the time-shift conditions to periodic ones, let us introduce auxiliary functions u(x, ˜ t), v(x, ˜ t), defined as time-advanced versions of u(x, t) and v(x, t). ˜ t − v −1 u(x, ˜ t) = u(x, t + v −1 p x sin θinc ) ⇔ u(x, t) = u(x, p x sin θinc ) (7.123) ˜ t − v −1 v(x, ˜ t) = v(x, t + v −1 p x sin θinc ) ⇔ v(x, t) = v(x, p x sin θinc ) (7.124) It is straightforward to verify that u˜ is indeed lattice-periodic: (7.122)

(7.119)

−1 u(a, ˜ t) = u(a, t + v −1 p x sin θinc ) = u(a, t + v p x sin θinc − ta ) = u(0, t)

408

7 Finite-Difference Time-Domain Methods for Electrodynamics

The same argument of course applies to v. ˜ Now let us see how Eqs. (7.120), (7.121) get rewritten in terms of the tildefunctions: ˜ t  ) − sin θinc ∂t u(x, ˜ t  ); t  = t − v −1 v p ∂x u(x, t) = v p ∂x u(x, p x sin θinc The differentiation above takes into account the fact that u˜ depends on x via both arguments. Thus, (7.120) becomes, in terms of u, ˜ v, ˜ ˜ t  ) − sin θinc ∂t u(x, ˜ t  ) = ∂t v(x, ˜ t  ), v p ∂x u(x,

(7.125)

because the time derivatives of v and v˜ are the same. A completely similar transformation applies to (7.121) and yields ˜ t  ) − sin θinc ∂t v(x, ˜ t  ) = ∂t u(x, ˜ t ) v p ∂x v(x,

(7.126)

Since the tilde-functions in (7.125), (7.126) are subject to periodic boundary conditions on the lattice, the problem of time advance is thereby solved. However, there is a downside: Eqs. (7.125), (7.126) are more cumbersome than the original wave equations (7.120), (7.121). First, each of the tilde equations contains the time and space derivatives of the same variable; therefore the standard leapfrog scheme needs to be modified. Secondly, the stability conditions for the appropriate FD schemes are more complicated and in general more restrictive than for the standard Yee scheme. Absorbing conditions or perfectly matched layers also need to be modified and adjusted to the tilde-variables. For further details, see [TH05, RGK+98, VSK93]. The second workaround for the time-shift conditions is applicable in the case of monochromatic excitation, when a steady-state fixed-frequency solution is sought. In that case, one may consider complex-valued fields in the time domain. At first blush, this idea looks strange, since complex-valued fields are commonly used in the frequency domain. A typical complex phasor, in its short form, is u 0 (r) exp(iφ(r)), where u 0 and φ are the (position-dependent) amplitude and phase of a generic potential or field component u. However, nothing prevents one from considering the full phasor (7.127) u 0 (r) exp(iφ(r)) exp(±iωt) as a time-dependent complex-valued mathematical solution of Maxwell’s equations; the ± sign reflects two possible sign conventions for this phasor. If desired, expression (7.127) can always be de-complexified and turned into a pair of coupled real-valued functions, with the cos ωt and sin ωt time waveforms. The key benefit of using complex-valued fields in the time domain is that the time-shift boundary conditions (7.117), (7.118) turn into the standard Bloch–Floquet conditions, which for the E and H fields are E(x + a, ω) = exp(iωta ) E(x, ω) ≡ exp(iωta ) E(x, ω)

(7.128)

7.15 Simulation of Periodic Structures

H(x + a, ω) = exp(iωta ) H(x, ω) ≡ exp(iωta ) H(x, ω)

409

(7.129)

with ta = v −1 p a sin θinc as per (7.119). This formulation generalizes readily to 3D. Its main advantage is that the fields and governing equations are not modified, and hence the standard FDTD-Yee schemes, stability criteria and PML or absorbing boundary conditions apply. On the downside, the approach described above has been derived for a singlefrequency steady-state regime and cannot be directly applied to broadband excitation. However, there is an interesting way to partly overcome this limitation. Assume that one fixes the ωta parameter—or, equivalently, ω sin θinc —rather than ω itself. This defines the Bloch prefactor in the boundary conditions (7.128), (7.129). One can then introduce a Constant Transverse Wavenumber (CTW) wave (A. Aminian & Y. Rahmat-Samii [ARS06]) as a superposition of waves with a varying ω and simultaneously varying sin θinc ∼ 1/ω. By construction, the Bloch–Floquet boundary conditions are the same for all frequency components of such a wave. Hence, this wave can be computed in one FDTD simulation which sweeps the whole curve ω sin θinc = const in the (ω, θ) plane, as opposed to just a single point in that plane.

7.16 Near-to-Far-Field Transformations 7.16.1 Introduction In wave scattering problems, one is often interested in far-field patterns—that is, the angular distribution of the fields and radiation power in free space far away from the scatterers. This creates complications for methods like FDTD and FEM, where the computational domain is by necessity finite in size. (In contrast, methods based on integral equations treat unbounded domains in a natural way and are suited less well for finite domains.) This limitation of FDTD and FEM can be overcome using analytical techniques, outlined below. Further details can be found in [TH05, Chap. 8], [Sch10, Chap. 14]. We fix ideas for frequency-domain problems, leaving the more complicated timedomain case to specialized literature (e.g. [GGG00]). Even though this chapter deals exclusively with electromagnetic analysis, it is instructive to start with a more abstract setting. Consider a bounded domain  and the corresponding exterior region17 ext = n R − . Let theis exterior region be physically homogeneous (typically, free space); however, no limitations are imposed on the media and sources in . Let a function u(r) satisfy the equation Lu(r) = 0, r ∈ ext 17 Our

(7.130)

analysis can also be applied to bounded exterior regions, except that in this case one cannot usually obtain a closed-form expression for Green’s functions in (7.135) below.

410

7 Finite-Difference Time-Domain Methods for Electrodynamics

where L is a differential operator with constant coefficients. It will prove useful to consider this operator in the whole space, even though within  the function u(r) need not satisfy (7.130) and may be completely arbitrary. For simplicity, we temporarily assume that u and L are scalar; extensions to vector problems are straightforward and will be considered later on. Finally, it is assumed that u satisfies the standard boundary conditions lim r γ Bu(r) = 0, γ > 0

r →∞

(7.131)

where operator B and parameter γ depend on the type of the problem. In particular, B may be viewed as shorthand for Dirichlet, Neumann, or radiation conditions. Let g(r) be Green’s function of this boundary value problem in the whole space Rn ; that is, Lg(r) = δ(r) (7.132) where δ is the Dirac delta function.18 Since L has constant coefficients, Green’s function is translation-invariant and depends only on position r of an observation point relative to a source point. Consider next the boundary value problem Lw(r) = f (r) in Rn

(7.133)

with a generic right-hand side f and the same boundary condition (7.131). Solution w can be expressed as the convolution integral −1

w(r) ≡ L

f =

Rn

f (r ) g(r − r ) dr

(7.134)

This basic property of Green’s functions is of course well known and can be verified by applying L to (7.134), under the assumption that integration and differentiation can be interchanged. This explicit integral representation of the solution is powerful, but depends critically on the translational invariance of Green’s function (otherwise it would be a function of two vector variables—positions of the source and observation points, which would complicate the matters significantly). Equation (7.134), when applied to solution u of (7.130), provides a representation of the field everywhere in the exterior domain—in particular, far field: u(r) =

Rn

Lu(r ) g(r − r ) dr

(7.135)

where the substitutions w → u and f → Lu were made. This equation is almost a tautology: convolution with Green’s function yields the inverse of L. This becomes 18

Clearly, (7.132) must be understood in the sense of distributions.

7.16 Near-to-Far-Field Transformations

411

less trivial, however, once we note that, due to (7.130)—the assumed absence of sources in the exterior region—the integrand of (7.135) vanishes there, and hence, integration can be truncated to the closure of : u(r) =

¯ 

Lu(r ) g(r − r ) dr

(7.136)

The complication is that u is not known inside , so (7.136) appears useless at first glance. But this Gordian Knot is easy to cut. One just needs to define an auxiliary function—call it u(r)—equal ˜ to u in the exterior region, but with any desired values (strictly) inside . The only constraint on the definition of u˜ inside  is for Lu to be valid in the sense of distributions. Then, (7.136) turns into Lu(r ˜  ) g(r − r ) dr , ∀r ∈ Rn (7.137) u(r) ˜ = ¯ 

Equivalently, since in the exterior domain u˜ = u by construction, we have u(r) =

¯ 

Lu(r ˜  ) g(r − r ) dr , ∀r ∈ ext

(7.138)

It might seem counter-intuitive that this integral expression for the exterior field does not depend on how u˜ is defined in the interior. Mathematically, this is just a reflection of the tautology noted above: the L operator is first applied to u˜ and then inverted via the convolution. Still, one may gain some additional insight from the 1D case study below (Sect. 7.16.2). Since the choice of u˜ inside  is arbitrary, it is natural to simply set u˜ to zero, which can formally be written as  u˜ = us(r), s(r) =

0, r ∈  1, r ∈ /

(7.139)

Note that, as part of the above definition, on the boundary of  s(r) = 1, r ∈ ∂

(7.140)

With this definition of u, ˜ the convolution integral in (7.138) reduces just to the boundary of : u(r) =

∂

Lu(r ˜  ) g(r − r ) dr , ∀r ∈ ext , u(r ˜  ) = 0, ∀r ∈ 

(7.141)

where all operations are understood in the sense of distributions. To make all these ideas as transparent as possible, we start with a 1D example and then move on to Maxwell’s electrodynamics in 3D.

412

7 Finite-Difference Time-Domain Methods for Electrodynamics

7.16.2 Example: “Far Field” in 1D The example in this section is the simplest, but still nontrivial, illustration of the key ideas behind the near-to-far-field transformation. Let   = (−∞, 0) (7.142) ext = [0, ∞) Here  is unbounded, which simplifies the algebra but is otherwise unimportant. In ext , consider the Helmholtz equation u  (x) + k 2 u(x) = 0, k = const > 0

(7.143)

The radiation condition in this case is trivial, u  − iku = 0 (x > 0), and defines a unique solution as an outgoing wave exp(ikx) on [0, ∞). Despite this simplicity, the 1D case still has pedagogical value. Following the general prescription of the previous section, let us introduce an auxiliary function u(x). ˜ In ext (i.e. for x ≥ 0), this function is, by definition, set to be equal to u(x). For x < 0, u(x) ˜ can be chosen arbitrarily. For illustration, let us choose an exponential function decaying at −∞: u(x) ˜ = a exp(αx), x < 0, a, α = const; α > 0

(7.144)

The trivial choice a = 0 is perfectly valid, but our objective is to verify that the “far field” u(x), x → ∞ will not depend on a and α. For algebraic manipulation, it is convenient to use the unit step function  s(x) =

0, x < 0 1, x ≥ 0

(7.145)

which is consistent with the general definition of s(r) from the previous section. Then (7.146) u(x) ˜ = a exp(αx)(1 − s(x)) + exp(ikx)s(x) Using the fact that s  = δ and the properties of the δ function, we derive u˜  (x) = aα exp(αx)(1 − s(x)) − aδ(x) + ik exp(ikx)s(x) + δ(x)

(7.147)

u˜  (x) = aα2 exp(αx)(1 − s(x)) − aαδ(x) − aδ  (x) − k 2 exp(ikx)s(x) + ikδ(x) + δ  (x)

f (x) ≡ Lu(x) ˜ ≡ u˜  (x) + k 2 u(x) =

(7.148)

aα2 exp(αx)(1 − s(x)) − aαδ(x) − aδ  (x) −k 2 exp(ikx)s(x) + ikδ(x) + δ  (x) + k 2 [a exp(αx)(1 − s(x)) + exp(ikx)s(x)]

7.16 Near-to-Far-Field Transformations

413

= (1 − s(x))a exp(αx)(α2 + k 2 ) + (ik − aα)δ(x) + δ  (x) − aδ  (x)

(7.149)

Green’s function for the 1D Helmholtz equation is known to be g(x) =

1 exp(ik|x|) 2ik

(7.150)

under the exp(−iωt) phasor convention. We can now evaluate the convolution integral of (7.141) with the integrand f ≡ Lu˜ from (7.149). For convenience, (7.149) can be split up into two parts: one corresponding to a = 0 (the most natural choice in practice) and the other one proportional to a: f (x) = f 0 (x) + a f a (x); f 0 (x) = ikδ(x) + δ  (x); f a (x) = (1 − s(x)) exp(αx)(α2 + k 2 ) − αδ(x) − δ  (x)

(7.151)

The convolution integrals are u 0 (x) ≡ f 0 ∗ g = [ikδ(x) + δ  (x)] ∗ g(x) = ikg(x) + g  (x)

=

1 1 + exp(ik|x|) = exp(ik|x|) 2 2

(7.152)

  u a (x) ≡ f a ∗ g = (1 − s(x)) exp(αx)(α2 + k 2 ) − αδ(x) − δ  (x) ∗ g(x) = = (α2 + k 2 ) α2 + k 2 = 2ik

x>0

=



−∞

exp(αx  )g(x − x  )d x  − αg(x) − g  (x) =



1 α exp(ikx) − exp(ikx) exp(αx  ) exp ik(x − x  ) d x  − 2ik 2 −∞ 0

α2 + k 2 exp(ikx) 2ik =

0



1 α exp(ikx) − exp(ikx) exp (α − ik)x  d x  − 2ik 2 −∞ 0

α2 + k 2 α 1 exp(ikx) − exp(ikx) − exp(ikx) = 0 2ik(α − ik) 2ik 2

Thus, for x > 0 we have the correct result u(x) = u 0 (x) + u a (x) = exp(ikx)



(7.153)

414

7 Finite-Difference Time-Domain Methods for Electrodynamics

7.16.3 Far Field in Maxwell’s Electrodynamics Key ideas of the previous section can be extended to vectorial problems of Maxwell’s electrodynamics in 2D and 3D. An outline follows. Let the scatterers be contained in a bounded domain ; as in the previous section, the exterior region ext is assumed to be homogeneous (typically, empty space). As ˜ H, ˜ equal to the actual fields E, H in in (7.146), let us introduce auxiliary fields E, ext and set to zero within . Formally, ˜ ˜ E(r) = s(r)E(r); H(r) = s(r)H(r)

(7.154)

where, as before, s(r) is an analog of the step function  s(r) =

0, r ∈  1, r ∈ /

(7.155)

In particular, this implies that s = 1 on the boundary ∂. Clearly, the tilde-fields satisfy Maxwell’s equations in the whole space, except (in the sense of distributions) on ∂, where the tangential components of the fields have a jump from their generally nonzero values outside  to zero inside. Thus, the generalized curl of E and H (see Sect. 3.17) is generally nonzero. Formally, one has ∇ × E˜ − ik0 B˜ = ∇ × (sE) − ik0 sB = s (∇ × E − ik0 B) + ∇s × E = δ∂ nˆ × E ≡ Jm

(7.156)

where δ∂ is the surface delta function, nˆ is the outward unit normal to the boundary, and Jm is shorthand for δ∂ nˆ × E, which has an intuitive interpretation as the surface density of some equivalent “magnetic current”—a quantity that does not physically exist but is of mathematical convenience. Similarly, ˜ + ik0 B˜ = δ∂ nˆ × H ≡ Je (7.157) ∇ ×H where Je is the equivalent electric current on the boundary ∂.

7.17 Historical Notes This section summarizes the key stages in the history of FDTD, with my brief comments. Further details can be found in the FDTD encyclopedia by A. Taflove & S. C. Hagness [TH05] and on Wikipedia.19 19 https://en.wikipedia.org/wiki/Finite-difference_time-domain_method.

7.17 Historical Notes

415

The Yee scheme [Yee66] (1966). K. Yee’s idea was to use central differences and staggered grids, resulting in an explicit second-order scheme in homogeneous regions (Sects. 7.6, 7.7). At the time of this writing (spring 2019), Yee’s seminal paper has garnered about 8,500 citations, according to Web of Science. Early developments (a small sample). C. D. Taylor et al. [TLS69] (1969) applied the Yee scheme to time-varying media. They remarked: Because of the symmetry that is imposed, the scattering problem treated is only two dimensional. With the IBM 360 model 40 computer made available for this study the memory capacity limited the study to two-dimensional geometry. However, a study is presently underway to use a CDC 6600 computer with additional memory (about 400 000 words) to treat three-dimensional scattering from arbitrarily shaped objects. In 1977, R. Holland published a description of a 3D Yee-scheme code called THREDE [Hol77]: THREDE is a time-domain, linear, finite-difference, three-dimensional ... coupling and scattering code. In its present form, it can accomodate [sic] a problem space consisting of a 30 × 30 × 30 mesh. ... Problem-space boundaries are provided with a radiating condition ... The scatterer must be a perfect conductor...

The FDTD acronym itself was introduced by A. Taflove [Taf80] (1980). Finite Integration Techniques (FIT), 1970s. FIT, developed by T. Weiland, is closely related to FDTD but distinctive enough conceptually to be considered as a separate methodology. It is based on the integral, rather than differential, form of Maxwell’s equations. Further details can be found in Sect. 7.8, and in the original papers by T. Weiland, M. Clemens, R. Schuhmann and their collaborators [Wei77, Wei96, Wei03, CW02, SW04]. As other methods on regular grids, FIT also faces the problem of curved or slanted material interfaces; “subcell” discretization and other ways of dealing with this issue are discussed by I. Zagorodnov et al. [ZSW03]. Radiation boundary conditions (1970s–1990s and beyond). This is a longstanding problem for all methods based on differential rather than integral equations, when an unbounded domain has to be truncated to a finite size for computational purposes. Section 7.10 outlines the main developments in that arena, from absorbing conditions by A. Bayliss & E. Turkel [BT80], G. Mur [Mur81], R. L. Higdon [Hig86], B. Engquist & A. Majda [EM77], T. Hagstrom & collaborators [HH98, HW04, HMOG08, HWG10] to perfectly matched layers (PML) by J. P. Bérenger [Ber96], Z. S. Sacks et al. [SKLL95], W.-C. Chew & collaborators [CW94, TC97, TC98b], S. D. Gedney [Ged96], and many others. Stability and accuracy enhancements (since 1990s). Explicit schemes of classical FDTD imposes serious stability constraints on the time step (Sects. 7.7.2, 7.7.2). These can be circumvented, at a moderate computational cost, by the alternating direction implicit (ADI) method; see Sect. 7.9.1 for details and references. Accuracy can be substantially improved, especially in large homogeneous regions and for interface boundaries parallel to the gridlines, by fast Fourier transforms with semi-analytical differentiation, leading to pseudospectral time-domain methods (Sect. 7.9.2).

416

7 Finite-Difference Time-Domain Methods for Electrodynamics

Applications are too numerous to list. They span all frequencies and all areas of electromagnetic analysis, including various problems in optics and photonics. This broad array of applications has emerged due to the universality and at the same time relative simplicity of FDTD. Although FDTD is a time-domain method by its nature, it has found frequency-domain applications as well, because the computer memory constraints are much less severe for time marching than for large-scale systems of equations arising, say, in finite element analysis or integral equations.

7.18 Codes Below is a list of selected software packages based on FDTD or closely related methods. Code Altair FekoTM altairhyperworks.com/product/Feko

License Commercial

Angora www.angorafdtd.org Dassault Systèmes— CST Microwave Studio www.cst.com FDTD++ www.fdtdxx. com

GNU Public License Commercial

FullWAVE https://www.synopsys. com/photonicsolutions.html Lumerical www.lumerical.com MEEP meep.readthedocs.io OptiFDTD optiwave.com/optifdtd-overview

Method An integrated suite of different solvers FDTD FIT (Sect. 7.8)

Capabilities Antenna design, EMC, waveguides, microstrip circuits, etc.

Complex optical beams, dispersive and random media, optical imaging 3D analysis of high frequency devices

Free with open C++ source code Commercial

FDTD

Numerical solutions to Maxwell’s equations in 3D, 2D, or 1D

FDTD

2D, radial, and 3D simulations. Photonic waveguide devices, photonic crystals, surface plasmons, etc.

Commercial

FDTD

Free, opensource Commercial

FDTD

3D/2D Maxwell’s solver for nanophotonic devices 1D, 2D, 3D; anisotropic, dispersive and nonlinear materials Photonic components: wave propagation, scattering, reflection, diffraction, polarization and nonlinear phenomena Antenna design, biomedical, EMI/EMC, microwave devices, radar, scattering, etc. Periodic 1D, 2D, and 3D structures; anisotropic materials

FDTD

REMCOM XFdtd www.remcom.com

Commercial

“Specialized FDTD”

WOLFSIM sourceforge.net/projects/wolfsim

GNU General Public License

FDTD

7.18 Codes

417

Acknowledgements I thank Allen Taflove, “father of the finite-difference time-domain technique,”20 for quite an invigorating and informative conversation in the summer of 2017. I also gratefully acknowledge communication, over many years, with experts on FDTD and FIT, especially the group of Thomas Weiland and his former students and coworkers: Markus Clemens, Herbert De Gersem, Irina Munteanu, Rolf Schuhmann.

7.19 Appendix: The Yee Scheme Is Exact in 1D for the “Magic” Time Step For easy reference, let us repeat the curl-E Yee equation (7.12) in 1D: ζ

Hn+1/2,m+1/2 − Hn+1/2,m−1/2 E n+1,m − E n,m = −μ x t

(7.158)

and consider the “magic” case x = v p t. The Yee equation then becomes ζ (E n+1,m − E n,m ) = −μv p (Hn+1/2,m+1/2 − Hn+1/2,m−1/2 )

(7.159)

We intend to show that this FD equation is in fact exact for any traveling wave (TW)—for definiteness, moving in the +x direction: E T W (x, t) = E 0 f (x − v p t); HT W (x, t) = H0 f (x − v p t)  Z H0 = E 0 , Z =

(7.160)

μ 

here f is an arbitrary differentiable function, Z is the intrinsic impedance of the medium, and E 0 is an arbitrary amplitude. Substituting this wave into the left-hand side of (7.159), we have l.h.s. = ζ E 0



f (xn + x − v p tm ) − f (xn − v p tm )



Similarly, the right-hand side is  r.h.s. = −μv p H0

1 1 1 1 f (xn + x − v p (tm + t)) − f (xn + x − v p (tm − t) 2 2 2 2



which for the “magic” step simplifies to r.h.s. = μv p H0



f (xn + x − v p tm ) − f (xn − v p tm )



√ √ Since v p = ζ/ μ and H0 = E 0 /μ, we see that indeed r.h.s. = l.h.s. 20 https://www.nature.com/articles/nphoton.2014.305.



418

7 Finite-Difference Time-Domain Methods for Electrodynamics

7.20 Appendix: Green’s Functions for Maxwell’s Equations 7.20.1 Green’s Functions for the Helmholtz Equation By definition, Green’s function g(r) for the Helmholtz equation satisfies, in n dimensions, the equation (7.161) − ∇ 2 g(r) − k 2 g(r) = δ(r) and the Sommerfeld radiation condition lim r 2 (n−1) (∂r g ± ikg) = 0 for the exp(±iωt) phasor convention 1

r →∞

(7.162)

As usual, δ(r) is the Dirac delta function, and k > 0 is a fixed wavenumber. The negative sign in the left-hand side of (7.161) is introduced as just a matter of convenience, to eliminate the respective negative sign in the expressions for Green’s functions. The radiation condition reflects the fact that under the “electrical engineering” phasor convention exp(+iωt) [≡ exp(+ jωt)], the phase of an outgoing wave is asymptotically exp(−ikr ). Under the “physics” phasor convention exp(−iωt), the signs are opposite. Expressions for Green’s functions in 1D, 2D and 3D are widely available from various sources and are given below without derivation. They can be verified by direct substitution in the Helmholtz equation, provided that the singularity at the origin is treated with care in the sense of distributions. g1D (x) =

1 exp(∓ik|x|) for exp(±iωt) 2ik

(7.163)

i (γ) h (kr ) 4 0

(7.164)

g2D (r ) = (γ)

where h 0 is the Hankel function of type γ = 1 for the exp(−iωt) phasor convention, and of type γ = 2 for exp(+iωt). Hankel functions are numerically expensive to compute, so their far-field (kr → ∞) asymptotic expansions are ordinarily used:  (γ) h 0 (z)



Finally, g3D (r ) =

  π  2 exp −i z − πz 4

exp(∓ikr ) for exp(±iωt) 4πr

(7.165)

7.20 Appendix: Green’s Functions for Maxwell’s Equations

419

7.20.2 Maxwell’s Equations Maxwell’s equations with a nonzero right-hand side, in the frequency domain, in a homogeneous linear isotropic medium filling the whole 3D space, are written in the SI system under the exp(iωt) phasor convention as ∇ × E = −iωB + Jm

(7.166)

∇ × H = iωD + Je

(7.167)

D = E, B = μH

(7.168)

where  is the dielectric permittivity, and μ is the magnetic permeability; Jm and Je are magnetic and electric current densities, respectively. These currents may be either auxiliary quantities introduced for mathematical analysis or, in the case of Je , actual electric currents (however, see Sect. 10.2). Assume further that the whole space is filled with a linear isotropic homogeneous medium, with a permittivity  and a permeability μ. By applying the divergence operator to (7.166), (7.167), one immediately obtains ∇ ·B = − ∇ ·D =

i ∇ · Jm ω

i ∇ · Je ω

(7.169) (7.170)

It is convenient to consider the general excitation as a superposition of electromagnetic fields induced by the electric current Je alone, and by the “magnetic current” Jm alone.

7.20.3 Subcase Jm = 0 First, let us consider the fields induced by Je , with Jm = 0. Then, from (7.166), ∇ · B = 0 and B = ∇ × A, where A is the magnetic vector potential; hence, also from from (7.166), E = −iωA − ∇φ (7.171) where φ is an electric scalar potential. Then (7.167) becomes

or

∇ × H = iω(−iωA − ∇φ) + Je

(7.172)

∇ × ∇ × A = μ ω 2 A − iω∇φ + μJe

(7.173)

420

7 Finite-Difference Time-Domain Methods for Electrodynamics

Using the standard calculus identity ∇ × ∇ = ∇∇ · −∇ 2 , one rewrites (7.173) in terms of the Laplace operator:

− ∇ 2 A = μ ω 2 A − iω∇φ − ∇∇ · A + μJe

(7.174)

So far only the curl of A has been defined (it is equal to B), and therefore the magnetic vector potential is not unique. One can choose the divergence of A arbitrarily. A convenient choice simplifying (7.174) is the Lorenz gauge ∇ · A = −iωμφ

(7.175)

This gauge decouples A and φ in (7.174): − (∇ 2 A + k 2 A) = μJe , k 2 = ω 2 μ

(7.176)

The respective equation for φ can be obtained from (7.170), (7.171), with D = E: −iω∇ · A − ∇ 2 φ =

i ∇ · Je ω

With the Lorenz gauge, this simplifies to iω(iωμφ) − ∇ 2 φ = or − (∇ 2 + k 2 )φ =

i ∇ · Je ω

i ∇ · Je ω

(7.177)

Let g(r) be Green’s function of the scalar Helmholtz operator ∇ 2 + k 2 in the lefthand side of (7.177). Explicit expressions for g(r), with relevant references, are given in Sect. 7.20.1. Solutions of Eqs. (7.176), (7.177) can then be written as A = μJe ∗ g

φ=

i ∇ · Je ∗ g ω

(7.178) (7.179)

where “*” denotes the standard convolution integral over the whole space. From these expressions for the potentials, one obtains the electric field (7.171) E = −iωμJe ∗ g − ∇

i i ∇ · Je ∗ g = −iωμJe ∗ g − ∇∇ · Je ∗ g ω ω

or E =



i ∇∇ g ∗ Je −iωμ − ω

(7.180)

(7.181)

7.20 Appendix: Green’s Functions for Maxwell’s Equations

421

The last transformation relies on the fact that differentiation (in this case, the ∇ operator) can be applied to either term of a convolution. In general, care must be exercised when dealing with the singularity of Green’s functions at the origin; however, in far-field analysis this issue does not arise, since all Green’s functions in the far field are smooth. Readers familiar with the notion of dyadic Green’s functions will recognize the first term in the convolution (7.181) as exactly that; but such readers are likely to be familiar with the material of this section anyway. Dyadic Green’s functions are not used anywhere else in this book, so (7.180) is sufficient for our needs. With the electric field determined by the convolution integral (7.180) or, equivalently, (7.181), the magnetic field is found from the Maxwell curl equation as H=

i ∇ ×E ωμ

(7.182)

7.20.4 Subcase Je = 0 The analysis of fields induced by Jm completely similar to the analysis for Je of the previous section, so only a brief summary is needed. To avoid cumbersome notation, the fields in this section are still denoted generically as E and H, but one should bear in mind that these fields are induced by different sources and hence differ from the fields in the previous section. If Je = 0, then, from (7.167), ∇ · D = 0 and D = ∇ × F, where F is the electric vector potential; hence H = iωF − ∇ψ (7.183) Here ψ is the magnetic scalar potential. Then (7.166) becomes

or

∇ × E = −iωμ(iωF − ∇ψ) + Jm

(7.184)

∇ × (−1 ∇ × F) = −iωμ(iωF − ∇ψ) + Jm

(7.185)

− ∇ 2 F + ∇∇ · F = ω 2 μF + iωμ∇ψ + Jm

(7.186)

Then,

The Lorenz-like gauge ∇ · F = iωμψ

(7.187)

− (∇ 2 F + k 2 F) =  Jm

(7.188)

decouples F and ψ in (7.186):

422

7 Finite-Difference Time-Domain Methods for Electrodynamics

For the magnetic scalar potential ψ, one obtains, from (7.169) (7.183) with B = μH: ∇ · μ(iωF − ∇ψ) = −

i ∇ · Jm ω

i iωμ∇ · F − μ∇ 2 ψ = − ∇ · Jm ω With the gauge (7.187), this simplifies to − (∇ 2 + k 2 )ψ = −

i ∇ · Jm ωμ

(7.189)

Solutions of Eqs. (7.188), (7.189) can be written as convolutions with Green’s function F = Jm ∗ g (7.190)

ψ=−

i ∇ · Jm ∗ g ωμ

(7.191)

The magnetic field (7.183)

H = iωF − ∇ψ = iωJm ∗ g + ∇ or H =

i ∇ · Jm ∗ g ωμ

(7.192)

i iω + ∇∇ g ∗ Jm ωμ

(7.193)

i ∇ ×H ω

(7.194)

E=−

7.20.5 Excitation by both Electric and “Magnetic” Currents In general, when both Je and Jm are nonzero, the electromagnetic field is just a superposition of fields induced by Je and Jm separately—that is, the sum of the final results of the previous two sections. For the electric field, the relevant equations are (7.180) or (7.181) and (7.182); for the magnetic field—(7.192) or (7.193) and (7.194).

7.20 Appendix: Green’s Functions for Maxwell’s Equations

423

7.20.6 Summary: Near-to-Far-Field Transformation The key ideas of near-to-far-field transformation in the frequency domain can thus be summarized as follows. • In a typical wave scattering problem, sources and inhomogeneities (e.g. scattering bodies) are confined within a bounded domain , the exterior of which is homogeneous (usually just air). The field inside  can be simulated using FDTD, with proper absorbing conditions or PML on the boundary ∂. • The far-field pattern, many wavelengths away from , is very often of great practical interest. The objective of near-to-far-field transformation is to derive that pattern from the near field, which has been computed within . • To this end, it is convenient to consider an auxiliary field, equal to the actual field in the exterior domain but artificially set to zero inside . It is clear that this auxiliary field satisfies Maxwell’s equations everywhere except for the boundary ∂, where jumps in the tangential components of the E and H fields are conceptually attributable to surface currents—electric and magnetic. • These surface currents can then be viewed as sources of the far field. Since the exterior domain is homogeneous, the far field can be expressed as a convolution of the surface currents with Green’s function (Sect. 7.20). In 2D, Green’s function is a computationally expensive Hankel function, but large-argument asymptotic expansions can be used for the far field. Convolutions with 3D Green’s functions are described above. In the time domain, the near-to-far-field transformation is, naturally, much more involved. The interested reader is referred to papers by T. B. Hansen & A. D. Yaghjian, S. González García et al., A. Shlivinski & A. Boag, J. Li & B. Shanker, C.-C. Oetting & L. Klinkenbusch [HY94b, HY94a, GGG00, OK05, SB09, LS15].

Chapter 8

Applications in Nano-Photonics

8.1 Introduction Visible light is electromagnetic waves with submicron wavelengths—between ∼400 nm (blue light) and ∼700–750 nm (red light) in free space. Therefore, propagation of light through materials is affected greatly by their submicron features and structures. Moreover, the ability to create and control such small features has led to peculiar physical effects, technologies and devices, as discussed later in this chapter. Truly nanoscale features, much smaller than the wavelength, can also be crucial. In particular, one exciting direction in photonics involves nanoscale (5–50 nm) “plasmon” particles and structures that exhibit peculiar resonance behavior in the optical frequency range (Sect. 8.12). This chapter is not a comprehensive review of nano-photonics; rather, it covers selected intriguing applications and related methods of computer simulation. For a broader view, see S. V. Gaponenko’s and P. N. Prasad’s monographs [Gap10, Pra03, Pra04]. References on more specific subjects (photonic crystals, plasmonics, nanooptics, etc.) are given in the respective sections of this chapter. The indispensable starting point in a discussion of photonics is Maxwell’s equations, which describe electromagnetic fields in general and propagating electromagnetic waves in particular. After a brief review of Maxwell’s equations, the chapter gives an introduction to band structure and the photonic bandgap (PBG) phenomenon, plasmonic particles and plasmon-enhanced scanning near-field optical microscopy (SNOM), backward waves, negative refraction and nano-focusing, with related simulation examples.

8.2 Maxwell’s Equations Even though Maxwell’s equations appear in many places in this book, they are summarized here for easy reference. There is the “curl part” © Springer Nature Switzerland AG 2020 I. Tsukerman, Computational Methods for Nanoscale Applications, Nanostructure Science and Technology, https://doi.org/10.1007/978-3-030-43893-7_8

425

426

8 Applications in Nano-Photonics

SI

Gaussian

∇ × E = −∂t B

(8.1)

∇ × H = ∂t D + J

(8.3)

1 ∇ × E = − ∂t B c

∇ ×H =

(8.2)

1 4π ∂t D + J c c

(8.4)

and the “divergence part” SI

Gaussian

∇ ·D = ρ

(8.5)

∇ · D = 4πρ

(8.6)

∇ ·B = 0

(8.7)

∇ ·B = 0

(8.8)

In these equations, E and H are the electric and magnetic fields, respectively; D and B are the electric and magnetic flux densities, respectively; ρ is the electric charge density, and J is the electric current density. For physical definitions of these vector quantities and a detailed physical discussion, see well-known textbooks by L. D. Landau and E. M. Lifshitz [LLP84], J. A. Stratton [Str41], R. P. Feynman et al. [FLS89], W. K. H. Panofsky and M. Phillips [PP62], R. Harrington [Har01]. The physical meaning of Maxwell’s equations becomes more transparent if they are rewritten in integral form using the standard vector calculus identities. The first two equations become SI  ∂S

E · dl = −dt

 ∂

Gaussian   1 E · dl = − dt B · dS c ∂S S



H · dl =

B · dS

(8.9)

S

d dt

 



 D · dS +



J · dS

(8.11)

∂

H · dl =

1 dt c

 

D · dS +

4π c

(8.10)  

J · dS

(8.12)

These relationships are valid for any open surface S with its closed-contour boundary ∂ S oriented in the standard way. Equations (8.9), (8.10)—known as Faraday’s law—mean that the electromotive force (emf) over a closed contour is induced by the changing magnetic flux passing through that contour. (The emf is defined as the

8.2 Maxwell’s Equations

427

line integral of the electric field.)1 Unlike the emf equations (8.9), (8.10), equations (8.11), (8.12) for the magnetomotive force (mmf, the contour integral of the magnetic field) contain two terms in the right-hand side. The mmf is due to the changing electric flux and to the electric current passing through the closed contour. The lack of complete symmetry between the emf and mmf equations is due to the apparent absence of magnetic charges (monopoles).2 If monopoles are ever discovered, presumably the Faraday law will have to be amended, as magnetic currents would contribute to the emf over a closed contour. Next, the integral form of the divergence equations (8.5), (8.6) and (8.8) is, for any 3D domain  bounded by a closed surface ∂, 

 ∂

D · dS = [4π] Q,

Q =

ρ d

(8.13)



 ∂

B · dS = 0

(8.14)

The first of these equations, known as Gauss’s law, relates the flux of the D vector through any closed surface to the total electric charge inside that surface; the 4π factor in the brackets applies only in the Gaussian system. The second equation, for the flux of the B field, is analogous, except that there is no magnetic charge (see footnote 2). As it stands, the system of four Maxwell’s equations is still underdetermined. Generally speaking, a vector field in the whole space (and vanishing at infinity) is uniquely defined by both its curl and divergence, whereas Maxwell’s equations specify the curl of E and the divergence of D, not E. The same is true for the pair of magnetic fields H and B. To close the system of equations, one needs to specify the relationships, known as constitutive laws, between E, D, H and B. In linear isotropic materials, D = E,  = (x, y, z) (8.15) B = μH, μ = μ(x, y, z)

(8.16)

In other types of media, however, relationships between the fields can be substantially more complicated—they can be nonlinear and can include the time history 1 An

alternative approach, where—loosely speaking—the emf is taken as a primary quantity and the field is defined via the emf, is arguably more fundamental but requires the notions of differential geometry and differential forms, which are beyond the standard engineering curriculum. See monographs by P. Monk [Mon03] and A. Bossavit [Bos98] as well as the sections on edge elements (Sect. 3.12), and on finite integration technique (FIT), on Sect. 7.8. 2 On February 14, 1982, a monopole-related event may have been registered in the laboratory of Blas Cabrera (B. Cabrera, First results from a superconductive detector for moving magnetic monopoles, Phys. Rev. Lett., vol. 48, pp. 1378–1381, 1982). An abrupt change in the magnetic flux through a superconducting loop was recorded (the magnetic flux is known to be quantized). A magnetic monopole passing through the loop would cause a similar flux jump. However, nobody has been able to reproduce this result.

428

8 Applications in Nano-Photonics

of the electromagnetic process. The dependence on the history is called hysteresis (I. D. Mayergoyz [May03]). Moreover, the magnetic and electric fields can be coupled (see, e.g., J. Lekner [Lek96], A. Lakhtakia and collaborators [LVV89, Lak06, ML10], Y. N. Obukhov and F. W. Hehl [OH09], and Sect. 9.2.5.1, with multiple references therein). Our discussion and examples, however, will be limited to the linear isotropic case (8.15), (8.16). There is a connection between the curl and divergence equations. Indeed, since divergence of curl is zero, by applying the divergence operator to both sides of the curl equations (8.1), (8.2) and (8.3), (8.4), one obtains ∂t ∇ · B = 0

(8.17)

∇ · (∂t D + [4π]J) = 0

(8.18)

and

The first equation implies zero divergence for B if, in addition, zero divergence is imposed as the initial condition at any given moment of time. Alternatively, zero divergence can be easily deduced from Faraday’s law if the fields are timeharmonic (i.e. sinusoidal in time—more about this case below). Without such additional assumptions, zero divergence does not in general follow from Faraday’s law. Similar considerations show a close connection, but not complete equivalence, between the divergence equation (8.18) and conservation of charge. Substituting the div D equations (8.5), (8.6) into (8.18) gives ∇ · J = − [4π] ∂t ρ

(8.19)

which is a mathematical expression of charge conservation.3 This logic cannot be completely reversed to produce the divergence equation for D from charge conservation and the curl equation for H. Indeed, substituting conservation of charge (8.19) into (8.18), one obtains ∂t (∇ · D − [4π]ρ) = 0

(8.20)

which makes the divergence equation ∇ · D = ρ true at all moments of time, provided that it holds at any given moment of time. Time-harmonic fields can be described by complex phasors. It will always be clear from the context whether a time function or a phasor is being considered, so for simplicity of notation phasors will be denoted with the same symbols as the corresponding time-dependent fields (H, D, etc.), with little danger of confusion. At the same time, we are facing a dilemma with regard to notational conventions on complex phasors themselves (see remarks on “On Units and Conventions”).  conservation is more easily noted if this equation is put into integral form, ∂ J · dS = −[4π] dt Q. The current flowing out of a closed volume is equal to the rate of depletion of electric charge inside that volume.

3 Charge

8.2 Maxwell’s Equations

429

Physicists usually assume that the actual E field can be obtained from its phasor as Re{E exp(−iωt)}, and similarly for other fields. Electrical engineers take the plus sign, exp(+iωt), in the complex exponential. This notational difference is equivalent to replacing all phasors with their complex conjugates. Unfortunately, material parameters also get replaced with their conjugates, and confusion may arise, say, if engineers take the dielectric permittivity from the physical data measured in the “wrong” quadrant. In addition, physicists and mathematicians typically use symbol i for the imaginary unit, while engineers prefer j. With these considerations in mind, Maxwell’s equations for the phasors of timeharmonic fields are SI ∇ × E = −iωB

∇ × H = iωD + J

Gaussian ω c

(8.22)

4π J c

(8.24)

(8.21)

∇ × E = ik0 B, k0 ≡

(8.23)

∇ × H = −ik0 D +

Maxwell’s “divergence equations” (8.5), (8.6), (8.8) do not involve time derivatives and are therefore unchanged in the frequency domain. For time-harmonic fields, zero divergence for B follows directly and immediately from (8.21), (8.22), and conservation of charge follows from (8.23), (8.24).

8.3 One-Dimensional Problems of Wave Propagation 8.3.1 The Wave Equation and Plane Waves The simplest, and yet important and instructive, case for electromagnetic analysis involves fields that are independent of two Cartesian coordinates (say, y and z) and may depend only on the third one (x); the medium is assumed to be source-free (ρ = 0, J = 0), isotropic and homogeneous, with parameters  and μ independent of the spatial coordinates and time. Divergence equations (8.5), (8.6), (8.8) in this case yield ∂ Bx ∂ Dx = 0, = 0 (8.25) ∂x ∂x and hence Dx and Bx must be constant. These trivial uniform electro- and magnetostatic fields are completely disassociated from the rest of the analysis and will hereafter be ignored. The following equations are written in the SI system, where the explicit 1/c factor does not appear; μ and  are the absolute (rather than relative) permeability and permittivity of a linear medium.

430

8 Applications in Nano-Photonics

In the absence of the x-component of the fields, the curl equations (8.1), (8.3) become ∂ Ey ∂ Hy ∂ Hz ∂ Ez = −μ ; = μ (8.26) ∂x ∂t ∂x ∂t ∂ Hy ∂ Ez =  ; ∂x ∂t



∂ Ey ∂ Hz =  ∂x ∂t

(8.27)

It is not hard to see that the equations have decoupled into two pairs: ∂ Ey ∂ Hz = −μ ; ∂x ∂t ∂ Hy ∂ Ez = μ ; ∂x ∂t



∂ Ey ∂ Hz =  ∂x ∂t

∂ Hy ∂ Ez =  ∂x ∂t

(8.28)

(8.29)

These pairs of equations correspond to two separate waves: one with the (E y , Hz ) components of the fields and the other one with the (E z , Hy ) components. These are different polarizations of the wave; by convention, it is the direction of the electric field that defines polarization. Thus, the wave of (8.28) is said to be polarized in the y direction, while the wave of (8.29) is polarized in the z direction. We can now focus on one of the waves—say, on the (E y , Hz ) wave (8.28)— because the other one is completely similar. The magnetic field can be eliminated by differentiating the first equation in (8.28) with respect to x, the second one with respect to time and then adding these equations to remove the mixed derivative of the H field. This leads to the wave equation ∂2 E y ∂2 E y − μ = 0 ∂x 2 ∂t 2

(8.30)

It is straightforward to verify, using the chain rule of differentiation, that any field of the form E y (x, t) = g(v p t ± x) (8.31) satisfies the governing equation (8.30) if g is an arbitrary twice-differentiable function and v p is 1 (8.32) vp = √ μ For example, E y (x, t) = (v p t − x)2 and E y (x, t) = cos k(v p t − x), where k is a given parameter, are valid waves satisfying the electromagnetic equations. Physically, (8.31) represents a waveform that propagates in space without changing its shape (the shape is specified by the g function). Let us trace the motion of any point with a fixed value of E y on the waveform. The fixed value of the field implies zero full differential

8.3 One-Dimensional Problems of Wave Propagation

d Ey =

∂ Ey ∂ Ey dt + d x = g  v p dt ± g  d x = 0 ∂t ∂x

431

(8.33)

and hence (for a nonzero derivative g  ) ∂ Ey ∂ Ey dx = − = ∓v p / dt ∂t ∂x

(8.34)

Thus, any point on the wave form moves with velocity v p ; it can also be said that the waveform as a whole propagates with this velocity. Note that for v p > 0 the g(x − v p t) wave moves in the +x direction, while the g(x + v p t) wave moves in the −x direction. In the very common particular case where the waveform g is sinusoidal, the point of constant value of the field is also the point of constant phase. For this reason, v p is known as phase velocity. To solve the wave equation (8.30), let us apply the Fourier transform. The transforms will sometimes be marked by the hat symbol; in many cases, however, for the sake of simplicity no special notation will be used and complex phasors will be identified from the context and/or by the argument ω. In this section, let us also drop the y subscript, as the field has only one component. Then, the wave equation becomes (8.35) E  (x) + ω 2 μE(x) = 0 where the prime indicates the x-derivative. This is the Helmholtz equation whose general solution E(x) is a superposition of two plane waves E ± exp(±kx), so called because their surfaces of equal phase are planes. E(x) = E + exp(ikx) + E − exp(−ikx)

(8.36)

where E ± are some amplitudes and √ k = ω μ

(8.37)

is the wavenumber. Since k enters the solution (8.36) with both plus and minus signs, it is at this point unimportant which branch of the square root is chosen to define k in (8.37). This issue will become non-trivial later, in the context of backward waves and negative refraction.

8.3.2 Signal Velocity and Group Velocity Plane waves cannot be used as “signals”; they do not transfer energy or information because, by definition, they exist forever and everywhere. Thus, unavoidably, information transfer must involve more than one frequency.

432

8 Applications in Nano-Photonics

Now, the standard textbook argument goes like this: Consider a superposition of two waves, for simplicity of the same amplitude, with slightly different frequencies ω ± ω (ω  ω). Simple algebra gives exp [i ((k + k)x − (ω + ω)t)] + exp [i ((k − k)x − (ω − ω)t)] = 2 exp [i(kx − ωt)] cos (xk − tω) The cosine term can be viewed as a low-frequency (ω) “signal” and the complex exponential as a high-frequency (ω) carrier wave. The “signal” cos(xk − tω) manifests itself as beats on the carrier wave and propagates with the group velocity vg = ω/k (the “group” consisting of just two waves in this idealized case). The ω → 0 limit ∂ω (8.38) vg = ∂k is then declared to be “signal velocity”—different from the phase velocity v p = ω/k. However, if a single monochromatic wave contains zero information, one may wonder how two such waves—or any finite number of plane waves for that matter— could carry a nonzero amount of information.4 Indeed, the train of beats is no less predictable than a single plane wave and also is present, theoretically, everywhere and forever. It cannot therefore be used as a signal any more than a single plane wave could. A rigorous analysis must rely on precise definitions of “information” and “signal” —a territory into which I will not attempt to venture here and which would take us too far from the main subjects of this chapter. Instead, following the books by L. Brillouin [Bri60] and P. W. Milonni [Mil04, Sect. 1.5], let us note that an observer can receive a nonzero amount of information only if the future behavior of the wave cannot be determined from its values in the past. This implies, in particular, that an information-carrying wave has to be, in the mathematical sense, non-analytic. As a characteristic example, consider a pointwise source capable of generating an arbitrary (not necessarily analytic!) field at x = 0. Let us use this source to produce amplitude modulation E(x = 0, t) = E(x = 0, t) exp(−iω0 t)

(8.39)

where E(x = 0, t) is a low-frequency waveform that can be used to carry information; ˆ = 0, ω) of E(x = 0, t) is ω0 is the carrier frequency. The Fourier transform E(x ˆ = 0, ω) = E(x



∞ −∞

ˆ = 0, ω − ω0 ) E(x = 0, t) exp(−iω0 t) exp(iωt)dt = E(x

That is, the modulation shifts the spectrum of E by ω0 , as is well known in signal analysis. The complex field phasor of an outgoing wave at an arbitrary x > 0 then is 4 This

is why the word “signal” was put in quotes in the previous paragraph.

8.3 One-Dimensional Problems of Wave Propagation

433

ˆ = 0, ω − ω0 ) exp(ik(ω)x) ˆ E(x, ω) = E(x

(8.40)

If there is no dispersion, i.e. the velocity of the wave is frequency-independent, k(ω) = ω/v p and   x ˆ ˆ (no dispersion) E(x, ω) = E(x = 0, ω − ω0 ) exp iω vp the inverse Fourier transform of which is   x (no dispersion) E(x, t) = E t − vp The wave arrives at the observation point x unmolested, only with a time delay x/v p . We are, however, interested in the general case with dispersion. The timedependent field can be found from its Fourier transform (8.40) as  E(x, t) =



−∞

ˆ = 0, ω − ω0 ) exp(ik(ω)x) exp(−iωt) dω E(x

(8.41)

which gives the low-frequency “signal” E(x, t)  E(x, t) = E(x, t) exp(iω0 t) = where

∞ −∞

ˆ = 0, ω  ) exp(ik(ω  )x) exp(−iω  t) dω  E(x (8.42)

ω  ≡ ω − ω0

The velocity of this signal can be found from the condition of zero differential dE(x, t) in full analogy with equations (8.33) and (8.34); this velocity is the ratio of partial differentials of E(x, t) with respect to t and x. These partial derivatives are ∂E = i ∂x and





−∞

k(ω  ) E(x = 0, ω  ) exp(ik(ω  )x) exp(−iω  t) dω 

∂E = −i ∂t



∞ −∞

ω  E(ω  ) exp(ik(ω  )x) exp(−iω  t) dω 

(8.43)

(8.44)

So far, the expressions have been exact; now an approximation is needed to find a relationship between the two partial derivatives. Since E(t) is a low-frequency function, the main contribution to the Fourier transforms comes from the small values of ω  = ω − ω0 . First-order approximation of ω  (k) for small k is ω ≈ k

∂ω  (0) ∂k

(small k)

434

8 Applications in Nano-Photonics

and consequently ∂E ∂ω  ≈ −i (0) ∂t ∂k



∞ −∞

k E(ω  ) exp(ikx) exp(−iω  t) dt

Therefore, the velocity of the signal is vsignal ≈

∂ω ∂E ∂E = ≡ vg / ∂t ∂x ∂k

(8.45)

Thus, group velocity ∂ω/∂k is only an approximation of signal velocity (P. W. Milonni elaborates on this in [Mil04, Sect. 1.5]). As the above analysis shows, the accuracy of this approximation depends on the deviation of the dispersion curve ω(k) from a straight line within the frequency range [ω0 − ωE , ω0 + ωE ], where [−ωE , ωE ] is a characteristic frequency band for the signal E beyond which its amplitude spectrum is zero or can be neglected; it is assumed that ωE  ω0 . One may not be satisfied with these approximations and may wish to define signal velocity exactly. However, the precise definition is elusive. Indeed, consider a broadband signal such as a sharp pulse. Its high-frequency components can, at least in principle, be used to convey information. But at high frequencies the material parameters tend to their free space values (0 and μ0 in SI and just unity in the Gaussian system), and hence group velocity tends to the speed of light. Thus— as a matter of principle and disregarding all types of noise—information can be transferred with the velocity of light in any medium. An equivalent and instructive physical interpretation is given by A. Sommerfeld [Bri60, p. 19], with attribution to W. Voigt: We will show here that the wave front velocity is always identical with the velocity of light in vacuum, c, irrespective of whether the material is normally or anomalously dispersive, whether it is transparent or opaque, or whether it is simply or doubly refractive. The proof is based on the theory of dispersion of light, which explains the various optical properties of materials on the basis of the forced oscillations of the particles of the material, either electrons or ions. . . . According to our present knowledge . . ., there exists only one isotropic medium for electrodynamic phenomena, the vacuum, and the deviations from vacuum properties can be traced back to the forced oscillations of charges. When the wave front of our signal makes its way through the optical medium, it finds the particles which are capable of oscillating originally at rest . . ., (except for their thermal motion which has no effect on propagation, due to its randomness). Originally, therefore, the medium seems optically empty; only after the particles are set into motion, can they influence the phase and form of the light waves. The propagation of the wavefront, however, proceeds undisturbed with the velocity of light in vacuum, independently of the character of the dispersing ions.

8.3.3 Group Velocity and Energy Velocity The relationship between group velocity and the Poynting vector has substantial physical significance in its own right but even more so in connection with back-

8.3 One-Dimensional Problems of Wave Propagation

435

ward waves and negative refraction, to be discussed in Sect. 8.15. Let us consider a homogeneous source-free isotropic material with frequency-dependent parameters (ω) and μ(ω). Losses at a given operating frequency (but not necessarily at other frequencies) will be neglected, so that both  and μ are real at that frequency. A y-polarized plane wave propagating in the x direction is governed by the equation (8.46) E  (x) + k 2 E(x) = 0 where k 2 = ω 2 (ω)μ(ω) in the SI system; E = E y , and has the form E(x) = E 0 exp(ikx)

(8.47)

under the exp(−iωt) phasor convention. The magnetic field H = Hz is H (x) = H0 exp(ikx);

H0 =

k E0 = ωμ

  21  E0 μ

(8.48)

Power flux is characterized by the time-averaged Poynting vector with the xcomponent only:   21 1 1  ∗ 2 P ≡ P x = Re(E H ) = |E 0 | Re 2 2 μ

(8.49)

(In the Gaussian system, there is an additional factor c/(4π).) If one is interested in the wave with power flow in the +x direction, then the real part of k is positive and the square root in (8.49) is the one with a positive real part. Since group velocity and the Poynting vector are related to the propagation of signals and energy, respectively, there is a connection between them. For group velocity, we have vg−1 =

∂k = ∂ω

  ∂ ∂μ 1 1 2μ + ωμ + ω (μ)− 2 ∂ω ∂ω 2

(8.50)

The amount of field energy transferred through a surface element d S = dy dz over the time interval dt is equal to w d S d x = w d S v E dt, where w is the volume energy density and v E is energy velocity. On the other hand, the same transferred energy is equal to P d S dt; hence, w d S v E dt = Pd Sdt or simply w vE = P

(8.51)

If one assumes that energy, like signals, propagates with group velocity (under the approximation assumptions considered above), i.e. v E = vg , then the volume energy

436

8 Applications in Nano-Photonics

density can be obtained from (8.51) and (8.50). After some algebra, w =

Pvg−1

1 = 4



∂(ω) ∂(ωμ) |E|2 + |H |2 ∂ω ∂ω

 (8.52)

where the relationship between the electric and magnetic field amplitudes, as specified in (8.48), has been worked into this expression to make it symmetric with respect to both fields. This result for dispersive media is well established in the physics literature (L. D. Landau and E. M. Lifshitz [LLP84], L. Brillouin [Bri60]) and is notably different from the classical formula for static fields wstatic =

 1   |E|2 + μ |H |2 2

The difference between the numerical factors in the “static” and “dynamic” expressions for the energy density is natural, as the additional 1/2 in (8.52) reflects the usual “effective value” of sinusoidally oscillating quantities. More interesting is the dependence of energy in a dispersive medium on the ω-derivatives of  and μ. The physical nature of these additional terms is explained by Brillouin [Bri60, pp. 88–89]: The energy . . . at the time when E passes through zero is quite different from the zero energy that the dielectric has after being isolated from an electric field for a long time. In order to explain the fact that the permittivity  of the dielectric is different from that of the vacuum, 0 , one must admit that the medium contains mobile charges, electrons or ions in motion or electric dipoles capable of orientation; then, one takes as the zero energy of the system the condition that all the charged particles are at rest in their equilibrium positions. …all the charged particles may pass by their equilibrium positions at the time t = 0 when the field vanishes, but they pass them with nonzero velocity. [The additional term] represents the kinetic energy of all the charged particles contained in the dielectric.

8.4 Analysis of Periodic Structures in 1D Much research in nano-photonics is related to electromagnetic wave propagation in periodic structures with a characteristic size comparable with, but smaller than, the wavelength. The mathematical side of the analysis is centered on differential equations with periodic coefficients. We therefore start with a summary of the relevant mathematical theory for ordinary differential equations. Subsequent sections deal with generalizations to two and three dimensions. In media with one-dimensional periodicity along the x-axis, the source-free onecomponent field satisfies equations (8.140) or (8.143), which are particular cases of Hill’s equation (8.53) dx (P(x) dx u) + Q(x)u = 0

8.4 Analysis of Periodic Structures in 1D

437

Here, u is the single Cartesian component of either the electric or magnetic field; dx denotes the x-derivative. P(x), Q(x) are known functions (in applications, piecewise-smooth and possibly complex-valued), periodic in x with a period a: P(x + a) = P(x),

Q(x + a) = Q(x), ∀x ∈ R

(8.54)

For theoretical analysis of Hill’s equation, it is convenient to rewrite this secondorder equation as a system of two first-order ones with a vector of unknowns (u, v)T , where v ≡ P(x) dx u: dx u = P −1 (x) v dx v = −Q(x) u

(8.55) (8.56)

or in matrix–vector form   u dx w = Aw, w ≡ , v

 A ≡

0 P −1 (x) −Q(x) 0

 (8.57)

Under quite general assumptions on the smoothness of P(x), Q(x), two linearly independent solutions ψ1,2 (x) of this system exist; each of them is a column vector of height two, real or complex depending on the nature of the coefficients P and Q. It is helpful to combine these linearly independent solutions into a 2 × 2 matrix (x) with columns ψ1 (x) and ψ2 (x). Clearly, this matrix itself satisfies the differential equation (8.57), i.e. (8.58) dx (x) = A(x) because this equation holds true columnwise. Matrix (x) is called the fundamental matrix of the system. ˜ Any other solution ψ(x) can be expressed as a linear combination of ψ1 (x), ψ2 (x): ˜ ψ(x) = (x) c

(8.59)

˜ of where c is some constant column vector in C2 . Consequently, any solution (x) matrix equation (8.58) is linearly related to the fundamental matrix (x): ˜ (x) = (x) C˜

(8.60)

where C˜ is some time-independent 2 × 2 matrix. Let us now take into account the periodicity of the coefficients. It is clear that translation of any solution by the spatial period a is also a solution. In particular, (x + a) is a solution. As such, it must be linearly related to the fundamental matrix by (8.60), i.e. (x + a) = (x) C (8.61)

438

8 Applications in Nano-Photonics

where matrix C is a particular instance of the generic matrix C˜ in (8.60). Setting x = 0 in (8.61) yields (8.62) C = (0)−1 (a) Further, let us be more specific and choose ψ1 (x) and ψ2 (x) as a pair of functions ψˆ 1 (x) and ψˆ 2 (x) corresponding to the initial conditions ˆ (0) = I

(8.63)

Such functions are known as normalized solutions. In that case, (8.62) simplifies just ˆ to C = (a), and (8.61) becomes ˆ + a) = (x) ˆ ˆ (x (a)

(8.64)

ˆ ˆ Thus, (a) can be interpreted as a transfer matrix that translates the matrix (x) ˆ + a). In the theory of ordinary of normalized solutions over one period to (x ˆ differential equations with periodic coefficients (Floquet theory), (a) is called the monodromy matrix. At first glance, since the coefficients of the underlying equation are periodic, one may want to look for two linearly independent solutions that would also be periodic with period a. This quickly turns out to be a false trail. In fact, even a single periodic solution in general does not exist. A simple example is the equation y  − y = 0: It has only non-periodic exponential solutions even though its coefficients are constant and hence trivially periodic with any desired period. Rather than making a priori assumptions about the possible behavior of solutions to (8.57) or (8.58), one can derive this behavior from the translation relation (8.64). Indeed, applying this relation recursively, we get m ˆ + ma) = (x) ˆ ˆ (x (a) , m = 0, ±1, ±2, . . .

(8.65)

It is clear from this recursion that the behavior of the normalized solutions ψˆ 1 (x), ˆ hence any other solutions which have to be ψˆ 2 (x) embodied in matrix —and linear combinations of ψˆ 1 (x), ψˆ 2 (x)—depends on the properties of the monodromy m ˆ ˆ is governed by matrix (a). More specifically, the qualitative behavior of (a) ˆ its eigenvalues if this matrix is diagonalizable, or by its Jordan form if (a) is defective5 : ˆ (a) = S J S −1 (8.66) ˆ is not defective, then J is a diagwhere matrix S defines a similarity transform. If  onal matrix of its eigenvalues, and the columns of S are the respective eigenvectors:

5 Recall

that a square matrix is called defective if it does not have a complete basis of eigenvectors. Defective matrices are not diagonalizable but can be converted to Jordan form (8.68) by a similarity transform (8.66).

8.4 Analysis of Periodic Structures in 1D

439

J = diag(λ1 , λ2 )

(8.67)

ˆ But if (a) is defective, J is a Jordan matrix: J = Then, (8.64) becomes

or equivalently

  λ 1 0 λ

(8.68)

ˆ + a) = (x) ˆ (x S J S −1 ˜ + a) = (x) ˜ (x J;

˜ ≡ ˆ S 

(8.69)

(8.70)

Recursion (8.65) becomes ˜ + ma) = (x) ˜ (x J m , m = 0, ±1, ±2, . . .

(8.71)

To make this result more transparent, let us start with the generic case of a diagonal matrix J = diag(λ1 , λ2 ). It is convenient, and customary, to introduce a new parameter q, called the Bloch (or Bloch–Floquet) wavenumber, such that λ1,2 = exp(iqa)

(8.72)

ˆ = This substitution is legitimate because, as we shall see (Sect. 8.4), λ1 λ2 = det (a) 1. Note, though, that q = log λ/(ia) is multivalued, defined modulo 2π/a. We shall return to this issue soon. For now, let us assume that the value of q closest to zero (i.e. with the minimum absolute value) is chosen. Under these conditions, according to (8.70), the following relations hold: ψˆ1 (x + a) = exp(iqa) ψˆ1 (x); ψˆ2 (x + a) = exp(−iqa) ψˆ2 (x)

(8.73)

Functions satisfying these conditions are known as Bloch waves.6 When q is purely real, and consequently λ1,2 in (8.72) are complex conjugates on the unit circle, the complex exponential in (8.73) is a pure phase factor; qa is the phase shift over one lattice cell. The Bloch wave in this case can then be interpreted as a traveling wave. Such waves play a central role in the analysis of periodic structures. Note that in general the wavelength 2π/q corresponding to the exp(iq x) factor is different from the spatial period a and is only loosely related to it. If, however, Im q = 0, each solution ψˆ1 (x) exhibits an exponential decay in one of the ±x directions and an exponential growth in the other, by analogy with the textbook case of evanescent waves (which are particular cases of Bloch waves in a homogeneous medium). 6 Felix Bloch (1905–1983), famous Swiss-American physicist, https://en.wikipedia.org/wiki/Felix_

Bloch.

440

8 Applications in Nano-Photonics

In physics books, analysis of electromagnetic fields in infinite periodic structures is limited to Bloch waves (8.73). However, the case of defective monodromy matrices also deserves attention and is qualitatively different. The relevant theory and illustrative examples appear in Sect. 8.4. But first, let us consider the simplest possible case, which will help us to flesh out the abstract algebraic relations above. Example 31 The case of a homogeneous medium is trivial but helps to illustrate the concepts and results. Hill’s equations then are dx u = P −1 v

(8.74)

dx v = −Q u

(8.75)

where P and Q are now just numbers (possibly complex). In matrix–vector form, these equations can be written as dx w = Aw, w ≡

  u , v

 A ≡

0 P −1 −Q 0

 (8.76)

The relevance of these equations to 1D analysis of electromagnetic waves can easily be seen if u, v are identified with (suitably normalized) E and H fields, and P, Q with the material parameters. For now, we keep the equations in their slightly more abstract mathematical form, which makes them applicable not only in electromagnetics but, e.g., in acoustics equally well. Direct algebra shows that the fundamental solutions of (8.76), organized as ˆ matrix, are columns of the    cos q x P −1 q −1 sin q x exam ˆ (x) = (8.77) , q = P −1 Q  −Pq sin q x cos q x [The branch of the square root does not matter, since all matrix entries are even functions of q.] ˆ exam (x) = 1; as we shall soon see, this is not accidental. Direct Notably, det  ˆ exam (a) yields computation of the eigenvalues and eigenvectors of  = exp(±iqa) λexam 1,2  S ˜ = ˆS = 

exam



=

−i i Pq Pq

−i exp(iq x) Pq exp(iq x)

(8.78)



i exp(−iq x) Pq exp(−iq x)

(8.79)  (8.80)

The columns of this matrix are the Bloch waves; their respective wavenumbers are of course ±q, in accordance with (8.78).

8.4 Analysis of Periodic Structures in 1D

441

ˆ Let us now confirm that one special feature of (x) observed in the above example is true for Hill’s equations in general. Namely, ˆ det (x) = 1, ∀x ∈ R

(8.81)

This can be derived from the Abel–Liouville–Jacobi–Ostrogradskii identity for the Wronskian; see, e.g., E. Hairer et al. [HrW93], W. Walter [Wal98b]:  det W (x) = det W (0) exp

x

Tr A(ξ) dξ

(8.82)

0

This identity is valid for any linear system dx w = A(x)w; the columns of matrix W (x) form a set of linearly independent solutions of this system; as a reminder, the determinant of W is called the Wronskian.7 For Hill’s equation, matrix A is defined in (8.57) and has a zero diagonal; hence, Tr A = 0 and the Abel–Liouville–Jacobi–Ostrogradskii identity yields det W (x) = det W (0) ˆ In particular, if (x) is the fundamental matrix of normalized solutions, then (8.81) ˆ follows immediately, since (0) = I by definition. It also follows that, for Hill’s equation, (8.83) λ 1 λ2 = 1 ˆ ˆ The characteristic polynomial for (a) is where λ1,2 are the eigenvalues of (a). ˆ λ+1 = 0 λ2 − Tr (a)

(8.84)

If the coefficients of the differential system, i.e. functions P(x) and Q(x), are real, ˆ then matrix (a) is real as well, and the eigenvalues λ1,2 can be either real and reciprocal or, alternatively, complex conjugate and lying on the unit circle. The characteristic equation has solutions λ1,2

 

1 2 ˆ ˆ Tr (a) ± Tr (a) − 4 = 2

(8.85)

ˆ and hence the type of λ will depend on whether |Tr (a)| is greater or less than two, ˆ |Tr (a)| = 2 being the borderline case. 7 Josef

Hoëné de Wronski (1778–1853) proposed theories of everything in the universe based on properties of numbers, designed caterpillar-like vehicles intended to replace railroad transportation, tried to square the circle and attempted to build both a perpetual motion machine and a device to predict the future. He also studied infinite series whose coefficients are the determinants now known as the Wronskians. http://en.wikipedia.org/wiki/Josef_Wronski; http://www.angelfire.com/scifi2/ rsolecki/jozef_maria_hoene_wronski.html.

442

8 Applications in Nano-Photonics

ˆ If |Tr (a)| > 2, the eigenvalues are real and the corresponding Bloch parameter q of (8.72) is purely imaginary. Equation (8.65) then shows a trend of exponential increase of the solution for x → ∞ or x → −∞ (depending on the sign of λ). If the differential equation describes the field behavior in an infinite medium, such exponentially growing solutions are deemed non-physical. ˆ In contrast, for |Tr (a)| < 2 the eigenvalues are complex conjugate and lie on the unit circle. As mentioned previously, this physically corresponds to Bloch waves, which are central in the electromagnetic analysis of periodic structures not only in 1D, but also in 2D and 3D (see subsequent sections). ˆ For the borderline case |Tr (a)| = 2, with its subcases, the eigenproblem is analyzed in detail by M. S. P. Eastham [Eas73] but is hardly ever considered in the physics literature. The presentation below is different from, but ultimately equivalent to, Eastham’s analysis. ˆ are both In this special case, it follows immediately from (8.85) that λ1,2 ((a)) equal to +1 or −1. This gives four options for the canonical matrix J in (8.71):  J1−4 = I2 , − I2 ,

  1 1 −1 , 0 1 0

 1 −1

(8.86)

where I2 is the 2 × 2 identity matrix. Four versions of matrix J m , which according ˆ at x and x + a for any x, then are to (8.71) relates the solution values     1 m (−1)m (−1)m−1 m , m = 0, ±1, ±2, . . . , 0 (−1)m 0 1 (8.87) The qualitative behavior of the Bloch waves in each of these cases is clear: For J1 = I2 , the waves are periodic with the period a, and for J2 = −I2 —antiperiodic (and periodic with the period 2a); for J3,4 , there will be a linear trend in the norm of the solutions as functions of m. We now consider three examples of second-order equations with periodic coefficients. The first two (Examples 32, 33) illustrate special cases where fields in a periodic medium do not behave as classical Bloch waves. The third one (Example 34, Sect. 8.4) is key to understanding multilayered optical structures and photonic crystals, as discussed in Sects. 8.6, 8.9. 

m = I2 , (−1)m I2 , J1−4

Example 32 Equation u  (x) + exp(iκx) u(x) = 0;

κ = 2πa −1

(8.88)

is an interesting illustrative case. Let us first assume that solution u(x) has a valid Fourier transform U (k) at least in the sense of distributions (a discrete spectrum is viewed as a particular case of a continuous spectrum—a set of Dirac delta functions at some frequencies; see Appendix 6.15, Sect. 6.5). Since multiplication by exp(iκx) amounts simply to a spatial frequency shift in the Fourier domain, and the second derivative translates

8.4 Analysis of Periodic Structures in 1D

443

into multiplication by −k 2 , Eq. (8.88) in Fourier space is − k 2 U (k) + U (k − κ) = 0

(8.89)

Viewing this as a recursion relation U (k − κ) = k 2 U (k)

(8.90)

one observes that the sequence of values U (k − κ, U (k − 2κ), U (k − 3κ), . . . , will generally be unbounded, with rapidly growing magnitudes. There is only one exception: This backward recursion gets terminated if k = nκ for some positive integer n. Then, U (−κ) = U (−2κ) = . . . = 0 due to (8.90). In this exceptional case, the spectrum is discrete, with some values Un at spatial frequencies kn = nκ (n = 0,1, . . .). Normalizing U0 to unity and reversing recursion (8.90) to get Un (8.91) Un+1 = 2 κ (n + 1)2 one obtains Un =

1 (n! κn )2

(8.92)

Hence, one solution is expressed via the Fourier series u(x) =

∞ n=0

1 exp(inκx) (n! κn )2

(8.93)

Due to the presence of factorials in the denominators in (8.93), the Fourier series and its derivatives are uniformly convergent, so it is legal to differentiate the series and verify that its sum satisfies the original equation (8.88). This Fourier series solution is obviously periodic with the period a. What about a second linearly independent solution? From the Fourier analysis above, it is clear that the second solution cannot have a valid Fourier transform, and (8.87) suggests that it will tend to grow linearly. The following numerical results for a = 1 (κ = 2π) illustrate the behavior of the solutions. The fundamental system ψ1,2 was computed by high-order Runge– Kutta methods (see Sect. 2.4.1) for Eq. (8.88). MATLAB function ode45 was used, with the relative and absolute tolerances of 10−10 . For ψ1 , the initial conditions are ψ1 (0) = 1, dt ψ1 (0) = 0; for ψ2 , ψ2 (0) = 0 and dt ψ2 (0) = 1. The real and imaginary parts of these functions are plotted in Figs. 8.1, 8.2 for reference. ˆ The governing matrix (a) comprising the values of these solutions at x = a = 1, is, with six digits of accuracy, ˆ (a) ≈



1 − 0.165288i 1.051632 0.0259787 1 + 0.165288i

 (8.94)

444

8 Applications in Nano-Photonics

Fig. 8.1 Real (left) and imaginary (right) parts of the first fundamental solution for the example equation (8.88)

Fig. 8.2 Real (left) and imaginary (right) parts of the second fundamental solution for the example equation (8.88)

ˆ Matrix (a) has a double eigenvalue of one, which numerically also holds with six digits of accuracy. The Fourier series solution (8.93) of the original equation (8.88) is a linear combination of ψ1,2 with the coefficients 1.025491 and 0.161179i. One way of finding ˆ this coefficient vector is to solve the linear system with matrix (a) and the righthand side vector containing the values of the Fourier series solution and its derivative at x = a = 1. This right-hand side is (1.025491, 0.161179i)T —not coincidentally, identical with the coefficient vector above, as both of them are nothing other than ˆ the eigenvector of (a) corresponding to the unit eigenvalue. The above example goes to show that, contrary to what standard physics texts may lead us to believe, solutions on periodic lattices do not always reduce to Bloch waves (8.73) with a real or complex parameter q. There are borderline cases where a linearly growing solution exists. It may be tempting to attribute this behavior to the fact that the differential operator in the left-hand side of (8.88) is non-Hermitian. However, the following example shows that this guess is not correct; see also “Further reading” on Sect. 8.5. Example 33 Consider an equation similar to the one in the previous example, but with real coefficients:

8.4 Analysis of Periodic Structures in 1D

445

Fig. 8.3 Two canonical normalized solutions ψ1,2 obtained with MATLAB’s ode45 for the Mathieu equation (8.95). The slight wobble in ψ2 is a numerical artifact

u  (x) + (α − 2β cos 2x) u(x) = 0, α, β ∈ R

(8.95)

The coefficients of this equation are obviously π-periodic.8 Solutions are known as Mathieu functions, the theory of which is extensive and detailed: N. W. McLachlan [McL64], J. Meixner et al. [MSW80], F. M. Arscott [Ars13], M. M. Bibby and A. F. Peterson [BP13]. For our purposes here, only a small piece of that theory is needed. Namely, for any real β there exist infinitely many values of α, called characteristic numbers, for which the monodromy matrix of (8.95) is defective. For numerical illustration, we pick the smallest characteristic number, given for small β by the following asymptotic expansion (M. Abramowitz and I. A. Stegun (AS) [AS83, 20.2.25, p. 724], NIST DLMF [NIS, 28.6.1]): 1 7 4 29 6 68687 8 β − β + β + ... α AS = − β 2 + 2 128 2304 18874368

(8.96)

I chose β = 0.1, for which the smallest characteristic number is9 α ≈ −0.004994544, and both eigenvalues of the monodromy matrix differ from unity by ∼10−6 numerically. As in the previous example, the two normalized solutions were obtained with MATLAB’s ode45, used with the relative and absolute tolerances of 10−10 (Fig. 8.3). As in the previous example, it is evident that, while one of the solutions is periodic, the other one grows linearly and is not a Bloch wave with complex exponential scaling across a lattice cell. Additional remarks can be found in Sect. 8.5. Example 34 We now turn to a case that is directly applicable to 1D-periodic multilayered structures in photonics. Consider a layered structure with alternating electromagnetic material parameters 1 , μ1 and 2 , μ2 (Fig. 8.4). Let us focus on normal rather than α and q rather than β; this had to be changed to avoid conflict with the lattice cell size a and the Bloch wavenumber q in this chapter. 9 With a numerical adjustment on the order of 10−7 relative to the (8.96) series. 8 The standard notation in this equation is a

446

8 Applications in Nano-Photonics

Fig. 8.4 An electromagnetic wave traveling through a multilayered 1D structure with normal incidence

incidence (direction of propagation k perpendicular to the slabs); oblique incidence does not create any principal difficulties. As theory prescribes, we first find the fundamental solutions and compute the ˆ transfer matrix (a). However, since the coefficients of the underlying differential equation are now discontinuous, the equation should be treated in weak form or, equivalently, the proper boundary conditions at the material interfaces should be imposed: −1   E 1 (d1 ) = E 2 (d1 ); μ−1 1 E 1 (d1 ) = μ2 E 2 (d1 ) at the interface x = d1

(8.97)

where the origin (x = 0) is assumed to be at the left edge of the layer of thickness d1 . Similar conditions hold at x = d1 + d2 and all other interfaces. The general solution of the differential equation within layer 1 is E 1 (x) = E 0 cos(k1 x) + k1−1 E 0 sin(k1 x), k1 = ω(μ1 1 ) 2

1

(8.98)

where the prime denotes x-derivatives and the coefficients E 0 and E 0 are equal to the values of E 1 and its derivative, respectively, at x = 0. The √ choice of the square root branch is not critical here; for definiteness, we can take 1 = 1. We shall now “propagate” this solution through layers 1 and 2, with the final goal of obtaining the transfer matrix once the solution is evaluated over the whole period a = d1 + d2 . First, we “follow” the solution to the interface between the layers, where it becomes (8.99) E 1 (d1 ) = E 0 cos(k1 d1 ) + k1−1 E 0 sin(k1 d1 ) and its derivative, one the side of layer 1, is

8.4 Analysis of Periodic Structures in 1D

447

E 1 (d1 ) = −k1 E 0 sin(k1 d1 ) + E 0 cos(k1 d1 )

(8.100)

Due to the interface boundary condition, the electric field and its derivative at x = d1 in the second layer are E 2 (d1 ) = E 1 (d1 ) = E 0 cos(k1 d1 ) + k1−1 E 0 sin(k1 d1 ) E 2 (d1 ) =

μ2 μ2  E 0 sin(k1 d1 ) + E 0 cos(k1 d1 ) E 1 (d1 ) = − k1 μ1 μ1

(8.101) (8.102)

Repeating this calculation for the second layer, with the “starting” values of the field and its derivative defined by (8.101), (8.102), one obtains the general solution just beyond the second layer at x = (d1 + d2 )+0 . (Subscript “+0” indicates the limiting value from the right.) The first fundamental solution is obtained by setting E 0 = 1, E 0 = 0 and the ˆ 1 + d2 ) has these second one by setting E 0 = 0, E 0 = 1. The transfer matrix (d two solutions as its columns and is calculated to be ˆ 11 (d1 + d2 )+0 = cos(k1 d1 ) cos(k2 d2 ) − 

sin(k1 d1 ) cos(k2 d2 ) μ2 cos(k1 d1 ) sin(k2 d2 ) + k1 μ1 k2

ˆ 12 (d1 + d2 )+0 =  ˆ 21 (d1 + d2 )+0 = − 

k1 μ2 sin(k1 d1 ) sin(k2 d2 ) k2 μ1

(8.103)

(8.104)

μ1 k2 cos(k1 d1 ) sin(k2 d2 ) − k1 sin(k1 d1 ) cos(k2 d2 ) (8.105) μ2

ˆ 22 (d1 + d2 )+0 = − 

μ1 k2 sin(k1 d1 ) sin(k2 d2 ) + cos(k1 d1 ) cos(k2 d2 ) (8.106) μ2 k1

We already know that the nature of Bloch-periodic solutions depends on the trace of ˆ 1 + d2 ): (d ˆ 1 + d2 ) = 2 cos(k1 d1 ) cos(k2 d2 ) − Tr (d



k1 μ2 k2 μ1 + k2 μ1 k1 μ2

 sin(k1 d1 ) sin(k2 d2 ) (8.107)

This expression is well known in optics—see, e.g., J. Li et al. [LZCS03], I. V. Shadrivov et al. [SSK05], P. Yeh [Yeh05]. In the literature, one may find alternative, but ultimately equivalent, ways of deriving (8.107). Numerical illustration. In the periodic structure of Fig. 8.4, assume that the widths of the layers are equal and normalized to unity, d1 = d2 = 1; materials are non-magnetic (relative permeabilities μ1 = μ2 = 1); the relative dielectric constants are chosen as 1 = 1, 2 = 5. For any given frequency ω, we can then calculate the trace of the transfer matrix by (8.107), with k1,2 = ωc−1 (μ1,2 1,2 )1/2 . This trace is plotted in Fig. 8.5. (The speed

448

8 Applications in Nano-Photonics

Fig. 8.5 Trace of the transfer matrix as a function of frequency. Periodic structure with d1 = d2 = 1; μ1 = μ2 ; 1 = 1, 2 = 5. Shaded areas indicate photonic bandgaps

Fig. 8.6 Bandgap structure: frequency versus Bloch wavenumber. Periodic structure with d1 = d2 = 1; μ1 = μ2 = 1; 1 = 1, 2 = 5

of light in free space is for simplicity normalized to one by a suitable choice of units.) As we have seen earlier in this section, propagating waves cannot exist in the infinite structure if the absolute value of the matrix trace exceeds two; the corresponding frequency gaps are shaded in Fig. 8.5. It is the relationship between this wavenumber and frequency that characterizes the bandgap structure. The wavenumber is typically displayed on the horizontal axis and frequency on the vertical one (Fig. 8.6).

8.5 Remarks and Further Reading on Bloch Waves

449

8.5 Remarks and Further Reading on Bloch Waves In this reference section, let us start with a brief summary of Floquet theory in one dimension (M. S. P. Eastham [Eas73], W. Magnus and S. Winkler [MW79], A. Cabada et al. [CACLS17]). Here is a slightly shortened excerpt from [Eas73, Sect. 1.1], with very minor changes of notation.10 We begin with the general second-order equation a0 (x) y  (x) + a1 (x) y  (x) + a2 (x) y(x) = 0

(8.108)

in which the coefficients ar (x) are complex-valued, piecewise-continuous, and periodic, all with the same period a. Thus ar (x + a) = ar (x) (0 ≤ r ≤ 2) where a is a non-zero real constant. It is also assumed that the left- and right-hand limits of a0 (x) at every point are non-zero, so that the usual theory of linear differential equations without singular points applies. ... The first remark we make about (8.108) is that the equation is unchanged when x is replaced by x + a. It follows that, if ψ(x) is a solution of (8.108), then so also is ψ(x + a). Generally, however, ψ(x + a) is not the same as ψ(x) and, in fact, (8.108) need not have any nontrivial solution with period a. For example, the solutions of y  (x) − y(x) = 0 are linear combinations of e x and e−x , and no non-trivial one has any real period a. Nevertheless, we prove below that (8.108) does have the property that there are a non-zero constant ρ and a non-trivial solution ψ(x) such that ψ(x + a) = ρψ(x)

(8.109)

This property leads to the existence of two linearly independent solutions of (8.108) having the important special form given in Theorem 9 below. These results and their proofs are known as the Floquet theory after G. Floquet (1883).

Theorem 9 (Floquet, Theorem 1.1.2 in [Eas73]) There are linearly independent solutions ψ1,2 (x) of (8.108) such that either (i) ψ1 (x) = exp(m 1 x) p1 (x), ψ2 (x) = exp(m 2 x) p2 (x), where m 1 and m 2 are constants, not necessarily distinct, and p1 (x) and p2 (x) are periodic with period a; or (ii) ψ1 (x) = exp(mx) p1 (x), ψ2 (x) = exp(mx) {x p1 (x) + p2 (x)}, where m is a constant and p1 (x) and p2 (x) are periodic with period a. As the reader will easily see, this piece of theory is consistent with our analysis in Sect. 8.4, subject only to minor notational differences. Note again a subtle but important point: as Eastham states, there is a non-trivial solution satisfying the Bloch–Floquet scaling condition (8.109). Although usually two such linearly independent solutions exist, there are exceptional cases—(ii) in Theorem 9—when the second solution grows linearly with x. Obviously, in multiple dimensions the behavior of various solutions may be even more involved. 10 Also,

the numbering of equations has been made consistent with that of this chapter.

450

8 Applications in Nano-Photonics

Since the exceptional case (ii) does not normally occur in practice, the physics literature focuses, understandably, on ordinary Bloch waves satisfying (8.109). Typically, the “Bloch theorem” is referred to but not rigorously formulated; only an expression similar to (8.109) is given. As an example, in their well-known textbook, N. W. Ashcroft and N. D. Mermin write [AM76, Chap. 8]11 : [Bloch’s] Theorem. The eigenstates ψ of the one-electron Hamiltonian H = −2 ∇ 2 /2m + U (r), where U (r + R) = U (r) for all R in a Bravais lattice, can be chosen ... [as] ψnk = eik·r u nk (r)

(8.110)

u nk (r + R) = u nk (r)

(8.111)

where for all R in a Bravais lattice.

In a footnote, Ashcroft and Mermin identify the 1D version of this statement as Floquet’s theorem. From the mathematical point of view, however, Floquet’s theorem is richer than just (8.110), as we have seen. One elegant attempt at proving the Bloch theorem (however formulated) is particularly popular in the physics literature and can be condensed to three steps: 1. Maxwell’s differential operators commute with the lattice shift (translation) operator T : ψ(x) → ψ(x + a) in 1D, with a straightforward generalization to multiple dimensions. 2. Commuting Hermitian operators possess a simultaneous set of eigenfunctions. 3. The eigenfunctions of the translation operator T have the form of Bloch waves. Hence, the eigenfunctions of Maxwell’s differential operators must also have that form.   However, as Examples 32, 33 show, there must be holes in this line of reasoning. The differential operators in these examples do commute with the translation operator, yet lack a complete basis set of Bloch waves. The problematic part of the analysis is item 2 above. It is formulated vaguely and is, generally speaking, not correct. Mathematicians do recognize that; for example, B. C. Hall [Hal13, Sect. 6.1] writes12 : Simple examples ... show that a self-adjoint operator may not have any eigenvectors. Consider, for example, H = L 2 ([0, 1]) and an operator A on H defined by (Aψ)(x) = xψ(x)

(8.112)

Then A satisfies ϕ, Aψ for all ϕ, ψ ∈ L 2 ([0, 1]), and yet A has no eigenvectors. ... only the zero element of L 2 ([0, 1]) satisfies Aψ = λψ. Now, a physicist would say that the operator A in (8.112) does have eigenvectors, namely the distributions δ(x − λ). ... These distributions indeed satisfy xδ(x − λ) = λδ(x − λ), but they do not belong to the Hilbert space L 2 ([0, 1]). Such “eigenvectors,” which belong transition from quantum mechanics to photonics, one replaces U with  and ignores the coefficient before the Laplace operator. 12 Minor notational changes made. 11 To

8.5 Remarks and Further Reading on Bloch Waves

451

to some larger space than H , are known as generalized eigenvectors. Even though these generalized eigenvectors are not actually in the Hilbert space, we may hope that there is some sense in which they form something like a orthonormal basis. ... Let us mention in passing that our simple expectation of a true orthonormal basis of eigenvectors is realized for compact self-adjoint operators ... The operators of interest in quantum mechanics, however, are not compact.

When applied to electromagnetic analysis, B. C. Hall’s statements are even stronger than for quantum mechanics, since, in addition to not being compact, the differential operators may be non-Hermitian (e.g. for lossy media). Note, incidentally, that even if a linear operator has no eigenfunctions, it still commutes with the identity operator, which has the whole Hilbert space of eigenfunctions. This makes it clear that the simultaneous eigenstate theorem, as formulated in the physics texts,13 cannot be true without additional conditions. Part of the problem is that true statements about matrices and linear operators in finite-dimensional spaces cannot be automatically translated to infinite-dimensional Hilbert spaces. But even for commuting matrices, the simultaneous eigenbase theorem is true only under certain conditions. For reference, here are two linear algebra results. Theorem 10 (K. M. Hoffman and R. Kunze [HK71, Chap. 6]) Let F be a commuting family of diagonalizable linear operators on the finite-dimensional vector space V . There exists an ordered basis for V such that every operator in F is represented in that basis by a diagonal matrix. Theorem 11 (F. R. Gantmacher [Gan90, Chap. IX, §10, Lemma 1]) Permutable operators A and B (AB = B A) always have a common characteristic vector. In Theorem 10, note the stipulation of diagonalizability. In Theorem 11, note the difference: “a common vector” rather than a complete set of common vectors. What does one make of all that? Physicists may be right to ignore the technicalities and assume that the statement about the common eigenstates of commuting operators is unconditionally true for Hermitian ones in quantum mechanics and even for nonHermitian ones in electromagnetism. Exceptional cases are indeed unlikely to have practical significance. To a mathematician, however, the underlying theory is quite intricate (M. Reed and B. Simon [RS78, Chap. XIII], G. Teschl [Tes14, Sect. 4.2]). Floquet theory in 1D is much more clear-cut but still non-trivial; a few monographs have already been cited (M. S. P. Eastham [Eas73], W. Magnus and S. Winkler [MW79], W. Walter [Wal98b]). A functional analytic treatment of problems with periodic coefficients in 1D and multiple dimensions is presented by P. Kuchment in [Kuc93, Sect. 4.3], where further references are given. Note, however, that Kuchment’s book is accessible only to expert mathematicians.

13 For example, N. W. Ashcroft and N. D. Mermin refer to it as a “fundamental theorem of quantum mechanics” [AM76, Chap. 8]. For the proof, they cite D. Park [Par64], who in turn cites H. A. Kramers [Kra57, p. 139] and E. Merzbacher [Mer70].

452

8 Applications in Nano-Photonics

For physical theories and applications related to electromagnetic waves in periodic structures, the standard references are the books by P. Yeh [Yeh05], K. Sakoda [Sak05], S. G. Johnson and J. D. Joannopoulos [JJ01].

8.6 Band Structure by Fourier Analysis (Plane Wave Expansion) in 1D The fundamental matrix that played a central role in Sect. 8.4 is more important for theoretical analysis than for practical computation, as it contains analytical solutions that may be complicated or unavailable. In particular, the approach cannot be extended to two and three dimensions, where infinitely many independent solutions exist and usually cannot be obtained analytically. Fourier analysis (plane wave expansion, PWE) is a common practical alternative for analyzing and computing the band structure in any number of dimensions. The 1D case is considered in this section, and 2D–3D computation is taken up later in this chapter. For simplicity of exposition, let us assume a lossless non-magnetic periodic medium, where the electric field E = E y (x) is governed by the wave equation E  (x) + k02 (x)E(x) = 0; k0 =

ω c

(8.113)

This equation is written in the Gaussian system, mainly to avoid the nuisance factors 0 and μ0 .14 (x) is assumed to be a piecewise-smooth function with a period a. We are looking for a solution in the form of the Bloch–Floquet wave ˜ E(x) = E(x) exp(iq x)

(8.114)

˜ ˜ where E(x) is an a-periodic function and q is the Bloch wavenumber. Both E(x) and q are a priori unknown and need to be determined. It is helpful, and straightforward, to rewrite (8.113) in terms of the periodic factor ˜ ˜ ˜ exp(iq x), which E(x). Differentiation yields [ E(x) exp(iq x)] = [ E˜  (x) + iq E(x)] can be represented by the symbolic substitution ∇ → ∇ + iq; in 1D, ∇ should be understood simply as the x-derivative. After the second differentiation, (8.113) becomes ˜ = 0 (8.115) E˜  (x) + 2iq E˜  (x) + (k02 (x) − q 2 ) E(x) ˜ As a periodic function, E(x) can be represented by its Fourier series with some coefficients em (m = 0, ±1, ±2, . . .)

14 In

the SI system, the prefactor in the second term of (8.113) is ω 2 μ0 instead of k02 .

8.6 Band Structure by Fourier Analysis (Plane Wave Expansion) in 1D

˜ E(x) =



em exp(imκ0 x), κ0 =

m=−∞

453

2π a

(8.116)

Similarly, (x) can be expressed as a Fourier series with coefficients m : ∞

(x) =

m exp(imκ0 x)

(8.117)

m=−∞

The Fourier coefficients em are given by the standard integral expressions em = a

−1



˜ E(x) exp(−imκ0 x) d x

(8.118)

a

and similarly for m , where the integration is over any period of length a. Now we are in a position to Fourier-transform the modified wave equation (8.115). ˜ In Fourier space, multiplication (x) E(x) (i.e. multiplication of the Fourier series (8.116) and (8.117)) turns into convolution. The first derivative E˜  (x) turns into multiplication of each term by imκ0 , and the second derivative—into multiplication by (imκ0 )2 = −(κ0 m)2 . Equation (8.115) in Fourier space becomes K2 e = k02 e

(8.119)

Here, e = (. . . , e−2 , e−1 , e0 , e1 , e2 , . . .)T is the (infinite) column vector of Fourier coefficients of the field; K is an infinite diagonal matrix with the entries km = q + κ0 m; that is, (8.120) K = q I + κ0 N , where I is the identity matrix and ⎛ ... ⎜. . . ⎜ ⎜. . . ⎜ N = ⎜ ⎜. . . ⎜. . . ⎜ ⎝. . . ...

... −2 ... ... ... ... ...

... ... −1 ... ... ... ...

... ... ... 0 ... ... ...

... ... ... ... 1 ... ...

... ... ... ... ... 2 ...

⎞ ... . . .⎟ ⎟ . . .⎟ ⎟ . . .⎟ ⎟ . . .⎟ ⎟ . . .⎠ ...

(8.121)

Matrix  in (8.119) represents the Fourier-space convolution  ∗ E˜ and is composed of the Fourier coefficients of : ml = m−l (8.122) for any row m and column l (−∞ < m, l < ∞).

454

8 Applications in Nano-Photonics

The infinite-dimensional eigenproblem (8.119) must in practice be truncated to a finite number of harmonics. The computational trade-off is clear: As the number of harmonics grows, accuracy tends to increase, but so does the computational cost. Example 35 Volume grating. This problem is mentioned in L. I. Mandelshtam’s paper [Man45] and will be of even greater interest to us in the context of backward waves and negative refraction (Sect. 8.15). Consider a volume grating characterized by a sinusoidally changing permittivity of the form (x) = 1 + 2 cos(2πx/a), with some parameters 1 > 2 > 0. As a numerical example, let 1 = 2, 2 = 1, a = 1, so that the permittivity and its Fourier decomposition are (x) = 2 + cos 2πx = 2 +

1 1 exp(i2πx) + exp(−i2πx) 2 2

Thus,  has only three nonzero Fourier coefficients: ±1 = 1/2, 0 = 2. (The SI permittivity of free space is not used in this example, so there should be no confusion with the Fourier coefficient 0 .) The diagonal matrix K2 has entries Km2 = (q + 2πm)2 , m = 0, ±1, ±2, . . . and matrix  is tridiagonal, with the entries in the mth row equal to m,m = 0 = 2; m±1,m = ±1 =

1 2

For any given value of the Bloch parameter q, numerical solution can be obtained by truncating the infinite system to the algebraic eigenvalue problem with 2M + 1 equations (m = −M, −M + 1, . . . M − 1, M). The first four dispersion bands k0 a(qa) are shown in Fig. 8.7; there are two frequency bandgaps in the figure, approximately [1.98, 2.55] and [4.40, 4.68] for the values of k0 a, and infinitely many more gaps beyond the range of the chart. (Obviously, one could equally well use the frequency ω = k0 c as a parameter instead of k0 .) The numerical results are plotted for 41 equally spaced values of the normalized Bloch number qa/π in [−1, 1]. There is no appreciable difference between the numerical results for M = 5 (11 equations) and M = 20 (41 equations). The high accuracy of the eigenfrequencies for a small number of plane waves in the expansion is due to the smooth variation of the permittivity. Discontinuities in  would require a higher number of harmonics (Sect. 8.10.3). In addition to the eigenvalues k02 = ω 2 /c2 of (8.119), the eigenvectors e are also of interest. As an example, let us set qa = π/10. Stem plots of the four eigenvectors corresponding to the four smallest eigenvalues (k0 a)2 ≈ 0.049, 18.29, 23.12 and 77.83 are shown in Fig. 8.8. The first Bloch wave in Fig. 8.8(a) is almost a plane wave; the amplitudes of all harmonics other than e0 are very small (but not zero, as it might appear from the figure); for example, e−1 ≈ 0.00057, e1 ≈ 0.00069.

8.6 Band Structure by Fourier Analysis (Plane Wave Expansion) in 1D

455

Fig. 8.7 Bandgap structure for the volume grating with (x) = 2 + cos 2πx. Solid line—M = 5 (2 × 5 + 1 = 11 plane waves); circles—M = 20 (2 × 20 + 1 = 41 plane waves)

Fig. 8.8 Amplitudes of the plane wave components of the first four Bloch waves a–d for the volume grating with (x) = 2 + cos 2πx. Solution with 41 plane waves. qa = π/10

456

8 Applications in Nano-Photonics

It is interesting to note that dispersion curves with positive and negative slopes ∂ω/∂q (i.e. positive and negative group velocity) alternate in the diagram. Group velocity is positive for the lowest-frequency curve ω1 (q), negative for ω2 (q), positive again for ω3 (q), etc. This interesting issue will be further discussed in the context of backward waves and negative refraction (Sect. 8.15.4.1).

8.7 Characteristics of Bloch Waves 8.7.1 Fourier Harmonics of Bloch Waves For analysis and physical interpretation of the properties of Bloch waves—in particular, energy flow and the meaning of phase velocity— it is convenient to view these waves as a suite of Fourier harmonics. Unfortunately, some authors call these individual harmonics “Bloch waves”; we shall avoid this confusion. It is important to note from the outset, as B. Lombardet et al. do in [LDFH05], that the individual plane wave components of the electromagnetic Bloch wave do not satisfy Maxwell’s equations in the periodic medium and therefore do not represent physical fields. Only taken together do these Fourier harmonics form a valid electromagnetic field. The ideas are most easily explained in the 1D case but will be extended to 2D and 3D in subsequent sections. Consider one more time the Bloch wave ˜ E(x) = E(x) exp(iq x)

(8.123)

As before, the tilde field E˜ indicates a spatially periodic function with a given period a. Here again is its Fourier series expansion (8.116) for easy reference: ˜ E(x) =

∞ m=−∞

em exp(imκ0 x), κ0 =

2π a

(8.124)

This decomposition has a clear physical interpretation as a superposition of plane waves: E(x) =



E m (x),

E m (x) ≡ em exp(ikm x), km ≡ q + mκ0

(8.125)

m=−∞

To simplify the algebraic expressions, let us, as before, assume intrinsically nonmagnetic materials. At optical frequencies, this is the case anyhow (L. D. Landau and E. M. Lifshitz [LLP84], §60).15 Then, the above expression for E(x) leads, via the Maxwell ∇ × E equation, to a similar decomposition of the magnetic field 15 Magnetic

response of metamaterials, discussed in detail in Chap. 9, is a separate “emergent” phenomenon, not directly related to the intrinsic permeability of the constituents.

8.7 Characteristics of Bloch Waves

H ≡ Hz : H (x) =

457

∞ 1 ∂E 1 (ikm ) em exp(ikm x) = ik0 μ ∂x ik0 μ m=−∞

=

∞ km em exp(ikm x) k μ m=−∞ 0

(8.126)

8.7.2 Fourier Harmonics and the Poynting Vector Consider now the Fourier decomposition of the time-averaged Poynting vector (power flow) P = Re{E × H∗ }/2. In the 1D case, this vector has only one component P = Px . In the system, 1 Re{E(x)H ∗ (x)} (SI) 2

(8.127)

c 1 Re{E(x)H ∗ (x)} (Gaussian) 4π 2

(8.128)

P(x) = P(x) =

In Fourier space, the product E H ∗ turns into convolution-like summation. The expression simplifies for lossless materials ( real) because then the Poynting vector must be constant; hence, pointwise values P(x) are obviously equal to the spatial average P . This average value over one period of the structure is easy to find due to the orthogonality of Bloch harmonics ψm = exp(ikm x) (km = q + mκ0 ):  (ψm , ψl ) ≡ a

ψm ψl∗ d x =

 exp(ikm x) exp(−ikl x) d x a

 =

exp[i(m − l)κ0 x] d x = 0, l = m a

The last equality represents orthogonality of the standard Fourier harmonics over one period. The Bloch harmonics have the same property because the exp(iq x) factor in one term of the integrand is canceled by the exp(−iq x) factor in the other, complex conjugate, term. (This is true for lossless media when the Bloch wavenumber q is purely real.) Parseval’s theorem then allows us to rewrite the Poynting vector of the Bloch wave (8.127), (8.128), in the lossless case, as the sum of the Poynting vectors of the individual plane waves:

458

8 Applications in Nano-Photonics ∞

km c km |em |2 (SI), |em |2 Gaussian, m = 0, ±1, . . . 2ωμ 4π 2k μ 0 m=−∞ (8.129) In 2D and 3D, an analogous identity holds true for the time–space averaged Poynting vector (B. Lombardet et al. [LDFH05])—again, due to the orthogonality of the Fourier harmonics. In 1D, the Poynting vector is constant and hence the spatial averaging is redundant. P =

Pm ; Pm =

8.7.3 Bloch Waves and Group Velocity For the same reason as in homogeneous media (Sect. 8.3.3), one may anticipate a connection between the Poynting vector, group and energy velocities of Bloch waves. The Poynting vector and group velocity are associated with energy flow and signal (information) transfer, respectively. One can define group velocity in essentially the same way as for waves in homogeneous media: ∂ω (8.130) vg = ∂q q being the Bloch wavenumber. Recall that q generates a whole “comb” of wavenumbers q + mκ0 , where m is an arbitrary integer and κ0 = 2π/a. Since any two numbers in the comb differ by a constant independent of q, differentiation in (8.130) can in fact be performed with respect to any of the comb values q + mκ0 . Loosely speaking, the group velocities of all plane wave components of the Bloch wave are the same. (“Loosely”—because these components do not exist separately as valid physical waves in the periodic medium, and therefore their group velocities are mathematical but arguably not physical quantities.) To see that this definition of group velocity bears more than superficial similarity to the same notion for homogeneous media, we need to demonstrate that vg in (8.130) is in fact related to signal velocity. To this end, let us follow the analysis in Sect. 8.3.2. We shall again consider, as a characteristic case, a pointwise source that produces amplitude modulation with a low-frequency waveform E(0, t) at x = 0 (8.39): E(x = 0, t) = E(x = 0, t) exp(−iω0 t)

(8.131)

In a homogeneous medium, each frequency component of this source gives rise to a plane wave, which leads to expression (8.41), Sect. 8.3.2 for the field at an arbitrary location x > 0. In the periodic medium, plane waves are replaced with Bloch waves, so that in lieu of (8.41) one has  E(x, t) =

∞ −∞

ˆ ω − ω0 ) E(x, ˜ E(0, ω) exp[iq(ω)x] exp(−iωt) dω

(8.132)

8.7 Characteristics of Bloch Waves

459

˜ where E(x, ω) is the space-periodic factor in the Bloch wave normalized for convenience to unity at x = 0. Of the two possible Bloch waves, Eq. (8.132) contains the one with the Poynting vector (energy flow) in the +x direction. The respective low-frequency “signal” E(x, t) is  E(x, t) = E(x, t) exp(iω0 t) = where as before



−∞

ˆ = 0, ω  ) E(x, ˜ E(x ω  ) exp[iq(ω  )x] dω  (8.133)

ω  ≡ ω − ω0

The velocity of this signal can again be found by setting the differential dE(x, t) to zero. This velocity is the ratio of partial differentials of E(x, t) with respect to t and x. For homogeneous media, these partial derivatives are given by expressions (8.43) and (8.44) on Sect. 8.3.2. For Bloch waves, due to the dependence of E˜ on x, the x-derivative acquires an additional (and unwanted) term 

 ˜ ˆ ω  ) ∂ E(x, ω ) exp[iq(ω  )x] dω  E(0, ∂x −∞ ∞

This field contains rapidly oscillating spatial components: ∞ ˜ ∂ E(x, ω ) = iκ0 em m exp(imκ0 x) ∂x m=−∞

(8.134)

A useful “macroscale” signal can be defined in a natural way as the average of this field over the lattice cell. For the mth spatial harmonic, this average is, as seen from (8.134) a −1

 iκ0 em m exp(imκ0 x) exp(iq x) d x = em κ0 a

exp(iqa) − 1 2πm + qa

This term is small under the additional constraint qa  1—that is, if the Bloch wavelength 2π/q is much greater than the lattice size a. In that case, the analysis on Sect. 8.3.2 remains essentially unchanged and leads to the familiar expression for group velocity (8.130). Other reservations discussed on Sect. 8.3.3 in connection with signal velocity (8.45) must also be borne in mind.

8.8 Two-Dimensional Problems of Wave Propagation Time-harmonic Maxwell’s equations simplify significantly if the fields do not depend on one of the Cartesian coordinates—say, on z—and if there is no coupling in the material parameters between that coordinate and the other two (i.e. x z = 0, etc.).

460

8 Applications in Nano-Photonics

Upon writing out field equations (8.22) and (8.24) (Sect. 8.3, Gaussian system) in Cartesian coordinates, one observes that they break up into two decoupled systems. The first system involves E z , Hx and Hy and for isotropic materials with scalar  = (x, y), μ = μ(x, y) has the form ∂ y E z = ik0 μHx ∂x E z = −ik0 μHy ∂x Hy − ∂ y Hx = −ik0 E z

(8.135) (8.136) (8.137)

In the SI system, one has ω instead of k0 in the right-hand side of the above equations; also, under the exp(+iωt) phasor convention, these right-hand sides would have the opposite signs. It is well known that the magnetic field can be eliminated from this set of equations, with the Helmholtz equation resulting for E z . Indeed, multiplying the first two equations by μ−1 and differentiating, we get ∂ y (μ−1 ∂ y E z ) = ik0 ∂ y Hx ∂x (μ−1 ∂x E z ) = −ik0 ∂x Hy

(8.138) (8.139)

The difference of these two equations, with (8.137) in mind, leads to ∇ · (μ−1 ∇ E z ) + k02 E z = 0

(8.140)

[In the SI system, ω appears in this equation instead of k0 .] In the special but important case of constant μ, this becomes ∇ 2 E z + k 2 (r)E z = 0, μ = const

(8.141)

Thankfully, at this point the Gaussian and SI forms of Maxwell’s equations, as well as the exp(±iωt) phasor conventions, are unified, except that k 2 = k02 μ (Gaussian); k 2 = ω 2 μ (SI)

(8.142)

and in the SI system one uses the “absolute” rather than “relative” values of the permittivity and permeability, with 0 , μ0 embedded in them. The complementary equation for the triple Hz , E x and E y in the Gaussian system is, quite analogously, (8.143) ∇ · (−1 ∇ Hz ) + k02 μHz = 0 which for constant  simplifies to ∇ 2 Hz + k 2 Hz = 0,  = const

(8.144)

8.8 Two-Dimensional Problems of Wave Propagation Table 8.1 Definitions of the TE mode differ One-component E, two-component H T. Fujisawa and M. Koshiba [FK04], I. V. Shadrivov et al. [SSK05], A. Ishimaru et al. [ITJ05], H. H. Sheinfux et al. [HSKP+14], V. Popov et al. [PLN16]

461

One-component H , two-component E G. Shvets and Y.A. Urzhumov [SU04], S. G. Johnson & J. D. Joannopoulos [JJ01, p. 179], S. Yamada et al. [YWK+02], R. Meisels et al. [MGKH06]

Table 8.2 A few representative quotes on TE/TM modes One-component E, two-component H One-component H , two-component E “Transmission of s- (TE-) polarized light through the metal-dielectric structure...” [PLN16] “TE-polarized ... waves ... have the component of the electric field parallel to the layers (E = E y )” [SSK05] [The primary variables are] “electric field ex for TE modes and magnetic field h x for TM modes, respectively.” [FK04]

“...consider a TE-polarized electromagnetic wave, with nonvanishing Hz , E x , and E y components” [SU04] “the magnetic field along z (TE fields) or the electric field along z (TM fields)” [JJ01, p. 179] “... The TM mode in which the electric field is parallel to the axis of the holes, and the TE mode in which it is perpendicular.” [YWK+02] “The polarization of the incident wave is TM in the first band and TE in the second band (E parallel and perpendicular to the rods, respectively)” [MGKH06]

The two decoupled solutions (E z , Hx , Hy ) and (Hz , E x , E y ) are called TE and TM modes, respectively or rather, TM and TE modes, respectively. There is a regrettable ambiguity in the terminology used by different engineering and research communities. The “T” in “TE” and “TM” stands for “transverse,” meaning, according to the dictionary definition, “in a crosswise direction, at right angles to the long axis.” So, the electric field in a TE mode and the magnetic field in a TM mode are transverse... to what? In waveguide applications, they are transverse to the longitudinal axis of the guide; a TM mode in the guide thus lacks the Hz component of the magnetic field and is described by Eq. (8.141) for the E field.16 However, for 2D-periodic structures in photonics applications (photonic crystals), the same Eq. (8.141) describes the electric field that is “transverse” to the cross section of the crystal and therefore some authors call it a TE mode. Others refer to the same field as a TM mode by analogy with waveguides. Tables 8.1, 8.2 illustrate the terminological differences. In optics, the waves with only one component of the electric field (perpendicular to the plane of incidence) are referred to as s-waves (or s-polarized waves); waves with a single H -component are p-waves. 16 In

waveguides, even though some field components may be zero, the fields in general depend on all three coordinates, and hence the Laplacian operator in field equations should be interpreted as ∇ 2 = ∂x2 + ∂ y2 + ∂z2 . If the field does not depend on z, as in many 2D problems in photonics, the z-derivative in the Laplacian disappears.

462

8 Applications in Nano-Photonics

From the computational (as well as analytical) perspective, fields with only one Cartesian component are of particular interest, as equations for these fields are scalar and thus much easier to deal with than the more general vector equations. With this in mind, in the remainder of this chapter I shall simply call waves with one component of the electric field E-waves (or E-modes); H-waves have an analogous definition. It is hoped that the reader will find this nomenclature straightforward and unambiguous.

8.9 Photonic Bandgap in Two Dimensions In 2D and especially in 3D periodic structures, the bandgap phenomenon is much richer, and more difficult to analyze, than in 1D (Sect. 8.4). The Bloch wavenumber, scalar in 1D, becomes a wave vector in 2D and 3D, as the Bloch wave can travel in different directions. Moreover, electromagnetic wave propagation in general depends on polarization—i.e. on the direction of the E vector in the wave; this adds one more degree of freedom to the analysis. For each direction of propagation and for each polarization, there may exist a forbidden frequency range—a bandgap—where the corresponding Bloch wavenumber q is imaginary and hence no propagating modes exist. If these bandgaps happen to overlap for all directions of propagation and for both polarizations, so that no Bloch waves can travel in any direction, a complete bandgap is said to exist. Let us consider a photonic crystal example that is general enough to contain many essential features of the two-dimensional problem. A square cell of the crystal, of size a × a, contains a dielectric rod with a radius rrod and the relative dielectric permittivity rod (Fig. 8.9). The medium outside the rod has permittivity out . All media are non-magnetic. The crystal lattice is obtained by periodically replicating the cell infinitely many times in both coordinate directions.

Fig. 8.9 A typical example of a square cell of a photonic crystal lattice. The (infinite) crystal is an array of dielectric rods obtained by periodic replication of the cell in both coordinate directions

8.9 Photonic Bandgap in Two Dimensions

463

Fig. 8.10 First Brillouin zone for the square photonic crystal lattice

In the Fourier space of Bloch vectors q, the corresponding “master” cell—called the first Brillouin zone17 —is [−π/a, π/a] × [−π/a, π/a] (Fig. 8.10). This zone can also be periodically replicated infinitely many times in both qx and q y directions to produce a reciprocal (i.e. Fourier space) lattice. However, all possible Bloch waves E˜ exp(iq · r) are already accounted for in the first Brillouin zone. Indeed, adding 2π/a to, say, qx introduces just a periodic factor exp(i2πx/a), with period a, which ˜ can as well be “absorbed” into the periodic Bloch component E(x, y). Note that this picture is conceptually simple only for real q. When the Bloch vector is complex (which is always the case in the presence of losses), band diagrams “live” in a multidimensional (2D or 3D) complex space, which significantly complicates the analysis (V. A. Markel and IT [MT16]). Standard notation for some special points in the first Brillouin zone is shown in Fig. 8.10. The Γ point is q = 0; the X point is q = (π/a, 0); the M point is q = (π/a, π/a);  is a generic point on Γ X (i.e. with q y = 0);  is a generic point on Γ M. The problem can now be formulated as follows. First, the E-mode (one component of the electric field E = E z ) is described by Eq. (8.141), repeated here for easy reference: (8.145) ∇ 2 E + k 2 (r)E = 0, for μ = const where the E field is sought as a Bloch wave with a (yet undetermined) wave vector q: ˜ (8.146) E(r) = E(r) exp(iq · r); r ≡ (x, y), q = (qx , q y ) There are two general options: solving for the full E field of (8.145) or, alternatively, ˜ for the periodic factor E(x, y). In the first case, the governing equation is fairly simple (Helmholtz) but the boundary conditions are nonstandard due to the Bloch exponential exp(iq · r) (details below). In the second case, with E˜ as the unknown, 17 Léon

N. Brillouin, 1889–1969, an outstanding French and American physicist.

464

8 Applications in Nano-Photonics

standard periodic boundary conditions apply, but the differential operator is more complicated. More precisely, the problem for the full E field includes the Helmholtz equation (8.145) in the square [−a/2, a/2] × [−a/2, a/2] and the Bloch boundary condition E

a 2

,y



 a  a a = exp(iqx a) E − , y ; − ≤ y ≤ 2 2 2

 a  a a a = exp(−iq y a) E x, − ; − ≤ x ≤ E x, 2 2 2 2

(8.147)

(8.148)

In the alternative formulation, with E˜ as the main unknown, the Helmholtz equation takes on a different form because     ˜ ˜ ˜ ∇ E(r) exp(iq · r) = ∇ E(r) + iq E(r) exp(iq · r) (8.149) Formally, the ∇ operator acting on E is replaced with the ∇ + iq operator acting on ˜ Similarly, applying the divergence operator to (8.149), one obtains the Laplacian E. ˜ exp(iq · r) ∇ 2 E = [(∇ + iq) · (∇ + iq) E] ˜ exp(iq · r) = [∇ 2 E˜ + 2iq · ∇ E˜ − q2 E]

(8.150)

Note that if q is not real, then q2 ≡ q · q is a complex number not equal to the real |q|2 . The Bloch problem for E˜ thus becomes (after canceling the common complex exponential in all terms) ∇ 2 E˜ + 2iq · ∇ E˜ + (k02 μ(r) − q2 ) E˜ = 0

(8.151)

with the periodic boundary conditions  a  a  a a , y = E˜ − , y ; − ≤ y ≤ E˜ 2 2 2 2  a   a a a E˜ x, = E˜ x, − ; − ≤ x ≤ 2 2 2 2

(8.152) (8.153)

The dielectric permittivity in (8.151) is a function of position. In principle, the magnetic permeability may also depend on coordinates, but this is not the case in our present example or at optical frequencies in general. ˜ are unusual, as they Both eigenvalue problems (in terms of E or, alternatively, E) have three (and in the 3D case four) scalar eigenparameters: frequency ω = k0 c and the components qx , q y of the Bloch vector. Solving for all three parameters, and the respective eigenmodes, simultaneously is impractical. The usual approach is to fix the q vector and solve the resultant eigenvalue problem for ω only; then, repeat the

8.9 Photonic Bandgap in Two Dimensions

465

computation for a set of values of q.18 Of most interest are the values on the symmetry lines in the Brillouin zone (Fig. 8.10) Γ → X → M → Γ ; eigenfrequencies ω corresponding to these values are typically plotted in a single chart. For the lattice of cylindrical rods, this bandgap structure is computed below as an example. It is quite interesting to analyze the behavior of Bloch waves in the limiting case of a quasi-homogeneous material, when the lattice cell size tends to zero relative to the wavelength in a vacuum. This will be discussed in Sect. 8.15.6, in connection with backward waves and negative refraction in metamaterials. In addition to the two ways of formulating the photonic bandgap problem, there are several approaches to solving it. We shall consider two of them: Finite element analysis and plane wave expansion (i.e. Fourier transform).

8.10 Band Structure Computation: PWE, FEM and FLAME 8.10.1 Solution by Plane Wave Expansion As a periodic function of coordinates, factor E˜ (8.152), (8.153) can be expanded into a Fourier series with some (yet unknown) coefficients em , E˜ =



em exp(ikm · r), km =

m∈Z2

2π 2π m ≡ (m x , m y ) a a

(8.154)

with integers m x , m y . The full field E is obtained by multiplying E˜ with the Bloch exponential: E = E˜ exp(iq · r) =



em exp (i(km + q) · r)

(8.155)

m∈Z2

The dielectric permittivity (x, y) is also a periodic function of coordinates and can be expanded into a similar Fourier series. However, it is often advantageous to deal with the inverse of , γ = −1 . The reason is that, after multiplying the governing equation (8.145) through by γ, one arrives at an eigenvalue problem without any coordinate-dependent coefficients in the right-hand side: − γ(x, y) ∇ 2 E = k02 μE, (μ = const)

(8.156)

in Flexible Local Approximation Method (FLAME, Sect. 8.10.6) it is ω that acts as an “independent variable” because the basis functions in FLAME depend on it. The Bloch wave vector is computed as a function of frequency. Also, for lossy materials q is complex, and it makes sense to fix ω and solve for q.

18 However,

466

8 Applications in Nano-Photonics

[In the SI system, ω 2 appears in this equation instead of k02 .] This ultimately leads to a standard eigenvalue problem of the form Ax = λx rather than a more complicated generalized problem Ax = λBx. (See also Sect. 8.10.4 on FEM, where a generalized eigenproblem arises due to the presence of the FE “mass matrix.”) As before, E satisfies the scaled-periodic boundary conditions with the complex exponential Bloch factor. The downside of the multiplication by γ is that the operator in the left-hand side of the eigenvalue problem (8.156) is not self-adjoint. (The coordinate-dependent factor γ(x, y) outside the divergence operator gets in the way of the usual integrationby-parts argument for self-adjointness.) The original formulation,−∇ 2 E = k02 μ  (x, y) E, has self-adjoint operators on both sides if the medium is lossless (real ). The choice thus is between a Hermitian but generalized eigenvalue problem and a regular but non-Hermitian one. For the Bloch–Floquet E field (8.155), the negative of the Laplace operator turns, in the Fourier domain, into multiplication by |km + q|2 . Further, the product −γ∇ 2 E in the left-hand side of (8.156) turns into convolution; the mth Fourier harmonic of this product is Fm {−γ ∇ 2 E} =



|qm + q|2 γ(m ˜ − s) E˜ s ,

qm =

s∈Z2

2π m a

(8.157)

where γ˜ are the Fourier coefficients for γ = −1 : γ =



γ(m) ˜ exp(ikm · r),

(8.158)

m∈Z2

Putting together the left- and right-hand sides of equation (8.156) in the Fourier domain, we obtain an eigenvalue problem for the Fourier coefficients:

˜ ˜ |qm + q|2 γ(m ˜ − s) E(s) = k02 μ E(m);

(8.159)

s∈Z2

m = (m x , m y ); m x , m y = 0, ±1, ±2, . . . This is an infinite set of equations for the eigenfrequencies and eigenmodes. For computational purposes, the system needs to be truncated to a finite size; this size is an adjustable parameter in the computation. Numerical results for a cylindrical rod lattice are presented in Sects. 8.10.4 and 8.15.5.3).

8.10 Band Structure Computation: PWE, FEM and FLAME

467

8.10.2 The Role of Polarization To avoid repetition, we have so far considered E-polarization only, with the corresponding equation (8.156) for the one-component E field. The problem for H polarization is very similar: − ∇ · (γ(x, y)∇ H ) = k02 μH

(8.160)

but its algebraic properties are better. Namely, the operator in the left-hand side of (8.160), unlike the operator for the E-problem (8.156), is self-adjoint and nonnegative definite (which is easy to show using integration by parts and taking into account Remark 27 on boundary conditions, Sect. 8.10.4). This unequal status of the E- and H -problems is due to the assumption that all materials are non-magnetic. If this is not the case and μ depends on coordinates, the E- and H -problems are fully analogous.

8.10.3 Accuracy of the Fourier Expansion The main factor limiting the accuracy of the plane wave solution is the Fourier approximation of the dielectric permittivity (x, y) or, alternatively, its inverse γ(x, y). Abrupt changes in the dielectric constant lead in its Fourier representation to the ringing effect (the “Gibbs phenomenon,” well known in Fourier analysis). For illustration, let us use the cylindrical rod example (Fig. 8.9 on Sect. 8.9). The inverse dielectric constant in this case is  1 γrod , r ≤ rrod , r ≡ (x 2 + y 2 ) 2 , (x, y) ∈  (8.161) γ(x, y) = γout , r > rrod The Fourier coefficients γ(m) ˜ (i.e. the plane wave expansion coefficients) for this function of coordinates are found by integration:  γ(m) ˜ =



γ(r) exp(−ikm · r) d x d y

(8.162)

This integration can be carried out analytically by switching to the polar coordinate system and using the Bessel function expansion for the complex exponential; see, e.g., K. Sakoda [Sak05]. The end result is  γ(m) ˜ =

f γrod + (1 − f )γout , m = 0 2(γrod − γout )(km rrod )−1 f J1 (km rrod ), m = 0

(8.163)

Figure 8.11 shows a plot of γ(x, y) ≡ −1 (x, y) along the straight line x = y, i.e. at 45◦ to the axes of the computational cell. The following parameters are assumed: cell

468

8 Applications in Nano-Photonics

Fig. 8.11 An illustration of the Gibbs phenomenon for the Fourier series approximation of the inverse permittivity of a cylindrical rod in a square lattice cell. Cell size a = 1 in each direction; rod = 9; rrod = 0.38. Top: 20 Fourier harmonics retained per coordinate direction; bottom: 50 harmonics

size a = 1 in each direction; rod = 9; rrod = 0.38. The true plot of γ is of course √a rectangular pulse that changes abruptly from γrod = 1/9 to γout = 1 at x = rrod / 2 ≈ 0.2687. Summation of a finite number of harmonics in the Fourier series produces typical ringing around the points of abrupt changes of the material parameter. When the number of Fourier harmonics retained in the series is increased, this ringing becomes less pronounced but does not fully disappear—compare the plots corresponding to 20 and 50 harmonics per component of the wave vector, Fig. 8.11. In practice, the number of plane waves in the expansion is limited by the computational cost of the procedure (see Appendix 8.17), which in turn limits the numerical accuracy of plane wave expansion. Because of that, in some cases the computational results initially reported in the literature had to be revised later. A. Moroz [Mor02] (p. 115109-3) gives one such example—the PBG of a diamond lattice of non-overlapping dielectric spheres in air.

8.10 Band Structure Computation: PWE, FEM and FLAME

469

Remark 26 An alternative approach used by Moroz is the Korringa–Kohn– Rostoker19 (KKR) method developed initially for the Schrödinger equation in the band theory of solids [KR54] and later adapted and adopted in photonics. KKR combines multipole expansions with transformations of lattice sums. This book deals with lattice sums for static cases only, in the context of Ewald methods (Chap. 5). The wave case is substantially more involved, and the interested reader is referred to Chap. 2 of [Yas06] (by L. C. Botten et al.), to the work of R. C. McPhedran et al. [MNB05] and references therein. To reduce the numerical errors associated with the Gibbs phenomenon in plane wave expansion, local homogenization (“pixel smoothing”) can be used to smooth out the dielectric permittivity at material interfaces; see R. D. Meade et al. [MRB+93] (with the erratum [MRB+97]). In particular, this approach is implemented in the MIT photonic band eigenmode solver, a public-domain software package developed by the research groups of S. G. Johnson and J. Joannopoulos [JJ01]. “Pixel smoothing” is discussed in a different context (the order of finite-difference schemes) in Sect. 7.12.2.

8.10.4 FEM for Photonic Bandgap Problems in 2D The finite element method (FEM, Chap. 3) can be applied to either of the two formulations: for the full E field (8.145), (8.147), (8.148) or for the spatial-periodic factor E˜ (8.151), (8.152), (8.152). In 2D, both routes are analogous, but we focus on the first one to highlight the treatment of the special Bloch boundary conditions. (In 3D, FE analysis is more involved; see Sect. 8.11.) The FE formulation starts with the definition of appropriate functional spaces (continuous and discrete) and with the weak form of the governing equations. This setup is needed not only as a mathematical technicality, but also for correct practical implementation of the algorithm—in particular, in the case under consideration, for the proper treatment of boundary conditions. A natural functional space B() ⊂ H 1 () (B for “Bloch”) in our 2D example is the subspace of Bloch-periodic functions in the Sobolev space H 1 (): B() = {E : E ∈ H 1 (); E satisfies boundary conditions (8.147), (8.148)} (8.164) The weak formulation of the problem is Find E ∈ B() :

    −1 μ ∇ E, ∇ E  = k02 E, E  , ∀E  ∈ B()

(8.165)

    ∇ E, ∇ E  = k02 μ E, E  , ∀E  ∈ B()

(8.166)

or, for μ = const, Find E ∈ B() : 19 Sometimes

incorrectly spelled as “Rostocker.”.

470

8 Applications in Nano-Photonics

Remark 27 The line integral (surface integral in 3D) that typically appears in the transition from the strong to the weak formulation and back (see Chap. 3) in this case vanishes:  ∂ E ∗ E dΓ = 0; ∀E, E  ∈ B() (8.167) Find E ∈ B() : Γ ∂n where Γ is the boundary of the computational cell  and n is the outward normal to this boundary. Indeed, the E field on the right edge of  has an additional Bloch factor b = exp(iqx a) as compared to the left edge; similarly, the complex conjugate of the test function E  has an additional factor b∗ . The integrals over the right and left edges then cancel out because bb∗ = 1 (real q is assumed) and the directions of the outward normals on these edges are opposite. The integrals over the lower and upper edges cancel out for the same reason. Next, assume that a finite element mesh (e.g. triangular or quadrilateral) has been generated. One special feature of the mesh is needed for the most natural implementation of the Bloch boundary conditions. The right and left edges of the computational domain  (a square in our example) need to be subdivided by the grid nodes in an identical fashion, so that the nodes on the right and left edges come in pairs with the same y-coordinate. A completely similar condition applies on the lower and upper edges.20 In each pair of boundary nodes, one node is designated as a “master” node (M) and the other one as a “slave” node (S).21 The Bloch boundary condition directly relates the field values at the slave nodes to the respective values at their master nodes: E(r S ) = exp (iq · (r S − r M )) E(r M )

(8.168)

where r S , r M are the position vectors of any given slave–master pair of nodes. Remark 28 For edge elements (see Chap. 3), one would consider pairs of master– slave edges rather than nodes. We can now move on to the discrete FE formulation. Let Ph () be one of the standard FE spaces of continuous piecewise-polynomial functions on the chosen mesh; see Chap. 3. The simplest such space is that of continuous piecewise-linear functions on a triangular grid. Any function E h ∈ Ph can be represented as a linear combination of standard nodal FE basis functions ψα (x, y) (e.g. piecewise-linear “hat” functions, Chap. 3): n E α ψα , α = 1, 2, . . . , n (8.169) Eh = α=1

20 For definiteness, let us attribute the corner nodes to the lower/upper edge pairs rather than to the left/right. 21 For each pair of nodes, this assignment of M-S labels is in principle arbitrary; however, for consistency it is convenient to treat all nodes on, say, the left and lower edges as “masters” and the nodes on the right and upper edges as the respective “slaves.”.

8.10 Band Structure Computation: PWE, FEM and FLAME

471

where n is (for nodal elements) the number of nodes of the mesh. The nodal values E α of the field can be combined in one Euclidean vector E ∈ Cn . The linear combination (8.169) establishes a one-to-one correspondence between each FE function E h and the respective vector of nodal values E. Bilinear forms in Ph × Ph and Cn × Cn are also related directly: (∇ E h , ∇ E h ) = (L E, E  ), ∀E h ∈ Ph ()

(8.170)

(E h , E h ) = (M E, E  ), ∀E h ∈ Ph ()

(8.171)

In the left-hand side of these two equations, the inner products are those of (L 2 ())2 and L 2 (), i.e.   ∗ ∗ (∇ E h , ∇ E h ) ≡ ∇ E h · ∇ E  h d; (E h , E h ) ≡ E h E  h d (8.172) 



In the right-hand sides, the inner products are in Cn : (E, E  ) =

n



Eα E α

(8.173)

α=1

Matrices L of (8.170) and M of (8.171) are, in the FE terminology, the “stiffness” matrix and the “mass” matrix, respectively (Chap. 3). Equations (8.170), (8.171) can be taken as definitions of these matrices. The entries of L and M can also be written out explicitly: (8.174) L αβ = (∇ψα , ∇ψβ ) 1 ≤ α, β ≤ n Mαβ = ( ψα , ψβ )

1 ≤ α, β ≤ n

(8.175)

where the inner products are again those of L 2 () and the ψs are the FE basis functions. To complete the FE formulation of the Bloch–Floquet problem, we need the subspace Bh ⊂ Ph of piecewise-polynomial functions that satisfy the Bloch boundary condition (8.168) for each pair of master–slave nodes. (Practical implementation will be discussed shortly.) The FE Galerkin formulation is nothing else but the weak form of the problem restricted to the FE space Bh :     ∇ E, ∇ E  = k02 μ E, E  , ∀E  ∈ Bh () (8.176) [In the SI system, ω = k0 c appears in (8.176) in the place of k0 .] In (8.176), it is tacitly assumed that  = (r) is real (lossless media). In the presence of losses, frequency dispersion, and/or for evanescent modes in a bandgap, it makes sense to set up a different eigenproblem, with ω (or k0 ) fixed and q as an eigenparameter to be found; see Sect. 8.10.7. Find E ∈ Bh (), k0 ∈ R :

472

8 Applications in Nano-Photonics

If there were no boundary constraints, the Galerkin problem (8.176) in matrix– vector form would be Find E ∈ Cn , k0 ∈ R :



   L E, E  = k02 μ M E, E  , ∀E  ∈ Cn

where L and M are the stiffness and mass matrices previously defined. However, the Bloch boundary conditions must be honored. To accomplish this algorithmically, let us separate out the slave nodes in the Euclidean vectors:  E =

 E nonS ; E nonS ∈ Cn−n S ; E S ∈ Cn S ES

(8.177)

where n S is the number of slave nodes. Vector E S includes the field values associated with slave nodes; vector E nonS is associated with “non-slaves,” i.e. the non-boundary nodes and the master nodes. Since the nodal values of slave nodes are completely defined by non-slaves, the full vector E can be obtained from its non-slave part by a linear operation: E = C E nonS

(8.178)

where C is a rectangular matrix  C =

I



CnonS→S

(8.179)

Each row of the matrix block CnonS→S corresponds to a slave node and contains exactly one nonzero entry, the complex exponential Bloch factor of (8.168), in the column corresponding to the respective master node. The problem now takes on the following Galerkin matrix–vector form: Find E nonS ∈ Cn−n S , k0 ∈ R : 

   LC E nonS , C E nonS = k02 μ MC E nonS , C E nonS , ∀E nonS ∈ Cn−n S

(8.180)

This immediately translates into the eigenvalue problem

where

L˜ E nonS = k02 μ M˜ E nonS

(8.181)

L˜ = C ∗ LC; M˜ = C ∗ MC

(8.182)

8.10 Band Structure Computation: PWE, FEM and FLAME

473

It is straightforward to show that both matrices L˜ and M˜ are Hermitian; L˜ is positive definite if the Bloch wavenumber q is real and nonzero; M˜ is always positive definite.22 In practice, there is no need to multiply matrices in the formal way of (8.182). Instead, the following procedure can be applied. Consider a stage of the matrix assembly process where an entry (i, j) of the stiffness or mass matrix is being formed. If i happens to be a slave node with its master M(i), the matrix entry gets multiplied by the Bloch exponential factor b(i, M(i)) (8.168) and attributed to row M(i) rather than row i. Likewise, if j is a slave node with the corresponding master node M( j), the matrix entry gets multiplied by b∗ ( j, M( j)) = exp(iq · (r j − r M( j) )) (note the complex conjugate) and the result gets attributed to column M( j) instead of column j. In this procedure, the rows and columns corresponding to slave nodes remain empty and in the end can be removed from the matrices. However, it may be algorithmically simpler not to change the dimension and structure of the matrices and simply fill the “slave” entries in the diagonals with some dummy numbers—say, ones for matrix M and some large number X for matrix L. This will produce extraneous modes “living” on the slave nodes only and corresponding to eigenvalues k02 μ = X . These modes can be easily recognized and filtered out in postprocessing. A disadvantage of FEM for the bandgap structure calculation is that it leads to a generalized eigenvalue problem, of the form L x = λM x rather than L x = λx. This increases the computational complexity of the solver. Note, however, that if the Cholesky decomposition23 of M (M = T T ∗ , where T is a lower triangular matrix) is not too expensive, the generalized problem can be reduced to a regular one by substitution y = T ∗ x: L x = λT T ∗ x ⇒ T −1 L T −∗ y = λy

(8.183)

If iterative eigensolvers are used, matrix inverses need not be computed directly; instead, systems of equations with upper or lower triangular matrices are solved to find T −1 L T −∗ y for an arbitrary vector y. However, in the numerical example below the matrices are of very moderate size and the MATLAB QZ algorithm (a direct solver for generalized eigenvalue problems) is employed.

 E nonS ) =  |∇ E h |2 d, ∀E h ∈ Bh . Since E h for q = 0 cannot be constant due to the Bloch boundary condition, this energy integral is strictly ˜ positive. Similar considerations apply to M. 23 André-Louis Cholesky (1875–1918), a French mathematician. It is customary to write the Cholesky decomposition as L L T or L L ∗ , but in our case symbol L is already taken, so T is used instead. 22 Indeed, by definition of the FE matrices, ( L ˜E

nonS ,

474

8 Applications in Nano-Photonics

Fig. 8.12 Two finite element meshes for one cell of a photonic crystal lattice and with cylindrical dielectric rods. The rod is shaded for visual clarity. Left: 404 nodes, 746 triangular elements. Right: 1553 nodes, 2984 triangular elements

8.10.5 A Numerical Example: Band Structure Using FEM The numerical data was chosen the same as in the computational example of K. Sakoda ([Sak05], pp. 28–29), where the bandgap structure was computed using Fourier analysis (plane wave expansion). Our finite element results can then be compared with those of [Sak05]. The general setup, with a cylindrical dielectric rod in a square lattice cell, is already shown in Fig. 8.9 (Sect. 8.9). The cell size is taken as a = 1, and the radius of the cylindrical rod is rrod = 0.38. The dielectric constant of the rod is rod = 9; the medium outside the rod is air, with out = 1. The FE mesh is generated by COMSOL Multiphysics™and exported to the MATLAB environment; an FE matrix assembly for the Bloch problem is then performed in MATLAB. As already noted, the MATLAB QZ solver is used. Postprocessing is again done in COMSOL™. The initial FE mesh is fairly coarse, with 404 nodes and 746 first-order triangular elements (Fig. 8.12, left). The main result of the FE simulation is the bandgap structure shown in Fig. 8.13 for the E-mode (s-polarization, one-component E field). The first four normalized eigenfrequencies ω˜ = ωa/(2πc) are plotted versus the normalized Bloch wavenumber qa/π over the M → Γ → X → M loop in the Brillouin zone. The chart in Fig. 8.13 is almost exactly the same as the one in [Sak05]. The bandgaps, where no (real) eigenfrequencies exist for any q, are shaded in the figure. The normalized frequency ranges for the first two gaps are, according to the FE calculation, [0.2462, 0.2688] and [0.4104, 0.4558]. To estimate the accuracy of this numerical result, the computation was repeated on a finer mesh, with 1553 nodes and 2984 first-order triangular elements (Fig. 8.12,

8.10 Band Structure Computation: PWE, FEM and FLAME

475

Fig. 8.13 Photonic band structure (plots correspond to the first four eigenfrequencies as a function of the wave vector) for a photonic crystal lattice; E-mode (one-component E field). Dielectric cylindrical rods in air; cell size a = 1, radius of the cylinder rrod = 0.38; the relative dielectric permittivity rod = 9

Fig. 8.14 E field distribution for the first (left) and the second (right) Bloch π modes for q = ( 2a , 0). Same setup and parameters as in Fig. 8.13

right).24 On the finer mesh, the first two bandgaps are calculated to be [0.2457, 0.2678] and [0.4081, 0.4527], which differs from the results on the coarser mesh by 0.2–0.7%. For comparison, the first two bandgap frequency ranges reported for the same problem by K. Sakoda [SS97, Sak05] are [0.247, 0.277] and [0.415, 0.466]. This result was obtained by Fourier analysis, with expansion into 441 plane waves; the estimated accuracy is about 1% according to Sakoda. The field distribution of lowest order Bloch modes is illustrated by Figs. 8.14 and π , 0) (a -point exactly in the 8.15. The first figure is for the Bloch vector q = ( 2a π π , 4a ). middle of Γ X ), and the second one is for point q = ( 2a This relatively simple comparison example of FEM versus Fourier expansion is not a basis for far-reaching conclusions. Both methods have their strengths and weaknesses. A clear advantage of FEM is its effective and accurate treatment of geometrically complex structures, possibly with high dielectric contrasts. Another advantage is the sparsity of the system matrices. Unfortunately, FEM leads to a gen-

24 In

modern FE analysis, much more elaborate hp-refinement procedures exist to estimate and improve the numerical accuracy. See Chap. 3.

476

8 Applications in Nano-Photonics

Fig. 8.15 E field distribution for the first (left) and the second (right) Bloch π π , 4a ). modes for q = ( 2a Same setup and parameters as in Fig. 8.13

eralized eigenvalue problem, with the FE “mass” matrix in the right-hand side.25 A special FE technique known as “mass lumping” makes the mass matrix diagonal, with applications to both eigenvalue and time-dependent problems. Mass lumping is usually achieved by applying, in the FE context, numerical quadratures with the integration knots chosen to coincide with element nodes. For details, see papers by M. G. Armentano and R. G. Durán [AD03]; A. Elmkies and P. Joly [EJ97a, EJ97b]; G. Cohen and P. Monk [CM98a]; and references there. In addition, as already noted, the generalized problem can be converted to a regular one by Cholesky decomposition.

8.10.6 Flexible Local Approximation Schemes for Waves in Photonic Crystals As an alternative to plane wave expansion and to finite element analysis, the Flexible Local Approximation Method (FLAME, Chap. 4) can be used for wave simulation in photonic crystal devices. FLAME incorporates accurate local approximations of the solution into a difference scheme. Applications of FLAME to photonic crystals are attractive because local analytical approximations for typical photonic crystal structures are indeed available and the corresponding FLAME basis functions can be worked out once and for all. In particular, for crystals with cylindrical rods the FLAME basis functions are obtained by matching, via the boundary conditions on the rod, cylindrical harmonics inside and outside the rod. These Bessel-based basis functions were already derived in Chap. 4 for the problem of electromagnetic scattering from a cylinder. In 3D, FLAME bases for electromagnetic fields near dielectric spheres could be constructed by matching the (vector) spherical harmonics inside and outside the sphere as in Mie theory (J. A. Stratton [Str41] or R. F. Harrington [Har01]).

25 The presence of the mass matrix is also a disadvantage in time-dependent problems, where this matrix is associated with the time derivative term and makes explicit time-stepping schemes difficult to apply..

8.10 Band Structure Computation: PWE, FEM and FLAME

477

When the dielectric structures are not cylindrical or spherical, the field can still be expanded into cylindrical/spherical harmonics, and the T- (“transition”) matrix provides the relevant relationships between the coefficients of incoming and outgoing waves. A comprehensive treatment of T-matrix methods and related electromagnetic theory can be found, for example, in the books and articles by M. I. Mishchenko and collaborators [MTM96, MTL02, MTL06], and T. Wriedt and collaborators [Wri99, DWE06]. Public domain codes are available from both groups [MT98, DWE].26 In contrast to methods that analytically combine multipole expansions and lattice sums (see Remark 26 on Sect. 8.10.3), the role of multipole expansions in FLAME is to generate a difference scheme. As an illustrative example, we consider a photonic crystal analyzed by T. Fujisawa and M. Koshiba [FK04, Web07]. The waveguide with a bend is obtained by eliminating a few dielectric cylindrical rods from a 2D array (Fig. 8.16). Fujisawa and Koshiba used a finite element–beam propagation method in the time domain to study fields in such a waveguide, with nonlinear characteristics of the rods. The use of complex geometrically conforming finite element meshes may well be justified in this 2D case. However, regular Cartesian grids have the obvious advantage of simplicity, especially with extensions to 3D in mind. This is illustrated by numerical experiments below. The problem is solved in the frequency domain, and the material characteristic of the cylindrical rods is assumed linear, with the index of refraction n = 3. The radius of the cylinders and the wavenumber are normalized to unity; the air gap between the neighboring rods is equal to their radius. The field distribution is shown in Fig. 8.16. For bandgap operation, the field is essentially confined to the guide, and the boundary conditions do not play a critical role. To get numerical approximation of these conditions out of the picture in this example, the field on the surface of the crystal was simply set equal to an externally applied plane wave. For comparison, FE simulations (COMSOL Multiphysics™) with three meshes were run: the initial mesh with 9702 nodes, 19,276 elements and 38,679 degrees of freedom (d.o.f.); a mesh obtained by global refinement of the initial one (38,679 nodes, 77,104 elements, 154,461 d.o.f.); and an adaptively refined mesh with 27,008 nodes, 53,589 elements, 107,604 d.o.f. The elements were second-order triangles in all cases. The agreement between FLAME and FEM results is excellent. This is evidenced, for example, by Fig. 8.17, where almost indistinguishable FEM and FLAME plots of the field distribution along the central line of the crystal are shown. Yet, a closer look at the central peak of the field distribution (Fig. 8.18) reveals that FLAME has essentially converged for the 50 × 50 grid, while FEM solutions approach the FLAME result as the FE mesh is refined. FEM needs well above 100,000 d.o.f. to achieve the level of accuracy comparable with the FLAME solution with 2500 d.o.f. [Tsu05a]. Figure 8.19 shows a visual comparison of FEM and Trefftz– FLAME meshes that provide the same accuracy level.

26 I

thank Thomas Wriedt for helpful information and references.

478

8 Applications in Nano-Photonics

Fig. 8.16 Real part of the electric field in the photonic crystal waveguide bend. The real part looks qualitatively similar

Fig. 8.17 Field distribution in the Fujisawa–Koshiba photonic crystal along the central line y = 0. FLAME versus FE solutions. (Reprinted by permission from [Tsu05a] ©2005 IEEE.)

Note that for the 50 × 50 grid there are about 10.5 points per wavelength (ppw) in the air but only 3.5 ppw in the rods, and yet the FLAME results are very accurate because of the special approximation used. Any alternative method, such as FE or FD, that employs a generic (piecewise-polynomial) approximation would require a substantially higher number of ppw to achieve the same accuracy. Remark 29 As described in more detail in Sect. 8.10.7, the FLAME computation of Bloch modes proceeds in a different manner than in the FE or plane wave methods.

8.10 Band Structure Computation: PWE, FEM and FLAME

479

Fig. 8.18 Convergence of the field near the center of the bend. Trefftz–FLAME has essentially converged for the 50 × 50 grid (2500 d.o.f.); FEM results approach the FLAME values as the FE mesh is refined. FEM needs well over 100,000 d.o.f. for accuracy comparable with FLAME. (Reprinted by permission from [Tsu05a] ©2005 IEEE.)

Fig. 8.19 The 50 × 50 FLAME grid (2500 d.o.f.) provides the same level of accuracy as the finite element mesh with 38,679 nodes, and 77,104 elements and 154,461 d.o.f. (Reprinted by permission from [Tsu05a] ©2005 IEEE.)

FLAME schemes rely on local analytical solutions that can be evaluated numerically only for a given (known) frequency. Hence, ω becomes an independent variable in the simulation, and the Bloch wave vector (say, along any given symmetry line in the Brillouin zone) is a parameter to be determined from a generalized eigenvalue problem. FLAME eigenmode analysis has been performed by H. Pinheiro et al. [PWT07] in application to photonic crystal waveguides. The crystal is again formed by dielectric cylindrical rods. The waveguides “carved out” of the crystal lattice have

480

8 Applications in Nano-Photonics

Fig. 8.20 (Credit: H. Pinheiro et al. Reprinted by permission from [PWT07] ©2007 IEEE.) Transmission and reflection coefficients of a directional coupler

ports that carry energy in and out of the device. What follows is a brief summary of the computational approach and results of [PWT07]. First, FLAME is used to compute waveguide modes whose energy is contained mostly within the guide. Toward this end, FLAME is applied to one layer of cylindrical rods, with the Bloch boundary condition imposed on two of its sides and the FLAME PML (perfectly matched layer) on the other two. This is a generalized eigenvalue problem that for moderate matrix sizes can be quickly solved using the QZ algorithm. There is normally no need to generate large matrices, as the convergence of FLAME is extremely rapid (see the following section). Second, the boundary conditions for the field at the ports can be expressed via the dominant waveguide modes determined as described above. For the excited port(s), the excitation is assumed known; for other ports, zero Dirichlet conditions are used. FLAME is then applied again, this time for the whole crystal, with the proper boundary conditions at the ports and PML conditions on inactive surfaces. The results of the first step of the analysis—computation of the propagation constant—show very good agreement with the plane wave expansion method when the FLAME grid has 6 × 6 nodes per lattice cell. Further, FLAME is applied to a 90◦ waveguide bend; the results obtained with 7744 degrees of freedom for FLAME agree well with those calculated by the FETD beam propagation method using 158,607 d.o.f. (M. Koshiba et al. [KTH00]). Equally favorable is the comparison of FLAME with FETD-BPM for photonic crystals with Y- and T-branches. For a T-branch, FLAME results with 25,536 d.o.f. are the same as FDTD results with 5,742,225 d.o.f. FLAME solutions exhibit very fast convergence as the grid is refined. As an example, Fig. 8.20 shows transmission and reflection coefficients of a directional coupler (H. Pinheiro et al. [PWT07]).

8.10 Band Structure Computation: PWE, FEM and FLAME

481

8.10.7 Band Structure Computation Using FLAME As an alternative to plane wave expansion (Sect. 8.10.1) and FEM (Sect. 8.10), let us now consider FLAME for band structure calculation. The familiar case with a dielectric cylindrical rod of radius rrod and dielectric permittivity rod in a square lattice cell will again serve as a computational example. This section follows closely [Tv08], and the exp(iωt) phasor convention is used for consistency with that paper. In particular, outgoing cylindrical waves are described, under this convention, by the Hankel function of the second kind. In the vicinity of a cylindrical rod centered at the origin of a polar coordinate system (r, φ), the FLAME basis ψα(i) contains Bessel/Hankel functions (see also Sects. 4.4.11, 8.10.6, 8.12.5): ψα(i) = an Jn (kcyl r ) exp(inφ), r ≤ rrod ψα(i) = [cn Jn (kair r ) + Hn(2) (kair r )] exp(inφ), r > rrod where Jn is the Bessel function, Hn(2) is the Hankel function of the second kind [Har01], and the coefficients an , cn are found by matching the values of ψα(i) inside and outside the rod. The 9-point (3 × 3) stencil with a grid size h is used and 1 ≤ α ≤ 8. The eight basis functions ψ are obtained by retaining the monopole harmonic (n = 0), two harmonics of orders n = 1, 2, 3 (i.e. dipole, quadrupole and octupole) and one of harmonics of order n = 4. This set of basis functions produces a 9-point scheme as the null vector of the respective matrix of nodal values (Sects. 4.4.11, 8.10.6, 8.12.5). The Bloch wave satisfying the second-order differential equation calls for two boundary conditions—for the E field and for its derivative in the direction of wave propagation (or, equivalently, for the H field). Consequently, there are two discrete boundary conditions per Cartesian coordinate (compare this with a similar treatment in [PWT07] (Sect. 8.10.7) where, however, the algorithm is effectively one-dimensional). The implementation of these discrete conditions is illustrated by Fig. 8.21. As an example, the square lattice cell is covered with a 5 × 5 grid of “master” nodes (filled circles). In addition, there is a border layer of “slave” nodes (empty circles). The FLAME scheme is generated for each of the master nodes (“M”). At slave nodes (“S”), the field is constrained by the Bloch condition rather than by the difference scheme: (8.184) E(r S ) = exp (iq · (r S − r M )) E(r M ) Here, r S , r M are the position vectors of any given slave–master pair of nodes. Several such pairs are indicated in Fig. 8.21 by the arrows for illustration. Note that the corner nodes are the “slaves of slaves”: For example, master node M1 for slave S1 is itself a slave S2 of node M2. This is algebraically equivalent to linking node S1 to M2; however, if the link S1 → M2 were imposed directly rather than via

482

8 Applications in Nano-Photonics

Fig. 8.21 Implementation of the Bloch–Floquet boundary conditions in FLAME. Empty circles—“slave” nodes, filled circles—“master” nodes. A few of the “slave–master” links are indicated with arrows. The corner nodes are the “slaves of slaves”

S1 → M1 → M2, the corresponding factor would be the product of two Bloch exponentials in the x and y direction, leading to a complicated eigenvalue problem, bilinear with respect to the two exponentials. Example equations for the Bloch boundary conditions, in reference to Fig. 8.21, are (8.185) E S1 = bx E M1 ; b y E S3 = E M3 where bx and b y are the Bloch factors bx = exp(iqx L x ); b y = exp(iq y L y )

(8.186)

In matrix–vector form, the FLAME eigenvalue problem is L E = (bx Bx + b y B y )E

(8.187)

where E is the Euclidean vector of nodal values of the field. The rows of matrix L corresponding to the master nodes contain the coefficients of the FLAME scheme, and the respective rows of matrices Bx,y are zero. Each slave node row of matrices L and B contains only one nonzero entry—either 1 or bx,y , as exemplified by (8.185). Matrices L and (especially) B are sparse; typical sparsity patterns, for a 10 × 10 grid, are shown in Fig. 8.22. Problem (8.187) contains three key parameters: ω, on which the FLAME scheme and hence the L matrix depend (for brevity, this dependence is not explicitly indicated), and the Bloch exponentials bx,y . Finding three or even two independent eigenparameters simultaneously is not feasible. Rather, one chooses a value of ω and constructs the corresponding difference operator L = L(ω). In principle, for any given value of either of the b parameters (say, bx ) one could solve for the other parameter and scan the (bx , b y )-plane that way. Typically, however, the focus is only on the symmetry lines Γ → X → M → Γ of the first Brillouin zone. On Γ X , b y = 1 and

8.10 Band Structure Computation: PWE, FEM and FLAME

483

Fig. 8.22 Sparsity structure of the FLAME matrices for a 10 × 10 grid: L (left) and B = Bx + B y (right)

bx is the only unknown; on X M, the only unknown is b y ; on MΓ , the single unknown is b = bx = b y . For comparison purposes, in the numerical example the numerical data was chosen the same as in the PWE computation of [Sak05], pp. 28–29. In the lattice of cylindrical rods, the size of the computational square cell is a = 1, and the radius of the cylindrical rod is rrod = 0.38. The dielectric constant of the rod is rod = 9; the medium outside the rod is air, with out = 1. In our FLAME simulation, due to very rapid convergence of the method, matrices L and M need only be of very moderate size, in which case the MATLAB QZ algorithm (a direct solver for generalized eigenvalue problems) is very efficient. Figure 8.23 shows the same band diagram for the E-mode as Fig. 8.13, but the focus now is on the accuracy of FLAME and its comparison with other methods. Plotted in the figure are the first four normalized eigenfrequencies ω˜ = ωa/(2πc) (c being the speed of light in free space) versus the normalized Bloch wavenumber q˜ = qa/π over the M → Γ → X → M loop in the Brillouin zone. The bandgaps, where no (real) eigenfrequencies exist for any q B , are shaded in the figure. The excellent agreement between PWE, FEM and FLAME gives us full confidence in these results and allows us to proceed to a more detailed assessment of the numerical errors.27 The accuracy of FLAME is much higher than that of PWE or FEM, with negligible errors achieved already for a 10 × 10 grid. Indeed, inspecting the computed Bloch wavenumbers as the FLAME grid size decreases, we observe that 6–8 digits in the result stabilize once the grid exceeds 10 × 10 and 8–10 digits stabilize once the 27 All

numerical results were also checked for consistency on several meshes and for an increasing number of PWE terms.

484

8 Applications in Nano-Photonics

Fig. 8.23 Photonic band structure (first four eigenfrequencies as a function of the wave vector) for a photonic crystal lattice; E-mode. FEM (circles), PWE (solid lines), FLAME, grid 5 × 5 (diamonds), FLAME, grid 20 × 20 (squares). Dielectric cylindrical rods in air; cell size a = 1, radius of the cylinder rrod = 0.38; the relative dielectric permittivities rod = 9; out = 1

Fig. 8.24 Numerical errors in the Bloch wavenumber. Same parameters as in the previous figure. FLAME grids: 5 × 5 (diamonds), 8 × 8 (squares), 10 × 10 (triangles), 20 × 20 (circles). FEM, 404 d.o.f. (empty squares)

grid exceeds 20 × 20. This clearly establishes the 40 × 40 results as an “overkill” solution that can be taken as quasi-exact for the purpose of error analysis. Errors in the Bloch wavenumber are plotted in Fig. 8.24. Very rapid convergence of FLAME with respect to the number of grid nodes is obvious from the figure. Further, the FLAME error for the Bloch number is about six orders of magnitude lower than the FEM error for approximately the same number of unknowns: 484 nodes (including “slaves”) in FLAME and 404 nodes in FEM.

8.10 Band Structure Computation: PWE, FEM and FLAME

485

In the numerical example presented, FLAME provides 6–8 orders of magnitude higher accuracy in the photonic band diagram than PWE or FEM with the same number of degrees of freedom (∼400). This high accuracy was indispensable in a peculiar case of backward waves in plasmonic crystals [Tsu09]: In general the cell size has to be above a certain threshold for backward waves and negative index of refraction to be possible. However, the case of plasmonic resonances is an interesting exception, where the lattice cell bounds derived in [Tsu08] degenerate and become nonrestrictive. .... Backward-wave modes are demonstrated in crystals with the lattice cell as small as 1/40th of the vacuum wavelength. ... the plasmonic crystal bands in this example are quite subtle and can be captured only in simulations with very high numerical accuracy. Traditional methods such as FEM, PWE and classical FD fall short of providing the needed accuracy in this case.

To apply FLAME to more general shapes of dielectric structures, one needs accurate local approximations of the theoretical solution. This can be achieved, for example, by approximating the air–dielectric boundaries with osculating arcs in a piecewisefashion and then using cylindrical harmonics as described above [Tsu11c]. Alternatively, basis functions can be obtained as accurate finite element or boundary element solutions of local problems that are much smaller than the global one [DT06]. Extensions of the methodology to 3D are possible but not trivial. H. Pinheiro and J. P. Webb applied FLAME, with spherical harmonics as Trefftz bases, to 3D scattering of electromagnetic waves by dielectric spheres and arrays of spheres [PW09]. High accuracy of the method was verified through comparison with alternative analytical and numerical results. Perfect conductors, edges and corners require special attention due to the complicated behavior of fields in the vicinity of such materials and features (O. AlKhateeb and IT [AT13], C. Classen et al. [CBST10]).

8.11 Photonic Bandgap Calculation in Three Dimensions: Comparison with the 2D Case This section reviews the main ideas of PBG analysis in three dimensions, highlighting the most substantial differences with the 2D case and the complications that arise.

8.11.1 Formulation of the Vector Problem One of the most salient new features of the 3D formulation, as compared to 2D, is that it is no longer a scalar problem. Maxwell’s equations for time-harmonic fields, with no external currents (J = 0), are ∇ × E = ik0 B

(8.188)

486

8 Applications in Nano-Photonics

∇ × H = −ik0 D

(8.189)

These equations are written here in the Gaussian system under the exp(−iωt) phasor convention; under the opposite convention, the signs in the right-hand side of (8.188), (8.189) would be opposite. In the SI system, one has ω instead of k0 = ω/c in these equations. We shall assume simple material relationships B = μH and D = E, where μ and  can depend on coordinates (in photonics, however, materials are usually nonmagnetic and then μ = 1 in the Gaussian system or μ = μ0 in SI). Taking the curl of either one of the Maxwell equations and substituting into the other one yield a single second-order equation for the field:

or, alternatively,

∇ × μ−1 ∇ × E − k02 E = 0

(8.190)

∇ × −1 ∇ × H − k02 μH = 0

(8.191)

These two equations are analogous but may not be computationally equivalent as we shall see. For simplicity of exposition, let us assume a cubic primary cell [−a/2, a/2]3 in real space; extensions to hexahedral and triclinic cells are possible both in plane wave methods and in FE analysis. (The plane wave method is currently used much more widely in PBG calculation than FEM.) As in 2D, the E field in formulation (8.190) is sought as a Bloch wave with some wave vector q: ˜ E(r) = E(r) exp(iq · r); r ≡ (x, y, z)

(8.192)

One can solve for the full E field of (8.190) satisfying the Bloch condition or, alter˜ natively, for factor E(x, y, z) satisfying periodic conditions on the boundary of the computational cell. As in 2D, the trade-off between these two formulations is in the relative complexity of the boundary conditions versus that of the differential operator. The Bloch condition for the full E field is    a a , y, z = exp(iqx a) E − , y, z ; (8.193) E 2 2 and analogous conditions for two other pairs of faces By analogy with the 2D case, the formulation for E˜ can be obtained by formally ˜ replacing the ∇× operator applied to E with the (∇ + iq)× operator applied to E, and the boundary conditions for E˜ are purely periodic. A detailed and mathematically rigorous exposition, with the finite element (more specifically, edge element) solution, is given by D. C. Dobson and J. E. Pasciak [DP01]; they use the E˜ formulation. We turn to the plane wave method first; the finite element solution will be considered later in this section. As in 2D, the periodic factor E˜ can be expanded into a

8.11 Photonic Bandgap Calculation in Three Dimensions: ...

487

Fourier series with some coefficients Em to be determined: ˜ E(r) =



Em exp(ikm · r), km =

m∈Z3

2π 2π m ≡ (m x , m y , m z ) a a

(8.194)

with integers m x , m y , m z . The full field E is obtained by multiplying E˜ with the Bloch exponential: ˜ E(r) = E(r) exp(iq · r) =



˜ E(m) exp (i(km + q) · r)

(8.195)

m∈Z3

The dielectric permittivity  = (x, y, z) or its inverse γ = −1 is also periodic function of coordinates and can be expanded into similar Fourier series. For the E-problem (8.190), there is, as in 2D, a trade-off between a generalized Hermitian problem and a regular non-Hermitian one. The latter is obtained if the equation for the E field is divided through by , so that the right-hand side of the eigenvalue problem does not contain any coordinate-dependent functions: γ∇ × μ−1 ∇ × E = k02 E

(8.196)

For the E field in the Bloch form (8.195), the curl operator translates in the Fourier domain into vector multiplication i(km + q)×. Materials are assumed non-magnetic. Multiplication by γ turns into convolution. Overall, the Fourier transformation of the differential equation is similar to the 2D case. The eigenvalue problem for the Fourier coefficients is (see, e.g., K. Sakoda [Sak05]) −



γ(m ˜ − s) (qs + q) × [(qs + q) × Es ] = k02 μEm ;

(8.197)

s∈Z3

m = (m x , m y , m z ); m x , m y , m z = 0, ±1, ±2, . . . where the Fourier coefficients γ˜ are  γ(r) exp(−ikm · r) d x d y dz γ(m) ˜ =

(8.198)



In practice, the infinite set of equations (8.197) is truncated and the resultant eigenvalue problem for a finite set of coefficients is solved by direct or iterative methods (Appendix 8.17). If M reciprocal (Fourier) vectors km are retained, the system comprises M vector equations or equivalently 3M scalar ones. An undesirable feature of the E-formulation is the presence of static eigenmodes (ω = 0) that for purposes of wave analysis in photonics can be considered spurious. These static modes are gradients of scalar potentials exp(i(km + q) · r). Indeed, these gradients satisfy (in a trivial way) the curl–curl Maxwell equation (8.190) as well as

488

8 Applications in Nano-Photonics

the Bloch boundary conditions on the cell. The number of these static modes is M, out of the 3M vector modes. In the H -formulation (8.191), these electrostatic modes can be eliminated from the outset by employing only transverse waves as a basis: H =



Hm exp(i(km + q) · r), Hm · (km + q) = 0,

(8.199)

m∈Z3

The transversality condition Hm ⊥(km + q) eliminates the electrostatic modes because those would be longitudinal (field in the direction of the wave vector): ∇ exp(i(km + q) · r) = i(km + q) exp(i(km + q) · r) No longitudinal H-modes exist because ∇ · H = 0. The absence of these spurious static modes makes the H field expansion substantially different from that of the E field. The dimension of the system is reduced from 3M to 2M: Each wave vector km has two associated plane waves, with two independent directions of the H field perpendicular to (km + q). Another advantage of the H -formulation in the lossless case (real γ) is that its differential operator, ∇ × γ(x, y, z)∇×, is Hermitian,28 unlike the operator γ(x, y, z)∇ × ∇× of the E-formulation. This is completely analogous to the twodimensional case and can be verified using integration by parts. In Fourier space, the corresponding problem is also Hermitian. Real-space operations in the differential equation are translated into reciprocal space in the usual manner (∇× becomes i(km + q)×, and multiplication turns into convolution), and the eigenvalue equations for the H-formulation become γ(m ˜ − s) (km + q) × [(ks + q) × Hs ] = k02 μ Hm ; (8.200) − s∈Z3

A small but significant difference from the E-formulation is that the wave vector in the left cross product now corresponds to the equation index m rather than the dummy summation index s; this reflects the interchanged order of operations, ∇ × γ× rather than γ∇ × ×, and makes the system matrix in the Fourier domain Hermitian. Although the E and H fields appear in Maxwell’s equations in a symmetric way (at least in the absence of given electric currents), the E- and H-formulations for the photonic bandgap problem are not equivalent as we have seen. The symmetry between the formulations is broken due to the different behaviors of the dielectric permittivity and magnetic permeability: While μ at optical frequencies is essentially equal to unity,  is a function of coordinates. This disparity works in favor of the formulation where  appears in the differential operator and the term with the eigenfrequency ω does not contain coordinate-dependent factors. 28 All

operators are considered in the space of functions satisfying the Bloch boundary conditions. The permittivity tensor is assumed to be real symmetric.

8.11 Photonic Bandgap Calculation in Three Dimensions: ...

489

8.11.2 FEM for Photonic Bandgap Problems in 3D As in 2D (Sect. 8.10.4), the finite element method can be applied either to the full E ˜ In the field (or, alternatively, H field) or to the spatial-periodic factor E˜ (or H). first instance, one deals with the usual differential operator but nonstandard for FEM boundary conditions (Bloch); the second case has standard periodic boundary conditions but an unusual operator. This second case is considered rigorously by D. C. Dobson and J. E. Pasciak in their terse but mathematically comprehensive paper [DP01]. As an alternative, and in parallel with Sect. 8.10.4, we now review the first formulation. A natural functional space B() for this problem is the subspace of “scaledperiodic” functions—not in the Sobolev space H 1 () as in 2D but rather in H (curl, ): B(curl, ) = {E : E ∈ H (curl, ); n × E satisfies Bloch boundary conditions with wave vector q}

(8.201)

H (curl, ) is the space of vector functions in (L 2 ())3 whose curl is also in (L 2 ())3 ; the tangential component n × E of vector fields in this space is mathematically well defined. The B space depends on the given value of q, although for simplicity of notation this is not explicitly indicated. At this book’s level of rigor, the technical details of this definition will not be required; they are available, e.g., in P. Monk’s monograph [Mon03], which is also a very useful reference on edge element formulations. The weak form of the H field problem is     γ∇ × H, ∇ × H = k02 μ H, H , ∀H ∈ B(curl, ) (8.202) The surface integral in the derivation of the weak formulation vanishes for the same reason as in 2D (Remark 27 on Sect. 8.10.4). Since the early 1980s, thanks to the work by J. C Nédélec [Néd80, Néd86], A. Bossavit [BV82, BV83, Bos88b, Bos88a, Bos98], R. Kotiuga [Kot85], D. Boffi [BFea99, Bof01], P. Monk [MD01, Mon03] and many others, the mathematical and engineering research communities have come to realize that the “right” FE discretization of electromagnetic vector fields is via edge elements, where the degrees of freedom are associated with the element edges rather than nodes. For eigenvalue problems, the use of edge elements is particularly important, because they, in contrast to nodal elements, do not produce spurious (non-physical) modes; see Sect. 3.12.1. Further details and references on the edge element formulation are given in Chap. 3. From the finite element perspective, the only nonstandard feature of the problem at hand is the Bloch boundary condition. It is dealt with in full analogy with the scalar case in 2D (Sect. 8.10.4), with “master–slave” edge pairs instead of node pairs. Find H ∈ B(curl, ) :

490

8 Applications in Nano-Photonics

8.11.3 Historical Notes on the Photonic Bandgap Problem It is well known that the seminal papers by E. Yablonovitch [Yab87, YG89], S. John [Joh87] and K. M. Ho et al. [HCS90] led to an explosion of interest in photonic bandgap structures. An earlier body of work, dating back to at least 1972, is not, however, known nearly as widely. The 1972 and 1975 papers by V. P. Bykov [Byk72, Byk75] (see also [Byk93]), originally published in Russian behind the Iron Curtain, were perhaps ahead of their time. A. Moroz on his Web site gives a condensed but informative review of the early history of photonic bandgap research.29 The following excerpts from the Web site and the original papers speak for themselves. A. Moroz: “A study of wave propagation in periodic structures has a long history, which stretches back to, at least, Lord Rayleigh classical article on the influence of obstacles arranged in rectangular order upon the properties of a medium.30 …Later on, wave propagation in periodic structures was a subject of the book [BP53] …by Brillouin and Parodi. …Some of early history of acoustic and photonic crystals can also be found in a review [Kor94] by Korringa. A detailed investigation of the effect of a photonic bandgap on the spontaneous emission (SE) of embedded atoms and molecules has been performed by V. P. Bykov [Byk72, Byk75]. For a toy one-dimensional model, he obtained the energy and the decay law of the excited state with transition frequency in the photonic band gap, and calculated the spectrum which accompanies this decay. Bykov’s detailed analytic investigation revealed that the SE can be strongly suppressed in volumes much greater than the wavelength. V. P. Bykov ([Byk75, “Discussion of Results,” p. 871]): “The most interesting qualitative conclusion is the possibility of influencing the spontaneous emission and, particularly, suppressing it in large volumes. …in a large volume we can use a periodic structure and thus control the spontaneous emission. Control of the spontaneous emission and particularly its suppression may be important in lasers. For example, the active medium of a laser may have a three-dimensional periodic structure. Let us assume that this structure has such anisotropic properties that at the transition frequency of a molecule there is a narrow cone of directions in which the propagation of electromagnetic waves is allowed, whereas all the other directions are forbidden. Then, the laser threshold of this medium (in the allowed direction) should be much lower than that of a medium without a periodic structure …”

8.12 Negative Permittivity and Plasmonic Effects The standard linear constitutive relationships between the electric field E, polarization P and the displacement field D are, in the SI system P = 0 χE 29 http://www.wave-scattering.com/pbgheadlines.html

(8.203)

and http://www.wave-scattering.com/ pbgprehistory.html. 30 Lord Rayleigh, On the influence of obstacles arranged in rectangular order upon the properties of a medium, Philos. Mag. 34, 481–502 (1892).

8.12 Negative Permittivity and Plasmonic Effects

491

D = 0 E + P = E,  = 0 (1 + χ)

(8.204)

or in the Gaussian system P = χE

(8.205)

D = E + 4πP = E,  = 1 + 4πχ

(8.206)

Under static conditions, the dielectric susceptibility χ must be nonnegative (L. D. Landau, E. M. Lifshitz and L. P. Pitaevskii [LLP84, Sect. II.7, Sect. II.14]). In highfrequency time-harmonic fields, χ and  become complex, and their behavior can be quite rich from both theoretical and practical perspectives. A well-known phenomenological description of polarization is obtained by applying Newton’s equation of motion to an individual electron in the medium: m r¨ + m r˙ + mω02 r = −eE(t)

(8.207)

The mass of the electron is m, and its charge is −e; r is the position vector;  is a phenomenological damping constant that can physically be interpreted as the rate of collisions—the reciprocal of the mean time between collisions. For electrons bound to atoms, the third term in the left-hand side represents the restoring force with the “spring constant” mω02 ; if the electrons are not bound (e.g. in metals), ω0 = 0. For time-harmonic excitation E(t) = E0 exp(±iωt), where the ± sign refers to two possible phasor conventions, one solves Newton’s equation (8.207) by switching to complex phasors: 1 eE0 (8.208) r = − 2 m ω0 − ω 2 ± iω where the same symbols are used for complex phasors as for time functions, with little possibility of confusion. With the standard definition of polarization as dipole moment per unit volume, one has P = −Ne er, where Ne is the volume concentration of the electrons,31 and hence 1 Ne e2 E0 (8.209) P = m ω02 − ω 2 ± iω The dielectric susceptibility is thus (in the SI system for definiteness) χ =

ω 2p ω02 − ω 2 ± iω

, ω 2p =

Ne e2 0 m

(8.210)

Parameter ω p is called the plasma frequency.

31 Averaging

over r for all electrons is implied and for simplicity omitted in the expressions.

492

8 Applications in Nano-Photonics

This phenomenological description of polarization is known as the Lorentz model. A particular case is the Drude model, where ω0 = 0 (no restoring force in Newton’s equation (8.207)): ω 2p (8.211) χ = −ω 2 ± iω The relative dielectric constant is  r = 1 + χ =

1−

ω 2p 2 + ω2

 ∓ i ω 2p

/ω 2 + ω2

(8.212)

A peculiar feature of this result is the behavior of the real part of r (expression in the large parentheses). For frequencies ω below the plasma frequency (more precisely, for ω 2 < ω 2p −  2 ) the real part of the dielectric constant is negative—in stark contrast to the normal values greater than one for simple dielectrics. The negative permittivity is, in the Drude model, ultimately due to the fact that for ω0 = 0 (no restoring force on the electrons) and sufficiently small damping forces, Newton’s law (8.207) puts acceleration—rather than displacement—approximately in sync with the applied electrostatic force. Acceleration, being the second derivative of the displacement, is shifted by 180◦ relative to the displacement. Therefore, displacement and hence polarization are shifted by approximately 180◦ with respect to the applied force, leading to negative susceptibility. For frequencies below the plasma frequency, the real part of susceptibility is even less than −1, which makes the real part of the dielectric constant negative. Why would anyone care about negative permittivity? As we shall see shortly, it opens many interesting opportunities in subwavelength optics, with far-reaching practical implications: strong resonances, with very high local enhancement of optical fields and signals; nano-focusing of light; propagation of surface plasmon polaritons (charge density waves on metal–dielectric interfaces); anomalous transmission of light through arrays of holes; and so on. This area of research and development— one of the hottest in applied physics—is known as plasmonics; see U. Kreibig and M. Vollmer [KV95], S. A. Maier and H. A. Atwater [MA05], S. A. Maier [Mai07], a recent broad review (“roadmap”) by a large group of authors [SKB+18], with references therein.32 Also associated with negative permittivity is the superlensing effect of metal nanolayers (J. B. Pendry [Pen00], N. Fang et al. [FLSZ05], D. O. S. Melville and R. J. Blaikie [MB05]). These subjects are discussed later in this chapter.

32 Mark

Brongersma discovered what almost certainly would be the first paper on the subject of plasmonics; it dates back to 1972. Unfortunately for the physicists, the article is in fact devoted to communication by fish (M. D. Moffler, Plasmonics: Communication by radio waves as found in Elasmobranchii and Teleostii fishes, Hydrobiologia, vol. 40 (1), pp. 131–143, 1972, https://doi.org/ 10.1007/BF00123598). Intriguingly, the author discovered “the phenomenon of fish communication, via hydronic radio waves” that are “neither sonic nor electrical.”.

8.12 Negative Permittivity and Plasmonic Effects

493

8.12.1 Electrostatic Resonances for Spherical Particles Exhibit A for electrostatic resonances33 is the classic example of the electrostatic field distribution around a dielectric spherical particle immersed in a uniform external field. The electrostatic potential can easily be found via spherical harmonics. In fact, since the uniform field (say, in the z direction) has only one dipole harmonic (u = −E 0 z = −E 0 r cos θ, in the usual notation), the solution also contains only the dipole harmonic. However, later on in this section higher-order harmonics will also be needed, and so for the sake of generality let us recall the expansion of the potential into an infinite series of harmonics. The potential inside the particle is (e.g. W. K. H. Panofsky and M. Phillips [PP62], R. F. Harrington [Har01] or W. B. Smythe [Smy89]) u in (r, θ, φ) =

∞ n

anm r n Pnm (cos θ) exp(imφ)

(8.213)

n=0 m=−n

where the standard notation for the associated Legendre polynomials Pnm and the spherical angles θ, φ is used; anm are some coefficients. The potential outside, in the presence of the applied field E 0 in the z direction, is u out (r, θ, φ) = −E 0 z +

∞ n

bnm r −n−1 Pnm (cos θ) exp(imφ)

(8.214)

n=0 m=−n

The coefficients anm and bnm for the field inside/outside are related via the boundary conditions on the surface of the particle: u in (rp , θ, φ) = u out (rp , θ, φ); in

∂u in (rp , θ, φ) ∂u out (rp , θ, φ) = out (8.215) ∂r ∂r

Substitution of harmonic expansions (8.213), (8.214) into these boundary conditions yields a system of decoupled equations for each spherical harmonic. For the special case n = 1, m = 0 (the dipole term), noting the contribution of the applied field −E 0 r cos θ ≡ −E 0 r P1 (cos θ), we obtain a10 rp = b10 rp−2

(8.216)

in a10 = −2out (b10 rp−3 + E 0 )

(8.217)

where in , out are the dielectric constants of the media inside and outside the particle, respectively. The Legendre polynomials have disappeared because they are the same in all terms. The coefficients a10 , b10 are easily found from this system: 33 I

thank Isaak Mayergoyz for introducing the term “electrostatic resonances” to me in the early 2000s; I believe he coined this term.

494

8 Applications in Nano-Photonics

a10 = −

3out out − in 3 E 0 , b10 = − r E0 2out + in 2out + in p

(8.218)

This result is very well known [Har01, Smy89, PP62]. The dipole moment of the particle is p = −b10 zˆ , where zˆ is the unit vector in the z direction, and the polarizability (dipole moment per unit applied field) is α =

out − in 3 r 2out + in p

(8.219)

For simple dielectrics with the dielectric constant greater or equal that of a vacuum, there is nothing unusual about this formula. However, if the permittivity can be negative, as in the quasi-static regime for metals at frequencies below the plasma frequency, the denominator of (8.219) can approach zero. The obvious special case— the plasmon resonance condition—for a spherical particle is in = −2out

(8.220)

If the relative permittivity of the outside medium is unity (air or vacuum), then the resonance occurs for the relative permittivity of the particle equal to −2. Notably, this resonance condition does not depend on the size of the particle—as long as this size remains sufficiently small for the electrostatic approximation to be valid. This size independence turns out to be true for any shapes, not necessarily spherical. It is worth repeating that although plasmon resonance phenomena usually manifest themselves at optical frequencies, their qualitative interpretation and approximation are as quasi-static effects, for particles much smaller than the wavelength; see U. Kreibig and M. Vollmer [KV95] and D. R. Fredkin and I. D. Mayergoyz [FM03, MFZ05a]. Still, full-wave simulation is needed for higher accuracy (see Sects. 8.12.3, 8.14.2). From the analytical viewpoint, the field can be expanded into an asymptotic series with respect to the small parameter—the size of the particle relative to the wavelength [MFZ05a], the zeroth term of this expansion being the electrostatic problem. At the resonance, division by zero in the expression for polarizability (8.219) and in similar expressions for the dipole moment and field indicates a non-physical situation. In reality, losses (represented in our model by the imaginary part of the permittivity), nonlinearities and dephasing/retardation will quench the singularity. Under the electrostatic approximation, a source-free field can exist if losses are neglected. In the case of a spherical particle, the boundary conditions for any spherical harmonic n, m (not necessarily dipole) are anm rpn = bnm rp−n−1 n

in anm rpn−1

= −(n +

1) out bnm rp−n−2

(8.221) (8.222)

It is straightforward to find that this system of two equations has a non-trivial solution anm , bnm if the permittivity of the particle is

8.12 Negative Permittivity and Plasmonic Effects

in = −

495

n+1 out n

(8.223)

In particular, for n = 1 this is the already familiar condition in = −2 out . The resonance permittivity is different for particles of different shapes; although no simple closed-form expression for this resonance value exists in general, theoretical and numerical considerations for finding it are presented in the following sections.

8.12.2 Plasmon Resonances: Electrostatic Approximation If the characteristic dimension of the system under consideration (e.g. the size of a plasmonic particle) is small relative to the wavelength, analysis can be simplified dramatically by electrostatic approximation—the zero-order term in the asymptotic expansion of the solution with respect to the characteristic size (see the previous section). The governing equation for the electrostatic potential u is − ∇ · ∇u = 0;

u(∞) = 0

(8.224)

An unusual feature here is the zero right-hand side of the equation, along with the zero boundary condition. Normally, this would yield only a trivial solution: The operator in the left-hand side is self-adjoint and, if the dielectric constant has a positive lower bound, (x, y, z) ≥ min > 0, positive definite. More generally, however, the dielectric constant can be complex, so the operator is no longer positive definite and for a real negative permittivity can have a non-trivial null space. This is the plasmon resonance case, which we have already observed for spherical particles. To study plasmonic resonances, let us revisit the formulation of the problem in the electrostatic limit. Since the dielectric constant need not be smooth (it is often piecewise-constant, with jumps at material interfaces), the derivatives in the differential equation (8.224) are to be understood in the generalized sense. It is therefore helpful to write the equation in weak form: (∇u, ∇u  ) L 32 (R3 ) = 0;

∀u  ∈ H 1 (R3 )

(8.225)

In contrast to standard electrostatics, for complex  this bilinear form is not in general elliptic. Importantly,  can be (at least approximately) real and negative in some regions, and this equation can therefore admit non-trivial solutions. To make further progress in the analysis, let us consider a specific practical case: region(s) p with one dielectric constant p (particles, particle clusters, layers, etc.) embedded in some “background” medium with another dielectric constant bg = p . It is assumed that p and bg do not depend on coordinates. The weak form of the governing equation can then be rewritten as

496

8 Applications in Nano-Photonics

bg (∇u, ∇u  ) L 32 (R3 ) + (p − bg )(∇u, ∇u  ) L 32 (p ) = 0; ∀u  ∈ H 1 (R3 ) (8.226) or equivalently (∇u, ∇u  ) L 32 (p ) = λ(∇u, ∇u  ) L 32 (R3 ) ; ∀u  ∈ H 1 (R3 ), λ =

bg bg − p

(8.227) (8.228)

This is a generalized eigenvalue problem. Setting u  = u reveals that all eigenvalues λ must lie in the closed interval [0, 1]. Indeed, both inner products with u  = u are always real and nonnegative; the inner product over p obviously cannot exceed the one over the whole R3 . Thus, we have 0 ≤ and consequently

bg ≤ 1 bg − p p < 0 bg

(8.229)

This result again highlights the key role of negative permittivity—without that the resonance, in the strict sense of the word (the presence of a source-free eigenmode), is not possible. If the dielectric constant of the particle is close but not exactly equal to its resonance value (e.g. p has a non-negligible imaginary part), one can expect strong local amplification of applied external fields in the vicinity of the particle,34 giving rise to various practical applications. To find the actual numerical values of the eigenparameter λ in (8.227)—and hence the corresponding value of the dielectric constant—one can discretize the problem using finite element analysis, finite differences (K. Li et al. [LSB03]), integral equation methods (D. R. Fredkin and I. D. Mayergoyz [FM03, MFZ05a]), T-matrix methods and other techniques. It goes without saying that the plasmon modes and their spectrum do not depend on a specific formulation of the problem or on a specific method of solving it. In particular, regardless of the formulation, the problem with two media (e.g. host and particles) splits up into a purely “geometric” eigenproblem (8.227) with no material parameters and the relationship (8.228) between the eigenvalue λ and the permittivity . As a side note, such splittings play a major role in analytical homogenization theories—especially in establishing bounds for effective material parameters (D. J. Bergman [Ber76a, Ber78, Ber80, Ber81, Ber82]; G. Milton [Mil02, Chaps. 18, 27]).

34 Unless

the external field happens to be orthogonal to the respective resonance eigenmode.

8.12 Negative Permittivity and Plasmonic Effects

497

8.12.3 Wave Analysis of Plasmonic Systems Although the electrostatic approximation does provide a very useful insight into plasmon resonance phenomena, accurate evaluation of resonance conditions and field enhancement requires electromagnetic wave analysis. Effective material parameters  and μ are needed for Maxwell’s equations, but questions do arise about the applicability of bulk permittivity to nanoparticles. Various physical mechanisms affecting the value of the effective dielectric constant in individual nanoparticles and in particle clusters are discussed in detail in the physics literature: U. Kreibig and C. Von Fragstein [Fra69], U. Kreibig and M. Vollmer [KV95], A. Liebsch [Lie93a, Lie93b], B. Palpant et al. [PPL+98], M. Quinten [Qui96, Qui99], L.B. Scaffardi and J.O. Tocho [ST06]. As an example of such complicated physical phenomena, at the surfaces of silver particles due to quantum effects the 5s electron density “spills out” into the vacuum, where 5s electronic oscillations are not screened by the 4d electrons [Lie93a, Lie93b]. Further, for small particles the damping constant Γ in the Drude model is increased due to additional collisions of free electrons with the boundary of the particle [Fra69, KV95]; Scaffardi and Tocho [ST06] and Quinten [Qui96] provide the following approximation; vF Γ = Γbulk + C rp where v F is the electron velocity at the Fermi surface and rp is the radius of the particle (v F ∼ 14.1 · 1014 nm · s−1 for gold, C is on the order of 0.1–2 [ST06]). Fortunately, the cumulative effect of the nanoscopic factors affecting the value of the permittivity may be relatively mild, as suggested by spectral measurements of plasmon resonances of extremely thin nanoshells by C. L. Nehl et al. [NGG+04]: “the resonance line widths fit Mie theory without the inclusion of a size-dependent surface scattering term.” Moreover, the measurements by P. Stoller et al. [SJS06] show that bulk permittivity is applicable to gold particles as small as 10–15 nm in diameter. There is a large body of the literature on the optical behavior of small particles. In addition to the publications cited above, see M. Kerker et al. [KWC80] and K. L. Kelly et al. [KCZS03]. In the remainder of this section, our focus is on the computational tools rather than the physics of effective material parameters. Hence, these parameters will be considered as given, with an implicit assumption that proper adjustments have been made for the difference between the parameters in the particles and in the bulk. However, it should be kept in mind that such adjustments may not be valid if non-local effects of electron charge distribution are appreciable. Physical and computational models of such effects—in particular, the hydrodynamic model of the free electron gas—are quite complicated and computationally challenging, but remain out of the scope of this book; see studies by M. Moeferdt et al. [MKS+18] and S. Raza et al. [RBWM15].

498

8 Applications in Nano-Photonics

8.12.4 Some Common Methods for Plasmon Simulation This section is a brief summary of computational methods that are frequently used for simulations in plasmonics. In the following sections, two other computational tools—the generalized finite-difference method with flexible local approximation and the finite element method—are considered in greater detail.

8.12.4.1

Analytical Solutions

As an analytical problem, scattering of electromagnetic waves from dielectric objects is quite involved. Closed-form solutions are available only for a few cases (see, e.g., M. I. Mishchenko et al. [MTL02]): an isotropic homogeneous sphere (the classic Lorenz–Mie–Debye case); concentric core–mantle spheres; concentric multilayered spheres; radially inhomogeneous spheres; a homogeneous infinite circular cylinder; an infinite elliptical cylinder; and homogeneous and core–mantle spheroids. For objects other than homogeneous spheres or infinite cylinders, the complexity of analytical solutions (if they are available) is so high that the boundary between analytical and numerical methods becomes blurred. At present, further extensions of purely analytical techniques seem unlikely. On the other hand, with the available analytical cases in mind, local analytical approximations to the field are substantially easier to construct than global closed-form solutions. Such local analytical approximations can be incorporated into “Flexible Local Approximation Method” (FLAME), Sect. 8.12.5 and Chap. 4.

8.12.4.2

T-Matrix Methods

T-matrix methods are widely used in scattering problems. References to books and public domain software already appeared on Sect. 8.10.6: [MTM96, MTL02, MTL06, Wri99, DWE06, MT98, DWE]. If a monochromatic wave impinges on a scattering dielectric object of arbitrary shape, both the incident and scattered waves can be expanded into spherical harmonics around the scatterer. If the electromagnetic properties of the scatterer (the permittivity and permeability) are linear, then the expansion coefficients of the scattered wave are linearly related to the coefficients of the incident wave. The matrix governing this linear relationship is called the T- (“transition”) matrix. For a collection of scattering particles, the overall field can be sought as a superposition of the individual harmonic expansions around each scatterer. The transformation of vector spherical harmonics centered at one particle to harmonics around another one is accomplished via well-established translation and rotation rules (theorems) (e.g. D. W. Mackowski [Mac91], M. I. Mishchenko et al. [MTL02], D. W. Mackowski and M. I. Mishchenko [MM96], Y.-l. Xu [lX95]).

8.12 Negative Permittivity and Plasmonic Effects

499

Self-consistency of the multicentered expansions then leads to a linear system of equations for the expansion coefficients. Since the system matrix is dense, the computational cost may become prohibitively high if the number of scatterers is large. For spherical, spheroidal and other particles that admit a closed-form solution of the wave problem (see above), the T-matrix can be found analytically. For other shapes, the T-matrix is computed numerically. If the scatterer is homogeneous, the “Extended Boundary Condition Method” (EBCM) (e.g. P. Barber and C. Yeh [BY75], M. I. Mishchenko et al. [MTL02]) is usually the method of choice. EBCM is a combination of integral equations for equivalent surface currents and expansions into vector spherical harmonics (R. F. Harrington [Har01] or J. A. Stratton [Str41]). While the T-matrix method is quite suitable for a moderate number of isolated particles and is also very effective for random distributions and orientations of particles (e.g. in atmospheric problems), it is not designed to handle large continuous dielectric regions. It is possible, however, to adapt the method to particles on an infinite substrate at the expense of additional analytical, algorithmic and computational work: Plane waves reflected off the substrate are added to the superposition of spherical harmonics scattered from the particles themselves (A. Doicu et al. [DEW99], T. Wriedt and A. Doicu [WD00]).

8.12.4.3

The Multiple Multipole Method

In the multiple multipole method (MMP), the computational domain is decomposed into homogeneous subdomains, and an appropriate analytical expansion—often, a superposition of multipole expansions as the name suggests—is introduced within each of the subdomains. A system of equations for the expansion coefficients is obtained by collocation of the individual expansions at a set of points on subdomain boundaries. Applications of MMP in computational electromagnetics and optics include simulations of plasmon resonances (E. Moreno et al. [MEHV02]) and of plasmon-enhanced optical tips (R. Esteban et al. [EVK06]). A shortcoming of MMP is that no general systematic procedure for choosing the centers of the multiple multipole expansions is available. The choice of expansions remains partly a matter of art and experience, which makes it difficult to evaluate and systematically improve the accuracy and convergence. The MaX platform developed by C. Hafner [Haf99b, Haf99a] has overcome some of the difficulties; see also D. Casati’s PhD thesis [Cas19].

8.12.4.4

The Discrete Dipole Method

The discrete dipole method belongs to the general category of integral equation methods but admits a very simple physical interpretation. Scattering bodies are approximated by a collection of dipoles, each of which is directly related to the local value of the polarization vector. Starting with the volume integral equation for the electric field, one can derive a self-consistent system of equations for the equivalent

500

8 Applications in Nano-Photonics

dipoles (B. Draine and P. Flatau [DF94, DF03], P. J. Flatau [Fla97], A. Lakhtakia and G. Mulholland [LM03], J. Peltoniemi [Pel96]). The method has gained popularity in the simulation of plasmonic particles, as well as other scattering problems, because of its conceptual simplicity, relative ease of use and the availability of public domain software DDSCAT [DF94, DF03] by Draine and Flatau. For application examples, see papers by K. L. Kelly et al. [KCZS03], M. D. Malinsky et al. [MKSD01], K.-H. Su et al. [SWZ+03]. DDM has some disadvantages typical of integral equation methods. First, the treatment of singularities in DDM is quite involved (Lakhtakia and Mulholland [LM03], Peltoniemi [Pel96]). Second, the system matrix for the coupled dipoles is dense, and therefore the computational time increases rapidly with the increasing number of dipoles. If the dipoles are arranged geometrically on a regular grid, the numerical efficiency can be improved by using fast Fourier transforms to speed up matrix–vector multiplications in the iterative system solver. However, for such a regular arrangement of the sources DDM shares one additional disadvantage not with integral equation methods but rather with finite-difference algorithms: a “staircase” representation of curved or slanted material boundaries. In DDM simulations (e.g. N. Félidj et al. [FAL99], M. D. Malinsky et al. [MKSD01]), there are typically thousands of dipoles in each particle and tens of thousands of dipoles for problems with a few particles on a substrate. As an example, in [MKSD01] 11,218 dipoles are used in the particle and 93,911 dipoles in the particle and substrate together, so that the overall system of equations has a dense matrix of dimension 280,000.

8.12.5 Trefftz–FLAME Simulation of Plasmonic Particles This section shows an application of generalized finite-difference schemes with flexible local approximation (FLAME, Chap. 4) to the computation of electromagnetic waves and plasmon field enhancement around one or several cylindrical rods. The axes of all rods are aligned in the z direction, and the field is assumed to be independent of z, so that the computational problem is effectively two-dimensional. Two polarizations can be considered: the E-mode with the E field in the z direction and the H -mode. (The reason for using this terminology, rather than the more common “TE/TM” modes, is explained on Sect. 8.8.) Note that it is in the H -mode (one-component H field perpendicular to the x yplane and the electric field in the plane) that the electric field “goes through” the plasmon particles, thereby potentially giving rise to plasmon resonances. The governing equation for the H -mode is: ∇ · (−1 ∇ H ) + ω 2 μH = 0 (SI)

(8.230)

∇ · (−1 ∇ H ) + k02 μH = 0 (Gaussian)

(8.231)

8.12 Negative Permittivity and Plasmonic Effects

501

Fig. 8.25 Two cylindrical plasmonic particles. Setup due to Kottmann and Martin [KM01]. (This is one of the two cases they consider.)

In plasmonics, the permeability can be assumed to be unity throughout the domain, while the permittivity has a complex and frequency-dependent value within plasmonic particles. Standard radiation boundary conditions for the scattered wave apply. Let us consider an illustrative example proposed by J. P. Kottmann and O. J. F. Martin [KM01]: two cylindrical plasmon particles with a small separation between them (Fig. 8.25). Kottmann and Martin used integral equations in their simulation. In this section as an alternative, Trefftz–FLAME schemes of Chap. 4 on a 9-point (3 × 3) stencil are applied. It is natural to choose the basis functions as cylindrical harmonics in the vicinity of each particle and as plane waves away from the particles. “Vicinity” is defined by an adjustable threshold: r ≤ rcutoff , where r is the distance from the midpoint of the stencil to the center of the nearest particle, and the threshold rcutoff is typically chosen as the radius of the particle plus a few grid layers. Away from the particles, eight basis functions are taken as plane waves propagating toward the central node of the 9-point stencil from each of the other eight nodes ψα = exp(ik rˆ α · r),

α = 1, 2, . . . , 8, k 2 = ω 2 μ0 0

(8.232)

(see Appendix 4.9). The 9 × 8 nodal matrix (4.14) of FLAME comprises the values of the chosen basis functions at the stencil nodes, i.e. Nβα = ψα (rβ ) = exp(ik rˆ α · rβ )

α = 1, 2, . . . , 8; β = 1, 2, . . . , 9 (8.233)

The coefficient vector of the Trefftz–FLAME scheme (Chap. 4) is s = Null N T . Straightforward symbolic algebra computation shows that this null space is indeed of dimension one, so that a single valid Trefftz–FLAME scheme exists (Appendix 4.9). Substituting the nodal values of a “test” plane wave exp(ik rˆ · r), where rˆ = xˆ cos φ + yˆ sin φ, into the difference scheme, one obtains, after some additional symbolic algebra manipulation, the consistency error c =

1 (hk)6 (cos(φ) − 1) cos2 (φ)(cos(φ) + 1)(2 cos2 (φ) − 1)2 12096

(8.234)

502

8 Applications in Nano-Photonics

where for simplicity the mesh size h is assumed to be the same in both coordinate directions. 1 1 1 The φ-dependent factor has its maximum of (2 − 2 2 )/8 at cos 2φ = ( 21 + 2 2 /4) 2 . 1 Hence, the consistency error c ≤ (hk)6 (2 − 2 2 )/96, 768 for any “test” plane wave. Since any solution of the Helmholtz equation in the air region can be locally represented as a superposition (Fourier integral) of plane waves, this result for the consistency error has general applicability. Note that by construction the scheme is exact for plane waves propagating in either of the eight special directions (at ±45◦ to the axes if h x = h y = h). The domain boundary is treated using a FLAME-style PML (perfectly matched layer), as mentioned on Sect. 4.4.11; see also [Tsu05a, Tsu06]. In the vicinity of each particle, the “Trefftz” basis functions satisfying the wave equation are chosen as cylindrical harmonics: ψα(i) =



an Jn (kcyl r ) exp(inφ), r ≤ r0  bn Jn (kair r ) + Hn(l) (kair r ) exp(inφ), r > r0

Jn is the Bessel function, Hn(l) is the Hankel function of the first kind (l = 1) for the exp(−iωt) phasor convention or of the second kind (l = 2) for the exp(+iωt) convention [Har01], and an , bn are coefficients to be determined. These coefficients are found via the standard conditions on the particle boundary; the actual expressions for these coefficients are too lengthy to be worth reproducing here but are easily usable in computer codes. Eight basis functions are obtained by retaining the monopole harmonic (n = 0), two harmonics of orders n = 1, 2, 3 (i.e. dipole, quadrupole and octupole) and one of harmonics of order n = 4. Numerical experiments for scattering from a single cylinder, where the analytical solution is available for comparison and verification, show convergence (not just consistency error!) of order six for this scheme [Tsu05a]. In Fig. 8.26, the electric field computed with Trefftz–FLAME is compared with the quasi-analytical solution via the multicenter-multipole expansion of the wave (V. Twersky [Twe52], M. I. Mishchenko et al. [MTL02]), for the following parameters.35 The radius of each silver nanoparticle is 50 nm. The wavelength of the incident wave varies as labeled in the figure; the complex permittivity of silver at each wavelength is obtained by spline interpolation of the Johnson and Christy values [JC72]. As evident from the figure, the results of FLAME simulation are in excellent agreement with the quasi-analytical computation. Kottman and Martin applied volume integral equation methods where “the particles are typically discretized with 3000 triangular elements” [KM01]. For two particles, this gives about 6000 unknowns and a full system matrix with 36 million nonzero entries. For comparison, FLAME simulations were run on grids from 100 × 100 to 250 × 250 (∼100–500 thousand nonzero entries in a very sparse matrix).

35 The

ˇ analytical expansion was implemented by Frantisek Cajko.

8.12 Negative Permittivity and Plasmonic Effects

503

ˇ Fig. 8.26 (Credit: F. Cajko.) The magnitude of the electric field along the line connecting two silver plasmonic particles. Comparison of FLAME and multipole-multicenter results. Particle radii 50 nm; varying wavelength of incident light. (Reprinted by permission from [Tsu06] ©2006 Elsevier)

8.12.6 Plasmonic Nano-Focusing: Finite Element Simulation As we have seen, plasmonic resonances of metal particles may lead to very high local enhancement of light. Cascade amplification may produce an even stronger effect. As an illustration, an interesting self-similar cascade arrangement of particles in 3D, where an extremely high plasmon field enhancement can be achieved, was proposed by K. Li, M. I. Stockman and D. Bergman [LSB03] (Fig. 8.27). Three spherical silver particles, with the radii 45, 15 and 5 nm as a characteristic example, are aligned on a straight line; the air gap is 9 nm between the 45 and 15 nm particles, and 3 nm between the 15 and 5 nm particles. Each of the smaller particles is in the field amplified by its bigger neighbor, hence cascade amplification of the field. The quasi-static approximation of [LSB03] is helpful if the size of the system is much smaller than the wavelength. Electrodynamic effects were reported by another group of researchers (Z. Li et al. [LYX06]) to result in correction factors on the order of two for the maximum value of the electric field. However, as K. Li et al. argue in [LSB06], the grid size in the finite-difference time-domain (FDTD) simulation of [LYX06] was too coarse to accurately represent the rapid variation of the field at the focus of the “lens.” To analyze the impact of electrodynamic effects on the nano-focusing of the field more accurately, J. Dai et al. [DTvS08] use adaptive finite element analysis in the frequency domain, which is more straightforward and reliable that reaching the sinusoidal steady state in FDTD. Spherical boundaries are accurately rendered in FEM.

504

8 Applications in Nano-Photonics

8

Fig. 8.27 A cascade of three particles and reference points for field enhancement

5 9

6

3 7 4

2 1

Some of the results by J. Dai et al. are reported below. It is assumed (as was done in [LSB03]) that, to a reasonable degree of approximation, the permittivity of the particles is equal to its bulk value for silver. As already noted, the optical response of small particles is very difficult to model accurately due to nonlocality, surface roughness, “spillout” of electrons and other factors. Nevertheless, the bulk value of the permittivity may still provide a meaningful approximation (Sect. 8.12.3). Under the electrostatic approximation, the maximum field enhancement in the Li– Stockman–Bergman cascade is calculated to occur in the near ultraviolet at ω = 3.37 eV, with the corresponding wavelength of ∼367.9 nm in a vacuum and the corresponding frequency ∼814.8 THz. The relative permittivity at this wavelength is, under the exp(+iωt) phasor convention, −2.74 − 0.232 i according to the Johnson and Christy data [JC72]. In [DTvS08], FEM is applied to the electric field equation in the SI system: ∇ × μ−1 ∇ × E − ω 2 E = 0

(8.235)

For analysis and simulation—particularly for imposing radiation boundary conditions—it is customary to decompose the total field into the sum of the incident field Einc and the scattered field Es ; by definition, Es = E − Einc . In our simulations, the incident field is always a plane wave with the amplitude of the electric field normalized to unity. The governing equation for the scattered field is ∇ × ∇ × Es − ω 2 μ0 Es = −(∇ × ∇ × Einc − ω 2 μ0 Einc )

(8.236)

(for μ = μ0 at optical frequencies). The differential operators should be understood in the sense of generalized functions (distributions) that include surface delta functions for charges and currents (Appendix 6.15). The right-hand side of the equation is nonzero due to these surface terms and due to the volume term inside the particles, as the incident field is governed by the wave equation with the wavenumber of free space. In the electrostatic limit, the governing equation is written for the total electrostatic potential φ: (8.237) ∇ · ∇φ = 0; φ(r) → φext (r) as r → ∞

8.12 Negative Permittivity and Plasmonic Effects

505

Fig. 8.28 Electric field enhancement factor around the cascade of three plasmonic spheres. (Simulation by J. Dai and ˇ F. Cajko.)

where φext (r) is the applied potential (typically a linear function of position r, corresponding to a constant external field). The differential operators in (8.237) should again be understood in the generalized sense. In FEM, (8.236) is rewritten in the weak (variational) form. Boundary conditions on the surfaces are natural—that is, the solution of the variational problem satisfies these conditions automatically. The mathematical and technical details of this approach are very well known (e.g. P. Monk [Mon03], J. Jin [Jin02]). J. Dai et al. [DTvS08] used the commercial software package HFSS™ by ANSYS Corp. for electrodynamic analysis36 and COMSOL Multiphysics in the electrostatic case. Both packages are FEM-based: second-order triangular nodal elements for the electrostatic problem and tetrahedral edge elements with 12 degrees of freedom for wave analysis. HFSS employs automatic adaptive mesh refinement for higher accuracy and either radiation boundary conditions or perfectly matched layers to truncate the unbounded domain. To assess the numerical accuracy, J. Dai et al. first considered a single particle. The average difference between Mie theory [Har01] and HFSS field values is ∼2.3% for a dielectric particle with  = 10 and ∼4.9% for a silver particle with s = −2.74 − 0.232 i. At the surface of the particle, the computed normal component of the displacement vector, in addition to smooth variation, was affected by some numerical noise. The noise was obvious in the plots and was easily filtered out. The HFSS mesh had 20,746 elements in all simulations. Let us now turn to the simulations of particle cascades. A sample distribution of the field enhancement factor (i.e. the ratio of the amplitude of the total electric field to the incident field) in the cross section of the cascade is shown in Fig. 8.28 for illustration; the incident wave is polarized along the axis of the cascade and propagates in the downward direction. Four independent combinations of the directions of wave propagation and polarization can be considered (left–right and up–down directions are in reference to Fig. 8.27):

36 Caution

should be exercised when representing the measured Johnson and Christy data [JC72], with its exp(−iωt) convention for phasors, as the HFSS input, with its exp(+iωt) default.

506

8 Applications in Nano-Photonics

Table 8.3 Field enhancement for different directions of propagation and polarization of the incident ˇ wave. P1–P9 are the reference points shown in Fig. 8.27. (Simulation by J. Dai and F. Cajko.) Case P1 P2 P3 P4 P5 P6 P7 P8 P9 ⇐⊥ ⇒⊥ ⇑⊥ ⇑

5.45 6.37 2.44 90.8

17.3 6.49 8.48 35.9

10.2 2.41 6.65 250

9.43 1.43 7.60 146

34.4 4.17 23.3 10.3

10.7 3.39 8.31 70.9

5.53 3.91 4.69 51.9

10.4 11.2 10.1 2.72

3.21 2.00 2.61 6.47

1. The incident wave propagates from right to left. Electric and magnetic fields are both perpendicular to the axis of the cascade. (Mnemonic label: ⇐⊥.) 2. Same as above, but the wave impinges from the left. (⇒ ⊥) 3. The direction of propagation and electric field are both perpendicular to the axis of the cascade. (⇑ ⊥) 4. The direction of propagation is perpendicular to the cascade axis, and the electric field is parallel to it. (⇑ ) Table 8.3 shows the field enhancement factors at the reference points for cases (i)–(iv) [DTvS08]. The “hottest spot,” i.e. the point of maximum enhancement, is indicated in bold and is different in different cases. When the electric field is perpendicular to the axis of the cascade, the local field is amplified by a very modest factor g < 40. Not surprisingly, enhancement is much greater (g ≈ 205) in case (iv), when the field and the dipole moments that it induces are aligned along the axis. To gauge the influence of electrodynamic effects, field enhancement is analyzed as a function of scaling of the system size. Scaling is applied across the board to all dimensions: All the radii of the particles and the air gaps between them are multiplied by the same factor. The radius of the smallest particle, with its original value [LSB03] of 5 nm as reference, is used as the independent variable for plotting and tabulating the results (Fig. 8.29). The enhancement factor decreases rapidly as the size of the system increases. This can be easily explained by dephasing effects. Conversely, as the system size is reduced, the local field increases significantly. It is, however, somewhat counterintuitive that the electrostatic limit does not produce the highest enhancement factor (Fig. 8.29). Further, the point of maximum enhancement does not necessarily lie on ˇ the axis of the cascade. As noted by F. Cajko, some clues can be gleaned by approximating each particle as an equivalent dipole in free space and neglecting higher-order spherical harmonics. The electric field of a Hertzian dipole is given by the textbook formula (in the SI system)

8.12 Negative Permittivity and Plasmonic Effects

507

Fig. 8.29 Maximum field enhancement versus radius of the smallest particle. All dimensions of the system are scaled proportionately. LSB: the specific example by K. Li et al. [LSB03], where ˇ the radius of the smallest particle is 5 nm. ES: the electrostatic limit. Credit: J. Dai and F. Cajko

Edip

     kω p exp(−ikr ) 1 1 2 = − η0 2 cos θ rˆ + 4πr ikr ikr 

1 + + θˆ 1 + ikr



1 ikr

2 

 sin θ , η0 =



μ0 0

 21 (8.238)

where the dipole with moment p is directed along the z-axis of the spherical system (r, θ, φ). In the case under consideration, kr is on the order of unity, and no near-/farfield simplification is made in the formula. Since all dipole moments approximately scale as the cube of a characteristic system size l, the magnitude of the field, say, on the axis θ = 0 behaves as ∝ c1 + c2 l 2 with some positive coefficients c1,2 . This explains the mild local minimum of the field in the electrostatic limit in Fig. 8.29. Furthermore, since (8.238) includes both sin θ and cos θ variations, it is clear that the maximum magnitude of the field cannot in general be expected to occur on the axis θ = 0. To summarize, while electrostatic analysis provides a useful insight into plasmonic field enhancement, electrodynamic effects lead to appreciable corrections. Field enhancement factors on the order of a few hundred by self-similar chains of plasmonic particles may be realizable. Maximum enhancement does not necessarily correspond to polarization along the axis of the cascade and to the electrostatic limit; hence, the size of the system is a non-trivial variable in the optimization of optical nano-lenses.

508

8 Applications in Nano-Photonics

Fig. 8.30 In geometric optics, an ideal lens can focus light to a single point, but in reality the focusing is limited by diffraction. In this case, the diffraction limit can be linked to the Heisenberg uncertainty principle (see text)

Fig. 8.31 Abbe’s formula for the diffraction limit is set in stone at a monument in Jena. (Source: [Wik19], author: D. Mietchen.) Classical papers by Lord Rayleigh and E. Abbe

8.13 The Diffraction Limit: Can It Be Broken? 8.13.1 Motivation In geometric optics, an ideal lens can focus a beam of light to a single point (Fig. 8.30); but in reality the focus is smeared to an area on the order of half-wavelength in size. This is the well-known diffraction limit, which restricts the resolution and/or focusing of any optical system and goes back to the classical papers by Lord Rayleigh [F.R79] and E. Abbe [Abb82, Abb83] (Fig. 8.31).

8.13 The Diffraction Limit: Can It Be Broken?

509

The diffraction limit is often viewed as a manifestation of the Heisenberg uncertainty principle  (8.239) y p y ∼ 2 where  is the reduced Planck constant (∼1.05457 × 10−34 m2 · kg/s); y, p y are the uncertainties in the position and momentum of a quantum particle (in our case, a photon) along a given direction labeled in the formula as y. A photon with frequency ω arriving at the focus of a lens (Fig. 8.30) has the magnitude of momentum p = k =  2π/λ, where λ is the wavelength in the medium around the lens. Since the photon can come from any angle θ between some −θmax and +θmax , the uncertainty in the y-component of its momentum is p y = 2 p sin θmax = 4π sin θmax /λ and hence the uncertainly in its position is, by the Heisenberg principle, y ∼

λ  = 2p y 8π sin θmax

(8.240)

Thus, the uncertainly principle prohibits ideal focusing of light by a conventional lens. On the other hand, plasmonic and other “hot spots” (Sect. 8.12.5) seem to be in stark contradiction with the diffraction limit. Examples of plasmon resonances, especially in particle cascades and clusters, show that light can in principle be focused to a small, highly subwavelength, “hot spot”; this can be interpreted as nano-focusing or nanolensing. Moreover, plasmonic field enhancement is not the only way to accomplish that goal. It is well known, for example, that fields have a singularity in the vicinity of a sharp material corner, so in the mathematical sense field enhancement can be unlimited. The physical reality is obviously much more complicated: Focusing can never be infinitely sharp due to the absence of mathematically perfect corners or edges, non-local material response and even a complete breakdown of continuum electrodynamics on the molecular-to-nanometer scale. Still, deep subwavelength focusing of light, or electromagnetic waves in general, is a valid proposition. Does this contradict the diffraction limit? To answer this question, we need to examine the diffraction limit more closely, but will do so on the conceptual, qualitative level, since a technical discussion of diffraction theories in physical optics (M. Born and E. Wolf [BW99]) would lead us too far astray from the main themes of this book.

510

8 Applications in Nano-Photonics

8.13.2 Superoscillations Let us start with the following simple setup. Consider an electromagnetic field in an air region at a fixed frequency ω, and assume that this field is representable as a superposition of plane waves with the wavelength λ = 2πc/ω.37 It would then be natural to conjecture that this field cannot have spatial oscillations on a scale much finer than λ and that, in particular, a sharp peak (“focus”) much narrower than λ cannot be obtained. This conjecture is true “typically” but not generally. To understand why, let us consider the following superposition of complex exponentials in the scalar 1D case: f (τ ) =

N

cα exp(ikα τ ), kα = k cos θα

(8.241)

α=1

Here, cα are some undetermined coefficients; the complex exponentials are τ -axis restrictions of N plane waves, all having the same wavenumber k = ω/c but propagating at different angles θα to the τ -axis.38 One can then introduce N knots τβ (β = 1, 2, . . . , N ) and consider the interpolation problem f (τβ ) = f β

(8.242)

where f β are arbitrary complex numbers and f (x) is given by (8.241). This interpolation problem leads to a system of N linear equations with respect to the coefficients cα , and under nonrestrictive assumptions, these coefficients can be uniquely determined from that system.39 Given the fact that f β are arbitrary and that the knots xβ can be arbitrarily close, it is clear that any behavior of the total field f (x)—including a highly oscillatory behavior—can in principle be produced in any region, however small. This phenomenon is known as superoscillations. Superoscillating functions, i.e. functions that are bandlimited and yet oscillate locally at a rate faster than their highest Fourier component, are of interest for applications from fundamental physics to engineering. The interpolation procedure outlined above is illustrated numerically in Fig. 8.32 for the following parameters: N = 5; λ = 1; k0 = θα =

2π ; kα = k0 sin θα ; λ

1 π 1 · {0, 1, 2, 3, 4}; τβ = 10−3 · {−1, − , 0, , 1} 10 2 2

(8.243)

37 A mathematical justification for that is given in the papers by R. Hiptmair, A. Moiola and I. Perugia

[MHP11, HMP11, HMPS14, HMP16b, HMP16a]. 38 Alternatively, in signal analysis τ could be interpreted as time, and {k }—as a set of frequencies. α 39 This is trigonometric interpolation: G. Meinardus [Mei67], D. Jackson [Jac30], A. P. Austin, L. N. Trefethen, J. A. C. Weideman [AT17, TW14], IT et al. [TMCM19, Sect. 2].

8.13 The Diffraction Limit: Can It Be Broken? Fig. 8.32 A simple example of “superoscillations.” λ = 1; f (τ ) oscillates between ±1 for τ = −0.001, −0.0005, 0, 0.0005, 0.001

511

512

8 Applications in Nano-Photonics

so that the (small) interpolation interval is [−10−3 , 10−3 ]. The values of f (x) are chosen to oscillate between ±1 over the five knots; these values are indicated in Fig. 8.32 with red circles. Clearly, rapid oscillations occur on the scale of ∼10−3 λ in this numerical example, even though this may appear counterintuitive at first blush. There is thus a sharp “focus” of the field over the interval [−0.0005, 0.0005] λ. What gives? The answer becomes clear once we zoom out of the small interval: The focus is produced at the expense of huge, exponentially growing sidelobes away from the focal point. Before exploring this further, let us note that there exist many alternative mathematical expressions for superoscillating functions. One example, familiar to experts in this field, is the periodic function  x N x g(x) = cos + ia sin N N

(8.244)

(see, e.g., M. Berry et al. [BZA+19]). In this expression, N " 1 is an even integer, and a > 1 is an adjustable parameter. Function (8.244) has been extensively analyzed in the literature ([BZA+19, Sect. 2] and references there). Here, I highlight the most salient features. First, replacing the trigonometric functions with complex exponentials and applying the binomial expansion in the right-hand side of (8.244), one observes that g(x) is a linear combination of harmonics exp(imx/N ), m = 0, 1, , . . . , N − 1. Hence, g(x) is bandlimited. It is also periodic with the period 2π N for any integer N and π N for N even. Second, straightforward asymptotic analysis shows that for any fixed N and small x function g(x) ∼ exp(iax), which is superoscillatory for large a. But as |x| increases, g(x) grows rapidly and at x = ±N π/2, when the sine in (8.244) is equal to one and the cosine is zero, g(x) = a N . This again is an example of a superoscillatory function with giant “sidelobes” (Fig. 8.33). The literature on superoscillations is now quite rich and covers both theoretical and experimental aspects. The main focus (no pun) of the application-oriented papers is on the best trade-offs between the intensity of oscillations versus sidelobes, either in free space or in other setups—e.g. waveguide modes (A. M. H. Wong and G. V. Eleftheriades [WE10]). N. I. Zheludev [Zhe08], G. Gbur [Gbu18] and others paint an optimistic picture: [Zhe08] ... microscopy can tolerate much higher losses than communications applications can. If the photon throughput inefficiency of the system is the price to pay for improved resolution, one can reasonably work with only a few detected photons per second, giving about 19 orders of magnitude to play with (a 1-watt laser generates about 1019 photons per second). Such a power reserve will in fact be needed because the cost of a decrease in the hot-spot size is a polynomial increase in the power going into the sidebands. The relative intensity and phase stability of emitters coherently excited by one light source are easy to maintain in the optical system. The only serious barrier to the development of optical superoscillation generators is manufacturing accuracy. ... one day, thanks to nanotechnology, a schoolboy will be able to screw a nano-array lens to his science class microscope and see a DNA molecule.

8.13 The Diffraction Limit: Can It Be Broken? Fig. 8.33 Superoscillations (8.244). The real part of g(x) (8.244), with N = 10, a = 10 over progressively increasing intervals. Superoscillations are evident within [−1, 1] but are accompanied by giant sidelobes for |x| > 2

513

514

8 Applications in Nano-Photonics

[Gbu18] The extremely low amplitude of superoscillations seemed, at first, to relegate them to the status of mathematical curiosity, with no practical application. Nevertheless, in recent years numerous researchers have explored using the phenomenon to improve the focusing characteristics of imaging systems, and even the resolution of such systems.

A vast array of references on superoscillations are available in the aforementioned review by G. Gbur [Gbu18] and in the “roadmap” by M. Berry et al. [BZA+19]; see also M. V. Berry and P. Shukla [BS19], M Mansuripur and P. K. Jakobsen [MJ19], M. K. Smith and G. J. Gbur [SG16], K. G. Makris et al. [MPT16], L. Chojnacki and A. Kempf [CK16], A. M. H. Wong and G. V. Eleftheriades [WE15], F. M. Huang, E. T. F. Rogers, N. I. Zheludev et al. [RLR+12, RZ13, HZ09], M. V. Berry and S. Popescu [BP06], P. J. S. G. Ferreira and A. Kempf [FK99]. While these references are fairly recent, the ideas of superoscillations can be traced back to the 1950s–60s (G. T. Di Francia [DF52], C. W. McCutchen [McC67]) and even to the early 1940s—S. A. Schelkunoff’s work on superdirectivity of antenna arrays [Sch43].

8.13.3 Subdiffraction Focusing and Imaging Techniques: A Brief Summary From the discussion in the previous sections, it is natural to segue to a more general question: Can the diffraction limit be broken, and if yes, to what extent and under what circumstances? Superoscillations and other cases to follow (Sects. 8.14, 8.15), as well as the above quotes from N. I. Zheludev’s and G. Gbur’s papers (Sect. 8.13.2) suggest that the answer is in the affirmative. However, the situation is much more complicated, because the notions of “diffraction limit” and “broken diffraction limit” are ambiguous enough and elude a precise definition. A variety of factors contribute to this vagueness: whether the focusing is in the near or far field; whether or not it occurs in the presence of material objects in close proximity to the “hot spot” (see, e.g., Sect. 8.14); and whether special physical processes are involved in addition to classical electromagnetism (e.g. selective deactivation of fluorophores in stimulated emission depletion (STED), ground state depletion (GSD), photo-activation localization microscopy (PALM) and other techniques of super-resolution microscopy). E. T. F. Rogers and N. I. Zheludev summarize various approaches to superresolution as follows [BZA+19, Sect. 3]40 : The Abbe-Rayleigh diffraction limit of conventional optical instruments has long been a barrier for studies of micro and nano-scale objects. The earliest attempts to overcome it exploited recording of the evanescent field of object: contact photography ... and scanning near-field imaging (SNOM) ... Such near-field techniques can provide nanoscale resolution, but capturing evanescent fields requires a probe (or photosensitive material) to be in the immediate proximity of the object. Therefore, these techniques cannot be used to image inside cells or 40

The numbered references are omitted in the quotation below for brevity.

8.13 The Diffraction Limit: Can It Be Broken?

515

silicon chips, for example. More recently, other techniques have been proposed to reconstruct and capture evanescent fields: including the far-field Veselago-Pendry ‘super-lens’, which uses a slab of negative index metamaterial as a lens to image the evanescent waves of an object to a camera ... This approach, however, faces substantial technological challenges in its optical implementation and has not yet been developed as practical imaging technique. Biological super-resolution imaging is dominated by the powerful stimulated emission depletion (STED) ... and single-molecule localization (SML) ... microscopies: far-field techniques that have demonstrated the possibility of nanoscale imaging without capturing evanescent fields (which decay over a scale of about one wavelength away from the object). These techniques, while they have become widely used, also have their own limitations: both STED and some of the SML techniques use an intense beam to deplete or bleach fluorophores in the sample. Indeed, the resolution of STED images is fundamentally linked to the intensity of the depletion beam. The harmful influence of these intense beams is known as phototoxicity, as they damage samples, particularly living samples, either stressing or killing them. SML is also inherently slow, requiring thousands of images to be captured to build a single high resolution image. Moreover, STED and SML require fluorescent reporters within the sample, usually achieved by genetic modification or immune labelling with fluorescent dyes or quantum dots ... These labels cannot be applied to solid nanostructures such as silicon chips and are known to change the behaviour of molecules or biological systems being studied ...

A. A. Maznev and O. B. Wright published an excellent analysis of various superfocusing and super-resolution techniques. The following quote from their paper [MW17], while lengthy, is instructive and worth including with only minor ellipses41 : ... scientists and engineers have been trying to do better than the diffraction limit seems to prescribe, and, in many instances, succeeded. For example, in fluorescence microscopy it is now possible to resolve features much smaller than half the optical wavelength λ... Likewise, in photolithography, it is possible to print features much smaller than λ/2 ... Interestingly, in both cases the propagation of light remains strictly within the constraints imposed by the diffraction limit. What made these developments possible is that, for example, in photolithography, one is ultimately interested in producing small features in the photoresist rather than in the optical intensity pattern. Consequently, various intrinsically nonlinear techniques such as double exposure ... can be used to fabricate photoresist features much smaller than the smallest possible far-field focused laser spot. Likewise, in microscopy with the manipulation of fluorescence-based detection, such as stimulated emission depletion or on-and-off stochastic switching of fluorophore molecules ..., one can resolve subwavelength features of an object even if such features in the optical intensity distribution cannot be resolved. It is also possible to resolve subwavelength features using near-field optical methods, in which case structures with subwavelength dimensions such as needles, tapered fibers or optical antennas are used to confine or scatter light... An important development occurred in the year 2000, when Pendry ... extended the concepts of focusing using materials with negative permittivity  and permeability μ developed by Veselago ... to show that focusing of light to a subwavelength spot (in theory, to a point) is possible with a slab of a double-negative material  = μ = −1 ... It turned out, however, that in any practical situation... the superlens only works at subwavelength distances in the near-field of the source ... Indeed, the effect discovered by Pendry should be classed as a near-field effect, as it relies on evanescent rather than propagating waves... The near-field ‘superlens’ and the far-field ‘Veselago lens’ are two different phenomena: the latter only requires that both  and μ be negative and their product be unity... Pendry’s ideal superlens effect, on the other hand, is specific to the case of a lossless double-negative medium with  = μ = −1.” 41 The

numbered references are again omitted for brevity.

516

8 Applications in Nano-Photonics

“It is instructive to observe that the superlens effect can be achieved without a negativeindex material if one uses an array of deeply subwavelength sources (for example, acoustic transducers or radio-frequency antennas) to recreate the same field as would be produced by a negative-index slab. However, in order to transmit subwavelength features to an image plane located in the far field using evanescent waves, one would need to drive the transducers at unrealistically high amplitudes. ... Despite the limitations of the superlens restricting its effect to the near field, the excitement generated by Pendry’s paper led to extensive work on subwavelength focusing and imaging, and in the ensuing years multiple groups reported ‘breaking’ the diffraction limit in the far field in both optics and acoustics: sub-diffractionlimited focusing or imaging was observed with metamaterials and without metamaterials, with negative refraction and without negative refraction, with Helmholtz resonators in acoustics and with Maxwell’s fish eye lenses in optics ... The general mood was expressed by a commentary in Nature Materials entitled “What diffraction limit?” ..., implying that the diffraction limit was all but irrelevant. In this paper, we will argue that the diffraction limit has not become irrelevant”. The question thus arises, is it possible to define the diffraction limit in a sensible way? We believe that it is. We thus propose an alternative definition of the diffraction limit in focusing: (ii) More than 50% of the total energy cannot be focused into a spot smaller than ∼λ/2N in diameter. We find that such reports generally fall into three categories... Super-resolution/superoscillations. A reduction of the FWHM of the central spot below λ/2 at the expense of large sidebands has ... been observed ...[but] no contradiction to the proposed definitions of the diffraction limit arises... Solid immersion lenses with metamaterials. ... The principal question that should be asked here is what the optical/acoustic wavelength λ is in the metamaterial medium. In fact, one finds that statements of ‘subwavelength’ resolution are invariably based on a comparison with the wavelength λ0 either in vacuum (in optics) or in the surrounding conventional medium (in acoustics). If the wavelength in the metamaterial is considered, no violations of the diffraction limit are found. ... Time-reversal allows an impressive degree of control over focusing ..., but it does not enable sub-diffraction-limited resolution. The sharp focusing observed in time-reversal experiments is not subwavelength with respect to the wavelength in the metamaterial ..., and can be achieved without time reversal ... ... Thus ‘subwavelength’ focusing with metamaterials is similar in essence to focusing with a solid immersion lens made of a natural material with a high refractive index ... Near-field ‘hot spots’. It is well known that a structure with subwavelength features such as an antenna can produce a deeply subwavelength ‘hot spot’ of an optical or acoustic field. ... This is a near-field effect... ... in any instance of reported sub-diffraction-limited focusing one needs to check for the presence of subwavelength structures in the proximity of the ‘focal spot’. ... The desire to overcome the diffraction limit has motivated a lot of great work... Yet, the concept of the diffraction limit stands firm: far from becoming irrelevant, it is in fact even more useful in analyzing recent experiments involving complex materials, negative refraction, time-reversal, etc. What is the wavelength in the metamaterial medium? Do we see near-field effects from subwavelength structures involved? Is subwavelength focusing achieved at the expense of large sidebands? ... Precisely defining what we mean by saying ‘diffraction limit’ is more than just a question of semantics. Defining things clearly helps us understand what exactly is achieved when new results are reported and better appreciate the limitations and opportunities for using light or sound to probe small length scales.

To conclude: The notion of diffraction limit is not as clear-cut as it may seem at first glance and is subject to many caveats, which were noted in the beginning

8.13 The Diffraction Limit: Can It Be Broken?

517

of this section on Sect. 8.13.3 and can be exploited in physical experiments and engineering practice. Therefore, rather than asking whether the diffraction limit has been “broken” in any particular instance, it is arguably better to inquire what specific practical benefits or applications may ensue.

8.14 Plasmonic Enhancement in Scanning Near-Field Optical Microscopy This section reflects some results of collaborative work with A. P. Sokolov and his group at the Department of Polymer Science, the University of Akron, and with F. Keilmann and R. Hillenbrand’s group at the Max-Planck-Institut für Biochemie in Martinsried, Germany. The simulations in this ˇ section were performed by F. Cajko.

In this section, we consider strong plasmon amplification of the field in scanning near-field optical microscopy (SNOM). SNOM is a very significant enhancement of more traditional scanning probe microscopy (SPM). The first type of SPM, the scanning tunneling microscope (STM), was developed by Gerd Binnig and Heinrich Rohrer at the IBM Zürich Research Laboratory in the early 1980s [BRGW82] (see also [BR99]). For this work, Binnig and Rohrer were awarded the 1986 Nobel Prize in Physics.42 The main part of the STM is a sharp metallic tip in close proximity (∼10 Å or less) to the surface of the sample; the tip is moved by a piezoelectric device. A small voltage, from millivolts to a few volts, is applied between the tip and the surface, and the system measures the quantum tunneling current (from pico- to nano-Amperes) that results. Since the probability of tunneling depends exponentially on the gap, the device is extremely sensitive. Binnig and Rohrer were able to map the surface with atomic resolution. STMs normally operate in a constant current mode, while the tip is scanning the surface. The constant tunneling current is maintained by adjusting the elevation of the tip, which immediately identifies the topography of the surface. The second type of scanning probe microscopy is atomic force microscopy (AFM). Instead of the tunneling current, AFM measures the interaction force between the tip and the surface (short-range repulsion or van der Waals attraction), which provides information about the surface structure and topography. To achieve atomic-scale resolution in all types of SPM, the position of the tip has to be controlled with extremely high precision and the tip has to be very sharp, up to just one atom at its very apex. Modern SPM technology satisfies both requirements. While the level of resolution in atomic force and tunneling microscopes is amazing, these devices are blind—they can only “feel” but not see the surface. Vision—a tremendous enhancement of the scanning probe technology—is acquired in scanning near-field optical microscopy.

42 Ernst

Ruska received his share of that prize “for his fundamental work in electron optics, and for the design of the first electron microscope”.

518

8 Applications in Nano-Photonics

Fig. 8.34 A schematic of aperture SNOM. An optical fiber tip is scanned across a sample surface to form an image. The tip is coated with metal everywhere except for a narrow aperture at the apex. (Reprinted by permission from D. Richards [Ric03] ©2003 The Royal Society of London.)

Two main approaches currently exist in SNOM. In the first one, light illuminates the sample after passing through a small (subwavelength) pinhole; the size of the hole determines the level of resolution. The idea dates back to E. H. Synge’s papers in 1928 and 1932 [Syn28, Syn32]. In modern realization, the “pinhole” is actually a metal-coated fiber (Fig. 8.34 and caption to it). An interesting timeline for the development of this aperture-limited type of SNOM is posted on the Web site of Nanonics Imaging Ltd.43 : 1928/1932 E.H. Synge proposes the idea of using a small aperture to image a surface with subwavelength resolution using optical light. For the small opening, he suggests using either a pinhole in a metal plate or a quartz cone that is coated with a metal except for at the tip. He discusses his theories with A. Einstein, who helps him develop his ideas. … 1956 J. A. O’Keefe, a mathematician, proposes the concept of Near-Field Microscopy without knowing about Synge’s earlier papers. However, he recognizes the practical difficulties of near-field microscopy and writes the following about his proposal: “The realization of this proposal is rather remote, because of the difficulty providing for relative motion between the pinhole and the object, when the object must be brought so close to the pinhole.” [J. A. O’Keefe, “Resolving power of visible light,” J. of the Opt. Soc. of America, 46, 359 (1956)]. In the same year, Baez performs an experiment that acoustically demonstrates the principle of near-field imaging. At a frequency of 2.4 kHz (λ = 14 cm), he shows that an object (his finger) smaller than the wavelength of the sound can be resolved. 43 http://www.nanonics.co.il/nsom-navigation/a-brief-history-and-simple-description-of-nsomsnom-technology Nanonics Imaging Ltd. specializes in near-field optical microscopes combined with atomic force microscopes.

8.14 Plasmonic Enhancement in Scanning Near-Field Optical Microscopy

519

Fig. 8.35 The optical probe particle (a) intercepts an incident laser beam, of frequency ωin , and concentrates the field in a region adjacent to the sample surface (b). The Raman signal from the sample surface is reradiated into the scattered field at frequency ωout . The surface is scanned by moving the optically transparent probe tip holder (c) by piezoelectric translators (d). (Reprinted with permission from J. Wessel [Wes85] ©1985 The Optical Society.)

1972 E. A. Ash and G. Nichols demonstrate λ/60 resolution in a scanning near-field microwave microscope using 3 cm radiation. [E. A. Ash and G. Nichols, “Super-resolution aperture scanning microscope,” Nature 237, 510 (1972).] 1984 The first papers on the application of NSOM/SNOM appear. These papers . . . show that NSOM/SNOM is a practical possibility, spurring the growth of this new scientific field. [A. Lewis, M. Isaacson, A. Harootunian and A. Murray, Ultramicroscopy 13, 227 (1984); D. W. Pohl, W. Denk and M. Lanz [PDL84]].

[End of quote from the Nanonics Web site.] In aperture-limited SNOM, high resolution, unfortunately, comes at the expense of significant attenuation of the useful optical signal: The transmission coefficient through the narrow fiber is usually in the range of ∼10−3 –10−5 , which limits the applications of this type of SPM only to samples with very strong optical response. A very promising alternative is apertureless SNOM that takes advantage of local amplification of the field by plasmonic particles. This idea was put forward by J. Wessel in [Wes85]; his design is shown in Fig. 8.35 and is summarized in the caption to this figure. A remarkably high optical resolution of ∼15–30 nm has already been demonstrated by several research groups (T. Ichimura et al. [IHH+04], N. Anderson et al. [AHCN05]), albeit with rather weak useful optical signals. To realize the full potential of apertureless SNOM, the local field amplification by plasmonic particles needs to be maximized. However, this amplification is quite sensitive to the geometric and physical design of plasmon-enhanced tips. For a radical improvement in

520

8 Applications in Nano-Photonics

the strength of the useful optical signal, one needs to unify accurate simulation with effective measurements of the efficiency of the tips and with fabrication. As an illustration, in A. P. Sokolov’s laboratory at the University of Akron in 2005–0644 a stable and reproducible enhancement of for the Raman signal on the order of ∼103 –104 was achieved for gold- and silver-coated Si3 N4 - and Si-tips. As noted by Sokolov, this enhancement may be sufficient for the analysis of thin (a few nanometer) films. However, for thicker samples, due to the large volume contributing to the far-field signal relative to the volume contributing to the near-field signal, the Raman enhancement of ∼104 does not produce a high enough ratio between nearfield and far-field signals. At the same time, a dramatically higher Raman enhancement, by a factor of ∼106 or more, appears to be within practical reach if tip design is optimized. This would constitute an enormous qualitative improvement over the existing technology, as the useful Raman signal would exceed the background field. Since plasmon enhancement is a subtle and sensitive physical effect, and since human intuition with regard to its optimization is quite limited, computer simulation—the main subject of this book— becomes crucial. The computational methods and simulation examples for plasmon-enhanced SNOM are described in Sect. 8.14.2. For general information on SNOM, the interested reader is referred to the books by M. A. Paesler and P. J. Moyer [PM96] and by P. N. Prasad [Pra03, Pra04].

8.14.1 Apertureless and Dark-Field Microscopy This section briefly describes the experimental setup in A. P. Sokolov’s laboratory at the University of Akron. The figures in this section are courtesy of A. P. Sokolov. For further details, see D. Mehtani et al. [MLH+05, MLH+06]. A distinguishing feature of the setup is side-collecting optics (Fig. 8.36, top) that does not suffer from the shadowing effect of more common illumination/collection optics above the tip. Another competing design, with illumination from below, works only for optically transparent substrates, whereas side illumination can be used for any substrates and samples. Finally, the polarization of the wave coming from the side can be favorable for plasmon enhancement. Indeed, it is easy to see that the electric field, being perpendicular to the direction of propagation of the incident wave, can have a large vertical component that will induce a plasmon-resonant field just below the apex of the tip, as desired. In contrast, for top or bottom illumination the direction of wave propagation is vertical, and hence the electric field has to be horizontal, which is not conducive to plasmon enhancement underneath the tip. Before a plasmon-enhanced tip can be used, it is important to evaluate the level of field amplification at the apex. Direct measurements of the optical response of the

44 A

brief description of their experimental setup for Raman spectroscopy is given below.

8.14 Plasmonic Enhancement in Scanning Near-Field Optical Microscopy

521

Fig. 8.36 Experimental setup. Top: schematics of side illumination/collection optics. Bottom: dark-field microscopy for measuring plasmon field enhancement at the apex of the tip. (Figure courtesy A.P. Sokolov. Bottom part reproduced with permission from D. Mehtani et al. [MLH+06] ©2006 IOP Publishing

tip are not effective because the measured spectrum of the tip as a whole may differ significantly from the spectrum of the plasmon area at the apex. An elegant solution is dark-field microscopy (C. C. Neacsu et al. [NSR04], D. Mehtani et al. [MLH+06]). The apex of the tip is placed in the evanescent field that exists above the surface of a glass prism due to total internal reflection (Fig. 8.36, bottom). Away from the glass surface, the evanescent field falls off exponentially and therefore is not seen by the collecting system. At the same time, the evanescent field does induce a plasmon resonance. Indeed, such resonance is, to a good degree of approximation, a quasi-static effect that will manifest itself once an external electric field is present and once the effective dielectric constant of the plasmonic structure is close to its resonance value. The exponential decay of the field matters only insofar as it can induce higher-order plasmon modes; this happens if the particle size is large enough for the variation of the field over the particle to be appreciable. The frequency of light affects the result indirectly, via frequency dependence of the dielectric permittivities. The side-collecting optics is critical for dark-field measurements, as it allows virtually unobstructed collection of optical signals from the apex of the tip.

522

8 Applications in Nano-Photonics

ˇ Fig. 8.37 (Credit: F. Cajko. Reprinted from [BSv+08] under Open Access Publishing Agreement.) Computational domain and finite element mesh for simulating the field distribution near a realistic s-SNOM cantilevered tip probing near the edge of a 30-nm-thick Au film on a SiC substrate

8.14.2 Simulation Examples for Apertureless SNOM The dependence of plasmon-amplified fields on geometric and physical parameters, as well as the dependence of these parameters (dielectric permittivities) on frequency, is so complex that computer modeling is indispensable in tip design and optimization.

8.14.2.1

Wave Simulations of Optical Tips

This section is based on our joint paper [BSv+08] with F. Keilmann’s group at the Max-Planck-Institut für Biochemie in Martinsried, Germany. All computer simulaˇ tions were performed by F. Cajko using the commercial package HFSS™, which employs the finite element method with adaptive mesh refinement (Sect. 3.13). In [BSv+08], we considered a realistic 3D structure of a cantilevered, platinumcoated AFM tip in close proximity to a flat SiC substrate partly covered with a 30-nm-thick gold film. Of most interest is the situation where the tip is near the edge of the film. The computational domain (Fig. 8.37) includes the whole conical tip of 22 µm height, with its axis tilted at 7◦ , a 25◦ full-cone angle and a 50-nm radius of curvature at the apex. The cantilever is represented by a 1-µm-thick platinum disk with a 12.5 µm radius. The gap between the tip and the SiC surface is 100 nm. The computational domain is a hexahedron 30 × 13.5 × 32 μm3 , with the solution extended by symmetry to 30 × 27 × 32 μm3 . The structure is illuminated with a p-polarized monochromatic Gaussian beam of 15 µm width (FWHM) incident at the angle of 30◦ . The dielectric permittivities of Pt, Au and SiC were obtained from the literature (Table 8.4). Full electrodynamic wave analysis was performed; see [BSv+08] for further technical details.

8.14 Plasmonic Enhancement in Scanning Near-Field Optical Microscopy Table 8.4 Complex dielectric permittivities of materials Spectroscopic Pt Au wavenumber λ−1 900 cm−1 950 cm−1

−1560 + 964i −1431 + 868i

−6136 + 1473i −5538 + 1259i

523

SiC −4.265 + 0.301i −0.757 + 0.148i

The second-order radiation boundary condition built into HFSS™and used in the simulations has the form (∇ × E)τ = jk0 E τ −

j j ∇τ × (∇τ × E τ ) + ∇τ (∇τ · E τ ) k0 k0

(8.245)

where τ indicates the tangential direction to the domain boundary. HFSS uses the exp(+ jωt) phasor convention, and therefore care should be exercised when using the dielectric permittivity data presented under the exp(−iωt) convention. The main computational challenge has to do with the multiscale nature of the problem: The free space wavelength (∼10 µm) and the length of the tip (22 µm) are orders of magnitude greater than the tip radius at the apex, the Au film thickness and the penetration depth in Pt and Au, all of which are less than 50 nm. The simulations were restricted for two frequencies below the Reststrahlen edge of SiC. Especially interesting is the case of the spectroscopic wavenumber λ−1 = 950 cm−1 , where experiments revealed extraordinarily strong back-scattering. The pattern of field enhancement at 950 cm−1 is shown in Fig. 8.38; see [BSv+08] for analysis and details.

8.15 Backward Waves, Negative Refraction and Superlensing 8.15.1 Introduction and Historical Notes In the first decade of the 21st century, negative refraction has become one of the most intriguing areas of research in nano-photonics, with quite a few books and review papers available: P. W. Milonni [Mil04], G. V. Eleftheriades and K. G. Balmain (eds.) [EB05], S. A. Ramakrishna [Ram05], V. M. Shalaev in [Sha06], S. A. Ramakrishna and T. M. Grzegorczyk [RG08] and many others. In his 1967 paper [Ves68],45 V. G. Veselago showed that materials with simultaneously negative dielectric permittivity  and magnetic permeability µ would exhibit quite unusual behavior of wave propagation and refraction. More specifically: 45 Published in 1967 in Russian. In the English translation that appeared in 1968, the original Russian paper is mistakenly dated as 1964.

524

8 Applications in Nano-Photonics

ˇ Fig. 8.38 (Credit: F. Cajko. Reprinted from [BSv+08] under Open Access Publishing Agreement.) Calculated distribution of the total field amplitude (normalized to the input field at beam center) in the central plane of the computational domain, for the spectroscopic wavenumber 950 cm−1 ; tip over Au, at 2.4 µm from the film edge. The scale in the upper plot is logarithmic. The lower, zoomed plots (linear scales) show enhanced near fields around the tip apex (left) and the film edge (right)

• Vectors E, H and k, in that order, form a left-handed system. • Consequently, the Poynting vector E × H and the wave vector k have opposite directions. • The Doppler and Vavilov–Cerenkov effects are “reversed.” The sign of the Doppler shift in frequency is opposite to what it would be in a regular material. The Poynting vector of the Cerenkov radiation forms an obtuse angle with the direction of motion of a superluminal particle in a medium, while the wave vector of the radiation is directed toward the trajectory of the particle. • Light propagating from a regular medium into a double-negative material bends “the wrong way” (Fig. 8.39). In Snell’s law, this corresponds to a negative index of refraction. A slab with  = −1, μ = −1 in air acts as an unusual lens (Fig. 8.40). Subjects closely related to Veselago’s work had been in fact discussed in the literature well before his seminal publication—as early as in 1904. S. A. Tretyakov [Tre05], C. L. Holloway et al. [HKBJK03] and A. Moroz46 provide the following references: • A 1904 paper by H. Lamb47 on waves in mechanical (rather than electromagnetic) systems. 46 http://www.wave-scattering.com/negative.html. 47 “On

group-velocity,” Proc. London Math. Soc. 1, pp. 473–479, 1904.

8.15 Backward Waves, Negative Refraction and Superlensing

525

Fig. 8.39 At the interface between a regular medium and a double-negative medium, light bends “the wrong way”; in Snell’s law, this implies a negative index of refraction. Arrows indicate the direction of the Poynting vector that in the double-negative medium is opposite to the wave vector

Fig. 8.40 The Veselago slab of a double-negative material acts as an unusual lens. Due to the negative refraction at both surfaces of the slab, a point source S located at a distance a < d has a virtual image S  inside the slab and a real image I outside. The arrows indicate the direction of the Poynting vector, not the wave vector

526

8 Applications in Nano-Photonics

• A. Schuster’s monograph [Sch04], pp. 313–318; a 1905 paper by H. C. Pocklington.48 • Negative refraction of electromagnetic waves was in fact considered by L. I. Mandelshtam more than two decades prior to Veselago’s paper.49 Mandelshtam’s short paper [Man45] and, even more importantly, his lecture notes [Man47, Man50] already described the most essential features of negative refraction. The 1945 paper, but not the lecture notes, is cited by Veselago. • A number of papers on the subject appeared in Russian technical journals from the 1940s to the 1970s: by D. V. Sivukhin (1957) [Siv57], V. E. Pafomov (1959) [Paf59] and R. A. Silin (1959, 1978) [Sil59, Sil72]. • Silin’s earlier review paper (1972) [Sil72], where he focuses on wave propagation in artificial periodic structures. In one of his lectures cited above, Mandelshtam writes, in reference to a figure similar to Fig. 8.39 ([Man50], pp. 464–465)50 : ... at the interface boundary the tangential components of the fields . . . must be continuous. It is easy to show that these conditions cannot be satisfied with a reflected wave (or a refracted wave) alone. But with both waves present, the conditions can always be satisfied. From that, by the way, it does not at all follow that there must only be three waves and not more: the boundary conditions do allow one more wave, the fourth one, traveling at the angle π − φ1 in the second medium. Usually it is tacitly assumed that this fourth wave does not exist, i.e. it is postulated that only one wave propagates in the second medium. . . . [the law of refraction] is satisfied at angle φ1 as well as at π − φ1 . The wave . . . corresponding to φ1 moves away from the interface boundary. . . . The wave corresponding to π − φ1 moves toward the interface boundary. It is considered self-evident that the second wave cannot exist, as light impinges from the first medium onto the second one, and hence in the second medium energy must flow away from the interface boundary. But what does energy have to do with this? The direction of wave propagation is in fact determined by its phase velocity, whereas energy moves with group velocity. Here therefore there is a logical leap that remains unnoticed only because we are accustomed to the coinciding directions of propagation of energy and phase. If these directions do coincide, i.e. if group velocity is positive, then everything comes out correctly. If, however, we are dealing with the case of negative group velocity – quite a realistic case, as I already said, – then everything changes. Requiring as before that energy in the second medium flow away from the interface boundary, we arrive at the conclusion that phase must run toward this boundary and, therefore, the direction of propagation of the refracted wave will be at the π − φ1 angle to the normal. However unusual this setup may be, there is, of course, nothing surprising about it, for phase velocity does not tell us anything about the direction of energy flow.

A quote from Silin’s 1972 paper: 48 H.

C. Pocklington, Growth of a wave-group when the group velocity is negative, Nature 71, pp. 607–608, 1905. 49 Leonid Isaakovich Mandelshtam (Mandelstam), 1879–1944, an outstanding Russian physicist. Studied at the University of Novorossiysk in Odessa and the University of Strasbourg, Germany. Together with G. S. Landsberg (1890–1957), observed Raman (in Russian—“combinatorial”) scattering simultaneously or even before Raman did but published the discovery a little later than Raman. The 1930 Nobel Prize in physics went to Raman alone; for an account of these events, see I. L. Fabelinskii [Fab98], R. Singh and F. Riess [SR01] and E. L. Feinberg [Fei02]. 50 My translation from the Russian. A similar quote is given by S. A. Tretyakov in [Tre05].

8.15 Backward Waves, Negative Refraction and Superlensing

527

Let a wave be incident from free space onto the dielectric. In principle one may construct two wave vectors β2 and β3 of the refracted wave …Both vectors have the same projection onto the boundary of the dielectric and correspond to the same frequency. One of them is directed away from the interface, while the other is directed toward it. The waves corresponding to the vectors β2 and β3 are excited in media with positive and negative dispersion, respectively. In conventional dielectrics the dispersion is always positive, and a wave is excited that travels away from the interface. … The direction of the vector β3 toward the interface in the medium with negative dispersion coincides with the direction of the phase velocity …and is opposite to the group velocity vgr . The velocity vgr is always directed away from the interfaces, so that the energy of the refracted wave always flows in the same direction as the energy of the incident wave.

Of the earlier contributions to the subject, a notable one was made by R. Zengerle in his PhD thesis on singly and doubly periodic waveguides in the late 1970s. His journal publication of 1987 [Zen87] contains, among other things, a subsection entitled “Simultaneous positive and negative ray refraction.” Quote: Figure 10 shows refraction phenomena in a periodic waveguide whose effective index …in the modulated region is …higher than …in the unmodulated region. The grating lines, however, are not normal to the boundaries. As a consequence of the boundary conditions, two Floquet-Bloch waves corresponding to the upper and lower branches of the dispersion contour …are excited simultaneously …resulting generally in two rays propagating in different directions. This ray refraction can be described by two effective ray indices: one for ordinary refraction …and the other …with a negative refraction angle …

The first publication on what today would be called a (quasi-)perfect cylindrical lens was a 1994 paper by N. A. Nicorovici et al. [NMM94] (now there are also more detailed follow-up papers by G. W. Milton et al. [MNMP05, MN06]).51 These authors considered a coated dielectric cylinder, with the core of radius rcore and permittivity core , the shell (coating) with the outer radius rshell and permittivity shell , embedded in a background medium with permittivity bg . It turns out, first, that such a coated cylinder is completely transparent to the outside H -mode field (the H field along the axis of the cylinder) under the quasi-static approximation if core = bg = 1, shell → −1. (The limiting case shell → −1 should be interpreted as the imaginary part of the permittivity tending to zero, while the real part is fixed at −1: shell = −1 + ishell , shell → 0.)52 Second, under these conditions for the dielectric constants, many unusual imaging properties of coated cylinders are observed. For 3 2 /rcore example, a line source placed outside the coated cylinder at a radius rsrc < rshell 4 2 would have an image outside the cylinder, at rimage = rshell /(rcore rsrc ). A turning point in the research on double-negative materials came in 1999–2000, when J. B. Pendry et al. [PHRS99] showed theoretically, and D. R. Smith et al. [SPV+00] confirmed experimentally, negative refraction in an artificial material with split-ring resonators [SPV+00]. A further breakthrough was Pendry’s “perfect lens” 51 I am grateful to N.-A. Nicorovici for pointing these contributions out to me in the mid-2000s and to

G. W. Milton for subsequent discussions. Dr. Nicorovici (1944–2010) was an outstanding theoretical physicist/applied mathematician at the University of Sydney and the University of Technology Sydney. 52 The exp(−iωt) convention is used here for complex phasors.

528

8 Applications in Nano-Photonics

paper in 2000 [Pen00]. It was known from Veselago’s publications that a slab of negative index material could work as a lens focusing light from a point-like source on one side to a point on the other side.53 Veselago’s argument was based purely on geometric optics, however. Pendry’s electromagnetic analysis showed, for the first time, that the evanescent part of light emitted by the source will be amplified by the slab, with the ultimate result of perfect transmission and focusing of both propagating and evanescent components of the wave. The research field of negative refraction and superlensing is so vast that a more detailed review would be well beyond the scope of this book. Further reading may include J. B. Pendry and S. A. Ramakrishna [PR03], J. B. Pendry and D. R. Smith [PS04], S. A. Ramakrishna [Ram05], A. L. Pokrovsky and A. L. Efros [PE02, PE03] and references therein. Selected topics, however, will be examined in the remainder of this chapter. One notable subject, alluded to above, is the Veselago–Pendry “perfect lens” [Ves68, Pen00] which is, in principle, capable of producing ideal (non-distorted) images.54 This is possible because evanescent waves with large wavenumbers k x , k y in the image plane x y, or equivalently with large components of momentum px = k x , p y = k y resolve [Pen01b] the apparent contradiction [Wil01] between the diffraction limit and the uncertainty principle. Indeed, the dispersion relation for waves in free space (air) is k x2 + k 2y + k z2 =

 ω 2 c

In the evanescent field, k x and k y can be arbitrarily large, with the corresponding imaginary value of k z and negative k z2 . The uncertainty in the x y-components of the photon momentum is therefore infinite, and there is no uncertainty in the position in the ideal case.

8.15.2 Negative Permittivity and the “Perfect Lens” Problem This section gives a numerical illustration of Pendry’s “perfect lens” in the limiting case of a thin slab. If the thickness of the slab is much smaller than the wavelength, the problem becomes quasi-static and the electric and magnetic fields decouple. Analysis 53 V.G. Veselago remarks that this is not a lens “in the usual sense of the word” because it does not focus a parallel beam to a point. 54 The perfect lensing effect has been challenged by many researchers (N. Garcia and M. NietoVesperinas [GNV02, NVG03], J. M. Williams [Wil01], A. L. Pokrovsky and A. L. Efros [PE03], P. M. Valanju et al. [VWV02, VWV03]) but for the most part has survived the challenge (see J. B. Pendry and D. R. Smith [PS04], J. R. Minkel [Min03]). Part of the difficulties and the controversy arise because the problem with the “perfect lens” parameters ( = −1, μ = −1 for a slab) is ill-posed, and the analysis depends on regularization and on the technical details of passing to the small loss and (in some cases) low-frequency limits (A. Farhi and D. J. Bergman [FB14]).

8.15 Backward Waves, Negative Refraction and Superlensing

529

Fig. 8.41 A finite element mesh for Pendry’s lens example with two line sources

of the (decoupled) electric field brings us back from a brief overview of negative index materials to media with a negative real part of the dielectric permittivity. Rather than repeating J. B. Pendry’s analytical calculation for a thin metal slab, let us, in the general spirit of this book, consider a numerical example illustrating the analytical result. The problem, in the electrostatic limit, can be easily solved by finite element analysis. The geometric and physical setup is, for the sake of comparison, chosen to be the same as in Pendry’s paper [Pen00]. A COSMOL Multiphysics™ mesh for 2D simulation is shown in Fig. 8.41. A metal slab of thickness 40 nm acts, under special conditions, as a lens. To demonstrate the lensing effect, two line charges (represented in the simulation by circles of 5 nm radius, not drawn exactly up to scale in the figure) are placed 20 nm above the surface of the slab, at points (x, y) = (±40, 40) nm. (The y-axis is normal to the slab.) In the simulations reported below, the FE mesh has 30,217 nodes and 60,192 second-order triangular elements, with 120,625 degrees of freedom. Naturally, for the FE analysis the domain and the (theoretically infinite) slab had to be truncated sufficiently far away from the source charges. In Pendry’s example ([Pen00], p. 3969), the relative permittivity of the slab is slab ≈ −0.98847 + 0.4i,55 which corresponds to silver at ∼356 nm. The magnitude of the electric field in the source plane y = 40 nm is shown, as a function of x, in Fig. 8.42 and, as expected, exhibits two sharp peaks corresponding to the line sources. The lensing effect of the slab is manifest in Fig. 8.43, where the field distributions with and without the slab are compared in the “image” plane (y = −40 nm).56 Perfect 55 With

the exp(−iωt) convention for phasors. similar distribution of the electrostatic potential in the image plane has a flat maximum at x = 0 rather than two peaks. Note also that the maximum value theorem for the Laplace equation prohibits the potential from having a local maximum (or minimum) strictly inside the domain with 56 A

530

8 Applications in Nano-Photonics

Fig. 8.42 Magnitude of the electric field in the source plane (y = 40 nm) as a function of x. The two line sources are manifest. (The field abruptly goes to zero at the very center of each cylindrical line of charge.)

lensing is a very subtle phenomenon and is extremely sensitive to all physical and geometric parameters of the model. Ideally, the distance between the source and the surface of the slab has to be equal to half of the thickness of the slab; the relative permittivity has to be −1. In addition, if the thickness of the slab is not negligible relative to the wavelength, the permeability also has to be equal −1. R. Merlin [Mer04] (see also D. R. Smith et al. [SSR+03]) derived an analytical formula for the spatial resolution  of a slightly imperfect lens of thickness d and the refractive index n = −(1 − δ)1/2 , with δ small: 2πd   =  log δ 

(8.246)

2

According to this result, for a modest resolution  equal to the thickness of the slab, the deviation δ must not exceed ∼0.0037. For /d = 0.25, δ must be on the order of 10−11 ; i.e. the index of refraction must be almost perfectly equal to −1, which is impossible in practice. For a qualitative illustration of this sensitivity to parameters, let us turn to the electrostatic limit again and visualize how a slight variation of the numbers affects the potential distribution. In Figs. 8.44, 8.45 and 8.46, the dielectric constant is purely real and takes on the values −0.9, −1 and −1.02; although these values are close, the results corresponding to them are completely different. Similarly, in Figs. 8.47, 8.48 and 8.49 the imaginary part of the permittivity of the slab varies, with the real part fixed at −0.98847 as in Pendry’s example. Again, the results are very different. As damping is increased, “multicenter” plasmon modes (no

respect to all coordinates. Viewed as a function of one coordinate, with the other ones fixed, the potential can have a local maximum.

8.15 Backward Waves, Negative Refraction and Superlensing Fig. 8.43 Magnitude of the electric field in the image plane (y = −40 nm) as a function of x, with and without the silver slab. The lensing effect of the slab is evident. The staircase artifacts are caused by finite element discretization

Fig. 8.44 Potential distribution for Pendry’s lens example with two line sources; slab = −0.9

Fig. 8.45 Potential distribution for Pendry’s lens example with two line sources; slab = −1

531

532

8 Applications in Nano-Photonics

Fig. 8.46 Potential distribution for Pendry’s lens example with two line sources; slab = −1.02

Fig. 8.47 Potential distribution for Pendry’s lens example with two line sources; slab = −0.98847

Fig. 8.48 Potential distribution for Pendry’s lens example with two line sources; slab = −0.98847 + 0.1i

damping, Fig. 8.47) turn into two-center and then to one-center Teletubbies-like57 distributions (Fig. 8.49).

57 https://www.facebook.com/teletubbies.

8.15 Backward Waves, Negative Refraction and Superlensing

533

Fig. 8.49 Potential distribution for Pendry’s lens example with two line sources; slab = −0.98847 + 0.4i

8.15.3 Forward and Backward Plane Waves in a Homogeneous Isotropic Medium In backward waves, energy and phase propagate in opposite directions (Sect. 8.15.1). We first examine this counterintuitive phenomenon in a hypothetical homogeneous isotropic medium with unusual material parameters (the “Veselago medium”). In subsequent sections, we turn to of forward and backward Bloch waves in periodic dielectric structures; plane wave decomposition of Bloch waves will play a central role in that analysis. Let us review the behavior of plane waves in a homogeneous isotropic medium with arbitrary constant complex parameters  and μ at a given frequency. The only stipulation is that the medium be passive (no generation of energy), which under the exp(−iωt) phasor convention implies positive imaginary parts of  and μ. It will be helpful to assume that these imaginary parts are strictly positive and to view lossless materials as a limiting case of small losses:  → +0, μ → +0. The goal is to establish conditions for the plane wave to be forward or backward. In the latter case, one has a “Veselago medium.” Let the plane wave propagate along the x-axis, with E = E y and H = Hz . Then, we have (8.247) E = E y = E 0 exp(ikx) H = Hz = H0 exp(ikx)

(8.248)

where E 0 , H0 are some complex amplitudes. It immediately follows from Maxwell’s equations that k E0 H0 = (8.249) ωμ √ k = ω μ (which branch of the square root?)

(8.250)

534

8 Applications in Nano-Photonics

Which branch of the square root “should” be implied in the formula for the wavenumber? In an unbounded medium, there is complete symmetry between the +x and −x directions, and waves corresponding to both branches of the root are equally valid. It is clear, however, that each of the waves is unbounded in one of the directions, which is not physical. For a more physical picture, it is tacitly assumed that the unbounded growth is truncated: e.g. the medium and the wave occupy only half of the space, where the wave decays. With this in mind, let us focus on one of the two waves—say, the one with a positive imaginary part of k: k  > 0

(8.251)

(The analysis for the other wave is completely analogous.) Splitting up the real and imaginary exponentials exp(ikx) = exp(i(k  + ik  )x) = exp(−k  x) exp(ik  x) we observe that this wave decays in the +x direction. On physical grounds, one can argue that energy in this wave must flow in the +x direction as well. This can be verified by computing the time-averaged Poynting vector P = Px =

k 1 1 Re E 0 H0∗ = Re |E 0 |2 2 2 ωμ

(8.252)

To express P via material parameters, let  = || exp(iφ );

μ = |μ| exp(iφμ );

0 < φ , φμ < π

Then, the square root with a positive imaginary part, consistent with the wave (8.251) under consideration, gives   φ + φμ k = ω |μ| || exp i 2

(8.253)

Ignoring all positive real factors irrelevant to the sign of P in (8.252), we get sign P = sign Re

φ − φμ k = sign cos μ 2

The cosine, however, is always positive, as 0 < φ , φμ < π. Thus, as expected, Px is positive, indicating that energy flows in the +x direction indeed. The type of the wave (forward vs. backward) therefore depends on the sign of phase velocity ω/k  —that is, on the sign of k  . The wave is backward if k  k  < 0



Im k 2 < 0



Im (μ) < 0

(8.254)

8.15 Backward Waves, Negative Refraction and Superlensing

535

As follows from (8.253), sign k  = sign cos

φ + φμ 2

and, for k  > 0 (8.251), the wave is backward if and only if the cosine is negative, or (8.255) φ + φμ > π An algebraically equivalent criterion can be derived by noting that the cosine function is monotonically decreasing on [0, π] and hence φ > π − φμ is equivalent to cos φ < cos(π − φμ ) or cos φ + cos φμ < 0 This coincides with the Depine–Lakhtakia condition [DL04] for backward waves: μ  + < 0 || |μ|

(8.256)

This last expression is invariant with respect to complex conjugation and is therefore valid for both phasor conventions exp(±iωt). Note that the analysis above relies only on Maxwell’s equations and the definitions of the Poynting vector and phase velocity. No considerations of causality, so common in the literature on negative refraction, were needed to establish the backward wave conditions (8.255), (8.256).

8.15.4 Backward Waves in Mandelshtam’s Chain of Oscillators A classic case of backward waves in a chain of mechanical oscillators is due to L. I. Mandelshtam. His four-page paper [Man45]58 published by Mandelshtam’s coworkers in 1945 after his death is very succinct, so a more detailed exposition below will hopefully prove useful. An electromagnetic analogy of this mechanical example (an optical grating) is the subject of the following section. Consider an infinite 1D chain of masses, with the nearest neighbors separated by an equilibrium distance d and connected by springs with a spring constant f . Newton’s equation of motion for the displacement ξn of the nth mass m n is

58 The

paper is also reprinted in Mandelshtam’s lecture course [Man47].

536

8 Applications in Nano-Photonics

¨ ξ(n) = ωn2 [ξ(n − 1) − 2ξ(n) + ξ(n + 1)] ,

ωn2 =

f mn

(8.257)

For brevity, dependence of ξ on time is not explicitly indicated. For waves at a given frequency ω, switching to complex phasors yields ω 2 ξ(n) + ωn2 [ξ(n − 1) − 2ξ(n) + ξ(n + 1)] = 0

(8.258)

Mandelshtam considers periodic chains of masses, focusing on the case with just two alternating masses, m 1 and m 2 . The discrete analog of the Bloch wave then has the form (8.259) ξ(n) = ξPER (n) exp(iqnd) where q is the Bloch wavenumber. ξPER is a periodic function of n with the period of two and can hence be represented by a Euclidean vector ξ ≡ (a, b) ∈ R2 , where a and b are the values of ξPER (n) for odd and even n, respectively.59 Substituting this discrete Bloch-type wave into the difference equation (8.258), we obtain  2 2   ω2 (λ + 1) λ(ω 2 − 2ω22 ) a = 0, λ ≡ exp(iqd) (8.260) λ(ω 2 − 2ω12 ) ω12 (λ2 + 1) b Hence, (a, b) is the null vector of the 2 × 2 matrix in the left-hand side of (8.260). Equating the determinant to zero yields two eigenfrequencies ωB1,B2 of the Bloch wave

   ωB1,B2 = ω12 + ω22 ± λ−1 ω12 λ2 + ω22 ω22 λ2 + ω12 To analyze group velocity of Bloch waves, compute the Taylor expansion of these eigenfrequencies around q = 0, keeping in mind that λ = exp(iqd): ωB1 = 2

d 2 ω22 ω12 ω12 + ω22

ωB2 = 2(ω12 + ω22 ) − 2

d 2 ω22 ω12 2 q ω12 + ω22

and equally well, ξPER can be represented via its two-term Fourier sum, familiar from discrete-time signal analysis:

59 Alternatively

˜ ˜ exp(inπ) = ξ(0) ˜ ˜ ξPER (n) = ξ(0) + ξ(1) + (−1)n ξ(1) where .

1 1 ˜ ˜ ξ(0) = (ξ(0) + ξ(1)); ξ(1) = (ξ(0) − ξ(1)) 2 2

8.15 Backward Waves, Negative Refraction and Superlensing

537

which coincides with Mandelshtam’s formulas at the bottom of p. 476 of his paper. Group velocity vg = ∂ωB /∂q of long-wavelength Bloch waves is positive for the “acoustic” branch ωB1 but negative for the “optical” branch ωB2 .60 For q = 0 (i.e. λ = 1), simple algebra shows that the components of the second null vector (aB2 , bB2 ) of (8.260) are proportional to the two particle masses: m2 aB2 = − bB2 m1

(q = 0)

(8.261)

(The first null vector aB1 = bB1 corresponding to the zero eigenfrequency for zero q represents just a translation of the chain as a whole and is uninteresting.) Next, consider energy transfer along the chain. The force that mass n − 1 exerts upon mass n is Fn−1,n = [ξ(n − 1) − ξ(n)] f The mechanical “Poynting vector” is the power generated by this force: ˙ t) Pn−1,n (t) = Fn−1,n (t) ξ(n, the time average of which, via complex phasors, is Pn−1,n =

 1 Re Fn−1,n iωξ(n) 2

For the “optical” mode, i.e. the second eigenfrequency of oscillations, direct computation leads to Mandelshtam’s expression P =

1 f ωab sin(qd) 2

The subscripts for P have been dropped because the result is independent of n, as should be expected from physical considerations: No continuous energy accumulation occurs in any part of the chain. We have now arrived at the principal point in this example. For small positive q (qd  1), the Bloch wave has a long-wavelength component exp(iqnd). Phase velocity ω/q of the Bloch wave—in the sense discussed in more detail below—is positive. At the same time, the Poynting vector and hence the group velocity are negative because aB2 and bB2 have opposite signs in accordance with (8.261). Thus, mechanical oscillations of the chain in this case propagate as a backward wave. An electromagnetic analogy of such a wave is mentioned very briefly in Mandelshtam’s paper and is the subject of the following subsection.

60 On

the acoustic branch, by definition, ω → 0 as q → 0; on optical branches, ω  0.

538

8 Applications in Nano-Photonics

Fig. 8.50 Poynting components Pm of the first four Bloch waves a–d for the volume grating with (x) = 2 + cos 2πx. Solution with 41 plane waves; qa = π/10

8.15.4.1

Backward Waves in Mandelshtam’s Grating

We now revisit Example 35 (Sect. 8.6) of a 1D volume grating, to examine the similarity with Mandelshtam’s particle chain and the possible presence of backward waves. For definiteness, let us use the same numerical parameters as before and assume a periodic variation of the permittivity (x) = 2 + cos 2πx. The Bloch problem, in its algebraic eigenvalue form K2 e = k02 e (8.119), Sect. 8.6 (or K2 e = ω 2 e in the SI system), was already solved numerically in Example 35 (Sect. 8.6), and the band diagram is presented in Fig. 8.7. We now discuss the splitting of the Poynting vector into the individual “Poynting components” Pm = km |em |2 /(2ωμ) (8.129), SI system; this splitting has implications for the nature of the wave. The distribution of Pm for the first four Bloch modes in the grating is displayed in Fig. 8.50. The first mode shown in Fig. 8.50a is almost a pure plane wave (P±1 are on the order of 10−5 ; P±2 are on the order of 10−13 , and so on) and does not exhibit any unusual behavior. Let us therefore focus on mode #2 (upper right corner of the figure). There are four non-negligible harmonics altogether. The stems to the right of the origin (km > 0) correspond to plane wave components propagating to the right, i.e. in the +x direction. Stems to the left of the origin correspond to plane waves propagating to

8.15 Backward Waves, Negative Refraction and Superlensing

539

Table 8.5 Principal components of the second Bloch mode in the grating km /π em Pm −5.9 −3.9 −1.9 0.1 2.1 4.1 6.1

−0.0023 −0.0765 −0.948 0.174 0.253 0.0179 0.000495

−1.79 × 10−5 −0.013 −0.997 0.00177 0.0783 0.000767 8.73 × 10−7

the left, and hence their Poynting values are negative. It is obvious from the figure that the negative components dominate and as a result the total Poynting value for the Bloch wave is negative. The numerical values of the Poynting components and of the amplitudes of the plane wave harmonics are summarized in Table 8.5. Now, the characterization of this wave as forward or backward hinges on the definition and sign of phase velocity. The smallest absolute value of the wavenumber in the Bloch “comb” q = 0.1π determines the plane wave component with the longest wavelength (bold numbers in Table 8.5). If one defines phase velocity vph = ω/q based on q = 0.1π, then phase velocity is positive, and since the Poynting vector was found to be negative, one has a backward wave. However, the amplitude of the q = 0.1π harmonic (e0 ≈ 0.174) is much smaller by the absolute value than that of the q − κ0 = −1.9π wave (italics in the table). A common convention (P. Yeh [Yeh79], B. Lombardet et al. [LDFH05]) is to use this highest-amplitude component as a basis for defining phase velocity. If this convention is accepted in our present example, then phase velocity becomes negative and the wave is a forward one (since the Poynting vector is also negative). One may then wonder what the value of phase velocity “really” is. This question is not a mathematically sound one, as one cannot truly argue about mathematical definitions. From the physical viewpoint, however, two aspects of the notion of phase velocity are worth considering. First, boundary conditions at the interface between two homogeneous media are intimately connected with the values of phase velocities and indexes of refraction (defined for homogeneous materials in the usual unambiguous sense). Fundamentally, however, it is the wave vectors in both media that govern wave propagation, and it is the continuity of its tangential component that constrains the fields. Phase velocity plays a role only due to its direct connection with the wavenumber. For periodic structures, there is not one but a whole “comb” of wavenumbers that all need to be matched at the interface. We shall return to this subject in Sect. 8.15.5. Second, in many practical cases phase velocity can be easily and clearly visualized. As an example, Fig. 8.51 shows two snapshots, at t = 0 and t = 0.5, of the second Bloch mode described above. For the visual clarity of this figure, low-pass filtering has been applied—without that filtering, the rightward motion of the wave is obvious

540

8 Applications in Nano-Photonics

Fig. 8.51 Two snapshots, at t = 0 and t = 0.5, of the second Bloch mode (low-pass filtering applied for visual clarity). The wave moves to the right with phase velocity corresponding to the smallest positive Bloch wavenumber q = 0.1π

in the animation but is difficult to present in static pictures. The Bloch wavenumber in the first Brillouin zone in this example is q = 0.1π, and the corresponding second eigenfrequency is ω ≈ 4.276. The phase velocity—if defined via the first Brillouin zone wavenumber—is vph = ω/q ≈ 4.276/0.1π ≈ 13.61. Over the time interval t = 0.5 between the snapshots, the displacement of the wave consistent with this phase velocity is 13.61 × 0.5 ≈ 6.8. This corresponds quite accurately to the actual displacement in Fig. 8.51, proving that the first Brillouin zone wavenumber is indeed relevant to the perceived visual motion of the Bloch wave. So, what is one to make of all this? The complete representation of a Bloch wave is given by a comb of wavenumbers q + mκ0 and the respective amplitudes em of the Fourier harmonics. Naturally, one is inclined to distill this theoretically infinite set of data to just a few parameters that include the Poynting vector, phase and group velocities.61 While the Poynting vector and group velocity for the wave are rigorously and unambiguously defined, the same is in general not true for phase velocity. However, there are practical cases where phase velocity is meaningful. The situation is clear-cut when the Bloch wave has a strongly dominant long-wavelength component. (This case will become important in Sect. 8.15.6.) Then, the Bloch wave is, in a sense, close to a pure plane wave, but non-trivial effects may still arise. Even though the amplitudes of the individual higher-order harmonics may be small, it is possible for their collective effect to be significant. In particular, as the example in this section has shown, the higher harmonics taken together may carry more energy than the dominant component and in the opposite direction. In this case, one has a backward wave, where phase velocity is defined by the dominant long61 As

a mathematical trick, any finite (or even any countable) set of numbers can always be combined into a single one simply by intermixing the decimals: For example, e = 2.71828 . . . and π = 3.141592 . . . can be merged into 2.3711481258 . . .. Of course, this is not a serious proposition in physics.

8.15 Backward Waves, Negative Refraction and Superlensing

541

wavelength harmonic, while the Poynting vector is due to a collective contribution of all harmonics. An alternative generalization of phase velocity in 1D is the velocity vfield of points with a fixed magnitude of the E field. From the zero differential dE =

∂E ∂E dx + dt = 0 ∂x ∂t

one obtains vfield =

dx ∂E ∂E = − / dt ∂x ∂t

(see also Eqs. (8.34), Sect. 8.3.2 and (8.45), Sect. 8.3.3). Unfortunately, this definition does not generalize easily to 2D and 3D, where an analogous velocity would be a tensor quantity (a separate velocity vector for each Cartesian component of the field).

8.15.5 Backward Waves and Negative Refraction in Photonic Crystals 8.15.5.1

Introduction

As noted on Sect. 8.15, R. Zengerle in the late 1970s–early 1980s examined and observed negative refraction in singly and doubly periodic waveguides. In 2000, M. Notomi [Not00] noted similar effects in photonic crystals. For crystals with a sufficiently strong periodic modulation, there may exist a physically meaningful effective index of refraction within certain frequency ranges near the band edge. Under such conditions, anomalous refractive effects can arise at the surface of the crystal. Negative refraction is one of these possible effects. Another one is “open cavity” formation where light can run around closed paths in a structure with alternating positive-negative index of refraction (Fig. 8.52), even though there are no reflecting walls. Notomi’s specific example involves a 2D GaAs (index n ≈ 3.6) hexagonal photonic crystal, with the diameter of the rods equal to 0.7 of the cell size. In the 2000s, there were a number of publications on negative refraction and the associated lensing effects in photonic crystals. To name just a few: 1. The photonic structure proposed by C. Luo et al. [LJJP02] is a bcc lattice of air cubes in a dielectric with the relative permittivity of  = 18. The dimension of the cubes is 0.75 a, and their sides are parallel to those of the lattice cell. The computation of the band diagram and equifrequency surfaces in the Bloch space, as well as FDTD simulations, demonstrate “all-angle negative refraction” (AANR), i.e. negative refraction for all angles of the incident wave at the air–

542

8 Applications in Nano-Photonics

Fig. 8.52 [After M. Notomi [Not00].] “Open cavity” formation: light rays can form closed paths in a structure with alternating positive–negative index of refraction

2.

3.

4.

5.

crystal interface. AANR occurs in the frequency range from 0.375(2πc/a) to 0.407(2πc/a) in the third band. E. Cubukcu et al. [CAO+03] experimentally and theoretically demonstrate negative refraction and superlensing in a 2D photonic crystal in the microwave range. The crystal is a square array of dielectric rods in air, with the relative permittivity of  = 9.61, diameter 3.15 mm and length 15 cm. The lattice constant is 4.79 mm. Negative refraction occurs in the frequency range from 13.10 to 15.44 GHz. R. Moussa et al. [MFZ+05b] experimentally and theoretically studied negative refraction and superlensing in a triangular array of rectangular dielectric bars with  = 9.61. The dimensions of each bar are 0.40a × 0.80a, where the lattice constant a = 1.5875 cm. The length of each bar is 45.72 cm. At the operational frequency of 6.5 GHz, which corresponds to λair ≈ 4.62 cm and a/λair ≈ 0.344, the effective index is n ≈ −1 with very low losses. Only TM modes are considered (the E field parallel to the rods.) V. Yannopapas and A. Moroz [YM05] show that negative refraction can be achieved in a composite structure of polaritonic spheres occupying the lattice sites. A specific example involves LiTaO3 spheres with the radius of 0.446 µm; the lattice constant is 1.264 µm, so that the fcc lattice is almost close-packed. Notably, the wavelength-to-lattice-size ratio is quite high, 14:1, but the relative permittivity of materials is also very high, on the order of 102 . M. S. Wheeler et al. [WAM06], independently of Yannopapas and Moroz, study a similar configuration. Wheeler et al. show that a collection of polaritonic spheres coated with a thin layer of Drude material can exhibit a negative index of refraction at infrared frequencies. The existence of negative effective magnetic permeability is due to the polaritonic material, while the Drude material is responsible for negative effective electric permittivity. The negative index region is centered at 3.61 THz, and the value of n eff = −1, important for subwavelength focusing, is approached. The cores of the spheres are made of LiTaO3 , and their radius is

8.15 Backward Waves, Negative Refraction and Superlensing

543

4 µm. The coatings have the outer radius of 4.7 µm, and their Drude parameters are ω p /2π = 4.22 THz, Γ = ω p /100. The filling fraction is 0.435. 6. S. Foteinopoulou and C. M. Soukoulis [FS03, FS05] provide a general analysis of negative refraction at the air–crystal interfaces and, as a specific case, examine Notomi’s example (a 2D hexagonal lattice of rods with permittivity 12.96 and the radius of 0.35 lattice size). 7. P. V. Parimi et al. [PLV+04] analyze and observe negative refraction and lefthanded behavior of the waves in microwave crystals. The structure is a triangular lattice of cylindrical copper rods of height 1.26 cm and radius 0.63 cm. The ratio of the radius to lattice constant is 0.2. The TM-mode excitation is at frequencies up to 12 GHz. Negative refraction is observed, in particular, at 9.77 GHz. For the analysis of anomalous wave propagation and refraction, it is important to distinguish intrinsic and extrinsic characteristics, as explained in the following subsection.

8.15.5.2

“Extrinsic” and “Intrinsic” Characteristics

This terminology, albeit not standard, reflects the nature of wave propagation and refraction in periodic structures such as photonic crystals and metamaterials. Intrinsic properties of the wave imply its characterization as either forward or backward, that is, whether the Poynting vector and phase velocity (if it can be properly defined) are in the same or opposite directions (or, more generally, at an acute or obtuse angle). Extrinsic properties refer to conditions at the interface of the periodic structure and air or another homogeneous medium. The key point is that refraction at the interface depends not only on the intrinsic characteristics of the wave in the bulk, but also on the way the Bloch wave is excited. This can be illustrated as follows. Let the x-axis run along the interface boundary between air and a material with a-periodic permittivity (x). For simplicity, we assume that  does not vary along the normal coordinate y. Such a periodic medium can support Bloch E-modes of the form E(r) =



em exp(imκ0 x) exp(iqx x) exp(iq y y)

m=−∞

Let the first Brillouin zone harmonic (m = 0) have an appreciable magnitude e0 , thereby defining phase velocity ω/qx in the x direction. For qx > 0, this velocity is positive. But any plane wave component of the Bloch wave can serve as an “excitation channel”62 for this wave, provided that it matches the x-component of the incident wave in the air: qx + κ0 m = k xair 62 A

lucid term due to B. Lombardet et al. [LDFH05].

544

8 Applications in Nano-Photonics

First, suppose that the “main” channel (m = 0) is used, so that qx = k xair . If the Bloch wave in the material is a forward one, then the y-components of the Poynting vector Py and the wave vector q y are both directed away from the interface, and the usual positive refraction occurs. If, however, the wave is backward, then q y is directed toward the surface (against the Poynting vector) and it can easily be seen that refraction is negative. This is completely consistent with Mandelshtam’s explanation quoted on Sect. 8.15. Exactly, the opposite will occur if the Bloch wave is excited through an excitation channel where qx + κ0 m is negative (say, for m = 1). The matching condition at the interface then implies that the x-component of the wave vector in the air is negative in this case. Repeating the argument of the previous paragraph, one discovers that for a forward Bloch wave refraction is now negative, while for a backward wave it is positive. In summary, refraction properties at the interface are a function of the intrinsic characteristics of the wave in the bulk and the excitation channel, with four substantially different combinations possible. This conclusion summarizes the results already available but dispersed in the literature [BST04, LDFH05, GMKH05].

8.15.5.3

Negative Refraction in Photonic Crystals: Case Study

To illustrate the concepts discussed in the sections above, let us consider, as one of the simplest cases, the structure proposed by R. Gajic and R. Meisels et al. [GMKH05, MGKH06]. Their photonic crystal is a 2D square lattice of alumina rods (rod = 9.6) in air. The radius of the rod is rrod = 0.61 mm, the lattice constant a = 1.86 mm, so that rrod /a ≈ 0.33. The length of the rods is 50 mm. Gajic and Meisels et al. study various cases of wave propagation and refraction. In the context of this section, of most interest to us is negative refraction for small Bloch numbers in the second band of the H -mode. The band diagram for the H -mode appears in Fig. 8.53. The diagram, computed using the plane wave method with 441 waves, is very close (as of course it should be) to the one provided by Gajic et al. Fig. 8.53 shows the normalized frequency ω˜ = ωa/(2πc); in the Gajic paper, the diagram is for the absolute frequency f = ω/2π = ωc/a. ˜ We observe that the second-band dispersion curve is mildly convex around q = 0 (ω˜ ≈ 0.427), indicating a negative second derivative ∂ 2 ω/∂q 2 and hence a negative group velocity for small positive q and a possible backward wave. As we are now aware, an additional condition for a backward wave must also be satisfied: The plane wave component corresponding to the small positive Bloch number must be appreciable (or better yet, dominant). Let us therefore consider the plane wave composition of the Bloch wave. The amplitudes of the plane wave harmonics for the Gajic et al. crystal are shown in Fig. 8.54. For q = 0, the spectrum is symmetric and characteristic of a standing wave. As q becomes positive and increases, the spectrum gets skewed, with the backward components (km < 0) increasing and the forward ones decreasing.

8.15 Backward Waves, Negative Refraction and Superlensing

545

Fig. 8.53 H -mode band diagram of the Gajic et al. crystal

Fig. 8.54 Amplitudes h m of the plane wave harmonics for the Gajic et al. crystal (arb. units). Second-band H -mode near q = 0 on the Γ → X line

The numerical values of the amplitudes of a few spatial harmonics from Fig. 8.54 are also listed in Table 8.6 for reference. From the figure and table, it can be seen that the amplitudes of the spatial harmonics of this Bloch wave in the first Brillouin zone (the first four rows of numbers in the table) are quite small. It is therefore debatable whether a valid phase velocity can be attributed to this wave. The Bloch wave itself is pictured in Fig. 8.55 for illustration. The distribution of Poynting components of the same wave and for the same set of values of the Bloch wavenumber is shown in Fig. 8.56. It is clear from the figure that

546

8 Applications in Nano-Photonics

Table 8.6 Amplitudes of the spatial harmonics of the second-band Bloch wave for the Gajic et al. photonic crystal Normalized wavenumber qx a/π Amplitude h m of the plane wave harmonic 0 0.1 0.2 0.4 2 2.1 2.2

0 0.00124 0.00477 0.0159 0.4023 0.3589 0.3175

Fig. 8.55 H field of the second H -mode (arb. units) for the Gajic et al. crystal. Point q = 0.2π on the Γ → X line

the negative components outweigh the positive ones, so power flows in the negative direction.

8.15.6 Are There Two Species of Negative Refraction? Negative refraction is commonly classified as two species: first, homogeneous materials with double-negative effective material characteristics, as stipulated in Veselago’s original paper [Ves68]; second, periodic dielectric structures (photonic crystals) capable of supporting modes with group and phase velocity at an obtuse angle to one another. The second category has been extensively studied theoretically, and negative refraction has been observed experimentally (see the list on Sect. 8.15.5.3). Truly homogeneous materials, in the Veselago sense, are not currently known and could be found in the future only if some new molecular-scale magnetic phenomena are discovered. Consequently, much effort has been devoted to the development

8.15 Backward Waves, Negative Refraction and Superlensing

547

Fig. 8.56 Plane wave Poynting components Pm for the Gajic et al. crystal (arb. units). Second H -mode near the Γ point on the Γ → X line

of artificial metamaterials capable of supporting backward waves and producing negative refraction. Selected developments of this kind, going back to early 2000, are summarized in Table 8.7. The numerical values in the table are approximate, and the list is in no way exhaustive. The right column of the table displays an important parameter: the ratio of the lattice cell size to the vacuum wavelength. One might have hoped that further improvements in nanofabrication and design could bring the cell size down to a small fraction of the wavelength, thereby approaching the Veselago case of a homogeneous material. However, the main message of this section is that the cell size is constrained not only by the fabrication technologies. There are fundamental limitations on how small the lattice size can be for negative index materials. Homogeneous negative index materials may not in fact be realizable as a limiting case of spatially periodic dielectric structures with a small cell size. The following analysis, available in a more detailed form in [Tsu08], shows that negative refraction disappears in the homogenization limit when the size of the lattice cells tends to zero, provided that other physical parameters, including frequency, are fixed. To streamline the mathematical development, let us focus on square Bravais lattice cells with size a in 2D and introduce dimensionless coordinates x˜ = x/a, y˜ = y/a, so that in these tilde coordinates the 2D problem is set up in the unit square. (The 3D case is considered in [Tsu08].) The E-mode in the tilde coordinates is described by the familiar 2D wave equation ∇˜ 2 E + ω˜ 2 r E = 0,

(8.262)

where the exp(+iωt) phasor convention and notation from [Tsu08] are used, and ω˜ =

a ωa = 2π c λ0

(8.263)

548

8 Applications in Nano-Photonics

Table 8.7 Selected designs and parameters of negative index metamaterials. The numerical values are approximate f

λ

a

a/λ

2000

D. R. Smith et al. [SPV+00]

Copper SRR and wires

4.85 GHz

6.2 cm

8 mm

0.13

2001

R. A. Shelby et al. [SSS01]

Copper SRR and strips

10 GHz

3 cm

5 mm

0.17

2003

C. G. Parazzoli et al. [PGL+03]

A stack of SRRs with metal strips

12.6 GHz

2.38 cm

0.33 cm

0.14

2003

A. A. Houck et al. [HBC03]

Composite wire and SRR prisms

10 GHz

3 cm

0.6 cm

0.2

2004

D. R. Smith & D. C. Vier [SV04]

Copper SRR and strips

11 GHz

2.7 cm

3 mm

0.11

2005

V. M. Shalaev et al. [Sha06]

Pairs of nanorods

200 THz

1.5 µm

2005

S. Zhang et al. [ZFP+05]

Nano-fishnet (circular voids in metal)

150 THz

2 µm

2005– 2006

S. Zhang et al. [ZFM+05, ZFM+06]

Nano-fishnet with rectangular/ elliptical voids

215, 170 THz

1.4, 1.8 µm

0.801, 0.787 µm

0.57, 0.44

2006– 2007

G. Dolling et al. [DEW+06, DWSL07]

Nano-fishnet with rectangular voids

210, 380 THz

1.45, 0.78 µm

0.6, 0.3 µm

0.41, 0.38

Year

Publication

Design

0.64 × 1.8 µm 0.838 µm

0.42–1.2 0.42

Here, c and λ0 are the speed of light and the wavelength in free space, respectively. The relative permittivity r is a periodic function of coordinates over the lattice. The fundamental solutions of the field equation are a Bloch wave; in the tilde coordinates, E(˜r) = EPER (˜r) exp(−i q˜ B · r˜ )

(8.264)

where r˜ is the position vector. As in Sect. 8.7.2, it is convenient to view this Bloch wave as a suite of spatial Fourier harmonics (plane waves): E(˜r) =

n

En ≡

n

e˜ n exp(i2πn · r˜ ) exp(−i q˜ B · r˜ )

(8.265)

(Summation in this and subsequent equations is over the integer lattice Z2 .) As also noted in Sect. 8.7.2, the time- and cell-averaged Poynting vector P = 21 Re{E × H∗ } can be represented as the sum of the Poynting vectors for the individual plane waves [LDFH05]:

8.15 Backward Waves, Negative Refraction and Superlensing

P =

n

Pn ;

Pn =

549

πn |˜en |2 ωμ ˜ 0

(8.266)

As we know, in Fourier space the scalar wave equation (8.262) becomes |q˜ B − 2πn|2 e˜n = ω˜ 2

m

˜n−m e˜ m ,

n ∈ Z2

(8.267)

where ˜n are the Fourier coefficients of the dielectric permittivity :  =

n

˜n exp (i2πn · r˜ )

(8.268)

The normalized band diagram, such as the one in Fig. 8.53, indicates that negative refraction disappears in the homogenization limit when the size of the lattice cells tends to zero, provided that other physical parameters, including frequency, are fixed. Indeed, the homogenization limit is obtained by considering the small cell size—long wavelength condition a → 0, q˜ → 0 (see [SEK+05, Sjö05] for additional mathematical details). As these limits are taken, the problem and the dispersion curves in the ˜ approaches normalized coordinates remain unchanged, but the operating point (ω, ˜ q) the origin along a fixed dispersion curve—the acoustic branch. In this case, phase ˆ ω/ql = ω/ ˜ q˜l , is well defined and equal to group velocity in any given direction l, velocity ∂ω/∂ql simply by definition of the derivative. No backward waves can be supported in this regime. This conclusion is not surprising from the physical perspective. As the size of the lattice cell diminishes, the operating frequency increases, so that it is not the absolute frequency ω but the normalized quantity ω˜ that remains (approximately) constant. Indeed, a principal component of metamaterials with negative refraction is a resonating element [SPV+00, SV04, Ram05, Sha06] whose resonance frequency is approximately inverse proportional to size [LED+06]. It is pivotal here to make a distinction between strongly and weakly inhomogeneous cases of wave propagation. The latter is intended to resemble an ideal “Veselago medium,” with the Bloch wave being as close as possible to a long-length plane wave. Toward this end, the following conditions characterizing the weakly inhomogeneous backward wave regime are put forth: • The first Brillouin zone component of the Bloch wave must be dominant; this component then defines the phase velocity of the Bloch wave. • The other plane wave components collectively produce energy flow at an obtuse angle to phase velocity. • The lattice cell size a is small relative to the vacuum wavelength λ0 ; a/λ0  1. • At the air–material interface, it is the long-wavelength, first Brillouin zone, plane wave component that serves as the excitation channel for the Bloch wave. If any of the above conditions are violated, the regime will be characterized as strongly inhomogeneous: The EM wave can “see” the inhomogeneities of the material. By this definition, in the weakly inhomogeneous case the normalized Bloch wavenumber q˜ must be small, q˜ ≡ qa  π. Larger values of q would indicate a strongly

550

8 Applications in Nano-Photonics

inhomogeneous (or, synonymously, “photonic crystal” or “grating”) regime, where the lattice size is comparable with the Bloch wavelength. Let us show that, under reasonable physical assumptions, backward waves cannot be supported in the weakly inhomogeneous case; strong inhomogeneity is required, ˜ when the medium and there is a lower bound for the relative cell size a/λ0 = ω/2π could still support backward waves. To simplify mathematical analysis, we focus on the limiting case q = 0, but the conclusions will apply, by physical continuity, to small q˜ = qa  1. We first turn to the E-mode governed by the 2D equation (8.262): (8.269) E = −η ∇˜ 2 E, η = ω˜ −2 (ω˜ = 0) Further analysis relies on the inversion of ∇˜ 2 . To do this unambiguously, let us split E up into the zero-mean term E ⊥ and the remaining constant E 0 : E = E 0 + E ⊥ . Symbol “⊥” indicates orthogonality to the null space of the Laplacian (i.e. to constants). To eliminate the constant component E 0 , we integrate (8.269) over the lattice cell. Integrating by parts and noting that the boundary term vanishes due to the periodic boundary conditions (q = 0), we get 

E 0 = − ˜−1 0



E ⊥ d,

˜0 = 0

(The exceptional case ˜0 = 0 is mathematically quite intricate and may constitute a special research topic.) With E 0 eliminated, the eigenvalue problem for E ⊥ becomes !  E⊥ −

˜−1 0

 

"  E ⊥ d = −η ∇˜ 2 E ⊥

Since E ⊥ by definition is zero mean, 

∇˜ ⊥−2  E ⊥ − ˜−1 0 (, E ⊥ )

= − η E⊥

(8.270)

where ∇˜ ⊥−2 is the zero-mean inverse of the Laplacian. #Fourier # analysis easily shows # # that this inverse is bounded (the Poincaré inequality): #∇˜ ⊥−2 # ≤ (4π 2 )−1 . Then, taking the norm of both sides of (8.270), we get |η| ≤ (4π 2 )−1 ||max (1 + ||max / |˜0 |)

(8.271)

This result that can be viewed as a generalization of the Poincaré inequality to cases with variable r leads to a simple lower bound for the lattice cell size, with the mean and maximum values of  as parameters: 

a λ0

2 =

ω˜ 2 1 ≥ 4π 2 ||max (1 + ||max / |˜0 |)

(8.272)

This completes the theoretical derivation of lattice size bounds for the E-mode.

8.15 Backward Waves, Negative Refraction and Superlensing

551

The main conclusion is that for periodic structures capable of supporting backward waves and producing negative refraction, the lattice cell size, as a fraction of the vacuum wavelength and/or the Bloch wavelength, must be above certain thresholds. These thresholds depend on the maximum, minimum and mean values of the complex dielectric permittivity as key parameters. In the presence of good conductors (e.g. at microwave frequencies), such theoretical constraints are not very restrictive. However, at optical frequencies and/or for nonmetallic structures the bounds on the cell size must be honored and may help to design metamaterials and photonic crystals with desired optical properties.

8.16 Appendix: The Bloch Transform In Sects. 8.4–8.11, we had an occasion to consider individual Bloch–Floquet waves in periodic structures. More generally, the electric or magnetic field can be represented via the Bloch Transform—a “continuous superposition” of Bloch waves. It can be viewed as a reformulation of the Fourier transform and is considered in this Appendix in 1D. The general idea is rather simple: As we know, a Bloch wave in Fourier space is just a “comb” of plane waves at the spatial frequencies q + mκ0 (m = 0, ±1, ±2, . . .). A generic Fourier transform can be viewed as a continuous superposition of such combs for a varying q. The details are as follows. Let a function f (x) be expressed via its Fourier transform F(k): f (x) =

1 2π





F(k) exp(ikx) dk

(8.273)

−∞

Let the k-axis be subdivided into intervals of equal length κ0 : [mκ0 , (m + 1)κ0 ], m = 0, ±1, ±2, . . . . The Fourier transform F(k) can then be viewed as a superposition of “Bloch combs” . . . , q − 2κ0 , q − κ0 , q, q + κ0 , q + 2κ0 , . . . with q ∈ [0, κ0 ]; one such comb is shown in Fig. 8.57 for illustration. More precisely, Fourier integration can be broken up into “combs” by setting k = q + mκ0 , which yields 1 f (x) = 2π



κ0 0





 F(q + mκ0 ) exp(imκ0 x) exp(iq x) dq

(8.274)

m=−∞

For a fixed q, the expression in the curly brackets is a Fourier series, with the coefficients Fm = F(q + mκ0 ). The sum of this series is a periodic function of x, with the period a = 2π/κ0 . Let us denote this sum with f PER (q, x), where “PER” implies

552

8 Applications in Nano-Photonics

Fig. 8.57 From Fourier to Bloch: Fourier transform as a superposition of “Bloch combs”

periodicity with respect to x: f PER (q, x) ≡



F(q + mκ0 ) exp(imκ0 x)

(8.275)

m=−∞

With this in mind, the Bloch transform is, in essence, Fourier transform in terms of the Bloch variable q: f (x) =

1 2π



κ0

f PER (q, x) exp(iq x) dq

(8.276)

0

8.17 Appendix: Eigenvalue Solvers The computational work in the bandgap structure calculation is dominated by the solution of the eigenvalue problem. This area of numerical analysis is remarkably rich in ideas and includes several classes of methods. An excellent compendium of computational strategies for solving eigenvalue problems is the Templates book [BDD+00]. The following quote from this book identifies the key questions for choosing the best strategy: 1. Mathematical properties of the problem: Is it a Hermitian (real symmetric, self-adjoint) or a non-Hermitian eigenproblem? Is it a standard problem involving only one matrix or a generalized problem with two matrices? 2. Desired spectral information: Do we need just the smallest eigenvalue, a few eigenvalues at either end of the spectrum, a subset of eigenvalues “inside” the spectrum, or most eigenvalues? Do we want associated eigenvectors, invariant subspaces, or other quantities? How much accuracy is desired? 3. Available operations and their costs: Can we store the full matrix as an array and perform a similarity transformation on it? Can we solve a linear system with the matrix (or shifted matrix) by a direct factorization routine, or perhaps an iterative method? Or can we only multiply a vector by the matrix, and perhaps by its transpose? If several of these operations are possible, what are their relative costs?

8.17 Appendix: Eigenvalue Solvers

553

The Templates book is perhaps the best starting point for studying eigenvalue algorithms. G. H. Golub and H. A. van der Vorst have written excellent reviews of the subject [GvdV00, vdV04]. In addition to the classical monographs by J. H. Wilkinson [Wil65], G. H. Golub and C. F. van Loan [GL96] and B. N. Parlett [Par80], more recent books by Y. Saad [Saa92], J. W. Demmel [Dem97] and L. N. Trefethen and D. Bau [Tre97] are highly recommended. This section summarizes the eigenvalue methods relevant to the PBG analysis. The answers to the Templates questions quoted above are different for Fourier-space and finite element algorithms; we therefore deal with these two approaches separately. Let us start with plane wave expansion and assume that a regular (rather than generalized) eigenvalue problem is solved; this is the case when all coordinatedependent quantities are incorporated into the differential operator on the left-hand side of the eigenvalue equation and do not appear on the right. The system matrix is generally full because products in real space turn into convolutions in the Fourier space, with all spatial harmonics coupled. The matrix is Hermitian for the H -problem and non-Hermitian for the E-problem. The standard solution procedure has two stages. At the first stage, the matrix is converted to a simpler form—usually upper Hessenberg form (upper triangular plus one lower subdiagonal)—by a sequence of orthogonal similarity transformations. The transformations are usually Householder reflections with respect to judiciously chosen hyperplanes. The purpose of conversion to upper Hessenberg form is to make QR iterations (see below) much more efficient. Figure 8.58 shows a geometric illustration of a Householder reflection. The picture is drawn just in two dimensions because the hypervolume of this book is limited. The hyperplane of reflection is orthogonal to a certain vector u that does not have to be a unit vector. The projection of an arbitrary vector x onto u is easily calculated to be xu = u

uT x uT u

The “mirror image” of x with respect to the hyperplane is therefore uT x x = x − 2xu = x − 2u T = u u 



uu T I −2 T u u

 x

The matrix in the parentheses formally defines the Householder reflection and is easily shown to be symmetric and orthogonal. It can also be demonstrated (see any of the monographs cited above) that a series of suitable Householder transforms will convert any matrix, column by column, to upper Hessenberg (“almost triangular”) form. Since orthogonal similarity transforms preserve symmetry, for a symmetric (Hermitian in the complex case) matrix the upper Hessenberg form is also symmetric (Hermitian) and hence tridiagonal.

554

8 Applications in Nano-Photonics

Fig. 8.58 A geometric illustration of Householder reflection. An arbitrary vector x is reflected with respect to a hyperplane orthogonal to some vector u

The second stage of the typical eigenvalue algorithm consists in applying QR iterations to the upper Hessenberg (tridiagonal in the symmetric case) matrix.63 As known from linear algebra, any matrix A can be factored as A = QR

(8.277)

where Q is an orthogonal matrix (Q T Q = I ) in the real case or a unitary matrix Q ∗ Q = I ) in the complex case; R is an upper triangular matrix. This factorization is applicable not only to square matrices, but to rectangular m × n matrices with m ≥ n as well. However, we only need to consider square matrices here. The QR decomposition is not a similarity transform and therefore is not by itself suitable for eigenvalue analysis. A similarity transform is obtained by multiplying Q and R in the opposite order: A = R Q = Q ∗ AQ

(8.278)

where the complex case is assumed for generality and R is expressed as Q ∗ A from (8.277). Remarkably, by repeating these two operations (QR decomposition and then RQ multiplication) one obtains a sequence of matrices A, A , A , . . . that usually converges very rapidly to a triangular matrix (diagonal in the symmetric/ 63 QR iterations should not be confused with QR decomposition. The iterative process does have QR factorization as its central part but also involves RQ multiplication and shifts, as summarized in the text below.

8.17 Appendix: Eigenvalue Solvers

555

Hermitian case).64 Since similarity transforms preserve the eigenvalues, the diagonal of this triangular matrix in fact contains the eigenvalues of the original matrix. The eigenvectors of the original matrix can be computed by “undoing” the orthogonal similarity transforms. The first stage of the overall procedure—the transformation to upper Hessenberg (or tridiagonal, in the symmetric/Hermitian case) form—for a full n × n matrix requires O(n 3 ) operations. QR iterations, the second stage of the procedure, typically exhibit quadratic convergence at least; hence, the number of iterations per eigenvalue is virtually independent of the size of the system, and the total operation count for the QR stage is O(n 2 ) if only the eigenvalues are sought. If the eigenvectors are also needed, the cost is O(n 3 ). In summary, the total computational cost for the standard eigenvalue solver with Householder reflections and QR iterations is, as a rule, asymptotically proportional to the cube of the system size. In practical applications to photonic bandgap structure calculation, the computational cost limits the number of PWE terms and hence the accuracy of the solution. QR-based methods are usually viewed as direct solvers. An alternative is iterative eigenvalue solvers. Strictly speaking, all eigenvalue solvers for systems of dimension greater than four are, by necessity, iterative (as there is no general way of computing the roots of the corresponding characteristic polynomial in a finite number of operations). However, due to fast convergence of QR iterations, the direct part (Householder reflections) is dominant, and for practical purposes QR solvers are viewed as direct. Iterative methods do not require explicit access to the matrix entries, as long as a matrix–vector multiplication procedure is available. The system matrix (or matrices, for the generalized eigenproblem) remains unchanged in the course of the iterations, and hence for sparse matrices no additional fill-in is created (i.e. zero entries remain zero). Moreover, in large-scale finite element simulations matrix–vector operations can be carried out on an element-by-element basis, without having to store the entries of the global matrix. However, additional memory is needed for an auxiliary set of orthogonal vectors, as described below. The literature on iterative methods is vast, and the algorithms are quite elaborate. Here, I only highlight the main ideas, from the perspective of solving photonic bandgap problems; further details and references can be found in the monographs and review papers already cited on Sect. 8.17. A key part of iterative algorithms is projection onto a suitably chosen subspace—in practice, with a small number of dimensions. Let us consider an eigenvalue problem Ax = λx, where A is an n × n matrix and x ∈ Cn , and assume that an orthonormal set of m vectors qk ∈ Cn , k = 1, 2, . . . , m is available. Construction of this set is 64 This algorithm, however, is not infallible. In practice, it is implemented with shifts: The QR factorization is applied not to the original matrix A itself but to A − s I , where the shift s may vary from iteration to iteration. Algorithms for choosing the shifts can be quite involved. Cleve Moler’s interesting note on the MATLAB implementation of the QR algorithm is posted on the Mathworks blog: https://blogs.mathworks.com/cleve/2019/08/05/the-qr-algorithm-computeseigenvalues-and-singular-values/.

556

8 Applications in Nano-Photonics

the second key part of the procedure and a defining feature of particular classes of iterative methods. To find an approximate solution of the eigenvalue problem within the subspace Qm spanned by vectors qk , a natural (but not the only) option is to apply the Galerkin method. The approximate solution xm is a linear combination of m basis vectors qk , and the same vectors are used to test the fidelity of this solution. In matrix form, xm = Q m cm ,

cm ∈ Cn

where the n × m matrix Q m comprises the m column vectors qk and cm is a coefficient vector. The Galerkin equations are (Axm , qk ) = μ(xm , qk ),

k = 1, 2, . . . , m

where μ is an approximate eigenvalue. Substituting xm = Q m cm and putting the system in matrix–vector form lead to Q ∗m AQ m cm = μcm

(8.279)

The right-hand side got simplified due to the orthogonality of the columns of Q m : Q ∗m Q m = Im . This eigenvalue problem reduced to subspace Qm has m eigensolutions that are called the Ritz values and Ritz vectors of A with respect to Qm . There are two main approaches for constructing the orthonormal sequence of vectors qk . The first one involves the Krylov spaces. By definition, these spaces are spanned by the Krylov vectors y1 , y2 = Ay1 , y3 = Ay2 = A2 y1 , . . ., ym = Am−1 y1 , where y1 is some starting vector. The orthonormal basis qk in the Krylov space is obtained by the modified Gram–Schmidt algorithm.65 Let us focus on the non-degenerate case where the n Krylov vectors, up to yn , are linearly independent, and assemble these vectors as columns into an n × n matrix Y . Since the dimension of the whole space is n, vector yn+1 = Ayn must be a linear combination of the n Krylov vectors, with some coefficients −s. (The minus sign is included for compatibility with the conventional form of the “companion matrix” S below.) Then, we have [Dem97] AY = Y S

(8.280)

where

65 The modified version of the Gram–Schmidt algorithm is mathematically equivalent to the classical one but is more stable, due to the way the orthogonalization coefficients are calculated. See, e.g., J. W. Demmel [Dem97], p. 107.

8.17 Appendix: Eigenvalue Solvers

557



0 ⎜1 ⎜ ⎜0 S = ⎜ ⎜. . . ⎜ ⎝. . . ...

0 0 1 0 ... ...

... ... ... 1 ... ...

... ... ... ... ... ...

0 0 0 0 ... 1

⎞ −s1 −s2 ⎟ ⎟ −s3 ⎟ ⎟ −s4 ⎟ ⎟ ... ⎠ −sn

(8.281)

This is just a matrix-form expression of two facts: (i) Multiplying A with each column of matrix Y (i.e. with each Krylov vector) except for the very last one produces, by construction of the Krylov sequence, the next column; (ii) multiplying A by the last column of Y produces a vector that is a linear combination of all columns. Since S is upper Hessenberg, the matrix identity (8.280) can be interpreted as a transformation of the original matrix A to the upper Hessenberg form: Y −1 AY = S

(8.282)

Although this transformation is non-orthogonal and computationally not robust, its conceptual importance is in establishing the connection between the Krylov vectors and upper Hessenberg matrices. An orthogonal transformation can be obtained by the QR decomposition of Y , which is nothing other than the Gram–Schmidt procedure already mentioned: Y = Q Y RY Upon substitution into (8.282), this yields RY−1 Q ∗Y AQ Y RY = S or equivalently

Q ∗Y AQ Y = H ≡ RY S RY−1

(8.283)

Since S is upper Hessenberg, so is H . This conversion of the original matrix to upper Hessenberg form using the orthogonalized Krylov sequence is known as the Arnoldi algorithm. In the Hermitian case, this algorithm simplifies dramatically: The upper Hessenberg matrix H becomes tridiagonal, and the orthogonal vector sequence can be generated by a simple threeterm recurrence, as described in all texts on numerical linear algebra. This procedure for complex Hermitian or real symmetric matrices is known as the Lanczos method. The Lanczos method is in exact arithmetic direct, in the sense that it orthogonally transforms the matrix to a tridiagonal one in a finite number of operations (after which the eigenproblem can be efficiently solved by QR iterations). However, in practice the Lanczos method and numerous other related Krylov space algorithms are used almost exclusively as iterative solvers to compute the Ritz approximations (8.279) to the eigenpairs. The main reason is that roundoff errors destroy orthogonality, so the procedure cannot be successfully completed in finite-precision arithmetic without additional measures such as reorthogonalization.

558

8 Applications in Nano-Photonics

The highest and lowest eigenvalues in the Lanczos method converge faster as a function of m in (8.279) than the values in the interior of the spectrum. Applying the algorithm to a shifted and inverted matrix (A − σ I )−1 instead of A will yield faster convergence for the eigenvalues closest to the shift σ. This, however, necessitates solving linear systems with matrix (A − σ I ), which may be prohibitively expensive for large matrices. Many preconditioning techniques, that can be viewed as approximate inverses in some sense, have been developed to overcome this difficulty. In addition to the review papers by H. A. van der Vorst [vdV04] and G. H. Golub and van der Vorst [GvdV00] already cited, see papers by A. V. Knyazev [Kny01], P. Arbenz [AHLT05] and references therein. An interesting alternative to the Krylov subspace solvers is the Jacobi–Davidson method. Van der Vorst [vdV04] notes the origin of this method in the 1846 paper by C. G. J. Jacobi [Jac46], the 1975 paper by E. R. Davidson [Dav75] and, finally, the 1996 paper by G. L. G. Sleijpen and H. A. van der Vorst [SvdV96]. An instructive way to view the Jacobi–Davidson method is as Newton–Raphson iterations [vdV04]. Let a unit vector y and a number θ (real for the Hermitian case) be an approximation to an eigenvector/eigenvalue pair of matrix A, so that y ∗ Ay = θ

(8.284)

We are looking for suitable corrections y and θ that would improve this approximation. Since it does not make sense to update y along its own direction, y is sought to be orthogonal to y: (8.285) y ∗ y = 0 The target condition for the corrections is A(y + y) = (θ + θ)(y + y)

(8.286)

Ignoring the second-order product θy as in the Newton–Raphson linearization and moving the -terms to the left-hand side yield (A − θ I )y − θ y = − (A − θ I )y

(8.287)

Since the left-hand side contains two unknown increments, a relationship between θ and y needs to be established to close the procedure. This can be done by premultiplying the equation for the corrections (8.285) with y ∗ [vdV04]: y ∗ A(y + y) = y ∗ (θ + θ)(y + y) After taking into account the orthogonality condition y ∗ y = 0, the eigenvalue condition y ∗ Ay = θ and the unit length of y (y ∗ y = 1), one obtains y ∗ A y = θ

(8.288)

8.17 Appendix: Eigenvalue Solvers

559

Fig. 8.59 A geometric illustration of the (I − yy ∗ ) projector

Substituting this θ into the Newton–Raphson formula (8.287) yields an equation for y alone: (I − yy ∗ )(A − θ I )y = − (A − θ I )y Since y ∗ y = 0, the last equation can be symmetrized: (I − yy ∗ )(A − θ I )(I − yy ∗ )y = − (A − θ I )y

(8.289)

The entries of matrix yy ∗ are pairwise products of the components of y, and hence matrix (I − yy ∗ ) is in general full. However, it need not be computed explicitly; to carry out the iterative procedure, one only needs the product of (I − yy ∗ ) with an arbitrary vector z, which is easily calculated right to left: (I − yy ∗ )z = z − y(y ∗ z). The meaning of (I − yy ∗ ) as a projection operator is clear from Fig. 8.59. Indeed, ∗ y z (a scalar) is geometrically the length66 of the projection of z onto y; y(y ∗ z) is this projection itself; finally, z − y(y ∗ z) is the projection of z onto the hyperspace orthogonal to y. In the Jacobi–Davidson method, the Newton–Raphson iteration outlined above defines a new direction y to be added to the Ritz approximation space. The complete Jacobi–Davidson algorithm is quite involved and includes, in addition to the computation of y: modified Gram–Schmidt orthogonalization; restarting procedures when the dimension of the Ritz subspace gets too large; solution of the eigenvalue problem in the subspace; and deflation (i.e. projection onto a hyperspace orthogonal to the eigenvectors already found—a generalization of I − yy ∗ ). Furthermore, preconditioning for eigensolvers in general, and within the Jacobi–Davidson algorithm in particular, is a more complicated matter than for linear system solvers. Parallel implementation of these algorithms is a very rich area of research as well. On all these subjects (except for parallel eigensolvers), the Templates book is again a comprehensive and condensed initial source of information; in addition, see papers

66 For

the geometric interpretation, vectors should be viewed as real.

560

8 Applications in Nano-Photonics

by G. L. G. Sleijpen and F. W. Wubs [SW03], A. V. Knyazev [Kny98, Kny01], A. Basermann [Bas99, Bas00]. What is, then, the bottom line for the photonic bandgap computation as far as eigensolvers are concerned? Methods that are based on plane wave (or spherical/cylindrical wave) expansion lead to dense matrices. For problems of small or moderate size, direct methods are typically the best choice. However, large-scale problems are common in computational practice: Tens of thousands or more unknowns are often needed for adequate accuracy. At the same time, only a small number of low-frequency eigenmodes may be of interest. In such cases, iterative eigensolvers have an advantage. S. G. Johnson and J. D. Joannopoulos, in their highly cited paper [JJ01], apply preconditioned conjugate gradient minimization of the block Rayleigh quotient or, alternatively, the Davidson method. Finite element analysis leads to generalized eigenproblems, with the “stiffness” matrix on the left and the “mass” matrix on the right. Both matrices are sparse. For regular media with symmetric material tensors, the matrices are Hermitian; the stiffness matrix is nonnegative definite, and the mass matrix is positive definite. The use of direct solvers is justified only for small-size problems; the relevant direct solver for the generalized eigenproblem is the QZ method by C. B. Moler and G. W. Stewart [MS73]. A review of iterative methods for this type of problem is given in the Templates [BDD+00]. W. Axmann and P. Kuchment [AK99] offer an interesting comparison of FEM with plane wave expansion; they use the “simultaneous coordinate overrelaxation method” as an eigenvalue solver for finite element photonic (and acoustic) bandgap calculation in two dimensions.

Chapter 9

Metamaterials and Their Parameters

“What’s this?” thought the Emperor. “I can’t see anything. ... Am I a fool? Am I unfit to be the Emperor? ... – “Oh! It’s very pretty,” he said. “It has my highest approval.” Hans Christian Andersen, “The Emperor’s New Clothes” (1837). [Translation by Jean Hersholt.]

9.1 Introduction Physicists and engineers have for a long time been interested in designing artificial periodic structures with unusual properties, possibly unattainable in natural materials. This chapter focuses on electromagnetic properties of periodic structures such as photonic crystals (PhCs) and metamaterials (MMs), but some references on acoustic and (to a lesser extent) elastic properties are included. The difference between PhC and MM is more a matter of degree than principle. These are periodic structures that can be viewed, with a varying level of accuracy, as effective media characterized by some macroscopic material parameters. MMs are typically distinguished by the presence of resonating elements and by a relatively high accuracy of their effective medium description. Note, though, that any structure can resonate under the right conditions; one textbook example is Mie resonances of a dielectric sphere. MMs are usually engineered to have tunable resonances, adjustable to a particular frequency range and/or excitation conditions (e.g. polarization of the wave). The archetypal resonating element is a split-ring resonator (SRR)—a conducting ring with a gap, or a concentric pair © Springer Nature Switzerland AG 2020 I. Tsukerman, Computational Methods for Nanoscale Applications, Nanostructure Science and Technology, https://doi.org/10.1007/978-3-030-43893-7_9

561

562

9 Metamaterials and Their Parameters

Fig. 9.1 (From D. R. Smith’s group, people.ee.duke.edu/∼drsmith/metamaterials/metamaterial_ elements.htm) Examples of split-ring resonators (SRRs) and SRR metamaterials

of such rings. Figure 9.1 reproduces a few examples from D. R. Smith’s Web site at Duke University. A simple intuitive SRR model is an RLC circuit, with a small resistance R, an inductance L due to the current-supporting ring(s), and a capacitance C due to the gap(s) across which charges can accumulate. Changing the diameter of the ring(s) and the width of the gap(s), one can tune the LC parameters and hence the resonance frequency of this element (Appendix 9.4). Many other modifications of size and shape of the elementary resonators are possible and have been explored analytically and experimentally: helical, rectangular, circular, etc. Moreover, SRR parameters can be tuned by various means, depending on the frequency range. For example, between ∼108 and ∼1010 Hz, SRRs can be loaded with varactors. Detailed information and a large number of references can be found in a comprehensive 2019 review by M. Kadic et al. [KMvHW19], as well as in an earlier review by C. Soukoulis and M. Wegener [SW11]. There are quite a few respectable books as well: by S. Tretyakov [Tre03], C. Caloz and T. Ito [CI08], W. Cai and V. Shalaev [CS09], L. Solymar and E. Shamonina [SS09], I. Shadrivov et al. (Eds.) [SLK15], M. A. Noginov and V. A. Podolskiy (Eds.) [NP16], D. H. Werner [Wer17]. A detailed analysis and applications of SRRs, especially to RFID tags, can be found in S. Zuffanelli’s thesis [Zuf18]. On a more esoteric side, connections between metamaterials science and cosmology are explored by I. I. Smolyaninov [Smo18]. It is not my intention here to recount all relevant historical developments; there is a large amount of information in the introductory part of the books and review papers referenced above. The widely recognized milestones are: • Early work by L. I. Mandelshtam in the 1940s [Man45, Man47, Man50]) and D. V. Sivukhin in the 1950s [Siv57]. V. G. Veselago [Ves68] speculated about the unusual properties of materials with simultaneously negative permittivity  and permeability μ, which results in negative index of refraction (Sect. 8.15). There

9.1 Introduction



• •



563

are over 7,200 citations of Veselago’s paper at the time of this writing (spring 2019). In the 1980s–early 1990s, M. V. Kostin and V. V. Shevchenko [KS88, KS93] analyzed arrays of metallic rings in a host dielectric—a clear prototype of today’s metamaterials. Such arrays may exhibit diamagnetic characteristics due to the reaction field of the rings and the Lenz law. The idea of rings and SRR (J. B. Pendry et al. [PHRS99]) was not entirely new, however, and could be traced back to the fundamental monograph by S. A. Schelkunoff and H. T. Friis published in 1952 [SF52]. J. B. Pendry’s “perfect lens” paper [Pen00] (about 8,000 citations) led to an explosion of interest in negative refraction. Negative refraction demonstrated by D. R. Smith’s group in 2000–2001 [SPV+00, SSS01, PS04] for frequencies around 10 GHz and later by other research groups at telecommunication and optical wavelengths (G. Dolling et al. [DEW+06, DWSL07]). The idea of cloaking put forward and demonstrated by J. B. Pendry, D. R. Smith, D. Schurig and coworkers [SMJ+06] and, independently, by U. Leonhardt [Leo06] (Sect. 9.2.3).

S. A. Tretyakov provides a much more detailed and very interesting review of historical developments [Tre05].

9.2 Applications of Metamaterials 9.2.1 An Overview Although prototypes of electromagnetic metamaterials were known for quite some time (see M. V. Kostin and V. V. Shevchenko’s paper cited above [KS88, KS93], as well as the monograph by S. A. Schelkunoff and H.T. Friis [SF52]), it was the phenomenon of negative refraction that catalyzed an enormous amount of interest in metamaterials. This was a result of two major developments around the year 2000: (i) J. B. Pendry’s “negative refraction makes a perfect lens” paper [Pen00] and (ii) an experimental demonstration of negative refraction by D. R. Smith’s research group [SPV+00, SVKS00]. See Sect. 8.15 as well as Sect. 9.2.2, for more details. Since perfect lensing is the holy grail of optics, the expectations at the time were quite high. For example, J. B. Pendry wrote in 2001 [Pen01a]: ... to make an enduring contribution we must find applications. This would present no problems if a perfect negative material were easily available. These applications would include DVDs that could store 100 times more data than at present, optical lithography in the semiconductor industry with a resolution 10 times better than the current standard, and MRI scanners an order of magnitude cheaper than current models. ...

564

9 Metamaterials and Their Parameters

The range of potential applications for these materials is vast. Indeed, even if we fail to reach our most ambitious goals, it would be surprising if some of these applications were not realized. The outlook is very positive for negative materials.

[Color emphasis mine.] This sentiment was echoed by N. Engheta and R. W. Ziolkowski in 2005: The future is indeed very positive for [double-negative metamaterials] [NZ05]. However, at least two significant limiting factors became clear in the 2000s. The first one is losses, which are frequency-dependent but never negligible for any natural conducting materials. The second limiting factor is the lattice cell size, which must be an appreciable fraction of the wavelength for strong magnetic effects and negative refraction to be achievable [Tsu08]. Constraints imposed by these two factors are particularly severe for imaging, but strongly affect other applications as well (see subsequent sections). One can identify several major avenues taken by various research groups to partly circumvent these constraints. • There are a few “poor man’s” versions of perfect physical effects and devices. One example is the electrostatic lens proposed by J. B. Pendry along with the “perfect lens” (Sect. 8.15). Another example is “carpet cloak” at optical frequencies— concealing an object under a metamaterial layer rather than inside a 3D metamaterial shell (J. Valentine et al. [VLZ+09], M. Gharghi et al. [GGZ+11]). The main impediment to full-fledged 3D cloaking, apart from fabrication challenges, is again losses—even though a negative index is not required for cloaking. • Loss mitigation, active metamaterials, and the search for low-loss materials. Since these subjects are only tangentially related to the main themes of this book, I simply mention several books and papers, with references therein: I. Shadrivov et al. (Eds.) [SLK15], J. B. Khurgin [Khu15, Khu17], A. Boltasseva and H. A. Atwater [BA11], O. Hess et al. [HPM+12], S. Xiao et al. [XDK+10], Q. Zhao et al. [ZZZL09], J. Valentine et al. [VLZ+09]. • As the old adage goes, “if you can’t beat them, join them.” In this case, the idea is to “turn losses into gain.” J. C. Ndukaife, V. M. Shalaev, A. Boltasseva, and others pointed out that losses can be beneficial in a number of applications: loss-induced heating in plasmon-enhanced optical tweezing; optical data storage and encryption; heat-assisted magnetic recording; plasmonic photothermal cancer therapy; broadband solar absorbers and emitters, etc. [NSB16]. Most of these applications involve plasmonic particles and structures, but not metamaterials per se. In the following sections, I review select application areas, especially those where analysis and simulation play an important role.

9.2 Applications of Metamaterials

565

9.2.2 Imaging: Perfect and Imperfect Lenses The principal ideas of “superlenses” based on negative refraction were outlined in Sect. 8.15. To reiterate, a slab of hypothetical material with the dielectric permittivity  = −1 and magnetic permeability μ = −1 would produce a perfect image of any source. This lens is not subject to the usual diffraction limit and can resolve two ideal point sources perfectly, no matter how close these sources are. This perfect picture is in reality tainted by several major factors. The first one is losses; they can be mitigated (Sect. 9.2.1) but will not in practice go away. Losses can affect the performance of the lens dramatically, as can any other deviations from the ideal parameters  = −1, μ = −1. For example, analytical and numerical calculations by D. R. Smith et al. [SSR+03] show that for s-polarization and the thickness of the slab equal to 2/3 of the vacuum wavelength λ, an order-of-magnitude resolution enhancement (relative to the standard diffraction limit) will require the deviation of μ from −1 by less than 10−18 ! If the thickness is 0.1λ, this deviation can be “as much as ∼0.002 to achieve the same resolution enhancement.” Even in the electrostatic limit, for “poor man’s perfect lens,” losses may have a dramatic effect on performance, as exemplified by Fig. 3 in the paper by J. T. Shen and P. M. Platzman [SP02]. Natural materials can in many cases be approximated by perfectly homogeneous media, since the characteristic spatial scale of their atomic structure is much finer than the wavelength even at optical frequencies. This is not the case for metamaterials. For strong magnetic response and negative refraction in particular, the lattice cell size must constitute an appreciable fraction of the vacuum wavelength; see Sect. 8.15. This granularity of the metamaterial structure limits the highest possible resolution of the lens. According to an estimate by D. R. Smith et al. [SSR+03, Eq. (9)], there is no resolution enhancement if, for example, the cell size is around 0.1–0.2λ and the thickness of the slab is ∼2/3λ. Yet another major factor, which is not fully appreciated in the literature, is that metamaterials do not behave as a homogeneous medium for evanescent waves, on which perfect lensing depends as a matter of principle. This is intuitively clear in the case of rapidly decaying waves whose penetration depth into the material is comparable with or smaller than the lattice cell size. Some of these waves may not even “see” the physical content of the cell or inclusions contained therein. A detailed analysis of homogenization limits under such circumstances is given by V. A. Markel [MT16].

566

9 Metamaterials and Their Parameters

9.2.3 Transformation Optics 9.2.3.1

Motivation

The underlying principle of transformation optics is that in Maxwell’s electrodynamics coordinate mappings are equivalent to transformations of material tensors. As a simple introductory illustration of that, let us consider a two-dimensional electrostatic problem for a potential u in free space, in an annulus r1 ≤ r ≤ r2 . In the polar system (r, θ), the Laplace equation for this potential is 1 1 ∂r (r ∂r u) + 2 ∂θ2 u = 0 r r

(9.1)

It is instructive to rewrite the Laplace equation as  ∂r (r ∂r u) + ∂θ

1 ∂θ u r

 = 0

(9.2)

Recall now that the electrostatic equation in a Cartesian coordinate system (ξ, η) and in the presence of an anisotropic dielectric reads ∂ξ (ξ ∂ξ u) + ∂η (η ∂η u) = 0

(9.3)

assuming that (ξ, η) are the anisotropy axes and that ξ , η are the dielectric permittivities along those axes. It is clear that Eqs. (9.2), (9.3) have exactly the same form and become equivalent for a hypothetical material with ξ = r ; η =

1 r

(9.4)

Let us interpret this result. The standard cylindrical-to-Cartesian transformation maps the annulus under consideration onto a rectangle (Fig. 9.2). As we just discovered, a

Fig. 9.2 Free-space potential in the annulus, when mapped to the respective rectangle, remains a valid electrostatic potential, but in a medium with a particular anisotropic tensor. See text for details

9.2 Applications of Metamaterials

567

free-space potential in the annulus, when mapped to the respective rectangle in the (ξ, η) plane (with ξ ≡ r , η ≡ θ in this case), remains a valid electrostatic potential, but in a medium with a particular anisotropic tensor (9.4). It is in that sense that coordinate mappings are equivalent to material transformations. This fact was known well before the advent of transformation optics and was used, in particular, to convert field problems from unbounded to bounded domains, which is helpful for finite element modeling (see, e.g., E. M. Freeman and D. A. Lowther [FL89], A. Stohchniol [Sto92], A. Plaks et al. [PTPT00]). Importantly, as we shall see in the following section, this transformation equivalence remains true beyond electrostatics, in full Maxwell’s electrodynamics. In comparison with the earlier developments cited above [FL89, Sto92, PTPT00], the novelty of transformation optics is that the position-dependent anisotropic tensors arising from spatial transformations can be implemented with judiciously designed metamaterials. Implicit in this idea is the assumption that a wide spectrum of material characteristics is accessible in practice. This assumption holds to a varying degree of accuracy, depending on specific applications, the frequency range, the properties of natural materials that could be used as ingredients of the metamaterials, and other factors. We shall return to these issues in the section on homogenization. For now, let us look at a simple 2D illustration of the key idea. Consider a plane wave u(r) = exp(ikx)

(9.5)

propagating in free space. Then, consider an arbitrary disk r ≤ r2 and its mapping to an annulus 0 < r1 ≤ r  ≤ r2 :  r1 + (1 − r1 /r2 )r, r < r2 r = ; θ = θ (9.6) r  = r, r ≥ r2 Figure 9.3 shows what the plane wave (9.5) turns into under this mapping. A very simple MATLAB code, which was used for plotting, is included below for reference (Table 9.1). Note that this 2D transformation is singular, as it maps the origin onto the whole circle r  = r1 —an idiosyncratic feature in both theoretical analysis and practical implementation. But there are two curious implications: • Since, by design, the transformation is an identity map outside the outer radius r2 , the wave in this outside region is unchanged. This implies that an outside observer would have no way of distinguishing between the waves before and after the transformation (the left and right parts of Fig. 9.3). • After the transformation, a void r ≤ r1 has been created. This disk is inaccessible to the wave. Any object placed inside that void will be invisible (“cloaked”) to any outside observer. These considerations form a basis for the field of transformation optics, which in recent years has expanded enormously (Sect. 9.2.3.3).

568

9 Metamaterials and Their Parameters

Fig. 9.3 Left: the original plane wave. Right: the wave after the coordinate transformation r  = r1 + (1 − r1 /r2 )r for r < r2 , and r  = r otherwise; θ = θ. Note that this transformation maps the disk r ≤ r2 into the annulus r1 ≤ r ≤ r2 , leaving the circle r = r2 unchanged

9.2.3.2

Transformation Optics: A Glimpse of Theory

To extend the ideas of the previous section, let us consider a generic setup of Fig. 9.4, 2D for simplicity of drawing. Assume that a field F(x) satisfies a differential equation Lx F = 0 in a domain x , Lx being a differential operator expressed in coordinates x = (x1 , x2 ). Naturally, F may represent a pair of fields, and L may symbolize a system of equations, Maxwell’s equations being of our primary interest. Now, let there be a differentiable mapping (as a rule, one-to-one and non-singular) of x onto a domain x . The latter is equipped with its own coordinate system x = (x1 , x2 ). With this mapping in place, the field F can be defined in x simply by the pullback to x . That is, the field at any point x ∈ x is the same as the field at the preimage x(x ). However, the coordinate components of F will in general be different in the two systems (Fig. 9.4). Note that these coordinate systems need not be orthogonal, although mathematical expressions of transformation optics do simplify in the case of orthogonality. The key observation is the same as in the case of the Laplace equation of the previous section. Namely, if the field F satisfies Maxwell’s equations in x in the xcoordinates, it will also satisfy Maxwell’s equations in x in the x -coordinates, but with the material tensors  and μ properly transformed. This fundamental property has been understood for a long time (E. J. Post [Pos62, Pos97]), and there are two major ways to derive tensor transformations corresponding to coordinate mappings: • Differential forms and differential geometry. There are many excellent references, which include, in the context of electromagnetics, a very readable introduction by K. F. Warnick and P. Russer [WR14]; a good exposition and a vast collection of references in F. L. Teixeira’s paper [Tei01]; and a whole body of work by A. Bossavit [Bos98, Bos, Bos88b, Bos91]. In the 1980s, P. R. Kotiuga’s PhD thesis [Kot85] was a seminal contribution to the application of differential forms in computational electromagnetics. The role of differential geometry in electromagnetic theory was recognized long ago (D. van Dantzig [vD34]).

9.2 Applications of Metamaterials Table 9.1 A simple MATLAB code demonstrating the cloaking idea in 2D function wave_cloaking_2D_example(xmax, ymax, r1, r2, k, nx, ny) % A simple demo of the cloaking idea % Domain: [-xmax, xmax] x [-ymax, ymax] % Disk r < r2 mapped to an annulus r1 < r < r2 % k -- the wavenumber % nx, ny - number of grid points for plotting x_grid = y_grid = [X, Y] = field_2D [N1, N2]

linspace(-xmax, xmax, nx); linspace(-ymax, ymax, ny); meshgrid(x_grid, y_grid); = exp(1i*k*X); % a plane wave = size(X);

[Theta, R] = cart2pol(X, Y); R_transformed = R; % initialize the transformation s = 1 - r1/r2; % the contraction coefficient (disk to annulus) for n1 = 1 : N1 for n2 = 1 : N2 r = R(n1, n2); if r < r2 R_transformed(n1, n2) = r1 + r*s; end end end [X_transformed, Y_transformed] = pol2cart(Theta, R_transformed); figure(’Color’, [1 1 1]); % white background contour(X, Y, real(field_2D), 20); axis equal; xlabel(’$x$’, ’Interpreter’, ’LaTeX’, ’FontSize’, 24); ylabel(’$y$’, ’Interpreter’, ’LaTeX’, ’FontSize’, 24); colorbar; tilt = 0; % an auxiliary variable for drawing figure(’Color’, [1 1 1]); % white background contour(X_transformed, Y_transformed, real(field_2D), 20); hold on; plot_arbitrary_ellipse(0, 0, r1, r1, tilt); % a utility function plot_arbitrary_ellipse(0, 0, r2, r2, tilt); axis equal; xlabel(’$x$’, ’Interpreter’, ’LaTeX’, ’FontSize’, 24); ylabel(’$y$’, ’Interpreter’, ’LaTeX’, ’FontSize’, 24); colorbar; return; x_max = 10; y_max = 10; r1 = 3; r2 = 6; k = 2*pi; nx = 200; ny = 200; wave_cloaking_2D_example(x_max, y_max, r1, r2, k, nx, ny);

569

570

9 Metamaterials and Their Parameters

Fig. 9.4 Domain x is mapped onto x . If a field F satisfies Maxwell’s equations in x in the x-coordinates, it will also satisfy Maxwell’s equations in x in the x -coordinates, but with the material tensors  and μ properly transformed

• The machinery of tensor calculus is, by definition, well suited for tensor transformations, and this is how they were derived in the context of transformation optics (A. J. Ward and J. B. Pendry [WP96]); see Appendix 9.5. For the reader not familiar with the differential geometric viewpoint, here is a pedestrian summary of the notions involved. A detailed exposition can be found in the literature cited above; see also S. H. Weintraub’s book [Wei14] for a lucid and readable account of the mathematical theory. It can be argued that the elementary circulations and fluxes of fields are more fundamental than the fields themselves. Indeed, suppose that one considers as fundamental the differential dC = E · dl, with the respective line integral of an electric field E over a given path, and the differential d = D · dS, with the respective integral of a displacement field D over a given surface. (Here, C stands for “circulation,” and  for flux.) Similar notions can obviously be considered for the H and B fields. One salient advantage of this viewpoint is that circulations and fluxes, unlike the fields themselves, are metric-independent, and hence Maxwell’s equations can themselves be written in metric-independent form. In this treatment, metric enters only the constitutive relations but not the differential from analogs of the curl and div equations. The proper mathematical formalism for objects like dC is 1-forms, which are linear functionals on certain vector spaces called tangent spaces.1 Likewise, objects like d are rigorously defined as antisymmetric bilinear forms (2-forms) on tangent spaces. In this framework, material properties are relations between 1-forms (representing the E and H fields) and 2-forms (representing the D and B fields). I hasten to reiterate that this detour into the differential geometric treatment of Maxwell’s electrodynamics is simplistic and incomplete; the intention is only to give the interested reader a flavor of the ideas involved.

1 To

each point in a given smooth manifold, there corresponds its space of vectors tangential to the manifold at that point.

9.2 Applications of Metamaterials

9.2.3.3

571

Historical and Other Notes on Transformation Optics

Amazingly, G. W. Milton and his collaborators have been able to trace the concept of transformation optics and cloaking back to a 1961 paper by L. S. Dolin [Dol61], who writes, in particular: If a transformation of coordinates changes the metric within a bounded region, the parameters of the medium and the expressions for the fields outside this region are preserved. Such transformations can be used for constructing non-reflecting inhomogeneities.

[Color emphasis mine.] A recent “roadmap on transformation optics” [MPG+18], with its 27 co-authors, 44 pages and 166 references, is a comprehensive review of the major trends and future directions, as well as a testament to an explosive growth of this field over the last 10–15 years. This growth was catalyzed by the theoretical idea and experimental demonstration of cloaking (J. B. Pendry, D. R. Smith, D. Schurig and coworkers [SMJ+06], U. Leonhardt [Leo06]). The apparent possibility of making macroscopic objects invisible aroused a great deal of interest in the scientific community, in funding agencies, and in the general public.2 A reference to Harry Potter became popular, although it was initially intended as a joke.3 Rational or irrational exuberance aside, broadband 3D cloaking does not appear to be possible. Existing demonstrations are mostly limited to carpet cloaks (J. Valentine et al. [VLZ+09], M. Gharghi et al. [GGZ+11]). Abnormal material parameters required for invisibility cloaking typically stem from the resonance behavior of metamaterial “atoms,” which is a narrowband phenomenon. In a different vein, R. Schittny et al. study cloaking in the regime of diffuse light propagation [SNM+16]. The main theme of this book requires that transformation methods be mentioned in the context of computational electromagnetics: A. Nicolet and collaborators [NRM+94, NZAG08, NZG10], E. M. Freeman and D. A. Lowther [FL89], A. Stohchniol [Sto92], A. Plaks et al. [PTPT00].

9.2.4 Tunable, Reconfigurable, Superconducting and Other Metamaterials and Metadevices There is a very vast array of existing and potential applications of metamaterials, where different physical effects are ingeniously combined to produce controllable or tunable metadevices with peculiar functionalities. For the themes of this book, a few remarks are in order:

2 For

example: Scientists Take Step Toward Invisibility, NY Times, October 20, 2006. https://www. nytimes.com/2006/10/20/science/20cloak.html. 3 “Transforming optics: an interview with Sir John Pendry,” Advanced Photonics 1(1), 010502 (28 January 2019). https://doi.org/10.1117/1.AP.1.1.010502.

572

9 Metamaterials and Their Parameters

• Computer simulation of such devices is a complicated engineering task, best performed with industrial software packages—e.g. FEM- or FDTD-based (see Chaps. 3, 7). • Coupled physical phenomena require special treatment and a combination of computational tools that are often problem-dependent. • The field of metamaterials and metadevices is evolving so rapidly that any review is likely to become outdated soon after this book gets published. Due to all these considerations, I simply refer here to the review article by N. I. Zheludev and Y. S. Kivshar [ZK12]. Using it as a starting point, the interested reader will certainly find a way to keep tabs on further developments in this area in the coming years.

9.2.5 Metamaterial Absorbers Under the optimistic assumption that the whole “passive” part of the -μ space of effective parameters is accessible via a suitable design of metamaterials, one can envision perfect absorbers: materials whose impedance is matched to air (i.e. the relative permittivity and permeability are equal), while the imaginary parts of these parameters are positive to facilitate absorption. This idea was developed by N. I. Landy et al. [LSM+08], who write As an effective medium ..., MMs [metamaterials] can be characterized by a complex electric permittivity ˜ (ω) = 1 + i2 and magnetic permeability μ(ω) ˜ = μ1 + iμ2 . ... the oftoverlooked loss components of the optical constants (2 and μ2 ) have much potential for the creation of exotic and useful materials as well. For example, they can be manipulated to create a high absorber. By manipulating resonances in  and μ independently, it is possible to absorb both the incident electric and magnetic field. Additionally, by matching  and μ, a MM can be impedance-matched to free space, minimizing reflectivity. ... we show that MMs can be fashioned to create narrow-band perfect absorbers.

A basic lattice cell design along these lines is metal–insulator–metal, with one or both metal layers patterned (N. I. Landy et al. [LSM+08], J. Hao et al. [HWL+10], X. Liu et al. [LSSP10], K. Aydin et al. [AFBA11], C. M. Watts et al. [WLP12]). Various embellishments of this basic design have been proposed (e.g. C. Wu et al. [WNS+11], Y. Avitzour et al. [AUS09]). Even though these developments were stimulated by the effective medium treatment of metamaterials, such treatment is by necessity approximate, especially when the lattice cell size is an appreciable fraction of the vacuum wavelength (Sect. 9.3). As an alternative, absorption effects in metal–insulator–metal structures can be interpreted simply via wave interference—that is, as Fabry–Pérot-type interferometers of varying complexity. Further details can be found in papers by H.-T. Chen [CZO+10, Che12].

9.2 Applications of Metamaterials

573

K. Bhattarai et al. write [BSS+17]: ...the metamaterial perfect absorber behaves as a meta-cavity bounded between a resonant metasurface and a metallic thin-film reflector. The perfect absorption is achieved by the Fabry-Perot cavity resonance via multiple reflections between the “quasi-open” boundary of resonator and the “close” boundary of reflector.

9.2.5.1

Chiral Metamaterials

By definition, an object or structure is chiral if it is not congruent (i.e. cannot be superimposed by translations or rotations) to its mirror image. Classic everyday examples are a glove and a corkscrew. Such structures are ubiquitous in chemistry, where molecules that are mirror images of each other are called enantiomers. In biological cells, DNA, amino acids (the building blocks of proteins) and sugars are chiral, and one of the big mysteries in the origin of life on Earth is homochirality: Why these molecules are utilized in one form only. Chiral metamaterials may be conceptually simple to design (although challenging to fabricate in 3D). One idea that almost immediately comes to mind is helical structures; see, e.g., a 2017 review by A. Passaseo et al. [PECT17] or J. Kaschke and M. Wegener’s 2016 paper [KW16]. Figures 9.5, 9.6, 9.7 and 9.8 show a few other representative designs of chiral metamaterials; there are many more. These examples are taken from the papers by J. Zhou et al. [ZDW+09], Z. Li et al. [LZK+10], Y. Cui et al. [CKL+14], and W. Ma et al. [MCL18]; see also M. Qiu et al. [QZT+18] and reviews by Z. Li et al. [LMO13], V. K. Valev et al. [VBSV13], Z. Wang et al. [WCWL16]. There is also a significant amount of research on planar chiral structures (B. Bai et al. [BSTV07], L. Hecht and L. D. Barron [HB94]). A closely related but separate subject is extrinsic chirality which arises from the mutual orientation of an

Fig. 9.5 (Source J. Zhou et al. arxiv.org/abs/0907.1121.) a Schematic representation of one unit cell of the cross-wire structure. b Photograph of one side of a fabricated microwave-scale crosswire sample. The geometry parameters are given by ax = a y = 15 mm, l = 14 mm, w = 0.8 mm, s = 1.6 mm, φ0 = 45◦ , and φ = 30◦ . The thickness of the copper is tm =36 µm

574

9 Metamaterials and Their Parameters

Fig. 9.6 (Source Z. Li et al. arxiv.org/abs/1008.4950.) a Schematic of a unit cell of the chiral metamaterials consisting of four “U” split-ring resonators. b A photograph of the experimental sample. The geometric parameters: ax = a y = 15 mm, t = 1.6 mm, d = 1.5 mm, w = 0.7 mm, and s = 6 mm. The copper thickness is 0.03 mm

c Fig. 9.7 (Reprinted with permission from W. Ma et al. [MCL18] 2018 American Chemical Society.) Schematic of the chiral metamaterial. The unit cell consists of two stacked gold SRRs twisted at a certain angle and separated by two spacing dielectric layers with a continuous gold reflector at the bottom. The thickness and width of the gold SRRs are 50 and 200 nm, respectively, whereas the period of the unit cell is fixed at 2.5 µm. The reflection spectra of interest are set in the mid-infrared region from 30 to 80 THz

intrinsically non-chiral structures and the incident beam (E. Plum, V. A. Fedotov and N. I. Zheludev [PFZ09, PFZ10]). What can chiral metamaterials do for us? First, it is probably not surprising that circular dichroism may be appreciable; that is, circularly polarized waves may behave differently in chiral materials, depending on the direction of rotation. Giant optical activity (polarization rotation) has been reported; I refer to the review papers cited above for all these effects [PECT17, LMO13, VBSV13, WCWL16]. More surprising is the “chiral route” to negative refraction, discovered by S. Tretyakov and his collaborators [TNA+03, TSJ05], and subsequently analyzed by a number of research groups (Z. Li et al. [LZK+10, LMO13], J. B. Pendry [Pen04], E. Plum et al. [PZD+09], J. Zhou et al. [ZDW+09].

9.2 Applications of Metamaterials

575

c Fig. 9.8 (Reprinted with permission from Y. Cui et al. [CKL+14] 2014 American Chemical Society.) Structural geometry and SEM images of the chiral metamaterial. a Schematic of a unit cell of the twisted-arc metamaterial, wherein the silver arcs in each pair have an angular offset of α + β with respect to the stacking axis passing through the center of the arcs. Geometrical parameters: α = 100◦ , β = 10◦ , r = 140 nm, w = 70 nm, t = 50 nm, and d = 180 nm. The unit cell is arranged in a two-dimensional square lattice with a lattice constant of 630 nm. b Oblique view of the sample under an electron microscope, with the magenta and green colors indicating the top and bottom arcs, respectively. Insets: enlarged SEM images of a single meta-atom at the normal and oblique incidence, respectively

All of the properties above follow from the analysis of the dispersion relations in chiral media. The material relations are phenomenologically written as follows: D = E + iκH

(9.7)

B = μH − iκE

(9.8)

There are other forms of such relations (N. Engheta, S. Bassiri, C. H. Papas, D. L. Jaggard [EJ88, BPE88])—some similar or equivalent to (9.7), (9.8), and some generalizations—notably to bi-isotropic media: D = E + γH

(9.9)

B = μH + βE

(9.10)

(C. Monzon and D. W. Forester [MF05], D.-H. Kwon et al. [KWKS08].) Further analysis and discussion of these issues, including some controversies related to material parameters of chiral and bi-isotropic materials (e.g. “the Post constraint”), can be found in a variety of publications: B. V. Bokut’ and A. N. Serdyukov [BS72], Y. N. Obukhov and F. W.Hehl [OH09], A. Lakhtakia [Lak06], A. Sihvola and S. Tretyakov [ST08].

576

9.2.5.2

9 Metamaterials and Their Parameters

Current Directions and Future Trends in the Applications of Metamaterials

One recent trend has been a focus on metasurfaces (one-layer structures). Incidentally, many of the designs studied in the 2000s were single-layer films, i.e. metasurfaces, due to the relative ease of their fabrication. But now this trend may be reversing, at least in part—advantages of metasurfaces notwithstanding. M. Kadic et al. write in their review of 3D metamaterials [KMvHW19]: At present, the number of researchers working on 2D metasurfaces is much larger than the number working on 3D metamaterials. It is argued that 2D structures are easier to fabricate, bringing the field closer to applications. Flat electromagnetic and optical metalenses are a prominent example. However, recent advanced meta-lenses use two or more layers to obtain additional degrees of freedom in the design process, in analogy to ordinary refractive-lens systems; otherwise, certain aberrations simply cannot be corrected. This means that the 2D metasurface field is partly moving to 3D architectures. This step is not surprising in view of the fact that, conceptually, the possibilities of 2D structures are only a subset of the possibilities of 3D structures. Furthermore, with the rapid progress of 3D additive manufacturing, in future years, 3D structures might be as easy to manufacture as 2D structures are today.

A variety of reviews of metasurfaces and their applications are available: for example, H.-T. Chen et al. [CTY16], P. Genevet et al. [GCA+17], G. Li et al. [LZZ17], F. Ding et al. [DPB18], S.Kamali et al. [KAAF18], I. Epstein et al. [ETA16], W. Wan et al. [WGY17], Z.-L. Deng and G. Li [DL17], Q.-B. Fan and T. Xu [QBT17], J. Cheng et al. [CFC19], S. Chen et al. [CLC+18], F. Ding et al. [DYDB18], P. Genevet and F. Capasso [GC15]. The unusual artificially designed and tuned properties of metamaterials, especially the resonance effects associated with them, continue to stimulate research in a number of unorthodox application areas: nonlinear, quantum, graphene-based, topological metamaterials and metasurfaces, with a variety of applications. I have not conducted independent research in these areas, and their relevance to the main theme of this book (analysis and simulation) varies widely. Therefore, the interested reader is referred to a collection of review papers, with numerous references therein: S. Yoo and Q.-H. Park [YP19], F. Fan et al. [FCC17], R. Bogue [Bog17], Z. Wang et al. [WCWL16], W. Adams et al. [ASG16], G. Oliveri et al. [OWM15], J. P. Turpin et al. [TBM+14], G. R. Keiser et al. [KFZA13], A. Q. Liu et al. [LZTZ12], S. Gredeskul et al. [GKA+12], M. Kadic et al. [KMvHW19], C. M. Soukoulis and M. Wegener [SW11]. Note that these references deal primarily with electromagnetic phenomena. There is an equally large body of research and publications on acoustic, elastic and even fluidic metamaterials.

9.3 Homogenization

577

9.3 Homogenization This section is joint work with V. A. Markel and is partly based on our papers [Tsu11a, Tsu11b, TM14, TM16, Tsu17].

9.3.1 Introduction Metamaterials were conceived as direct analogs of natural materials, with judiciously designed artificial “atoms” (such as SRRs or other inclusions of varying complexity) producing unusual material properties, in many cases unattainable in nature. This implies that at the macroscopic level, much coarser than the lattice cell size, these artificial structures can be accurately characterized by effective material tensors, as is the case for natural materials. Hence, there is a need for a rigorous and sufficiently general homogenization theory. Development of such theories is the main subject of this section, and the exposition proceeds along the following lines: An outline of classical effective medium theories in physics and applied mathematics. One would assume that some or all of these existing theories can be successfully applied to metamaterials, so that no new ideas are needed; but this turns out not to be the case (Sect. 9.3.7). Classical effective medium theories are inadequate for metamaterials. It turns out that classical theories cannot account for the apparent magnetic properties of structures with intrinsically non-magnetic constituents. Lattice cells must be sufficiently large. An important observation is that many unusual phenomena and properties of metamaterials (negative index of refraction, non-trivial magnetic characteristics, etc.) can manifest themselves only if the lattice cell size is an appreciable fraction of the vacuum wavelength λ (in typical designs, on the order of 0.2–0.3λ). In the mathematical limit of a vanishingly small cell size, these unusual phenomena disappear. (One exception is two-parameter homogenization reviewed in Sect. 9.3.4.) The role of material boundaries. In classical homogenization, material boundaries play only a perfunctory role. In contrast, magnetic effects in metamaterials are boundary-dependent (Sect. 9.3.7.3). The need for non-asymptotic theories. In applied mathematics, well-established homogenization theories are asymptotic—that is, valid in the limit of the lattice cell size vanishingly small relative to some characteristic scale. (For wave problems, this scale is the free-space wavelength; in statics, it is the scale of variation of the applied field and/or the size of the material sample.) One can draw the following principal conclusion: A non-asymptotic homogenization theory is needed—that is, a procedure valid for an arbitrary composition of the lattice cell and a broad range of lattice cell sizes, not necessarily vanishingly small.

578

9 Metamaterials and Their Parameters

Parameter retrieval (Sect. 9.3.2). This is a natural and most common procedure, whereby effective material parameters are determined from the reflection and transmission coefficients. This is an inverse problem, and its basic version, where only normal incidence is considered, is ill-posed; more advanced, well-posed procedures were developed by V. A. Markel [MT13]. Two-parameter homogenization is a clever way of capturing the magnetic effects that vanish in the standard zero-cell-size limit (Sect. 9.3.4). There exists a welldefined asymptotic limit when the cell size a tends to zero, while the dielectric permittivity  of an inclusion within the cell simultaneously tends to infinity in a certain way. By definition, this procedure is limited to a single inclusion with a constant permittivity. A more important shortcoming is that the homogenization limit depends strongly on the trajectory chosen in the a– parameter plane. It is not obvious why the particular trajectory specified in the method should necessarily produce more accurate results than any other one (Sect. 9.3.4). In addition, I am not aware of a systematic demonstration that Maxwell’s boundary conditions on the material/air interface are honored in this double-parameter limit. High-frequency homogenization is another attempt to break away from the a/λ → 0 limit (Sect. 9.3.5). It accurately represents the dispersion relation near the edges of the Brillouin zone—but not necessarily the boundary conditions on the material/air interface. Our non-asymptotic and, by extension, non-local procedures are outlined in Sect. 9.3.7. One way of describing these procedures is via approximations by Trefftz functions— a set of Bloch waves on the fine scale and the respective generalized plane waves on the coarse scale. The coarse-level approximation is constructed in such a way that (i) the boundary conditions on the material interface and (ii) Maxwell’s equations in the bulk are satisfied as accurately as possible. More examples of non-asymptotic and non-local homogenization are presented in Sect. 9.3.7. Back in time to classical theories. Analysis of metamaterials, while interesting and important in its own right, gives a good opportunity to revisit the familiar notions of polarization, magnetization, average fields, etc. Section 9.3.9 explores the connection between non-asymptotic homogenization and classical physics of the nineteenth–early twentieth century. While the respective theories come from very different perspectives, we shall see that classical theories fit nicely into the proposed framework. Classical effective medium theory dates back to the works of O.-F. Mossotti (1850), L. Lorenz (1869), H. Lorentz (1878), R. Clausius (1879), and J. C. Maxwell Garnett (1904). A number of more advanced physical theories followed: by L. Lewin (1947), N. A. Khizhnyak (1957–59), P. C. Waterman and N. E. Pedersen (1986) [Lew47, Khi59, WP86]. Many modern physical theories of homogenization are presented in T. C. Choy’s monograph [Cho99]. The mathematics literature is also very rich and includes the monographs by N. S. Bakhvalov and G. Panasenko, A. Bensoussan et al., V. V. Jikov et al., G. Dal Maso, E. Sanchez-Palencia, L. Tartar and G. Milton [BP89, BLP78, JKO94, Mas92, SP80, Tar09, Mil02].

9.3 Homogenization

579

A comprehensive review of these theories, even if it were possible in a single chapter, would lead us too far astray. Altogether, according to the ISI database, there are about 40,000 papers covering mathematical, physical, numerical and engineering aspects of effective medium theory (homogenization). However, the existing theories are applicable, for the most part, in the homogenization limit, when the cell size of the microstructure (or another similar parameter) tends to zero. Indeed, for periodic structures the solution is typically sought as an asymptotic series expansion with respect to the lattice cell size a (N. S. Bakhvalov and G. Panasenko [BP89], A. Bensoussan et al. [BLP78], G. Allaire [All92]):  x  x  x + au 1 x, + a 2 u 2 x, + ··· u a (x) = u 0 x, a a a

(9.11)

where the us are some undetermined functions periodic in their second argument. Asymptotic theories do have shortcomings. The zero-order term in (9.11) not only is technically challenging for Maxwell’s electrodynamics of metamaterials (N. Wellander and G. Kristensson [WK03]), but is not sufficient. Indeed, non-trivial magnetic properties can exist in intrinsically non-magnetic metamaterials only for sufficiently large cell sizes. This conclusion has been reached independently by several research groups who have approached the subject from very different directions (A. Bossavit et al. [BGM05], D. Sjöberg et al. [SEK+05]). Specifically, in the zerocell-size limit a → 0, the electromagnetic behavior of a periodic composite becomes trivial. That is, electric and magnetic effects decouple and the material is described by two separate static tensors of dielectric permittivity and magnetic permeability, the latter being unity for intrinsically non-magnetic materials. I showed in 2008 [Tsu08] that magnetic effects may exist only for lattice cell sizes above a certain minimum threshold. V. A. Markel and J. Schotland, using an asymptotic theory, have proved the same by considering Fresnel coefficients at a halfspace boundary of a layered medium [MS10] and of a general 3D composite [MS12]. Various attempts have been made to relax the asymptotic conditions for homogenization. C. Conca and M. Vanninathan et al. [CV02] derive higher-order Bloch approximations with respect to a small parameter, but those are not easy to use and applicable to elliptic problems only. Y. Capdeville and J.-J. Marigo [CM07] consider an asymptotic expansion of elastic waves with respect to a small parameter up to second order, but their analysis applies to layered media only (in essence, a 1D problem). V. Kamotski et al. [KMS07] demonstrate an asymptotic small-parameter expansion with an exponentially small error for the solution of second-order elliptic PDEs with periodic coefficients and very smooth (analytic or, more generally, Gevrey regular [Rod93]) free terms. Other examples of higher-order homogenization are known in various special cases: P. Ponte Castañeda [Cas96], S. Moskow and M. Vogelius [MV97], S. E. Golowich and M. I. Weinstein [GW03, GW05]. In classical physics, effective medium theories rely on simplification assumptions that work very well for relatively simple mixtures but require extensions and enhancements in more complicated cases (L. Lewin [Lew47], N. A. Khizhnyak [Khi59], P. C. Waterman and N. E. Pedersen [WP86]).

580

9 Metamaterials and Their Parameters

c Fig. 9.9 (Reprinted from [Tsu17] 2017 with permission from Elsevier.) A periodic structure replaced with an equivalent (in the sense explained in the text) homogeneous sample. Coarse-level fields should be defined in such a way as to satisfy Maxwell’s equations and interface boundary conditions exactly or as accurately as possible

Informally, the essence of the problem is illustrated in Fig. 9.9. A given periodic structure is to be replaced with a homogeneous sample of the same geometric shape and size, with some material tensor M to be defined, in such a way that reflection and transmission of waves (or, equivalently, the far-field pattern) remain, to the extent possible, unchanged. This informal description is made more precise in Sect. 9.3.7. We have adopted this formulation as the most natural one, but it is not the only one possible. One may ask, for example, how a metamaterial sample will behave if placed inside a waveguide; since the boundary conditions and surface waves at the walls will be different from those in free space, it is not obvious that the effective material parameters will have to be the same (Y. Hadad [Had18]). Homogenization is fundamentally a nonlinear inverse problem which in each particular case could in principle be solved by brute numerical force. However, this is very expensive computationally (especially for large samples) and—perhaps more importantly—gives little insight into the physical behavior of waves in the material and into its design. Instead, we solve this inverse problem approximately, using, as an intermediate step, a special construction of coarse-grained fields.

9.3.2 Parameter Retrieval: Procedures S-parameter retrieval is a well-established inverse problem of determining material parameters from the measured or calculated transmission and reflection coefficients. Typically, wave propagation through a metamaterial slab (sometimes even a single layer) is considered at normal incidence. We review this case first and then consider some improvements to the retrieval methodology. Under the assumption that the effective medium can be described by two scalar parameters  and μ, analysis of wave propagation of a semi-infinite slab with a given thickness is fairly straightforward. For a given incident wave, there are four waves with unknown amplitudes: the reflected wave, two waves traveling in the opposite directions within the slab, and the transmitted wave. These unknown amplitudes can be found from the four standard boundary conditions, two on each side of the slab.

9.3 Homogenization

581

The relevant algebra can be found, for instance, in P. Yeh’s monograph [Yeh05, Chap. 4] or in the well-known J. A. Kong’s textbook [Kon86, Chap. III]. The following final expressions are cited from papers by D. R. Smith et al. [SScvS02, SVKS05], with a slight change of notation. These formulas are valid at normal incidence.  1  1 − (R 2 − (T  )2 )  2T (1 + R)2 − (T  )2 η = ± (1 − R)2 − (T  )2

cos nk0 d =

(9.12)

(9.13)

In these expressions, n is the index of the homogenized slab, η is its wave impedance, d is the thickness of the slab, k0 = ω/c is the free-space wavenumber, R and T are the (measured or calculated) reflection and transmission coefficients, ant T  is phaseadjusted transmission coefficient T  = exp(ik0 d) T

(9.14)

Well-known relations between index and impedance on the one hand and material parameters on the other exist. For homogeneous isotropic media,  =

n ; η

μ = nη

(9.15)

Even a cursory glance at Eqs. (9.12), (9.13) shows that index and impedance, and hence the corresponding material parameters (9.15), cannot be determined uniquely from (9.12), (9.13). First, there are sign ambiguities; but these can be resolved by asserting that Re η > 0 and Im n ≥ 0 for passive media. More importantly, nk0 d can be determined from (9.12) only up to an integer multiple of 2π. Note that k = nk0 is the wavenumber in the medium, with the respective wavelength λ = 2π/(nk0 ); hence nk0 d = 2πd/λ. Thus, adding 2π to nk0 d is equivalent to increasing the index and “fitting” an additional (shorter) wavelength into the slab, without any change in the T R coefficients. Strictly speaking, one cannot even distinguish between positive and negative values of Re n, since adding 2πm/(k0 d) to Re n, for any integer m, will make no measurable difference for an outside observer. Several semi-heuristic procedures to rectify the ill-posedness of the inverse problem have been proposed: e.g. X. Chen et al. [CGW+04] or S. Feng [Fen10]. A rigorous approach is to incorporate TR data for incidence other than normal, at which point the problem becomes well posed and overdetermined. That is, effective parameters fitting the TR data perfectly will not in general exist. A fairly obvious but often underappreciated conclusion is that effective parameters of metamaterials are only an approximation. We shall revisit this conclusion in Sect. 9.3.8.. V. A. Markel worked out several well-posed retrieval procedures in [MT13, Appendix C]. A concise summary of one of these procedures is as follows.

582

9 Metamaterials and Their Parameters

Let z be the normal direction to the slab, and let κ0 denote the normal component of the incident wave vector:

κ0 = k02 − kτ2 where τ is the tangential direction to the slab in the plane of incidence. Denote x = k0 L

(9.16)

Then, the tangential components of the electric and magnetic field at the left and right faces of the slab are related by

EL M11 M12 E 0 = HL M21 M22 H0

where M is the transfer matrix with det M = 1. If the slab has a center of symmetry, then M11 = M22 , and the transfer matrix can be written as [Fen10]

cos θ (−i/Z) sin θ M= , −iZ sin θ cos θ where θ and Z are the optical depth and the generalized impedance of the slab. The transmission and reflection coefficients can be expressed in terms of θ and Z as 1 , cos θ − i X + (Z0 , Z) sin θ −i X − (Z0 , Z) sin θ R= cos θ − i X + (Z0 , Z) sin θ

T =

where X ± (Z1 , Z2 ) =

1 2



Z1 Z2 ± Z2 Z1

(9.17a) (9.17b)



and Z0 = κ0 /k0 is the generalized impedance of free space. As usual, the quantities T and R (9.17) are the ratios of the amplitudes of the transmitted and reflected tangential fields (electric for s-polarization or magnetic for p-polarization) to those of the incident wave at z = 0. (T is measured at z = L and R at z = 0). To make the above expressions more specific, consider a slab that occupies a domain 0 < z < L and is characterized by local diagonal tensors  = diag(⊥ , ⊥ ,  ) and μ = diag(μ⊥ , μ⊥ , μ ). Then, θ = qz L , Z = qz /(k0 η ) ,

(9.18)



 k02  μ − k x2 η /η⊥

(9.19)

where qz =

9.3 Homogenization

583

and η refers to μ for s-polarization and to  for p-polarization. The transmission and reflection coefficients are rewritten by V. A. Markel in the following form: 4 pC , (C + 1)2 − p 2 (C − 1)2 (1 − p)2 (C 2 − 1) R= . (C + 1)2 − p 2 (C − 1)2

T =

Here, C=

(9.20a) (9.20b)

qz η , p = exp (iqz L) . κ0

If T and R are known at some incidence angle (parameterized by k x ), so are C and p. The expressions for C and p in terms of T and R [inversion of (9.20)] are unique up to the branch of a square root: C=

1 + T 2 − R2 ± D ±D , p = T 2 − (1 − R)2 2T

where D=

 (1 + T 2 − R 2 )2 − (2T )2

One may seek the index of refraction that fits the function p(t) to fourth order in t; then, use the function C(t) to find the impedance. (This is Method 3 of [MT13, Appendix C].) Then, the following relations hold: F2 ≡

η p  (0) = −i x , p(0) nη⊥

F4 ≡

i + xn p  (0) = −3x p(0) n

(9.21a) 

η nη⊥

2 .

(9.21b)

Eliminating the ratio η /η⊥ from the above set of linear equations and solving for the index of refraction n, we obtain n=

i . x(F4 /3F22 − 1)

F2 and F4 can in turn be related to p(t) (available from measurements or simulations) at some small but nonzero values of t, say, τ1 and τ2 s:

584

9 Metamaterials and Their Parameters

τ22 b1 − τ12 b2 , τ22 − τ12 b2 − b1 F4 = 12 2 , τ2 − τ12

F2 =

where

2 bk = 2 τk



p(τk ) − 1 , k = 1, 2 p(0)

With the index of refraction n so determined, we can compute η and η⊥ ([MT13, Appendix C]): xC(0) . η = C(0)n , η⊥ = −i F2

9.3.3 Parameter Retrieval: Anomalies and Controversies In the early 2000s, as research in metamaterials progressed and experience in parameter retrieval accumulated, some anomalies in the behavior of the effective parameters eff (ω), μeff (ω) became apparent. Th. Koschny et al. [KME+05] give a list of these anomalies, of which I will mention two most interesting ones: Negative imaginary parts of eff (ω), μeff (ω) near resonances. Resonances and antiresonances in eff (ω) and μeff (ω) go hand-in-hand: That is, peaks in one of these parameters typically correspond to troughs in the other. The appearance of the negative imaginary parts of eff (ω) and/or μeff (ω) has been particularly controversial, since it seems to violate passivity of the material. This anomaly has been observed by many research groups: S. O’Brien and J. B. Pendry [OP02b, Fig. 7], [OP02a, Fig. 5] Th. Koschny et al. [KMSS03, KME+05], P. Markoš and C. M. Soukoulis [MS03], and others. The authors of [KMSS03, KME+05] attribute the negative imaginary parts of the material parameters to “periodicity”—an apparent misnomer which actually seems to mean the finite size of the metamaterial sample. The implication is that the retrieved parameters accurately describe reflection and transmission through a given finitethickness slab and are not otherwise universally applicable. Further, Th. Koschny et al. state that the negative imaginary part of one of the material parameters does not necessarily violate passivity, since the conventional expression for power absorption includes both electric and magnetic contributions and may still be positive even if one of these contributions is negative. This interpretation was criticized by A. L. Efros [Efr04], who wrote that conditions Im (ω) > 0

(9.22)

Im μ(ω) > 0

(9.23)

9.3 Homogenization

585

for any system in thermodynamic equilibrium (passive system) [follow] directly from the second law of thermodynamics and that this is one of the most important theorems of macroscopic electrodynamics.

Efros further argues that (9.22) and (9.23) must be valid separately—not just as a combination ensuring positive absorption in passive media. Indeed, one can envision a situation where a material sample is placed inside a charged capacitor, ensuring that the electrical field is strongly dominant, and hence  places a critical role. Efros also states that, similarly, by placing the sample in a solenoid one can ensure that μ plays a principal role and hence (9.23) must hold. However, V. A. Markel pointed out [Mar08] that this second statement is not as clear-cut. Indeed, an electric field will be induced by the changing magnetic field in the solenoid, and electrical effects may turn out not to be weaker than magnetic ones, especially at sufficiently high frequencies where μ may deviate significantly from unity. Causality-related properties of material parameters deserve particular attention. The classical Kramers–Kronig expression for the real part of the permeability reads 2 μ (ω) = 1 + π 



∞ 0

ω  μ (ω  ) dω  ; μ ≡ Re μ, μ ≡ Im μ ω 2 − ω2

(9.24)

This form of the Kramers–Kronig relation can be derived from Titchmarsh’s theorem in complex analysis, taking into account the fact that magnetic susceptibility is a realvalued function in the time domain. Relations similar to (9.24) hold for the dielectric permittivity and also for  and μ . An important corollary of (9.24) is that if, indeed, μ (ω  ) > 0 at all frequencies, then μ (0) > 1, which contradicts the existence of diamagnetics whose static permeability is less than one. V. A. Markel arrives at the conclusion that the μ (ω  ) > 0 condition may in fact be violated [Mar08]. L. D. Landau, E. M. Lifshitz and L. P. Pitaevskii [LLP84, §84, p. 287] deduce from the Kramers–Kronig relations the following condition, valid in the frequency ranges with negligible losses ( ≈ 0): ∂ω (ω) > 1

(9.25)

A similar condition holds for μ.4 Therefore, as noted by C. R. Simovski [Sim09, Sim11], ... at frequencies where losses can be neglected, the real parts of the material parameters should increase with frequency.

“Retrieved” material parameters do not always satisfy this condition. This is not, in my view, surprising. First of all, representing a metamaterial as an effective medium is an approximation, not necessarily very accurate (Sect. 9.3.8). Secondly, and more importantly, coarse-level (“macroscopic”) fields cannot be defined in any meaningful way at wavelengths shorter than the lattice cell size—that is, at frequencies not that 4 An analogous statement in the theory of electrical networks is known as Foster’s reactance theorem.

586

9 Metamaterials and Their Parameters

much higher than the typical operating frequency of a metamaterial device. One may then question the applicability of the Kramers–Kronig relations, where the integration extends to ω → ∞. Contrast that with natural materials, where the effective medium representation breaks down only at very short wavelengths (very high frequencies).

9.3.4 Two-Parameter Homogenization A mathematical technique involving two-parameter limits was initially developed for fluid flows in fractured porous media (G. I. Barenblatt and Y. T. Zheltov [BZ60], T. Arbogast et al. [AJH90]) as well as for diffusion (G. Allaire [All92]) and elasticity (G. V. Sandrakov [San99], V. V. Zhikov [Zhi00]). D. Felbacq, G. Bouchitté and collaborators applied similar ideas to electromagnetic metamaterials with highpermittivity inclusions in the lattice cell [FB05, FGBB09]. In this approach, the cell size is scaled as a = ηa0 and concomitantly the permittivity of inclusions is scaled as  = 0 /η 2 , where a0 and 0 are two constants. The problem is then treated asymptotically in the limit η√→ 0. The rationale behind this particular scaling is that it preserves the product  a = na, where n is the index of the inclusion, or equivalently the ratio λincl /dincl , where λincl , dincl are the wavelength in the inclusion and its diameter. The latter ratio is a critical parameter under resonance conditions. Thus, the double-parameter limit is an attempt to capture the resonance behavior of the structure. Despite its physical and mathematical appeal, two-parameter homogenization has several limitations: • This two-parameter method applies to high-permittivity inclusions in an otherwise empty lattice cell—a mathematically and physically interesting, but still special, case. • In practice, for a given (finite) cell size a and (finite) permittivity of the inclusion , it is not clear whether and why the double-parameter limit specified above should give a more accurate result than any other conceivable trajectory in the (a, ) parameter plane, including the classical single limit a → 0,  = const. Interestingly, N. A. Nicorovici et al. showed that the two limits a → 0 and  → ∞ are noncommuting [NMB95b, NMB95a]. • Finally, the (ηa0 , 0 /η 2 ) method is in general not self-contained and may require additional fitting parameters not derivable from first principles [FGBB09]. The above reservations notwithstanding, two-parameter homogenization is being actively studied and extended, especially in the mathematical literature: G. Bouchitté, B. Schweizer, R. Lipton and collaborators [BS10, LS18], and M. Ohlberger and B. Verfürth [OV18, Ver19]. Even time-domain homogenization has been proposed (T. Dohnal et al. [DLS14]).

9.3 Homogenization

587

9.3.5 “High-Frequency Homogenization” R. V. Craster and collaborators [CKP10, CKNG12, HMC16] developed extensions of the classical two-scale homogenization procedure ([BP89, BLP78], Sect. 9.3.1) to a special case of high-frequency waves in periodic structures. Namely, Craster’s method is valid at the Brillouin zone edges, when the field is a standing wave, the spatial period being double the lattice cell size. The method has found a variety of applications within its range of validity [CKP10, CKNG12, HMC16], but I am not aware of its generalizations to less specific situations. The asymptotic analysis is focused on the dispersion relation, and it is not clear how the boundary conditions on a finite sample are handled.

9.3.6 Wave Vector-Dependent Tensor In one line of research on homogenization of metamaterials, artificial sources are introduced within the material, resulting in wave vector-dependent effective tensors [Sil07, Sil09, FS10, Alu11, AYSS11, SHD19]. Specific stages of this type of procedure are as follows. 1. Magnetic and/or electric currents of the form J = J0 exp(ik · r) are introduced inside a metamaterial sample. 2. These currents induce electromagnetic fields in the material of the Bloch-like form: e(r) = e˜ (r) exp(ik · r), e˜ being a lattice-periodic factor, similarly for the magnetic field. Importantly, however, k is arbitrary and does not have to be equal to a Bloch wave vector which would exist in the absence of sources. 3. Volume averages (i.e. the zero-order harmonics) of d˜ and e˜ are found. Note that they are functions of both frequency and k, which in this approach are treated as independent variables. A linear relation between these averages is a tensor eff (ω, k) or, equivalently, eff (k0 , k), k0 = ω/c. 4. The eff (ω, k) tensor represents a non-local material relation.5 It is claimed that local material parameters eff , μeff (no longer dependent on k) can then be deduced from eff (ω, k). This approach is an adaptation of a theory in crystal optics, due primarily to V. M. Agranovich and V. L. Ginzburg [AG84]. We have critiqued this adaptation in a number of publications [Mar10, MT13, Mar18] on the following grounds: • Currents inside a metamaterial are non-physical [Mar10, Mar18]. If they are adopted as a purely mathematical construct, then the intrinsic material relations in all constituents of the periodic structure are broken, and the problem being solved is different from the original one.

5 Since

dependence on k translates into a convolution integral in real space.

588

9 Metamaterials and Their Parameters

• Setting k = q, where q is a Bloch wave vector in the structure, results in a singularity. • These artificial currents are in fact unnecessary; the eff (ω, k) tensor can be produced without recourse to these currents [MT13]. • The procedure, based entirely on volume averaging and field behavior in the bulk, cannot account for the boundary conditions and, consequently, for reflection and refraction of waves, with the respective coefficients. This is a principal shortcoming of the methodology.

9.3.7 Non-asymptotic Homogenization 9.3.7.1

Preamble

Much of the material in this section is joint work with V. A. Markel and follows our publications [Tsu11a, MT13, TM16, TM14, MT16, Tsu17]. To justify our approach, it is instructive to first consider some pitfalls in the standard ways of homogenization, with application to metamaterials, and then develop a procedure that avoids these pitfalls. Specifically, we consider the role of interface boundaries and inadequacy of the standard volume averaging of fields in the non-asymptotic case. Separately, I discuss the classical notions of polarization and magnetization (Sect. 9.3.9) and see why they are not as clear-cut and not as easily applicable to metamaterials as it might seem at first glance.

9.3.7.2

Scaling Invariance of Maxwell’s Equations

Maxwell’s equations are written differently, in different systems of units, by physicists, electrical engineers and mathematicians. To keep everybody equally happy (or equally unhappy), in this section we shall write these equations in a unified form, as is also done in Sect. 7.2: (9.26) ζ∇ × E = −∂t B ζ ∇ × H = ∂t D

(9.27)

These equations are valid in the absence of external sources. The “system unification factor” is  1, SI ζ = (9.28) c, Gaussian Critical for this section is the observation that the ∇ × H equation (9.27) is invariant with respect to any simultaneous rescaling of the H and D fields:

9.3 Homogenization

589

H → ξH;

D → ξD

(9.29)

ξ being an arbitrary number. This is a simple particular case of what some researchers call the Serdyukov–Fedorov transformation (A. P. Vinogradov [Vin02]). The ∇ × E equation (9.26) is also rescalable in a similar manner. However, the E and B fields govern physical forces acting on charges, and for this reason we shall not consider their rescaling. To the contrary, rescaling of H and D does not violate any physical principles and, moreover, is closely related to the well-known fact that the total current in a medium can be split non-uniquely into “polarization” and “magnetization” components, proportional to ∂t P and ∇ × M, respectively. (The familiar notions of polarization and magnetization are revisited in Sect. 9.3.9.) This scaling invariance has principal implications for homogenization theory of metamaterials. Indeed, consider even the trivial case of an infinite homogeneous isotropic medium. If one, say, were to double the H and D fields simultaneously, this would result in the multiplication of the dielectric permittivity  by a factor of two and division of the magnetic permeability μ by the same factor. The product μ would remain the same; this makes physical sense, since this product determines the phase velocity of waves in the medium. However, the ratio μ/ would change. We conclude, therefore, that even in the simplest case material parameters cannot be uniquely determined from the bulk behavior alone (i.e. from the behavior of waves in an infinite medium). Obviously, for complex media such as metamaterials this conclusion is even more true. This brings us to the principal role of boundaries in non-asymptotic homogenization of electromagnetic periodic media.

9.3.7.3

The Role of Boundaries

The role of interface boundaries in homogenization is schematically illustrated in Fig. 9.10. Consider a smooth wave (e.g. a plane wave) impinging from the air on a periodic structure. Sketched in the figure is the tangential component Hτ of the H field; near-field effects in the air are for simplicity not shown. In the material, the wave is subject to fine-scale oscillations. One simple but critical observation is that an arbitrary rescaling (9.29) can no longer be applied to the field within the material, because Hτ at the interface boundary is constrained (one may say, “calibrated”) by Hτ in the air.6 An equally valid alternative interpretation of the role of boundaries is that Maxwell’s boundary conditions depend on the wave impedance of the medium. In the simplest case of a linear isotropic medium, this fixes the ratio μ/ and, together with the product μ (on which the index and phase velocity depend), determines μ and  separately. This critical role of interface boundaries of metamaterials is not accepted as widely as it should be, but has been noted by various researchers in different contexts, with 6 In

principle, of course, H could be rescaled in the whole 3D space, including air. But this would lead to B = H (or B = μ0 H) in the air, which would make little physical or practical sense.

590

9 Metamaterials and Their Parameters

Fig. 9.10 Schematic illustration of the role of interface boundaries in homogenization. The tangential component of H in the air “calibrates” this component in the material, so rescaling (9.29) is no longer possible. An equally valid alternative interpretation of the role of boundaries is that they fix the wave impedance of the medium

varying degrees of relevance to homogenization (D. Felbacq [Fel00], F. J. Lawrence et al. [LdSB+13], S. Boscolo et al. [BCMS02] and especially C. R. Simovski, who introduced the notion of “Bloch impedance” [Sim09, Sim11]). In light of all these observations, I propose the following classification: Partial problem of homogenization: Approximate (“homogenize”) the dispersion relations only—i.e. wave propagation in an infinite medium. Magnetic effects in intrinsically non-magnetic media cannot in general be accounted for in this type of problem. Full problem of homogenization: Find the full effective material tensor, including the magnetic part and, possibly, magnetoelectric coupling. Unavoidably, boundaries must be taken into account. Most publications to date are devoted to the partial problem. There are a few notable exceptions, where wave propagation through a finite structure is considered, and the relevant material parameters derived: e.g. Y. Liu et al. [LGG13], V. Popov et al. [PLN16]. Our own methodology is presented in Sect. 9.3.7.5.

9.3.7.4

Inadequacy of Volume Averaging

Given Maxwell fields and sources rapidly varying in space on a fine scale,7 one wishes to define their coarse-scale averages and the corresponding governing equations on the coarse (“macroscopic”) level. Toward this end, a natural inclination is to apply volume averaging, and this indeed is taken for granted in most textbooks and research papers.8 However, the situation is not quite that simple; for natural materials, this is further discussed in Sect. 9.3.9 and for metamaterials can again be inferred from the sketch (Fig. 9.10). 7 Molecular scale for natural materials and subcell-size scale for periodic structures (photonic crys-

tals or metamaterials. and ensemble averaging are interesting subjects in their own right, but with little relevance to the content of this section.

8 Time

9.3 Homogenization

591

Suppose that the dashed green line in that figure represents the volume average h τ V of the tangential component h τ = bτ (or h τ = μ−1 0 bτ in the SI system), and suppose further that one were to define the coarse-level field Hτ ≡ h τ V , —that is, the dashed green line would indicate the coarse-scale Hτ . Then, the key observation is that this volume average will in general violate Maxwell’s tangential continuity condition at the material/air interface; this is schematically indicated by the solid green jump segment in the figure. There are two main ways to rectify that. First, one may introduce an equivalent electric surface current that would account for the jump of Hτ . Then, a separate constitutive relation between this current and the H field would need to be devised. This could probably be done but would lead to a nonstandard formulation, not directly compatible with well-established analytical tools and numerical procedures implemented in most software packages. A more appealing way is to define the coarse-scale H field in such a way that Maxwell’s tangential continuity condition is honored, as schematically indicated by the dashed red line in Fig. 9.10. This definition of the H field is different from volume averaging and will ultimately account for non-trivial magnetic effects, as we shall see in Sect. 9.3.7.5. Similar considerations apply to the definition of the coarse-level E field, which must also satisfy Maxwell’s tangential continuity condition. Tangential continuity of the H and E fields is in fact a starting point of our non-asymptotic homogenization procedure. Let us also note a few additional complications related to volume averaging. There are two different ways of volume averaging: (i) The simplest one is just the  lattice cell average ξC = VC−1 C ξd V , where VC is the volume of the lattice cell C and ξ stands for any physical quantity. (ii) The moving average ξ = ξ ∗ , where  is a chosen convolution kernel (e.g. a Gaussian). The cell average and moving average are substantially different, and there are problems with each of them, as explained below. The “insanity pitfall”. Consider the four-field model with the e, d, h, b fields on the fine scale; their coarse-scale counterparts E, D, H, B need to be defined. On the fine scale, for intrinsically non-magnetic materials h = b (or h = μ−1 0 b in the SI system). Magnetic effects are in general expected on the coarse scale; that is, H = B. Let us suppose that one applies volume averaging to h and obtains H. Then, one repeats the same averaging for b → B. Having done the same thing twice (since h ≡ b), one expects to get different results (H = B), which falls under the definition of insanity commonly attributed to Einstein.9 Let us briefly discuss the items above. 9 This

apparently is a misattribution; the most likely source of this adage is—believe it or not— Alcoholics Anonymous (https://quoteinvestigator.com/2017/03/23/same/ or https://en.wikiquote. org/wiki/Narcotics_Anonymous). Incidentally, Einstein’s infamous “biggest blunder” quote is most likely made up as well. http://www.theatlantic.com/technology/archive/2013/08/einstein-likelynever-said-one-of-his-most-oft-quoted-phrases/278508/ Einstein Likely Never Said One of His Most Oft-Quoted Phrases. The great scientist certainly regretted introducing the “cosmological constant” into his equations, but calling it his “biggest blunder”? Not so much, it seems..

592

9 Metamaterials and Their Parameters

An obvious feature of cell-wise averaging is that it produces piecewise-constant fields, with jumps between the adjacent lattice cells (unless the cell averages are the same, which can only happen in exceptional circumstances for uniform static fields, but certainly not for waves). Such fields do not satisfy Maxwell’s equations, since their curls and divergences are zero within the cells and are surface delta functions on the intercell boundaries. This argument should be sufficient to dissuade a mathematician from using cellwise volume averaging, but a physicist might still argue that the stepwise variation of the fields could be interpolated smoothly and become legitimate. There are two objections to this point of view. First, it is not at all clear that an interpolation satisfying Maxwell’s equations can be arranged. Secondly, there is an inherent and non-negligible error in cell averaging. Consider, for example, a cubic cell C with (i) a point dipole p at its center and (ii) a uniform polarization P = p/VC . Clearly, the absolute or relative difference between (i) and (ii) is negligible in the far field. However, on the boundary of the cell the relative difference in the electrostatic field between these two cases is in fact O(1)—that is, it does not diminish as the cell size goes to zero. Thus, extreme caution should be exercised if one wishes to use the standard volume averages in homogenization of metamaterials. What about the moving average—that is, a convolution of fine-scale fields with a mollifier kernel (e.g. a Gaussian)? In contrast to cell-wise averaging, the coarselevel fields produced by this operation not only are smooth but satisfy Maxwell’s equations exactly, because convolution commutes with Maxwell’s curl operators. A pedagogical exposition of this is given by G. Russakoff [Rus70]. We will return to this subject in Sect. 9.3.9, but one important issue should be noted right away: the “spillover” effect across interfaces. The fields near the metamaterial/air boundary are smeared from the metamaterial into the air, in a boundary layer whose width is commensurate with that of the convolution kernel. For natural materials, the moving average can be performed on a deeply subwavelength scale, and hence the spillover effect can safely be neglected. That is not the case for metamaterials whose lattice cell size is typically an appreciable fraction of the wavelength. Hence, a mesoscale for the mollifier kernel—i.e. an intermediate scale between the lattice cell size and the vacuum wavelength—simply does not exist for metamaterials, in contrast to natural materials. Let us now turn to the second item on the list above: h = b but, paradoxically, H = B. There are proponents of the “three-field” theory who may dismiss this paradox as a non-issue, since the fine-scale h field in that model simply does not exist, and the coarse-scale H field is obtained not by averaging h but rather by introducing magnetization and subtracting it from B.10 In natural materials, one may indeed consider b as the only magnetic field existing on the molecular scale and adopt the “three-field” model (e.g. L. D. Landau, E. M. Lifshitz & L. P. Pitaevskii [LLP84]). But for metamaterials, this point of view is not satisfactory, for the following reasons.

10 More

precisely, H = μ−1 0 B − M in the SI units and H = B − 4πM in the Gaussian system.

9.3 Homogenization

593

• For natural materials, magnetization (and polarization) are defined phenomenologically, with the respective material parameters determined experimentally. In contrast, Maxwell’s electrodynamics of metamaterials is completely self-contained. The availability of experimental measurements notwithstanding, ways must be found to define magnetization and the material tensor analytically, without recourse to phenomenological arguments. • Thus, an unambiguous definition of M has to be given. For a stand-alone magneticdipole-like inclusion inside an otherwise homogeneous lattice cell, the standard approach is to find its magnetic dipole moment m and define magnetization M = m/VC . However, this approach runs into problems, quite similar to the ones discussed above in the case of polarization. First, cell-wise magnetization is a piecewise-constant function corresponding to zero currents within each cell and some spurious electric currents on intercell boundaries. Second, the near field of a small inclusion at the center of the cell is significantly different from the near field due to magnetization smeared uniformly over the cell; hence, the interface boundary conditions are distorted by this smearing. • Obviously, the assumption b = h (or b = μ0 h) excludes from consideration intrinsically magnetic materials. This is inconsequential in the optical range, but at microwave and lower frequencies intrinsic magnetism of some materials may play an important role (see L. Chao et al. [CAO15] as just one example). • While it is very common to treat B and H (or b and h) as objects of the same type—namely, vector fields,—a serious argument can be made that they should be treated as fundamentally different objects. Namely, B fields are associated with fluxes and H fields with circulations. The proper mathematical language for this is that of differential geometry and differential forms; in that framework, b and h are not identical and cannot be equated, but are rather connected via the so-called Hodge operator. This subject is beyond the scope of the book, but a few additional remarks and references can be found in Sect. 7.8. If b and h are viewed as different entities, then a four-field model, rather than a three-field one, is called for. The main conclusion is that uncritical translation of familiar quantities and notions from the physics of natural materials to metamaterials is fraught with peril. Moreover, a careful analysis of electromagnetic fields in metamaterials may help to clarify the seemingly familiar notions such as polarization and magnetization; we revisit these notions in Sect. 9.3.9.

9.3.7.5

The Ideas of Trefftz Homogenization

The homogenization procedure presented in this section is a slightly simplified version of the one introduced in [TM14, Tsu17], but with very similar results. The exposition follows closely our papers [TM14, TM16, Tsu17], with some modifications noted in the course of the presentation below. As previously stated, “non-asymptotic” homogenization implies that mathematical limits with respect to any physical parameters are not considered; instead, the

594

9 Metamaterials and Their Parameters

procedure applies to periodic structures with any given composition and size of the lattice cell, under a broad range of illumination conditions. By necessity, nonasymptotic homogenization is an approximation; our procedure has a built-in error estimate allowing one to assess the accuracy of homogenization and conclude qualitatively and quantitatively whether a given metamaterial can be represented as an effective medium under given conditions. As emphasized previously, material boundaries play a critical role in homogenization of metamaterials, and effective medium theory must apply to finite samples. It has to be recognized, however, that a rigorous analysis of samples finite in all three coordinate directions is extremely complicated due to the presence of edges, corners and complex surface waves. As a compromise, we consider metamaterial slabs of finite width and finite but long in the remaining two directions. Such slabs have indeed become standard objects of theoretical analysis and, among other things, form a basis for J. B. Pendry’s “superlens” [Pen00]. The electromagnetic problem is formulated in the frequency domain with the exp(−iωt) phasor convention. At a working frequency ω, the free-space wavenumber k0 = ω/c = 2π/λ, where λ is the free-space wavelength. The problem under consideration has two principal scales (levels). Fine-level fields e, d, h and b are the exact solutions of Maxwell’s equations for given illumination conditions and for a given sample. In general, their variation in space is rapid and consistent with the microstructure of metamaterial cells. These fields are assumed to satisfy the constitutive relations d(r) = ˜(r)e(r) , b(r) = h(r)

(9.30)

Coarse-level fields vary on a characteristic scale greater than the cell size. Importantly, there are two distinct types of such fields: (i) The actual physical fields in and around a homogeneous sample. These can be obtained (e.g. numerically) only after the equivalent material tensor has been determined, and therefore cannot play a role in any valid homogenization procedure free of circular reasoning. (ii) Some smoothed (averaged) versions of the fine-level ones. In contrast to (i), such fields are auxiliary mathematical constructions rather than measurable physical quantities.11 These auxiliary fields are denoted with capital letters E, D, H, B and

11 P. Penfield and H. A. Haus write [PH67, pp. 42–43]: “We now indicate why fields inside materials are impossible to measure.... within the polarized body the fields E and H are mathematical constructs defined in terms of equivalent smoothed charge and current distributions. A measurement of an electric or magnetic field inside the polarized body would not necessarily measure these fields because the spatial and time averages taken by a probe, such as an energetic electron beam shot through the body or electrons of a conduction current, are not necessarily the same as the averages implied in [macroscopic Maxwell’s equations]. ... the fields predicted from [macroscopic Maxwell’s equations] have physical meaning only in the space surrounding a polarized body or in slots cut into the body for the purpose of measuring the fields but have no direct physical meaning inside the body itself.”

9.3 Homogenization

595

depend on the respective fine-scale fields. Although the latter are not known a priori, suitable approximations can be used in their stead, as explained below. Let us assume that a periodic composite is contained between the planes z = 0 and z = L. In the other two directions (x and y), the size of the sample is assumed to be finite but much greater than L, so that the influence of edges and corners can be disregarded. The fine-level fields satisfy Maxwell’s equations of the form ˜ ∇ × e(r) = ik0 h(r) ∇ × h(r) = −ik0 ε(r)e(r);

(9.31)

everywhere in space, supplemented by the usual radiation boundary conditions at infinity. Here, the tilde sign indicates lattice-periodic quantities, as in [TM14]: f˜(x + n x ax , y + n y a y , z + n z az ) = f˜(x, y, z)

(9.32)

where ax,y,z are the lattice periods and n x,y,z are arbitrary integers. The intrinsic permittivity ˜(r) is assumed to be lattice-periodic and scalar. For notational simplicity and with an eye on applications in optics and photonics, the intrinsic magnetic permeability of all material constituents is unity, μ(r) ˜ = 1; however, the homogenization procedure generalizes in a straightforward fashion to linear magnetic materials as well. All constitutive relationships are assumed to be local and linear. Let the sample be illuminated by monochromatic waves with a given far-field pattern; these waves are reflected by the metamaterial. Outside the slab, the most general solution to (9.31) can be written as a superposition of incident, transmitted and reflected waves. For the electric field, we can write these in the form of angular spectrum expansions [TM14]:  ei (r) = et (r) = er (r) =

 

si (k x , k y )ei (kx x+k y y+kz z ) dk x dk y ,

(9.33a)

st (k x , k y )ei (kx x+k y y+kz z ) dk x dk y , z > L ,

(9.33b)

sr (k x , k y )ei (kx x+k y y−kz z ) dk x dk y , z < 0 ,

(9.33c)

where kz =



k02 − k x2 − k 2y ,

(9.34)

and the square root branch is defined by the condition 0 ≤ arg(k z ) < π. Expressions for the magnetic field are obtained from (9.33) using the second Maxwell equation in (9.31). In (9.33), si (k x , k y ), st (k x , k y ) and sr (k x , k y ) are the angular spectra of the incident, transmitted and reflected fields. Waves included in these expansions can be both evanescent and propagating. For propagating waves, k x2 + k 2y < k02 ; otherwise the waves are evanescent.

596

9 Metamaterials and Their Parameters

Everywhere in space, the total electric field e(r) can be written as a superposition of the incident and scattered fields: e(r) = ei (r) + es (r) .

(9.35)

Outside the material, the reflected and transmitted fields form the scattered field:  e (r), z < 0 , (9.36) es (r) = r et (r), z > L The scattered field inside the material is also formally defined by (9.35). In the case of homogeneous slabs, a single incident plane wave gives rise to one reflected and one transmitted plane wave, all having the same tangential component of the wave vector. For a metamaterial slab, lattice periodicity leads to additional surface waves analyzed by V. A. Markel in [MS12, MT13]. Under the natural “Bragg constraint” that the lattice periods in the directions along the slab are smaller than the free-space wavelength λ, these surface waves are evanescent and localized near the boundary.12 It is natural to approximate fine-level fields via a basis set of Bloch waves traveling in different directions: eα (r) = e˜ α (r) exp(iqα · r) , hα = h˜ α (r) exp(iqα · r)

(9.37)

where index α labels both the wave vector and the polarization state of the Bloch wave; e˜ α (r), h˜ α (r) are the respective lattice-periodic factors. It is assumed that, for practical purposes, the Bloch modes can be computed numerically. In practice, for a 2D problem (s- or p-modes) one may use, say, 8 basis waves propagating at the angles lπ/4, l = 0, 1, . . . , 7; for a generic 3D problem, one could take, as a minimum, 12 waves traveling in the ±x, ±y, ±z directions, with two possible polarizations per direction. Bases smaller or bigger than that are possible, with some trade-offs discussed in Sect. 9.3.7.6. In [TM14], we considered cell-wise approximations of the fields, whereby the Bloch bases could in principle be different in different cells. Equation (9.37) is a streamlined version of the respective equation in [TM14]: It is now assumed that the Bloch bases are in fact the same in all cells—that is, the Bloch approximations span the whole metamaterial sample. This simplifies the exposition and notation, with little practical difference. For compactness of expressions, we merge the e and h fields into a single mathematical vector (9.38) ψα (r) = {˜eα (r), h˜ α (r)}

12 Otherwise,

surface waves become propagating in free space, and the transmission and reflection coefficients lose their traditional meaning.

9.3 Homogenization

597

On the coarse scale, a natural counterpart of the fine-scale Bloch basis is a set of generalized plane waves α = {Eα , Hα } = {E0α , H0α } exp(iqα · r)

(9.39)

which satisfy Maxwell’s equations in a homogeneous but possibly anisotropic medium; subscript “0” indicates the field amplitudes to be determined. The abstract vector of the coarse field  = {E, H} within the slab can be approximated as (r) =



cα α (r) , α (r) ≡ {E0α , H0α } exp(iqα · r) ,

(9.40)

α

The key requirement for constructing coarse-level fields is that they satisfy Maxwell’s equations and all interface boundary conditions as accurately as possible. Unless the metamaterial is homogeneous to begin with, some approximation errors are unavoidable and need to be minimized. We consider a two-step optimization procedure: (i) Determine the amplitudes Eα , Hα for which the interface boundary conditions are satisfied as accurately as possible (for the chosen wave basis). (ii) Determine the material tensor for which Maxwell’s equations within the sample are satisfied as accurately as possible (for the chosen wave basis). An implementation of these steps can be described semi-formally as follows (a more formal description can be found in [TM14]). First of all, to avoid unnecessary phase errors, one takes the coarse-level wave vector for each plane wave α to be the same as its counterpart for the corresponding Bloch wave, which is already reflected in our notation above (9.37), (9.39). Next, we note that along the interface boundary eτ α (τ ) = e˜ατ exp(iqατ τ ); E τ α (τ ) = E ατ 0 exp(iqατ τ )

(9.41)

τ being the tangential coordinate in the plane of incidence. Completely similar equations hold for h τ , Hτ . Since the exponential factor exp(iqατ τ ) is common for both fine- and coarse-scale fields at the boundary, it is obvious that the amplitudes of the plane waves should be defined as E ατ 0 = e˜ατ Γ (C)

(9.42)

where Γ (C) is the interface between a boundary lattice cell C and air, and the angle brackets indicate the boundary average (not volume average!). Obviously, a similar expression holds for the magnetic field: Hατ 0 = h˜ ατ Γ (C)

(9.43)

Note again that the averages in (9.42), (9.43) involve the periodic factors of the Bloch wave (since the complex exponential factor is the same on the fine and coarse scales).

598

9 Metamaterials and Their Parameters

This completes step (i) of the homogenization procedure—-approximation of Maxwell’s boundary conditions. We now move to step (ii)—approximation of the dispersion relations, i.e. Maxwell’s equations in the bulk of the material on the coarse level. Omitting the formal mathematical derivation of [TM14], let us fast-forward to the final result—a system of algebraic equations admitting an intuitive interpretation (Fig. 9.11). Each column of the rightmost rectangular matrix (call it  E H ) corresponds to a given coarse-level basis function (mode) α, and the entries of that column are the x yz-components of the wave amplitudes E0α , H0α . The number of columns n is equal to the chosen number of basis functions. The leftmost rectangular matrix (call it  D B ) is completely analogous to  E H and contains the DB amplitudes derived from Maxwell’s curl equations, viz.: B0α = k0−1 qα × E0α , D0α = −k0−1 qα × H0α

(9.44)

The (local) material tensor is represented by a 6 × 6 matrix. Since the number of columns in matrix  E H is typically greater than the number of rows, the matrix equation for the material tensor is solved in the least square sense: M =  D B  E+H ;

δl.s. =  D B − M E H

(9.45)

where  E+H is the Moore–Penrose pseudoinverse of  E H , δl.s. is the associated least square error, and (9.46) δl.s. =  D B − M E H · is the matrix 2-norm. Some trade-offs related to the choice of the bases can be inferred from the expressions above and from Fig. 9.11. If more functions are included in the basis, then fine-scale fields can, generally speaking, be approximated more accurately; but, as a trade-off, matrices  D B and  E H become “more rectangular,” leading in general to a higher least square error δl.s. . A high error indicates the existence of many physically different modes not simultaneously and accurately representable by the same material tensor. This is not a deficiency of the proposed methodology but rather a principal limitation of local homogenization due to the complexity of fields in a lattice cell. An alternative intuitive interpretation of (9.45) and Fig. 9.11 is in terms of “information compression.” Namely, (partial) information about fields in the lattice cell is contained in the chosen set of n-modes. If these modes are significantly different and n  6, this information cannot be compressed without loss to a 6 × 6 matrix. From these qualitative observations, one can see two ways of reducing the error (9.45)—that is, improving the accuracy of homogenization. First, one can reduce the number of columns in  D B and  E H , thereby limiting the set of basis modes. The material tensor will then be tailored to a subset of illumination conditions representable by those modes. An extreme example is consideration of normal incidence only, in which case the matrices will have only four columns (two opposite direc-

9.3 Homogenization

599

c Fig. 9.11 (Reprinted from [Tsu17] 2017 with permission from Elsevier.) A schematic representation of the matrix equation  D B =l.s. M E H . “l.s.” stands for “least squares”

tions of propagation and two polarizations), and infinitely many material tensors will render δl.s. = 0. A complementary possibility of reducing the error is to expand the number of rows in the matrices. This amounts to considering, in addition to field amplitudes, other degrees of freedom—e.g. integral ones. This way of making  E H “more square” leads to non-local homogenization (Sects. 9.3.7.6 and 9.3.7.7). Remark 30 The fine-scale and coarse-scale bases used here contain so-called Trefftz functions, which, by definition, satisfy the underlying homogeneous differential equation (in electrodynamics, source-free Maxwell’s equations). On the fine scale, these are Bloch waves traveling in different directions, and on the coarse scale—the respective generalized plane waves. In computational and applied mathematics, Trefftz approximations are gaining popularity due to their excellent accuracy in many circumstances (R. Hiptmair, A. Moiola et al. [HMP11, MHP11, HMP13, HMP16b, PSHT16], as well as my publications with collaborators [KSE+15, EKS+15, MTC17, TMCM19]). Correspondingly, we often refer to the method described here as Trefftz homogenization.

9.3.7.6

Non-local Homogenization

This section is a brief summary of my paper [Tsu17]. A non-local material relationship between two fields is typically written in the form  E(r, r ) E(r ) d (9.47) D(r) = 

600

9 Metamaterials and Their Parameters

where E is a convolution kernel and  is the region occupied by the material in question. A local relationship is an obvious particular case if the kernel is the Dirac delta.13 In the physical literature, non-local models are almost exclusively derived and used in Fourier space (k-space). This is common, in particular, in plasma physics (V. L. Ginzburg [Gin62], Yu. A. Il’inskii and L. V. Keldysh [IK13]), crystal optics (V. M. Agranovich and V. L. Ginzburg [AG84]), and plasmonics (C. David and F. J. Garcia de Abajo [DGdA11]). However, the use of Fourier transforms relies on translational invariance of the convolution kernel, which does not hold in the presence of interface boundaries One helpful idea, due to V. A. Markel [Mar16c], is to restrict the dependence of the integral kernel in the vicinity of a boundary to the tangential directions τ only: E(r, r ) = E(nˆ × r, nˆ × r )

(9.48)

A natural choice is a Gaussian E(r, r ) = E0 exp(−|nˆ × (r − r )|2 /τ02 )

(9.49)

where the amplitude E0 and width τ0 are adjustable parameters, and nˆ is the unit normal vector. Although differential degrees of freedom such as spatial derivatives of the fields are in principle possible, they unfortunately lead to higher-order equations requiring additional non-trivial boundary conditions. A schematic representation of the non-local model in matrix form is shown in Fig. 9.12. The key amendment to the local model (Fig. 9.11) is additional degrees of freedom leading to the corresponding expansion of  E H downward and of the material tensor rightward. The tensor thus acquires, in addition to its local part, a non-local one. This is illustrated with an example in Sect. 9.3.7.7.

9.3.7.7

Homogenization of Layered Media

We consider a standard generic setup for wave propagation in a layered medium (Fig. 9.13). A plane wave impinges on a stack of layers at an angle θinc . Each of the layers is characterized by scalar linear parameters l , μl (l = 1, 2, . . . , N ) and has a given width wl . The incident wave gets partly reflected, at the angle θr = θinc , and partly transmitted through the stack, at an angle θt . (If the media on both sides of the stack are the same, then θt = θinc .) As usual, an incident s-polarized wave has the form e(n, τ ) = ez (n, τ ) = e0inc exp(inkn ) exp(iτ kτ ), and a p-polarized wave is h(n, τ ) = h z (n, τ ) = h 0inc exp(inkn ) exp(iτ kτ ). The coordinate system (n, τ , z) is shown in the figure; other symbols in the above expressions have their usual meaning. The wave equation in a layered medium admits a well-known analytical solution. First, one notes that the tangential component kτ of the wave vector must be the same 13 Written,

with the usual abuse of notation, as an integral rather than a linear functional.

9.3 Homogenization

601

c Fig. 9.12 Reprinted from [Tsu17] 2017 with permission from Elsevier. A schematic representation of the matrix equation  D B =l.s. M E H in the non-local case. The equation is analogous to that of Fig. 9.11, but the matrices are expanded, as described in [Tsu17] Fig. 9.13 Generic setup for wave propagation in a layered medium

in all regions, due to the continuity of the tangential components of the e and h fields across all interfaces. Consider a given kτ , dictated by the incident wave. In each layer l, there are two waves, backward and forward, with the normal components of the wave vector 1 knl = ±(k02 l μl − kτ2 ) 2 . Thus, there is a pair of undetermined wave amplitudes in each layer. Maxwell’s boundary conditions prescribe a linear relation between these pairs of amplitudes in two consecutive layers. This linear relation can be expressed via a transfer matrix for a given layer. The transfer matrix for the whole structure is a product of these layer-wise transfer matrices. Expressions for these matrices are widely available; see, e.g., P. Yeh [Yeh05] and Sect. 8.4. A related but more complicated case of a chiral slab is presented in Sect. 9.7. Despite the availability and relative simplicity of these analytical results, homogenization of layered media is notoriously difficult. Papers on this subject continue to be published: A. V. Chebykin et al. [COS+12], R.-L. Chern [Che13], L. Sun et al. [SLL+15], A. Maurel and J.-J. Marigo [MM18], V. Popov et al. [PLN16], H. Sheinfux et al. [HSKP+14], S. V. Zhukovsky et al. [ZAT+15], A. Andryieuski et al. [ALZ15]. Interesting examples from the last three of these papers are considered in Sect. 9.3.7.8.

602

9 Metamaterials and Their Parameters

In [MT13, TM14], we introduced several benchmark examples of layered media and explored their non-asymptotic but local homogenization. In [Tsu17], Example A of these papers is extended to non-local homogenization. In that example, the layered medium consists of a finite number of stacked inversion-symmetric lattice cells, each of which contains three intrinsically non-magnetic layers of widths a/4, a/2 and a/4, and scalar permittivities 1 , 2 , and 1 , respectively (1 = 4 + 0.1i and 2 = 1). We assume that λ is fixed and a changes; s-polarization (one-component electric field and two-component magnetic field) is considered; results for p-polarization are qualitatively similar. Integral kernel degrees of freedom in non-local homogenization are introduced as described in Sect. 9.3.7.7, with the kernel dependent only on the tangential coordinate τ (i.e. on the direction along the layers). The two additional dof are of the form    R W (τ )G(τ )dτ , where W is either E or Hτ , and     τ 2 1 exp − G = √ τ0 πτ0

(9.50)

is a Gaussian kernel with a given width parameter τ0 and area normalized to unity. The respective non-local tensor can be represented as a 3 × 6 matrix (see Fig. 9.12 as a visual aid). Figures 9.14 and 9.15 show the absolute errors in the reflection coefficient as a function of a/λ in the first photonic band and of sin θinc for a/λ = 0.2, respectively, for a metamaterial slab with a 10-cell thickness in the normal direction. To avoid any dubious parameter fitting, the integral kernel width τ0 was simply set equal to the cell size a. It can be seen that the errors of the non-local homogenization are far lower than those of local homogenization and of the standard static (asymptotic) homogenization. I am not aware of any method with a level of accuracy for transmission and reflection comparable with that of Fig. 9.14; the published results improve the accuracy only of the dispersion relation at best. Further numerical results are presented in the following section.

Fig. 9.14 Example A of [TM14, MT13]. Absolute error in R versus free-space wavelength

9.3 Homogenization

603

Fig. 9.15 Example A of [TM14, MT13]. Absolute error in R versus angle

9.3.7.8

Example: A “Breakdown” of Effective Medium Theory and the Role of Magnetoelectric Coupling

As an instructive example, let us consider the “effective medium theory breakdown” reported by H. Sheinfux et al. in 2014 [HSKP+14]. The setup is quite simple: a periodic layered medium whose lattice cell contains two dielectric layers “a” and “b” of equal width (da = db ≡ d ≡ a/2 = 10 nm) with the respective dielectric permittivities a = 5, b = 1. The free-space wavelength λ = 500 nm; s-mode is analyzed, with a one-component E field parallel to the layers and a two-component H field. The periodic structure under consideration has a finite number N of lattice cells and is immersed in dielectric media with the permittivity in = 4 on the illumination side and out = 3 on the transmission side. The effective permittivity of the medium in the static limit (a/λ → 0) is, for the s-mode, the simple average stat = (a da + b db )/a = (5 + 1)/2 = 3. In this limit, since in > stat , total internal reflection manifests itself for the angle of incidence greater than the critical angle θcrit = sin

−1

√ n stat 3 −1 = 60◦ = sin n in 2

Notably, once the angle of incidence reaches the critical value, the wave in the “b” layer (b = 1) becomes evanescent, while the “a” layer wave (a = 5) is still a propagating one. More specifically, at the critical angle, the normal component of the wavenumber knb in layer “b” is imaginary: knb =

kb2



[∗] kτ2 =



k0 b − stat

√ √ 2πi 2 = ik0 2 = λ

[*] The tangential component of the wave vector, the same for all layers, at the critical angle is equal to kτ = k0 n stat ; i.e. kτ2 = k02 stat .

604

9 Metamaterials and Their Parameters

Across√the “b” layer, this evanescent wave gets attenuated by the factor exp(−2π 2d/λ) ≈ 0.84, which is appreciable. The ratio d/λ is 10/500 = 0.02, which Sheinfux et al. refer to as “deeply subwavelength.” Note, though, that freespace waves do not actually exist in this example; a more relevant wavelength is λ/n in , and the more relevant dimension is the cell size a rather than the width of an individual layer.14 Hence, the relevant small parameter in this example is a˜ = n in a/λ = 0.08. Whether this normalized dimension qualifies as deeply subwavelength depends on one’s definition of the word “deeply.” Be it as it may, the phenomenon uncovered by Sheinfux et al. is interesting and instructive. The actual reflection and transmission coefficients (TR) through the slab with N lattice cells at the angles of incidence close to critical are quite different from the ones for the rversusespective slab with the static permittivity stat . Moreover, the TR coefficients depend on the side of illumination (or, equivalently, on the order of layers, a-b-a-b... versus b-a-b-a...)—this distinction is lost entirely in classical homogenization. In sum, the classical homogenization limit gives inaccurate predictions in this case. In hindsight, this is not so surprising, since wave behavior near the critical angle is quite special and, furthermore, the relative lattice cell size a˜ = 0.08 is not really negligible. Detailed analytical calculations are presented by X. Lei [LMLW17], and experimental verification of the “breakdown” in a similar setting, along with analysis, is given by A. Andryieuski et al. [ALZ15] and S. V. Zhukovsky et al. [ZAT+15]. Rather than discussing the “breakdown” per se, it is instructive to view this special case as a challenge to our non-asymptotic and non-local homogenization procedures. One principal issue is noted above: The layered sample lacks inversion symmetry, the transmission and reflection coefficients in general depend on which side the sample is illuminated from, while after homogenization this asymmetry seems to be lost. Thus, it appears that the lack of symmetry imposes principal constraints on the accuracy of homogenization. We shall see now that this conclusion is incorrect; moreover, our homogenization procedure of Sect. 9.3.7.5 automatically produces a qualitatively and quantitatively correct description in the non-symmetric case (Fig. 9.16). To see how this comes about, let us recall that the end result of our local homogenization is a tensor M which, for the s-mode in the case under consideration, can be represented by a 3 × 3 matrix M: ⎞ ⎛ ⎞ E D ⎝ Bn ⎠ = M ⎝ Hn ⎠ , Bτ Hτ ⎛



⎞  ζ12 ζ13 M = ⎝ζ21 μn μnτ ⎠ ζ31 μτ n μτ

(9.51)

Two points are in order. First, the ζ entries in the first row and first column of M represent magnetoelectric coupling. Secondly, one must distinguish between this tensor as an abstract coordinate-independent object and its representation in 14 Since any lattice cell can be conceptually, or even physically, subdivided into any number of layers, using the width of a single layer as the characteristic dimension is dubious.

9.3 Homogenization

605

Fig. 9.16 A layered sample: the geometric setup. The lattice cell of the periodic structure contains two layers, “a” and “b”. The coordinate systems (n, τ , z) and (n  , τ  , z) are, by definition, righthanded

a particular coordinate system. The former, from the mathematical perspective, is a linear map between vector fields or, alternatively, a bilinear form that takes two vectors (in our case, complex vectors in C3 ) and produces a (complex) number. The M tensor has different matrix representations in the coordinate systems (n, τ , z) and (n  , τ  , z). Specifically, the ζ entries will have opposite signs in the two systems. This can be deduced from general coordinate transformations or simply by noting that n  = −n, τ  = −τ , but z ≡ z, so any entry that couples the z component with either n or τ will have its sign reversed upon the (n, τ , z) → (n  , τ  , z) transformation. It is the presence of magnetoelectric coupling that accounts for the symmetry breaking in the homogenized sample. To illustrate this, let us turn to specific numerical results. “Trefftz homogenization” of Sect. 9.3.7.5 is applied to the problem. The Trefftz basis on the fine level contained Bloch waves with the tangential components of the Bloch wave vector k0−1 qτ = m

n stat , nq − 1

m = 0, 1, . . . , n q − 1

To each of these values of qτ , there correspond two Bloch waves in the forward and backward directions through the slab; hence, the total size of the Trefftz basis is 2n q . On the coarse level, the Trefftz basis consists of the respective 2n q generalized plane waves, as described in Sect. 9.3.7.5. For n q ≥ 5, the numerical results are virtually independent of n q ; n q = 7 was chosen in all numerical simulations below. Shown in Fig. 9.17 are the real and imaginary parts of the reflection coefficient R versus the number of lattice cells in the slab. (Since each cell in this example contains two layers, the total number of layers is double the number of cells.) The “tensor 3 × 3” lines correspond to non-asymptotic homogenization, “tensor 3 × 5”—to nonlocal homogenization. In the non-local case, in addition to the field amplitudes E 0 , Hn0 , Hτ 0 , the two extra degrees of freedom were convolutions of the fine-scale basis waves e(τ ) and h τ (τ ) with the Gaussian kernel (9.49) of width a. As can be seen from Fig. 9.17, non-asymptotic homogenization gives almost perfect results for the reflection coefficient. Notably, when the order of layers is reversed (b-a-b-a... rather than a-b-a-b...), the results remain equally accurate, even though the value of R changes significantly, in agreement with the analysis of Sheinfux et al. Moreover, numerical results for the transmitted power are also quite accurate,

606

9 Metamaterials and Their Parameters

Fig. 9.17 Real and imaginary parts of the reflection coefficient R for the “EMT breakdown” example of H. Sheinfux et al. [HSKP+14], at the angle very close to critical (sin θinc = 0.999 sin θcrit ). Trefftz homogenization yields almost perfect results, while the results for static tensor (green markers) are qualitatively incorrect. Both the original and reversed orders of layers are represented accurately due to the magnetoelectric coupling in the effective material tensor. The geometric and material parameters are indicated in the text and in the figure

for both the original and reversed orders of layers (Fig. 9.18). Non-local homogenization can be viewed as overkill in this case, since local non-asymptotic homogenization already yields near-perfect accuracy. At the same time, static homogenization (green markers) is qualitatively incorrect. Remark 31 The reflection coefficient is defined in the standard way, as the ratio of the E field amplitudes of the reflected and incident waves: R = E 0r /E 0,inc . The transmission coefficient for the field amplitudes is defined analogously. However, there is a caveat for the transmitted power. When the media on the incoming and transmitted side are optically different (as is the case in our current example), the

9.3 Homogenization

607

Fig. 9.18 Power transmission coefficient Tpower for the “EMT breakdown” example of H. Sheinfux et al. [HSKP+14], at the angle very close to critical (sin θinc = 0.999 sin θcrit ). Trefftz homogenization yields almost perfect results. Both the original and reversed orders of layers are represented accurately due to the magnetoelectric coupling in the effective material tensor. The static tensor (green markers) yields qualitatively incorrect results

angles of incident and transmission are different as well; hence, power is concentrated in the beams of different widths. If the transmitted beam is narrower (which is the case for n out < n in ), the Poynting vector may, counterintuitively, exceed unity. The energy conservation relation reads (F. A. Jenkins and H. E. White [JW76, §25.2]): |R|2 + |T |2

n out cos θout = 1 n in cos θin

(9.52)

Plotted in Fig. 9.18 is the coefficient Tpower ≡ |T |2

n out cos θout n in cos θin

(9.53)

which cannot exceed unity. One cannot help but note that the field transmission coefficients T for the original (“ab”) and reversed (“ba”) order of layers are different, Tab = Tba . We will return to this interesting observation in Sect. 9.3.8. Obviously, it is instructive to see how the effective material tensor differs from the classical static one. Here are two samples. For a/λ = 0.04, as in the original case due to H. Sheinfux et al., ⎛

M0.04

3.02 0 1 ≈ ⎝ 0 0.128i 0

⎞ − 0.128i ⎠ 0 1

(9.54)

608

9 Metamaterials and Their Parameters

For a/λ = 0.08 (twice shorter wavelength), ⎛

M0.08

3.09 0 1 ≈ ⎝ 0 0.268i 0

⎞ − 0.268i ⎠ 0 1

(9.55)

(The numerical values are rounded off to three digits, to eliminate numerical “noise.”) We observe that the effective permittivity differs slightly from its static value of 3, but the key feature of the non-asymptotic tensor is the presence of the magnetoelectric terms coupling the D and Hτ fields. This relation can be written in coordinateindependent form: (9.56) D = E − i ξna→b × H na→b × B = i ξE + μτ na→b × H

(9.57)

na→b is the unit vector a → b

(9.58)

where

It is worth mentioning that the presence of magnetoelectric coupling in this case does not indicate optical activity; polarization of the wave is not affected, and the wave remains in s-mode everywhere. In contrast, constitutive relations for optically active media typically include a contribution to the D field in the direction of the B field, as well as a contribution to the B field in the direction of the E field (C. F. Bohren [Boh74], S. Bassiri, C. H. Papas and N. Engheta, D. L. Jaggard [BPE88, EJ88]). Several curves are plotted in Fig. 9.19. Perhaps of most interest is the deviation of the non-asymptotic tensor from the static one, M = M − diag(3, 1, 1). Two

Fig. 9.19 Deviation of the non-asymptotic tensor from the static one, in the example due to H. Sheinfux et al. Also shown is the homogenization error estimate (9.46)

9.3 Homogenization

609

Fig. 9.20 Errors in the reflection and transmission coefficients versus a/λ, in the example due to H. Sheinfux et al.; sin θinc = 0.5 sin θcrit High accuracy of homogenized models with ME coupling is manifest. The static tensor (green markers) yields qualitatively incorrect results, except for very long wavelengths (small a/λ)

measures of this deviation are shown in the figure: the mean and the maximum value of Mαβ , α, β = 1, 2, 3. The conclusion is not surprising: For a/λ → 0, the static limit is recovered; but for larger cells, the non-asymptotic tensor deviates substantially from the static case, primarily due to the magnetoelectric coupling terms. The homogenization error estimate (9.46) is also plotted in Fig. 9.19 and behaves as expected for shorter and longer wavelengths. Note the logarithmic scale in the figure. Error plots in Fig. 9.20 demonstrate the high accuracy of homogenized models with ME coupling. Let us now return to the salient difference between the transmission coefficients Tab and Tba for the original and reversed order of the layers in the lattice cell (Fig. 9.18). Alternatively, one may consider illumination of the original structure from two opposite sides. A natural question is whether the inequality Tab = Tba violates the reciprocity principle. In the original “ab” configuration, the incoming wave impinges on the layered structure at an angle θinc and is transmitted at some angle θt ; since the media on the two sides are different, θinc = θt . In the reciprocal case, the wave would impinge on the reversed (“ba”) structure at θt and exit at θinc . Simply flipping the structure while keeping the angle of incidence unchanged does not constitute a reciprocal case; hence, reciprocity is not violated. The above analysis seems satisfactory at first glance but upon closer inspection turns out to be incomplete. Direct calculation shows that Tab = Tba even at normal incidence, θinc = θt = 0. What gives? A more careful consideration [Tsu20b] yields the following reciprocity relation at normal incidence:

610

9 Metamaterials and Their Parameters

n out Tab = n in Tba

(9.59)

n out E ab,t E ba,inc = n in E ba,t E ab,inc

(9.60)

This can be rewritten as

Squaring this relation and noting that power flow (the Poynting vector) S is proportional to n|E|2 for a real index n, we have Sab,t Sba,inc = Sba,t Sab,inc or

power

Tab

power

= Tba

(9.61)

That is, if the ambient media on the two sides of the structure are different, then reciprocity holds in its standard form for power transmission rather than for the fields.15

9.3.8 The Uncertainty Principle This section follows in part my joint paper with V. A. Markel [TM16]. There is ample evidence in the existing literature that effective parameters of metamaterials may have limited accuracy and validity. As an instructive example (C. R. Simovski [Sim09]), the celebrated negative-index metamaterial due to D. R. Smith et al. [SPV+00] cannot be homogenized for a range of wavelengths in the vicinity of the second Γ -point, even though these wavelengths are relatively long (a/λ ∼ 0.1). Another notable example is the work of the Jena and Lyngby groups (C. Menzel et al. [MRIL10]), who show that high symmetry of a metamaterial cell does not imply optical isotropy, especially in frequency ranges where the effective index is negative. As noted in Sect. 9.3.1, magnetic characteristics of metamaterials become trivial in the zero-cell-size limit (assuming that the intrinsic material parameters remain bounded). Thus, a strong magnetic response can only be achieved if the cell size forms an appreciable fraction of the vacuum wavelength. In our papers [MT13, TM14], we brought to the fore an interplay between magnetic response, the accuracy of homogenization and the range of illumination conditions. In [TM16], we further argue that not only negative index, but a strong magnetic response must unfortunately be accompanied by lower accuracy of effective parameters. We call this an “uncertainty principle” (UP) of local homogenization. Let M be an effective material tensor found using any method (say, parameter retrieval as the most common 15 I

thank C. T. Chan for pointing out to me that (9.61) follows almost immediately from (9.59).

9.3 Homogenization

611

example). To avoid unnecessary mathematical complications and to keep our focus on the physical essence of the problem, we present our analysis of the UP for the s-mode (the E field in the z direction, the H field in the x y-plane), with a plane wave impinging in the x y-plane on a half-space filled with a metamaterial, i.e. a dielectric structure characterized by a permittivity (r) periodic in the x and y directions with the same (for simplicity) lattice constant a. A Bloch wave with a wave vector q is e B (r, q) = e˜ B (r) exp(iq · r)

(9.62)

Subscript “B” in this section indicates a Bloch wave-related quantity. The tangential component of the respective h field is h(r) =

1 ∂e = h˜ B (r) exp(iq · r), ik ∂n

(9.63)

(only the tangential component is used in the analysis, and therefore subscript “τ ” is dropped for brevity of notation). The periodic factor for the magnetic field is qn 1 ∂ e˜ B (r) e˜ B (r) + h˜ B (r) = k ik ∂n

(9.64)

We now compare wave propagation from the air into a half-space filled with (i) A metamaterial. (ii) A homogeneous medium, with its material tensor yet to be determined to minimize the TR-discrepancy.16 In both cases (i) and (ii), the field in the air is given by   eair (r) = E inc exp(ikinc · r) + R exp(ikr · r)

(9.65)

h air (r) = E inc cos θinc (exp(ikinc · r) − R exp(ikr · r))

(9.66)

where the tangential component is again implied for h. It will be convenient to assume that the reflection coefficient R is exactly the same in cases (i) and (ii). Strictly speaking, there can be (and in practice will be) some approximation tolerance; however, introducing this tolerance explicitly would obscure the analysis while adding little to its physical substance. In the case of a metamaterial, we ignore the surface wave. Then, the field in the metamaterial is just the Bloch wave (9.62), (9.63). This should be compared with the transmitted wave in the homogeneous half-space: E T (r) = E T 0 exp(ikT · r)

(9.67)

16 Transmission is easy to define only for a finite-thickness slab; for a half-space, the focus is on reflection as a function of the angle of incidence.

612

9 Metamaterials and Their Parameters

HT (r) = HT 0 exp(ikT · r)

(9.68)

Phase matching between (9.67) and (9.62) implies that for best approximation one must have (9.69) kT = q Further, due to the boundary conditions at the material/air interface, the amplitudes of the transmitted wave in the equivalent homogenized medium must be E T 0 = e˜ B Γ

(= (1 + R)E inc )

HT 0 = h˜ B Γ

(9.70) (9.71)

where , ·, Γ denotes the air/cell boundary average. Indeed, if, say, condition (9.70) were to be violated, then in the homogenized case there would be a spurious jump of the E field across the air/material interface, with the nonzero mean E T 0 − e˜ B Γ = E T 0 − (1 + R)E inc (assuming that the interface boundary is at n = 0). This jump will result in a commensurate far-field error. Remark 32 We require that boundary conditions hold in the sense of averages (9.70), (9.71) rather than pointwise because zero-mean discrepancies between a Bloch wave and a plane wave at the boundary are unavoidable. Indeed, the Bloch wave in an inhomogeneous medium has higher-order spatial harmonics that cannot be matched by a plane wave. Conditions (9.70), (9.71) ensure that the discrepancy between the Bloch field on the material side and plane waves on the air side affects only the near field, as long as a < λ. Now that the field amplitudes in the homogenized material have been determined, we can find the material tensor for which the dispersion relation (in essence, Maxwell’s equations) will be satisfied. We will be primarily interested in the case of fourfold (C4 group) symmetry, which is particularly instructive. For C4 cells, the material tensor is diagonal (in particular, there is no magnetoelectric coupling) and, moreover, μτ τ = μnn . In the remainder, we shall focus on the μτ τ entry of the tensor. Maxwell’s ∇ × E-equation for the generalized plane wave (9.67), (9.68) gives the amplitude of the tangential component of the B field in this wave: BT 0 =

1 kT n E T 0 ik

(9.72)

or, substituting kT = q from (9.69) and E T 0 from (9.70), BT 0 =

qn e˜ B Γ k

(9.73)

9.3 Homogenization

613

This, along with Eq. (9.71) for the amplitude of HT , leads to the following expression for the effective magnetic permeability: μτ τ =

BT 0 qn e˜ B Γ qn e˜ B Γ = = ˜ HT 0 q  e ˜  kh B Γ n B Γ − i∂n e˜ B Γ

(9.74)

where we inserted expression (9.64) for h˜ B . Switching for algebraic convenience from permeability to reluctivity, we arrive at the following surprisingly simple expression: i ∂n e˜ B Γ i ∂n e˜ B Γ = (9.75) ζτ τ ≡ 1 − μ−1 ττ = qn e˜ B Γ q cos θ B e˜ B Γ where θ B is the propagation angle for the Bloch wave. It is instructive to split e˜ B in (9.75) into its mean value e0 and zero mean eZM ,  e˜ B ≡ e0 + eZM , e0 = const, eZM dC = 0 C

Then, (9.75) becomes ζτ τ =

i ∂n eZM Γ q cos θ B (e0 + eZM Γ )

(9.76)

It becomes immediately clear that magnetic effects in metamaterials are due entirely to higher-order spatial harmonics of the Bloch wave, manifested in eZM . (If eZM = 0, the Bloch wave is just a plane wave, and ζτ τ = 0.) To avoid any misunderstanding, note that eZM by definition has a zero average in the volume of the cell but in general not on its surface, which makes all the difference in (9.76). The behavior of Bloch waves in inhomogeneous lattice cells is complicated, and there are no simple closed-form expressions for these waves. (Approximations are well known but exist only as formal solutions of large linear systems developed in a finite basis, e.g. in a plane wave basis.) From the qualitative physical perspective, however, one may conclude that, due to the complex dependence of eZM on θ B (i.e. on the illumination conditions), ζτ τ must in general be angle-dependent. Moreover, this angular dependence will tend to be stronger when the magnetic effects (nonzero ζτ τ ) are themselves stronger, as both are controlled by eZM . This conclusion can also be supported quantitatively [TM16] but does not have the status of a mathematical theorem; the door is therefore still open for engineering design and optimization, with a compromise between the strength of magnetic response and homogenization accuracy.

614

9 Metamaterials and Their Parameters

9.3.9 Polarization, Magnetization, and Classical Effective Medium Theories [W]e must ... go beyond a focus on charge densities, instead considering currents, ... to arrive at a suitable definition of electric polarization. David Vanderbilt, “Berry Phases in Electronic Structure Theory,” Cambridge University Press (2018).

9.3.9.1

Motivation

A natural inclination in homogenization of metamaterials is to translate the familiar notions of classical effective medium theories to periodic electromagnetic structures. However, extreme caution should be exercised if this path is followed. For example, as we have seen, volume averaging of fine-scale fields and sources is a questionable way of defining coarse-scale fields. This section addresses an even more fundamental question: how can polarization and magnetization be rigorously defined, and to what extent are their classical definitions valid? In metamaterials, we are obligated to construct these quantities in a “hands-on” way, from the fundamental principles; this may teach us a lesson about the proper treatment of natural materials in macroscopic electrodynamics as well.

9.3.9.2

“Dipole Moment per Unit Volume”

Textbook definitions of polarization and magnetization almost invariably involve the notion of “dipole moment per unit volume.” Let us examine this more closely. The standard notion of “moment” is the integral of a given quantity multiplied with the position vector.17 If one follows this definition, the dipole moment in a volume V becomes  pV = rρ d V (9.77) V

Contrary to the standard claims—explicit or tacit—in many textbooks, this expression does not lead to a valid definition of polarization as the “moment per unit volume.” Indeed, consider  pV 1 rρd V (9.78)  PV ≡ = V V V 17 Or,

in general, with an independent variable, as is the case, e.g. in probability and statistics.

9.3 Homogenization

615

The cross-out ( P) indicates quantities that will prove not to be physically meaningful and useful. First, a very natural attempt to pass to the limit V → 0 yields a trivial and meaningless result  Plim ≡ lim  PV = rρ V →0

Clearly, ∇·  Plim = ∇ · (rρ) = −ρ as would have been expected if  Plim were to be a physically valid vector of electric polarization. Also, ∇·  Plim = −ρ, where the angle brackets denote volume averaging. The situation is not any better if one attempts to define polarization as (9.78) for a given “physically small” volume V without passing to the limit; it is still obvious that the divergence condition will not hold. L. D. Landau and E. M. Lifshitz [LLP84] (L&L) first introduce polarization axiomatically, as a vector P such that ∇ · P = −ρ From this condition, L&L deduce that   rρd V = Pdv

(9.79)

(9.80)

where—importantly—the integral is taken over the whole volume of the dielectric. Unfortunately, this cannot qualify as a rigorous definition of polarization either. Indeed, since only the divergence of P is specified, this vector is not uniquely defined. L&L state that uniqueness is established by associating P with the dipole moment per unit volume, but, as we just saw, the latter has not been independently defined, so the logic is circular. Thus far, we considered a general charge distribution in the volume of the material. If one insists on finding a rigorous definition of the “dipole moment per unit volume,” progress can be made in a particular case of charge density represented by a collection of point (in reality, molecular-scale) dipoles. Then, an attempted constructive definition of polarization could be  P =

1  pi V V

(9.81)

where summation is over the dipole moments pi of all molecules inside a given mesoscopic volume V —large enough for meaningful averaging and yet small enough from the macroscale perspective. Note that V must represent a “moving” volume; i.e. it can be centered at an arbitrary point in space (otherwise (9.81) would not define polarization as a function of position). Despite its intuitive appeal, (9.81) does not constitute a rigorous definition and cannot be easily turned into one, as indicated again by the cross-out sign. Indeed, note first of all that if the boundary of volume V is allowed to cross through the

616

9 Metamaterials and Their Parameters

polarized (or polarizable) molecules of the medium, then the total charge inside V may be nonzero and hence the dipole moment of this volume will depend on the origin and thus will be ill-defined. On the other hand, if the boundary is not allowed to cross through the molecules, then the total charge inside is always zero and the usual condition ∇ · P = −ρ cannot be satisfied in any meaningful way. Even more importantly, any rigorous definition of P requires it to be a well-defined function of coordinates, so that its divergence could make mathematical and physical sense. To this end, (9.81) can be rewritten more explicitly as  P (r) = (rρ) ∗ W ≡

 R3

r ρ(r )W (r − r )dr

(9.82)

where W (x, y, z) is a “window” function equal to 1/L 3 within a cube of size L, [−L/2, L/2]3 , and zero outside (the integral of W over the cube is obviously equal to one). But divergence of (9.82) does not correspond to the charge density. This is easy to see if one recalls that differentiation can be applied to either factor of a convolution: ∇·  P (r) = [∇ · (rρ)] ∗ W

(9.83)

The expression in the square brackets bears almost no relation to the volume average ρ, and even less to −ρ, so defining polarization as (9.82) makes no sense. This is true for any choice of the W mollifier, not necessarily just a window function as initially assumed. Furthermore, it is doubtful that an unambiguous definition of polarization purely in terms of charge density is even possible at all. For example, the following postulates seem quite natural but do not in fact lead to a meaningful definition: • ∇ · P = −ρ • P depends only locally on ρ • P = 0 outside dielectric bodies. Indeed, for electrostatic fields inside a homogeneous medium ρ ≡ 0, and hence if P were to be derived from the average charge density locally, such “polarization” would have to be identically zero as well.

9.3.9.3

Constructive Definitions of Polarization: Russakoff’s Averaging

G. Russakoff [Rus70] describes a rigorous averaging procedure which follows the work of J. H. Van Vleck [Vle32], P. Mazur and B. R. A. Nijboer [MN53, Maz57], S. R. De Groot [Gro65] and others. I will refer to this whole body of work as “Russakoff’s averaging” or “Russakoff’s analysis.” This is done for brevity only, with no intention to diminish the fundamental contributions of other scientists. As a starting point, consider a single polarized molecule that can be represented as a few point charges around the origin:

9.3 Homogenization

617

ρ(r) =



qm δ(r − rm )

(9.84)

m

The notation is standard and self-evident. Note a subtle but critical feature of this model: It involves displacements of charges from a number of “equilibrium centers” rm rather than the absolute positions of these charges with respect to one common origin. A smoothed charge density is obtained by convolution with a suitable mollifier w (e.g. a sharp Gaussian function): ρ(r) = ρ ∗ w =



qm δ(r − rm ) ∗ w =

m



qm w(r − rm )

(9.85)

m

Since w is smooth, it can be expanded into a Taylor series around the origin with respect to rm : 1 w(r − rm ) = w(r) − rmα ∂α w(r) + rmα rmβ ∂αβ w(r) + h.o.t. 2 where the usual summation convention over repeated indices α, β(= x, y, z) is adopted; “h.o.t.” stands for higher-order terms. With this expansion, (9.85) becomes 1 2 w(r) + h.o.t. ρ(r) = qm w(r) − qm rmα ∂α w(r) + qm rmα rmβ ∂αβ 2 again with summation implied over α, β and also over m. If the total charge of the molecule is zero, the first term vanishes, and we have 1 2 w(r) + h.o.t. ρ(r) = −qm rmα ∂α w(r) + qm rmα rmβ ∂αβ 2

(9.86)

This can be rewritten in the desirable form ∇ · (P ∗ w) = P ∗ ∇w if, up to high-order terms, P is chosen as P(r) = −qm rm + qm

1 rm rmβ ∂β w(r) 2

(9.87)

Mathematically, it is always the case that the curl of an arbitrary smooth function can be added to P without changing ∇ · P = −ρ. However, the leading term in (9.87) can be justified on physical grounds: One can argue that the displacements rm are approximately proportional to the electric field, and hence a linear relation between the first term in P and the field will follow. At the same time, the second term will not be linear with respect to the field, which is troubling.

618

9.3.9.4

9 Metamaterials and Their Parameters

A Valid Definition of Polarization

Russakoff’s analysis outlined in the previous section is mathematically unassailable, yet not fully satisfactory, for the following reasons. • Russakoff’s setup is restrictive: It hinges on the representation of the medium as a collection of individual well-separated molecules. This is realistic only in special cases (e.g. molecular solids). Many materials do not lend themselves to an unambiguous subdivision into discretized charges; one example is ionic crystals such as sodium chloride. The complexity of electron charge density distributions on the atomic scale is explicitly illustrated by exquisite electron microscopy measurements [GAW+19, Fig. 3], by [RV07, Fig. 2], [YLJ+11, Fig. 5], and by Fig. 9.21 (reprinted from [FSA19, Fig. 5]).18 It is hard to see how Russakoff’s treatment could be applied in any such cases. Indeed, a natural question to ask when looking at these and similar figures is: Where are the discrete dipoles? • Higher-order moments, and hence the multipole expansion (9.87) as a whole, depend on the origin. It is often suggested that the origin should be placed at the “center of gravity” (or mass) of a given molecule. Since this recommendation is not meant to be taken literally (the masses of the individual charges are completely irrelevant in the present context), the proper (in some sense) choice of the origin remains an open question. R. E. Raab and O. L. de Lange [RdL04, RdL10] develop an intricate theory of material parameters that do not, to a certain order of accuracy, depend on the origin of multipole expansions. This is achieved by redefining the D and H fields in a rather cumbersome way.19 • Furthermore, it is legitimate to ask whether polarization can be defined in a meaningful and unambiguous way for any given distribution of charge density ρ(r) in space.20 Not surprisingly, the answer is “No”. Notably, though, this question may be reformulated in such a way that the answer becomes positive, showing that the partitioning of charges into discrete dipoles is superfluous. The analysis is given below. The issues above can be resolved and inconsistencies avoided if one takes a point of view that prior to the 1990s would have been considered completely unconventional. Let us treat polarization as a characteristic not of a single state of a physical medium but of a pair of states—one unperturbed and the other one perturbed. Consider a transient process that occurs when an external field is applied to a dielectric sample —or, in a more abstract setting, to a collection of charges treated as classical particles with known positions. There is an electric current J = J(t) such that ∇ · J = −∂t ρ the figure, the charge density is expressed in “electrons per cubic Bohr radius”; rBohr ≈ 5.29 · 10−11 m. 19 This is not to say that multipole expansions are not useful; of course they are, and there also are some interesting extensions “beyond multipoles” (N. A. Nemkov [NBF18]). However, “To every thing there is a season.” 20 Only classical (non-quantum mechanical) treatment of this problem is considered here. 18 In

9.3 Homogenization

619

Fig. 9.21 (From T. Fujima et al. [FSA19, Fig. 5]; open access under Creative Commons CC BY license. https://www.mdpi.com/ openaccess) Electron density distribution in an AlMgB14 -based material sample in the planes containing a Mg sites and b Al sites (details in [FSA19]). The unit cell includes four Al sites, four Mg sites, and 56 B sites. Where are the discrete dipoles? (See text)

Integrating this relation over time, ∇ · I = −δρ where I has the meaning of current but is a vector quantity defined simply as the integral of J over the time of transition from the original unperturbed state ρ0 to a perturbed one with ρ = ρ0 + δρ. Let us simply identify I with p—a fine-scale quantity from which polarization P can be obtained by filtering with a mollifier as in Russakoff’s approach.21 Then, the ambiguities and paradoxes noted above disappear. Polarization becomes a characteristic of a perturbation of a charge density. No approximation is made; the key relation ∇ · p = −δρ, and hence ∇ · P = −δρ, is exact if p = I. There is no need to grapple with the origin dependence of multipole expansions; these expansions are not in fact needed at all. Nor is there a need to group charges into “molecules” of any kind; it is sufficient to consider the (small) displacement of each elementary charge due, e.g., to the applied field. It is also clear now that Russakoff’s definition is valid precisely because it implicitly involves virtual displacements of charges relative to the molecular centers—in contrast to the failed attempt to define polarization via (9.78). Furthermore, the definition of P via current density applies, somewhat counterintuitively, to systems of charges that are not necessarily electrically neutral and ultimately even to a single charge.22 To illustrate this, consider a single point charge q that has been displaced from a position x0 to x0 + δx along the x-axis. It is then easy to see that

21 For

further details and the treatment of boundaries, see [Tsu20a]. am not arguing that the notion of polarization is necessarily useful in all such instances; only that this notion can be introduced without logical or physical contradictions.

22 I

620

9 Metamaterials and Their Parameters

p ≡ I = xq ˆ (x0 , x0 + δx) where (a, b) is the unit pulse function equal to one within [a, b] and zero outside that interval. Indeed, ∇ · p = ∂x p = q (δ(x − x0 ) − δ(x − x0 − δx)) = −δρ The definition of polarization outlined above relies on the classical notions of charges and charge density. A full quantum mechanical (QM) treatment has become known as the “modern theory of polarization” (A. Dal Corso et al. [DCPRB94], R. Resta and D. Vanderbilt [Res92, Res94a, Res94b, Res02, RV07, Res10, Res18]). This QM theory has been implemented in a number of ab initio software packages [Res18]. The downside is that this theory is valid, in its natural version, at zero temperature and for spontaneous polarization in the absence of applied fields. Hence, it seems sensible to consider the classical counterpart of this theory, which is simpler and not subject to the same restrictions, but obviously not without its own limitations [Tsu20a]. To conclude, here are a few quotes from [Res18], closely related to the list in Sect. 9.3.9.4. Contrary to a widespread incorrect belief, P has nothing to do with the periodic charge distribution of the polarized crystal: the former is essentially a property of the phase of the electronic wave function, while the latter is a property of its modulus. Analogously, the orbital term in M has nothing to do with the periodic current distribution in the magnetized crystal. The basic concepts of the modern theory of polarization have also started reaching a few textbooks ... though very slowly; most of them are still plagued with erroneous concepts and statements. A material is insulating, in principle, only at T = 0, hence the modern theory of polarization is intrinsically a T = 0 theory. The [Clausius–Mossotti] model applies only to the extreme case of molecular crystals, where the polarizable units can be unambiguously identified; for any other material—including the alkali halides—such decomposition is severely nonunique. ... the polarization difference is then equal to the time-integrated transient macroscopic current that flows through the insulating sample during the switching process. The popular textbooks typically attempt a microscopic definition of P in terms of the dipole moment per cell ..., but such approaches are deeply flawed ... Only a few undergraduate textbooks are free from such flaws.

9.3.9.5

The Clausius–Mossotti–Lorenz–Lorentz Model Revisited

It is instructive to establish a connection between our non-asymptotic homogenization and classical effective medium theories within the range of validity of the latter.

9.3 Homogenization

621

In particular, let us revisit the venerable Clausius–Mossotti and Lorenz–Lorentz model in electrostatics. The ideas of Trefftz homogenization are very different from those of classical effective medium theories. We have made no mention of, e.g., dipole moments, polarization, Lorentz local fields, volume averages, and so on. Let us nevertheless see how classical theories fit the general methodology outlined above. Although in the previous sections our homogenization framework (“Trefftz homogenization”) was described in the context of full Maxwell’s electrodynamics, the concepts are applicable in electrostatics as well, albeit with some changes. In particular, since the Bloch wave vector in the static case is zero, the Bloch basis functions correspond just to two (2D) or three (3D) polarization states. As a particular but interesting case, consider in the x y-plane a very large square lattice of identical circular dielectric cylinders, each of which has a polarizability α; that is, its dipole moment is p = αE , where E is the field at the location of the dipole, due to all other sources. Higher-order multipoles are neglected, and the cylinders are assumed to be infinitely long in the z direction, so the electrostatic problem is twodimensional. The 3D case is completely analogous but results in slightly different and lengthier expressions. Let the dipole array be immersed in a uniform external (“incident”) field Einc in the x direction, and let us focus on a lattice cell [− 21 a, 21 a]2 in the middle of the array (“in the bulk”); a is the cell size, and the origin for convenience is placed at the center of the cell. It is easy to see that each of the four faces of the cell parallel to the x-axis (i.e. to the direction of the applied field) lies in a symmetry plane of the structure and of the electrostatic potential.23 Similarly, the two faces perpendicular to x lie in antisymmetry planes. It then follows that the potential within a cell “in the bulk” is subject to the following boundary conditions (see, e.g., N. S. Bakhvalov and G. Panasenko [BP89] for a rigorous mathematical justification): nˆ · d = 0, Neumann; face parallel to applied field; nˆ × e = 0, Dirichlet; face perpendicular to applied field

(9.88)

where n is the unit normal vector on the cell boundary (undefined at the edges and corners). The differential equation for the potential u within the cell is, in the Gaussian system, (9.89) ∇ 2 u = −4πρ, ρ = ∂x δ(r) where δ(r) is the Dirac delta function residing at the center of the cell and ∂x δ is a unit dipole. The boundary value problem with sources (9.89) and boundary conditions (9.88) defines the potential and field in the cell uniquely, up to a multiplicative constant. This field can be taken as one of the basis functions, ψ1 (r), on the fine scale. The 23 This

is strictly true for an infinite array and approximately true for a very large array.

622

9 Metamaterials and Their Parameters

other basis function, ψ2 (r), is defined in a similar way, for the applied field in the y direction. At the coarse level, the corresponding functions, by construction, satisfy the electrostatic equations in a homogeneous medium and can be taken to be simply constant and, as before, are found as the boundary averages of the respective fine-level fields. The electrostatic problem with the boundary conditions (9.88) is well known, and various solutions have been put forward in the literature—see, for example, semianalytical analyses by D. J. Bergman and coworkers [Ber76a, Ber76b, Ber78, Ber80, Ber81, Ber82, BD92, BS92]. Once this solution has been found one way or another, (9.45) assumes the form 

Ex x Ex y E yx E yy

  (nˆ × e1 )x ∂C (nˆ × e2 )x ∂C (nˆ × e1 ) y ∂C (nˆ × e2 ) y ∂C

  (nˆ · d1 )x ∂C (nˆ · d2 )x ∂C = (nˆ · d1 ) y ∂C (nˆ · d2 ) y ∂C

(9.90)

The E tensor can be found from this equation directly by matrix inversion. Note that in this particular example all matrices in (9.90) are actually diagonal, so the inversion is trivial. Next, we argue that the Clausius–Mossotti (C-M) theory can be interpreted as a way of approximating the circulations and fluxes of the basis functions ψ, as required for (9.90). The key quantities in C-M, as shown, e.g. in V. A. Markel’s tutorials [Mar16a, Mar16b], are (i) polarization and (ii) the volume integral of the electric field within the cell. Although the relation between these quantities and the ones in (9.90) may not be obvious at first glance, it is in fact easy to explain (D. Bergman [Ber78]). Indeed, consider any rectangular loop with two edges lying on the opposite “Dirichlet” faces of the cell; these edges (call them edges 1,2) are perpendicular to the applied field. The other two edges of the loop (edges 3, 4) cut through the cell and can be located anywhere in the volume of the cell; they are parallel to the applied field. Since the circulation of the field over edges 1,2 is zero due to the Dirichlet condition, it is clear that the line integrals of the field over edges 3, 4 (taken in the same direction) are equal. From that, it is easy to deduce that the volume average of the electric field over the cell is equal to the boundary average of its tangential component—i.e. exactly the quantity needed in (9.90). In other words, the volume average in this case is a perfect proxy for the boundary average. The calculation of the volume integral must account for the singular term in the dipole field [Mar16a, Mar16b] and yields in the 2D case under consideration (nˆ × e1 )∂C = e1 C = Einc − 2π P1 , P1 ≡ p1 (r)C

(9.91)

A similar relation of course exists for the second basis function, i.e. between e2 and P2 . The derivation also relies on the fact that the volume average of the field

9.3 Homogenization

623

produced by all dipoles outside the cell is zero. This is true for an infinite lattice due to symmetry and approximately true for a very large one. Now let us turn to the normal component of D and flux through the boundary. Consider the volume contained between two arbitrary cross sections of the cell perpendicular to the applied field (in our case, perpendicular to the x-axis). Because of the Neumann boundary conditions on the four faces of this volume, it is clear that the flux through the two cross sections must be the same. Hence, the flux of D through any cross section perpendicular to x is the same, and face averages in (9.90) can be replaced with volume averages; we have nˆ · d(r)C = d(r)C = e(r) + 4πp(r)C = E0 − 2πP + 4πP = E0 + 2πP

(9.92)

 where we used the well-known integration-by-parts identity  r ∇ · p(r) d V =   p(r) d V ≡ P. The Maxwell Garnett formula then immediately follows from (9.91), (9.92) and the definition (9.90) of eff : eff =

DC 1 + 2π α˜ , = EC 1 − 2π α˜

α˜ ≡

α VC

9.3.10 Summary on Homogenization of Periodic Structures Our objective has been to develop a first-principle non-asymptotic homogenization theory for periodic electromagnetic structures. Fine-level fields in such structures vary rapidly on the scale finer than the lattice cell size.24 One is looking for coarsescale fields as suitable spatial averages of the fine-scale ones and for constitutive laws relating these coarse-scale fields. But what makes an averaging procedure “suitable”? We require that coarse-level fields (i) satisfy Maxwell’s equations and interface boundary conditions as accurately as possible and (ii) be as close as possible to the actual fields outside the periodic structure. Requirement (ii) is obvious—after all, we want the homogenized model to predict the fields in the air accurately. Requirement (i) also appears to be quite natural and unassailable, but a few remarks are in order. First, if one succeeds in constructing coarse-level fields satisfying some other equations within the sample, such a model would in principle be a valid one. Hence, there is a substantial leeway in constructing the coarse-scale fields. Incidentally, models in which fields on the fine and coarse levels satisfy equations of different type are not purely hypothetical. For example, in some homogenization problems involving eddy currents in laminated magnetic 24 It should be emphasized that the fine scale is not the atomic scale; fine-scale fields are still assumed

to satisfy the equations of continuous electrodynamics.

624

9 Metamaterials and Their Parameters

cores, it turns out that eddy currents in the iron sheets can be accounted for on the fine level but not on the coarse scale (I. Niyonzima et al. [NSD+13, NGS16], M. Schöbinger et al. [SHT20]); consequently, the coarse-scale equation may be magnetostatic rather than eddy current [SHT20]. But for electromagnetic waves in metamaterials, it makes perfect sense to adopt the most natural approach and devise coarse-scale fields satisfying Maxwell’s equations, with material relations to be determined. Even though (ii) appears indisputable, most of the literature on homogenization of metamaterials focuses on the wave behavior in the bulk, while boundary conditions are considered only as an afterthought or not considered at all. Yet boundary conditions are as essential as dispersion relations; in particular, the magnetic response of metamaterials is as much a surface effect as it is a volume effect. The next conceptual step in homogenization is to recognize that one needs information about fine-level fields. In our approach, this information is carried by a basis of Bloch waves traveling in different directions.25 A typical 2D example would be 8 waves propagating at the multiples of π/4 relative to one another. In 3D, one may consider a basis of 12 Bloch waves traveling in the ±x, ±y, ±z directions, with two polarizations per direction. Obviously, these are just examples; trade-offs between the size of the basis and homogenization accuracy were considered in this section. Information about the coarse-level fields is contained in a basis of plane waves, each of which corresponds to the respective Bloch wave in the fine-scale basis. The E H amplitudes of these plane waves are found from Maxwell’s boundary conditions—as boundary averages of the periodic factors in Bloch waves. Once that is done, one finds the D B amplitudes within the homogenized sample from Maxwell’s curl equations. The final step of the procedure is to find the material tensor as the optimal D B versus E H relationship. This tensor automatically accounts for anisotropy and magnetoelectric coupling if these effects manifest themselves; additional nonlocal degrees of freedom (convolution integrals with a sharp kernel) may be included for higher accuracy. Our benchmark examples demonstrate that the systematic approach outlined in this section often leads to an accuracy improvement by an order of magnitude or more, compared to the results published in the literature. This is especially clear in the case of layered media, where exact analytical results are available for comparison and validation. Notably, magnetoelectric coupling, automatically accounted for in our model, correctly represents the symmetry breaking when a structure lacking inversion symmetry is illuminated from two opposite sides [THC20]. As we have seen, classical effective medium theories (Clausius–Mossotti, Lorenz– Lorentz, Maxwell Garnett) fit well—within their own range of validity—into the proposed new framework. If deemed necessary, non-local effects can be included in the model, making further accuracy improvements possible. Also, material parameters can be automatically tailored to a given range of illumination conditions, if so desired.

25 Evanescent

waves could also be included in the basis.

9.3 Homogenization

625

A particularly challenging problem for future research is to determine what effective material tensors are attainable for given constituents of a metamaterial with their given properties, and how the lattice cell could be designed to produce such tensors. For example, what is the maximum effective permeability achievable or what is the strongest (by some measure) chirality, under given physical and engineering constraints? Bounds for effective parameters are currently known only for relatively simple settings, such as static dielectric permittivity of mixtures with two ingredients. Our homogenization methodology may be a modest step toward addressing problems of this kind.

9.4 Appendix: Parameters of Split-Ring Resonators A single conducting ring with a gap—a split ring—can be approximated via a circuit model with an equivalent inductance due to the ring and capacitance due to the gap. A double split-ring resonator (SRR) contains two concentric split rings, and the gap between them provides additional capacitance (Fig. 9.22). This is a popular type of metamaterial “atom,” and the qualifier “double” is often omitted. Analytical approximations of SRR’s electromagnetic response have been derived by many researchers via equivalent circuit models (A. J. Ward and J. B. Pendry [WP96], R. Marqués et al. [MMREI02], S. A. Ramakrishna [Ram05], B. Sauviac et al. [SST04], P. A. Belov and C. R. Simovski [BS05], S. Zuffanelli [Zuf18]). The following result for magnetic polarizability α(ω) is due to B. Sauviac, C. R. Simovski, S. A. Tretyakov and P. A. Belov [SST04, BS05] and is reproduced here in their notation. Let α(ω) =

Aω 2 , ω02 − ω 2 + jωΓ

A =

μ2h π 2 r 4 L+M

(9.93)

where the resonant frequency ω02 =

Fig. 9.22 (Double) split-ring resonator. Reprinted with permission from [BS05] (P. A. Belov c and C. R. Simovski) 2005 by the American Physical Society. http://dx.doi.org/10. 1103/PhysRevE.72.026615

1 (L + M)Cr

(9.94)

626

9 Metamaterials and Their Parameters

L is the inductance of each ring (neglecting the small difference between these two inductances):   32R L = μh r ln −2 (9.95) w M is the mutual inductance of the two rings:   w+d 4 −2+ξ , ξ = M = μh r (1 − ξ) ln ξ 2r

(9.96)

Cr is the effective capacitance of the SRR, C r = h

r 2w cosh−1 π d

(9.97)

Γ is the radiation reaction factor: Γ =

Aωkh3 6πμ0

(9.98)

As indicated in Fig. 9.22, r is the inner radius of the inner ring, w is the width of the √ rings, and d is the distance between the edges of the rings; h , μh and kh = ω h μh are the permittivity, permeability and the wavenumber of the host medium. The above expressions are valid if w, d  r , if the gaps in the rings are large enough compared to d, and if conducting losses in the rings are neglected [BS05]. More advanced models that take into account electric polarizability and bianisotropy of SRRs were developed by R. Marqués et al. [MMREI02]. I should also mention, without much technical detail, two other types of resonators—closely related, but different from the SRR: electric-LC (ELC) [SMS06] and complementary ELC resonators [HGS+08]. The following excerpt from [SMS06] provides an excellent summary of the ideas behind ELC. ... we introduce an inductive-capacitive (LC) resonator with a fundamental mode that couples strongly to a uniform electric field, and negligibly to a uniform magnetic field. We will refer to such a resonator as an electric-LC (ELC) resonator. In this nomenclature, an SRR with two or four balanced splits would be a magnetic-LC resonator, since it couples only to magnetic field, and a single-split SRR would be an electromagnetic LC (EMLC) resonator, since it couples to both electric and magnetic fields. (Note that Pendry’s original SRR with two coaxial, single-split rings is not balanced, and is thus an EMLC resonator.) The presented ELC resonator is suitable for implementing media with desired positive or negative permittivity in one to three dimensions, and does not require intercell or interplane electrical connectivity. Additionally, due to the fact that it is a local and self-contained oscillator, it should be more robust with regard to maintaining its bulk properties close to a boundary or interface. The ELC resonator can be described qualitatively in terms of its equivalent circuit [Fig. 9.23d]. A capacitor-like structure couples to the electric field and is connected in parallel, to two loops, which provide inductance to the circuit. This allows the electric field to drive the LC resonance providing both positive and negative electric polarization at different frequencies along the resonance curve, where the phase of the resonator response is in phase and out of

9.4 Appendix: Parameters of Split-Ring Resonators

627

Fig. 9.23 (Reprinted from c [SMS06] 2006 with the permission of AIP Publishing). a The fabricated sample. b The geometric design: a=3.333 mm, d = 3 mm, l = 1 mm, w = g = 0.25 mm. The copper thickness was 0.017 mm, and the FR4 substrate thickness was 0.203 mm. c A proposed multilayer design with lower resonant frequency. d An equivalent circuit

phase with the driving field, respectively. The two inductive√ loops are connected in parallel, so that the resonant frequency of the circuit model is ω0 = 2/LC. ... The symmetry of the structure also dictates its coupling to a magnetic field. Because the two inductive loops are equivalent but oppositely wound with respect to the capacitor, they act as a magnetic field gradiometer that does not couple to a uniform magnetic field; a uniform magnetic field cannot drive the fundamental LC resonance. Like an SRR, the ELC resonator has both inductive and capacitive elements, only one type of which is involved in coupling the fundamental mode to the desired external field. For the SRR, the inductive element (the ring) couples to the magnetic field, and the capacitive element (the split) does not. This gives useful independent control over coupling strength and resonant frequency. Analogously, in the ELC resonator, the capacitive element couples strongly to the electric field and the inductive loops do not.

CELC—Babinet’s complement26 of ELC—was studied by T. H. Hand et al. [HGS+08]. CELC “achieves a purely magnetic response with no cross coupling. The CELC metamaterial structure offers potential applications in both free-space and waveguide environments [with applications to] passive phase shifters, filters, power splitters, etc.”27

9.5 Appendix: Coordinate Mappings and Tensor Transformations In the context of transformation optics, the relevant tensor transformations appeared in the 1996 paper by A. J. Ward and J. B. Pendry [WP96]. A concise and clear exposition (two pages only) is provided by S. G. Johnson [Joh10]. The results are as follows. 26 Material

and air regions interchanged. omitted from this quote.

27 References

628

9 Metamaterials and Their Parameters

Consider an electromagnetic field in a source-free medium with linear material relations D = E, B = μE, where  and μ can be position-dependent and tensorial. Let there be a differentiable and invertible map of the domain x occupied by this medium onto a domain x . As a particular case, x and x may in fact correspond to the same physical region of space represented in two different coordinate systems x = (x1 , x2 ) and x = (x1 , x2 ). Typically, however, x and x are physically different regions, each equipped with its own coordinate system. Following S. G. Johnson [Joh10], define the Jacobian matrix of the transformation: Jαβ ≡

∂xα ∂xβ

(9.99)

For simplicity, let both coordinate systems be right-handed. Then, the field and tensor transformations can be written as follows: E = (J T )−1 E

(9.100)

H = (J T )−1 H

(9.101)

 = (det J )−1 J  J T

(9.102)

μ = (det J )−1 J μ J T

(9.103)

As a very simple illustration, let us consider a uniform scaling x = sx with some factor s > 0. For example, the physical domain may remain the same, but coordinates x were measured in meters, and x —in centimeters; in that case, s = 100. Then, J = s I3×3 , where I3×3 is the 3 × 3 identity matrix. With this J , det J = s 3 , and Eqs. (9.100)–(9.103) yield (9.104) E = s −1 E H = s −1 H

(9.105)

 = s −1 

(9.106)

μ = s −1 μ

(9.107)

The transformations of E and H make clear physical sense, as they leave the circulations of these fields invariant (coordinates are scaled by s, and fields by s −1 ). Similarly, fluxes of D and B remain invariant if these fields are scaled by s −2 . Thus, the D/E ratio is scaled as s −2 /s −1 = s −1 , which is consistent with (9.106). The same conclusion holds for the ratio of the B and H fields. For static fields, these transformations have appeared over the years in a variety of publications and applications; see, e.g., E. M. Freeman and D. A. Lowther [FL89], A. Stohchniol [Sto92], A. Plaks et al. [PTPT00], G. W. Milton et al. [MBW06].

9.6 Appendix: Dielectric Cylinders and Spheres in a Uniform Electrostatic Field

629

9.6 Appendix: Dielectric Cylinders and Spheres in a Uniform Electrostatic Field 9.6.1 Dielectric Cylinders in a Uniform Electrostatic Field For easy reference, let us reproduce the well-known equations for a dielectric cylinder in a uniform electrostatic field. Let the permittivity of the cylinder be cyl its radius be r0 , and let z be its axis. The cylinder is immersed in a host dielectric with a permittivity host . The potential of an electrostatic field E a applied along the x-axis is u a = −E a x = −E a r cos θ with the standard notation for the polar coordinates (r, θ). Since this potential contains only the first cylindrical harmonic, it can be easily demonstrated that the potentials inside and outside the cylinder also have only one angular harmonic, viz. u in = cin r cos θ u out = (−E a r + cout r −1 ) cos θ The boundary conditions for the potential and its normal derivative at r = r0 are cin r0 = −E a r0 + cout r0−1 cyl cin = host (−E a − cout r0−2 ) where cin and cout are coefficients to be determined. The solution of these two equations with two unknowns is cin = −E a

2host cyl + host

cout = p = E a r02 The polarizability α ≡

cyl − host cyl + host

cyl − host p = r02 Ea cyl + host

The potential and the field components inside and outside the cylinder are E in = −cin = E a

2host cyl + host

630

9 Metamaterials and Their Parameters

u out = −E a x − E x,out = E a +

cout x x 2 + y2

cout 2cout x 2 − x 2 + y2 (x 2 + y 2 )2

E y,out = −

2cout x y (x 2 + y 2 )2

9.6.2 Dielectric Spheres in a Uniform Electrostatic Field The electrostatic problems for a dielectric sphere and cylinder in a uniform field are analogous.28 Let the permittivity of the sphere (or “particle”) be p , and its radius be r0 . The electrostatic potential of a uniform field applied in the x direction is, as before, u a = −E a x = −E a r cos θ Again, only the first angular harmonic is nonzero in this setup, and the potentials inside and outside the sphere are u in = cin r cos θ u out = (−E a r + cout r −2 ) cos θ The boundary conditions for the potential and its normal derivative at r = r0 are cin r0 = −E a r0 + cout r0−2 p cin = host (−E a − 2cout r0−3 ) The solution is −cin = E in = E a

3host p + 2host

cout = p = E a r03

p − host p + 2host

The polarizability α≡

p − host p = r03 Ea p + 2host

is straightforward to combine both cases: The dipole potential depends on r as cout r −d+1 , where d = 2, 3 is the number of dimensions. I still decided to keep the two cases separate for easy reference.

28 It

9.7 Appendix: Wave Propagation Through a Homogeneous Slab

631

9.7 Appendix: Wave Propagation Through a Homogeneous Slab 9.7.1 Maxwell’s Equations Maxwell’s equations in the Gaussian system under the exp(−iωt) phasor convention are (9.108) ∇ × e = ik0 b ∇ × h = −ik0 d

(9.109)

The small letters e, d, h, b denote fine-scale fields in any given structure. The respective capital letters E, D, H, B denote coarse-level fields obtained by some homogenization procedure. k0 = ω/c is the wavenumber in free space. In the remainder, we consider a slab 0 ≤ n ≤ w of a material with linear but otherwise general electromagnetic characteristics, possibly non-local; direction n is normal to the slab. If the slab is homogeneous, the distinction between small-letter and capital-letter fields is lost.

9.7.2

E-mode (s-mode), with Possible Nonlocality

For the s-mode (E-mode) E = E z , H = nˆ Hn + τˆ Hτ , where n and τ are normal and tangential directions to the slab, respectively, and (n, τ , z) is assumed to be a right-handed system, Maxwell’s equations in the Gaussian system become in this case ∂n E = −ik0 Bτ ∂τ E = ik0 Bn ∂n Hτ − ∂τ Hn = −ik0 D We seek a particular solution of the above equations in the form of a generalized plane wave E = E 0 exp(iq · r), etc. where q is a wave vector (either given or to be found, depending on the problem at hand), and E 0 is an amplitude. Then, Maxwell’s equations get converted from the differential to matrix form (since ∂n → iqn , etc.): ⎞⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ E0 D0 0 0 −1 q˜n 0 0 ⎝q˜τ 0 0 ⎠ ⎝ Hn0 ⎠ = P ⎝ Bn0 ⎠ , P = ⎝0 1 0 ⎠ 10 0 Hτ 0 Bτ 0 0 q˜τ −q˜n

(9.110)

632

9 Metamaterials and Their Parameters

where q˜ = q/k0 . Let the constitutive relation be ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ D E E ⎝ Bn ⎠ = M L ⎝ Hn ⎠ + M N L E ∗ ⎝ Hn ⎠ Bτ Hτ Hτ where “*” denotes spatial convolution with a given integral kernel E; M L , M N L are the local and non-local parts of the effective tensor. Substituting this into (9.110) and noting that convolution turns into multiplication in reciprocal space, one obtains (q˜n Q n + q˜τ Q τ )0 = P (M L + M N L E(q˜n , q˜τ )I ) 0 where



⎞ ⎛ ⎞ ⎛ ⎞ 10 0 000 E0 Q n = ⎝0 0 0 ⎠ , Q τ = ⎝1 0 0⎠ , 0 = ⎝ Hn0 ⎠ 0 0 −1 010 Hτ 0

(9.111)

In general, this is a nonlinear problem with respect to q˜n (for a given q˜τ ). However, if E does not depend on q˜n , then one has a generalized linear eigenproblem q˜n Q n 0 = [P (M L + M N L E(q˜τ )I ) − q˜τ Q τ ]0

(9.112)

As a side note, recall, in particular, that the Fourier transform of the Gaussian kernel E(τ ) =

β exp(−β 2 τ 2 ) √ π

(an “approximate δ-function” for large β) is   k2 E(k) = exp − 2 4β The assumption that the kernel depends only on τ and not on n has particular physical significance: No complications arise near the material/air interface, since the support of the kernel does not, by construction, intersect with the boundary. With the eigenvalues qn and eigenvectors 0 defined by (9.112), one can proceed to the analysis of wave propagation through the slab. For a given qτ , (9.112) has two eigensolutions qnα , 0α (α = 1, 2) corresponding to waves traveling in the opposite directions with respect to the n-axis. Hence, the field in the slab is a superposition of two eigenmodes: ⎛

⎞  E 01 E 02  exp(iq1n n) 0 (n) = ⎝ Hn01 Hn02 ⎠ c 0 exp(iq2n n) Hτ 01 Hτ 02

(9.113)

9.7 Appendix: Wave Propagation Through a Homogeneous Slab

633

Here, Hτ 01 (for example) is the amplitude of Hτ for the first mode; c ∈ C2 is a two-component complex coefficient vector —the “weights” of the two modes in the solution. The complex exponential exp(iqτ τ ) is from now on suppressed in all expressions because it is a common factor for all terms. It makes sense to introduce shorthand notation for the two matrices in (9.113) and rewrite that equation as (9.114) (n) =  E H Q(n) c One remaining step is to find the c coefficients. It is convenient to start from the transmission side (n = w), since there is only one wave (the transmitted one) in the air. The boundary conditions involve the tangential components of the field, i.e. E and Hτ for the s-mode under consideration. To separate these components out from (9.113), we may formally use the following matrix which selects rows 1 and 3:  P13 =

100 001



With this notation, the boundary conditions for (9.114) on the transmission side can be written as   kn,out E 0,out , Hτ 0,out = − E 0,out (9.115) P13  E H Q(n) c = Hτ 0,out k0 μout where “out” labels the quantities on the transmission side. Thus, c = E 0,out [P13  E H Q(n)]−1





1

− kk0n,out μout

With the c coefficients now determined, the field in the slab is known completely. We are particularly interested in the boundary conditions on the incident side (n = 0): 

E in Hτ ,in

 = P13  E H Q(0) c

or, substituting the coefficients c, 

E in Hτ ,in



= E 0,out P13  E H Q(0) [P13  E H Q(n)]−1



1 − kk0n,out μout

 (9.116)

Finally, the fields at the n = 0 boundary of the slab are a superposition of the incident and reflected waves, and one can easily show that 

E in Hτ ,in



 =G

  1 E 0,inc , G= − kk0n,in E 0,r μin

1 kn,in k0 μin



634

9 Metamaterials and Their Parameters

Putting everything together, we can now relate the amplitudes of the incident and reflected waves to that of the transmitted one:     1 E 0,in = E 0,out G −1 P13  E H Q(0) [P13  E H Q(n)]−1 qn,out − k0 μout E 0,r Although this expression looks cumbersome, its implementation is straightforward, especially in a high-level language such as MATLAB. The reflection and transmission coefficients R = E 0,r /E 0,inc and T = E 0,out /E 0,inc follow from the above relationship immediately.

9.7.3

H-mode ( p-mode), with Possible Nonlocality

Equations for the p-mode (H -mode: H = Hz , E = nˆ E n + τˆ E τ ) are ∂n h = ik0 dτ ∂τ h = −ik0 dn ∂n eτ − ∂τ en = ik0 b In a homogenized medium, we seek the fields (E, H, D, B) as a generalized plane wave (9.117) H = H0 exp(iq · r), etc. Maxwell’s equations for these fields become ⎛

⎞⎛ ⎞ ⎛ ⎞ ⎛ ⎞ 0 0 q˜n En Dn 0 10 ⎝ 0 0 q˜τ ⎠ ⎝ E τ ⎠ = P ⎝ Dτ ⎠ , P = ⎝−1 0 0⎠ −q˜τ q˜n 0 H B 0 01 where q˜ = q/k0 . Let the constitutive relation be ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ Dn En E ∗ En ⎝ D τ ⎠ = M L ⎝ E τ ⎠ + M N L ⎝E ∗ E τ ⎠ B H E∗H where M L , M N L are the local and non-local parts of the effective tensor. For the generalized plane wave (9.117), Maxwell’s equations become (q˜n Q n + q˜τ Q τ ) = P[M L E(qn , qτ )M N L ]

(9.118)

9.7 Appendix: Wave Propagation Through a Homogeneous Slab

where

635



⎞ ⎛ ⎞ 001 0 00 Q n = ⎝0 0 0 ⎠ , Q τ = ⎝ 0 0 1 ⎠ 010 −1 0 0 ⎛

⎞ En  = ⎝ Eτ ⎠ H As in the case of E-mode, this is in general a nonlinear problem with respect to q˜n (for a given q˜τ ). However, if E does not depend on q˜n , then we have a generalized linear eigenproblem 

q˜n Q n  = P(M L + E(qτ )M N L ) − q˜τ Q y  The rest of analysis coincides with that of the E-mode.

(9.119)

Chapter 10

Miscellany

A Miscellany is a collection without a natural ordering relation; I shall not attempt a spurious unity by imposing artificial ones. I hope that variety may compensate for this lack... J. E. Littlewood, “A mathematician’s miscellany” (1960)

This short chapter is a collection of paradoxical or controversial subjects related to the main topics of this book, but not necessarily covered in other chapters.

10.1 Good or Poor Conductors for Low Loss? In optics and photonics, the use of conducting materials (e.g. gold or silver) leads to losses. This is well known, in particular, as a major impediment in applications of plasmonics and metamaterials (see, e.g., J. B. Khurgin’s paper [Khu15] and references there, as well as Sect. 8.12). Low loss in these applications is associated with a small or negligible imaginary part of the dielectric permittivity, |Im |  1.1 This appears non-controversial until we realize that in everyday electric power applications it is good conductors (such as copper or aluminum) that are used to minimize losses. Since electric conductivity σ = ω|Im |, in power applications one seeks |Im |  1. A natural question therefore is: Are low losses associated with good or poor conductors, i.e. with |Im |  1 or |Im |  1? In an informal poll of my colleagues,

1 The absolute value is used because the sign of Im

 depends on the exp(±iωt) phasor convention.

© Springer Nature Switzerland AG 2020 I. Tsukerman, Computational Methods for Nanoscale Applications, Nanostructure Science and Technology, https://doi.org/10.1007/978-3-030-43893-7_10

637

638

10 Miscellany

the number of different answers approximately matched the number of people polled. I give my own hint at the end of this chapter.

10.2 The “Source Current” In the engineering literature, it is common practice to subdivide the current density J in conductors into two parts: 1. “External,” “impressed” or “source” currents Js . 2. “Induced” currents Jind = σE. Thus, one sees equations like J = Js + σE [caution!]

(10.1)

Typical examples are [CC77, Kon81], but there are many more. Unfortunately, (10.1) is misleading at best and incorrect at worst, depending on its interpretation. If the conventional constitutive equation in conductors holds—and in the vast majority of practical applications we assume it does—then J = σE, period. An individual moving charge does not know, and does not care, whether its motion is “impressed” or “induced.” Calling Js the “source current” is not just a matter of pure semantics; it may lead to significant errors. One might think, for example, that in a stand-alone conductor, not connected to any sources, Js must be zero, but this is not the case. Instead of the “source current” Js , it would be far better to use the unambiguous and rigorous notion of the electrostatic potential and its gradient; for details, see [TKL92, TKMS93, Tsu95a, Tsu02]. It may be noted in passing that (10.1) would be technically correct if interpreted in the following sense:  J =

Js , known current density; formally, set σ = 0 σE, unknown current density

(10.2)

What relevance does this have to electromagnetic problems on the nanoscale? It is common—this time, in publications on optics and photonics—to posit external current sources within (meta)materials; see notes in Sect. 9.3.6, as well as papers by V. A. Markel, e.g. [Mar10, MT13, Mar18, MT20].

10.3 Boundary Conditions in Effective Medium Theory It is common in the literature to assume that effective parameters of periodic electromagnetic structures could be derived from the dispersion relations alone—that is, from the behavior of Bloch waves in the bulk. The analysis of Sect. 9.3.7 shows,

10.3 Boundary Conditions in Effective Medium Theory

639

however, that boundary conditions on the surface of the (by necessity, finite) structure are as important as is the bulk behavior of waves; see also [TM14, Tsu17]. This conclusion is not yet accepted as widely as it should be, for two main reasons. First, the standard tool of analysis in physics is Fourier transforms, which work in a straightforward manner only in infinite media. Second, classical homogenization theories, which are valid in the limit of a vanishingly small lattice cell size, do not depend significantly on the boundary effects.

10.4 “Spurious Modes” It was observed in the 1970s and 1980s (e.g. J. B. Davies et al. [DFP82]) that nodal element solutions of electromagnetic cavity resonance problems contained non-physical “spurious” modes with nonzero divergence, even though good physical results had been obtained with node elements for a variety of other problems— magnetostatics, eddy currents, etc. This problem is discussed in Sect. 3.12.1; the summary below follows [Tsu03a]. Several misconceptions and incomplete theories of spectral convergence existed in the past. As one salient example, the fact that the lowest order Nedelec–Whitney edge elements (Sect. 3.12.1) are divergence-free was construed to imply that (a) the “spurious modes” in electromagnetic cavity resonance problems are absent; (b) edge elements cannot be applied to fields with nonzero divergence [Mur98]. If such statements were true, then first-order nodal elements would be perfect for the Laplace equation (because the Laplacian of a linear function is exactly zero) and could not be applied at all to, say, a general Poisson equation. In fact, the FEM Galerkin method is based not on a pointwise rendering of derivatives but rather on the weak formulation of the problem, which takes into account not only what happens inside the element but the interelement boundaries as well. For spurious modes, the divergence-free property of edge elements is in fact irrelevant (S. Caorsi et al. [CFR00]). In the mathematical literature, a consensus has emerged that the so-called discrete compactness condition plays a key role in the analysis of spectral convergence: S. Caorsi et al. [CFR00], D. Boffi et al. [BFea99, Bof01], P. Monk and L. Demkowicz [MD01].

10.5 The Moment Method and FEM One of the most common procedures for the solution of integral equations of electromagnetic fields (usually high frequency) has been known, since the 1960s, as “the moment method” (R. F. Harrington [Har93]). Even though one does not normally argue about definitions, in this case the terminology is misleading. The original

640

10 Miscellany

method of moments dates back to Galerkin’s work ca. 1915 and can be applied to both differential and integral equations. In particular, FEM is also a moment method. Computational procedures consist of a formulation and a numerical method. Integral equations in electromagnetic or other applications can be solved not only by the moment method but also, say, by numerical quadratures. Likewise, boundary value problems can be solved by FD methods or by moment methods (FEM Galerkin); see [KT93] for further discussion. Rather than appropriating the “moment method” moniker for one specific class of high-frequency problems, it would be desirable to use more descriptive terms, such as “the BEM Galerkin method,” “FEM Galerkin” and “numerical quadratures for integral equations.” Unfortunately, this suggestion is half a century too late—in applied electromagnetics, “the moment method” seems to be firmly ensconced in one specific niche.

10.6 The Magnetostatic “Source Field” and the Biot–Savart Law In magnetostatics, the governing equation ∇ × H = J (or 4πc−1 J in the Gaussian system) is very often solved by introducing an auxiliary “source field” Hs which satisfies the same curl equation but not necessarily the zero-divergence or boundary conditions. Then, for simply connected regions, H = Hs − ∇u, where u is the magnetic scalar potential. Typically, the Biot–Savart volume integral is used for the analytical construction of Hs . However, there is a much more efficient approach involving a line integral rather than the volume integral:  Hs =

J × dl

(10.3)

L

where the direction dl is fixed; for examples and details about the integration path L, see V. V. Belousova and V. L. Chechurin et al. [BBC78, Che79]. This formula can be verified by direct evaluation of ∇ × Hs . Analogous expressions, with Lamé coefficients, are valid in the cylindrical and spherical systems and have for a long time been standard in the Russian literature [BBC78] but are not widely known elsewhere. Assuming that the starting points for integration in (10.3) form a smooth surface, one observes that Hs is zero on that surface. Its tangential components being zero, the normal component of J = ∇ × H must be zero, too. This is also a sufficient condition for (10.3) to be valid. The source field defined by (10.3) has no component in the dl direction, which may reduce computational work in some cases. The “tree–cotree” gauge (F. Dubois [Dub90, Sect. 4.1]; R. Albanese and G. Rubinacci [AR90]) for edge elements can be viewed as a discrete analog of (10.3). When written in the spherical system, (10.3) coincides with the Poincaré gauge (W. E. Brittin et al. [BSW82], F. H. J. Cornish [Cor84]).

10.7 TE and TM Modes

641

10.7 TE and TM Modes In waveguide theory, transverse electric (TE) and transverse magnetic (TM) modes are defined unambiguously as having no electric field or no magnetic field, respectively, along the axis of the guide. Unfortunately, when translated to wave propagation in 2D structures (e.g. photonic crystals), this terminology becomes confusing. As documented on Sect. 8.8 and in Tables 8.1, 8.2, the same term (TE or TM) may mean diametrically opposite conditions when used by different authors. I wish the TE/TM terminologies were confined to the analysis of waveguides. For 2D structures, unambiguously defined are the s- and p-modes (Sect. 8.8); in the book, I also call them the E- and H -modes, respectively, indicating the field with only one component.

10.8 FDTD Versus Discontinuous Galerkin and FETD Finite-difference methods have a salient weakness: It is difficult to model accurately the behavior of fields around geometrically complex objects in general, and slanted or curved material interfaces in particular. In the frequency domain, FLAME (Chap. 4) often qualitatively improves the accuracy. In time-domain problems, several partial remedies have been proposed (Sect. 7.12). In contrast, various versions of the finite element method, which operate on geometrically conforming meshes, have a distinct and natural advantage in problems with complex structures. Aren’t FDTD-type methods becoming obsolete and getting displaced by the advent of sophisticated finite element time-domain (FETD) and discontinuous Galerkin (DG, Sect. 4.6.4) methods? Some researchers believe so; but a quick literature review tells a different story. Shown in Figs. 10.1 and 10.2 are the numbers of ISI journal articles on FD, FEM and DG in the time domain, over the last decade, for photonics and antenna applications, respectively. There are two obvious caveats with regard to that data. First, the database search results depend to some extent on the particular choice of keywords. Second, the usage of various methods may change in the future. Nevertheless, the overall conclusion is, for the moment, unmistakable: In a variety of practical applications, FDTD methods are used much more widely than FETD or DG. Acceptance of different methods depends not only, and perhaps not primarily, on their pure mathematical strengths, but also on many other factors: ease of implementation, CPU and memory requirements, availability of public domain and commercial software, etc. It is a safe bet that development of FDTD, FEM and DG will continue unabated in future years, and the arguments for or against each of these methods will continue to be deployed.

642

10 Miscellany

Fig. 10.1 FD, FEM and DG in the time domain: ISI journal articles over the last decade; photonics applications. Keywords used in the database search are indicated in the legend (The 2019 data may be incomplete)

Fig. 10.2 FD, FEM and DG in the time domain: ISI journal articles over the last decade; antenna applications. Keywords used in the database search are indicated in the legend (The 2019 data may be incomplete)

10.9 1D Poisson Equation: FEM Solution with Exact Nodal Values A curious special case of finite element approximation in 1D is considered in Appendix 3.10.2 (see also J. Douglas and T. Dupont [DD74, p. 101]). Namely, if first-order finite elements are used to solve a 1D Poisson equation with an arbitrary admissible right-hand side, then the nodal values of the theoretical and numerical solutions are exactly the same. This is surprising, since in general the nodal values

10.9 1D Poisson Equation: FEM Solution with Exact Nodal Values

643

are subject to approximation errors commensurate with the order of the scheme (or sometimes one order higher), but in this special case these errors just vanish. This is an extreme case of superconvergence (Sect. 3.11).

10.10 Good or Poor Conductors for Low Loss? (A Hint) Are low losses associated with good or poor conductors, i.e. with |Im |  1 or |Im |  1? This question—not as silly as it might appear at first glance—was posed in Sect. 10.1. As an analogy, consider two expressions for power dissipation in a resistor R: P = I2R

(10.4)

V2 R

(10.5)

P =

with the standard notation for voltage V and current I . These expressions are, of course, mathematically equivalent. But if one wishes to maintain a fixed value of the current, (10.4) clearly shows that low loss corresponds to low resistance. On the other hand, if a fixed value of the voltage needs to be maintained, (10.5) shows that low loss corresponds to high resistance.2 The reader will surely see the connection of this example with power applications on the one hand, when currents need to be supplied, and photonics applications on the other, when “voltages” (more precisely, electric fields) are key.

2 In

electric machine design, cases analogous to these two are sometimes called “current driven” and “voltage driven,” respectively.

Chapter 11

Conclusion: “Plenty of Room at the Bottom” for Computational Methods

Room at the Top is a 1959 British film based on the 1957 novel of the same name by John Braine. ... Room at the Top was widely lauded, and was nominated for six Academy Awards. https://en.wikipedia.org/wiki/ Room_at_the_Top_(novel) https://en.wikipedia.org/wiki/ Room_at_the_Top_(1959_film)

Cut out the poetry, Watson. Sir Arthur Conan Doyle, The Casebook of Sherlock Holmes.

Google returns over 400,000 references to R. Feynman’s talk “There’s Plenty of Room at the Bottom” [Fey59]. This talk, given on December 29, 1959, at the annual meeting of the American Physical Society, presaged the development of nanoscale science and technology. As yet another reference to Feynman, I argue in this book that there is “plenty of room at the bottom” for modeling and simulation. This point is illustrated with a number of examples: computation of long-range interactions between charged or polarized particles in free space and in heterogeneous media; plasmonic field enhancement by metal layers or cascades of particles; the bandgap structure and light waves in photonic crystals; scattering and enhancement of light in near-field optical microscopy; backward waves, negative refraction and perfect lensing; metamaterials and homogenization. The content of the book lies in the intersection of computational methods and applications, especially to electromagnetic problems on the nanoscale. Whenever © Springer Nature Switzerland AG 2020 I. Tsukerman, Computational Methods for Nanoscale Applications, Nanostructure Science and Technology, https://doi.org/10.1007/978-3-030-43893-7_11

645

646

11 Conclusion: “Plenty of Room at the Bottom” for Computational Methods

possible, I have tried to give a commonsense explanation of mathematical, physical and computational ideas, hoping that sizeable portions of the text will be accessible and understandable to specialists in diverse areas of nanoscale science and technology: physicists, engineers, chemists, mathematicians, numerical analysts. Many sections in the book should be suitable for graduate students doing interdisciplinary research and to undergraduate students interested in simulation and in nanoscience. Some of the material, however, is more advanced. For example, in addition to traditional techniques such as the finite element method, the new finite-difference calculus of Flexible Local Approximation MEthods (FLAME) is presented and its applications in colloidal simulation and photonics are discussed. In many cases, the accuracy of FLAME is much higher than of the standard finite-difference schemes and even of the finite element method. Another more advanced topic is finite element error estimates, with unconventional eigenvalue and singular value accuracy conditions presented in Sect. 3.14. Yet another example is the York–Yang “fast Fourier– Poisson” method, including its versions without Fourier transforms; this method is known less well than its more conventional Ewald counterparts. Due to the natural constraints of time and of my own expertise, only a limited number of topics in nanoscale simulation could be included. For example, boundary integral methods, T-matrix methods, “rigorous coupled wave analysis,” discrete dipole approximations, fast multipole algorithms and other computational techniques appear primarily in the review sections and as alternatives to other techniques that are discussed in much greater detail (Ewald summation, standard and generalized FD, FEM and FDTD). On the application side, subjects that are not discussed in the book (nanotubes, nanodots, nanocomposites, topological properties of periodic structures, and so on) would form a much longer list than the ones that are included (selected topics in molecular dynamics, colloidal systems and photonics). My goal will be achieved if some of this book’s ideas in the intersection of numerical methods and nanoscale applications help to stimulate further analytical, computational and experimental work. Not only for technology, but also for theory and simulation there is indeed “plenty of room at the bottom”.

References

[AAJ09]

[Abb82] [Abb83] [ABCM02]

[ABPS02]

[AD03]

[AF98] [AF03] [AFBA11]

[AFR00] [AG84] [AG97] [AG98] [AG99]

Mohammad A. Alsunaidi and Ahmad A. Al-Jabr. A general ADE-FDTD algorithm for the simulation of dispersive structures. IEEE Photonics Technology Letters, 21:817–819, 2009. E. Abbe. The relation of aperture and power in the microscope. Journal of the Royal Microscopical Society, 2:300–309, 1882. E. Abbe. The relation of aperture and power in the microscope (continued). Journal of the Royal Microscopical Society, 3:790–812, 1883. D. N. Arnold, F. Brezzi, B. Cockburn, and L. D. Marini. Unified analysis of discontinuous Galerkin methods for elliptic problems. SIAM J. Numer. Analysis, 39(5):1749–1779, 2002. P. Alotto, A. Bertoni, I. Perugia, and D. Schötzau. Efficient use of the local discontinuous Galerkin method for meshes sliding on a circular boundary. IEEE Trans. on Magn., 38:405–408, 2002. Maria G. Armentano and Ricardo G. Durán. Unified analysis of discontinuous Galerkin methods for elliptic problems. Num. Meth. for Partial Diff. Equations, 19(5):653–664, 2003. R. Albanese and R. Fresa. Upper and lower bounds for local electromagnetic quantities. Int. J. for Numer. Meth. in Eng., 42(3):499–515, 1998. Robert A. Adams and John J. F. Fournier. Sobolev Spaces. Amsterdam; Boston: Academic Press, 2003. Koray Aydin, Vivian E. Ferry, Ryan M. Briggs, and Harry A. Atwater. Broadband polarization-independent resonant light absorption using ultrathin plasmonic super absorbers. Nature Communications, 2:517, 2011. R. Albanese, R. Fresa, and G. Rubinacci. Local error bounds for static and stationary fields. IEEE Trans. Magn., 36(4):1615–1618, 2000. V. M. Agranovich and V. L. Ginzburg. Crystal Optics with Spatial Dispersion, and Excitons. Berlin; New York: Springer-Verlag, 1984. S. Abarbanel and D. Gottlieb. A mathematical analysis of the PML method. Journal of Computational Physics, 134(2):357–363, 1997. S. Abarbanel and D. Gottlieb. On the construction and analysis of absorbing layers in CEM. Applied Numerical Mathematics, 27(4):331–340, 1998. C. Ashcraft and R.G. Grimes. SPOOLES: an object-oriented sparse matrix library. In Proc. 1999 SIAM Conf. Parallel Processing for Scientific Computing, 1999.

© Springer Nature Switzerland AG 2020 I. Tsukerman, Computational Methods for Nanoscale Applications, Nanostructure Science and Technology, https://doi.org/10.1007/978-3-030-43893-7

647

648 [AGH02]

[AHCN05]

[AHLT05]

[AJH90]

[AK95] [AK99]

[AL98] [All92] [ALT98]

[Alu11] [ALZ15]

[AM76] [And87] [ANVA+99] [AO00] [AP98]

[App85] [AR90]

[Arf85] [Arn89] [Arn04] [ARS06]

References S. Abarbanel, D. Gottlieb, and J. S. Hesthaven. Long time behavior of the perfectly matched layer equations in computational electromagnetics. Journal of Scientific Computing, 17(1):405–422, 2002. N. Anderson, A. Hartschuh, S. Cronin, and L. Novotny. Nanoscale vibrational analysis of single-walled carbon nanotubes. J. Am. Chem. Soc., 127:2533–2537, 2005. Peter Arbenz, Ulrich L. Hetmaniuk, Richard B. Lehoucq, and Raymond S. Tuminaro. A comparison of eigensolvers for large-scale 3d modal analysis using AMGpreconditioned iterative methods. Int. J. for Numer. Meth. in Eng., 64:204–236, 2005. Todd Arbogast, Jim Douglas, Jr., and Ulrich Hornung. Derivation of the double porosity model of single phase flow via homogenization theory. SIAM J. Math. Anal., 21(4):823–836, 1990. A. Ahagon and T. Kashimoto. 3-dimensional electromagnetic-wave analysis using high-order edge elements. IEEE Trans. Magn., 31(3):1753–1756, 1995. W. Axmann and P. Kuchment. An efficient finite element method for computing spectra of photonic and acoustic band-gap materials - I. Scalar case. J. of Comp. Phys., 150(2):468–481, 1999. C. Ashcraft and J. W. H. Liu. Robust ordering of sparse matrices using multisection. SIAM J. on Matrix Analysis & Appl., 19(3):816–832, 1998. Grégoire Allaire. Homogenization and two-scale convergence. SIAM Journal on Mathematical Analysis, 23(6):1482–1518, 1992. E Allahyarov, H Lowen, and S Trigger. Effective forces between macroions: The cases of asymmetric macroions and added salt. Physical Review E, 57(5, B):5818– 5824, May 1998. A. Alu. First-principles homogenization theory for periodic metamaterials. Phys Rev B, 84:075153, 2011. Andrei Andryieuski, Andrei V Lavrinenko, and Sergei V Zhukovsky. Anomalous effective medium approximation breakdown in deeply subwavelength all-dielectric photonic multilayers. Nanotechnology, 26(18):184001, Apr 2015. Neil W. Ashcroft and N. David Mermin. Solid State Physics. Fort Worth: Saunders College Publishing, 1976. V. V. Andrievskii. On approximation of functions by harmonic polynomials. Mathematics of the USSR-Izvestiya, 51(1):1–13, 1987. M Arndt, O Nairz, J Vos-Andreae, C Keller, G van der Zouw, and A Zeilinger. Wave-particle duality of C-60 molecules. Nature, 401(6754):680–682, 1999. Mark Ainsworth and J. Tinsley Oden. A Posteriori Error Estimation in Finite Element Analysis. John Wiley & Sons, 2000. U. M. Ascher and Linda Ruth Petzold. Computer Methods for Ordinary Differential Equations and Differential-Algebraic Equations. Society for Industrial & Applied Mathematics, 1998. A. W. Appel. An efficient program for many-body simulation. SIAM J. Sci. Stat. Comput., 6:85–103, 1985. R. Albanese and G. Rubinacci. Magnetostatic field computations in terms of twocomponent vector potentials. International Journal for Numerical Methods in Engineering, 29(3):515–532, 1990. G. Arfken. Mathematical Methods for Physicists. Orlando, FL: Academic Press, 1985. Vladimir Igorevich Arnol’d. Mathematical Methods of Classical Mechanics. New York : Springer-Verlag, 1989. 2nd ed. Axel Arnold. Computer simulations of charged systems in partially periodic systems. PhD thesis, Johannes Gutenberg Universität, 2004. A. Aminian and Y. Rahmat-Samii. Spectral FDTD: a novel technique for the analysis of oblique incident plane wave on periodic structures. IEEE Transactions on Antennas and Propagation, 54(6):1818–1825, 2006.

References [Ars13] [Art80] [AS83]

[AS02]

[ASG16]

[AT13]

[AT17]

[AUS09]

[Axe96] [AYSS11] [AZ98] [BA72]

[BA76] [BA11] [Bab58] [Bab71] [Bad14]

[Bak66]

[Bas99]

[Bas00]

649 F. M. Arscott. Periodic Differential Equations: An Introduction to Mathieu, Lamé, and Allied Functions. Pergamon Press, 1964, 2013. A. M. Arthurs. Complementary Variational Principles. Oxford : Clarendon Press; New York : Oxford University Press, 1980. Milton Abramowitz and Irene Ann Stegun, editors. Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. United States Department of Commerce, National Bureau of Standards; Dover Publications, 1964–1983. S. N. Atluri and S. P. Shen. The meshless local Petrov-Galerkin (MLPG) method: A simple & less-costly alternative to the finite element and boundary element methods. CMES – Computer Modeling in Engineering & Sciences, 3(1):11–51, 2002. Wyatt Adams, Mehdi Sadatgol, and Durdu O. Guney. Review of near-field optics and superlenses for sub-diffraction-limited nano-imaging. AIP Advances, 6(10), 2016. Osama AlKhateeb and Igor Tsukerman. A boundary difference method for electromagnetic scattering problems with perfect conductors and corners. IEEE Trans on Antennas and Propagation, 61(10):5117–5126, 2013. Anthony P. Austin and Lloyd N. Trefethen. Trigonometric interpolation and quadrature in perturbed points. SIAM Journal on Numerical Analysis, 55(5):2113– 2122, 2017. Yoav Avitzour, Yaroslav A. Urzhumov, and Gennady Shvets. Wide-angle infrared absorber based on a negative-index plasmonic metamaterial. Phys. Rev. B, 79:045131, 2009. Owe Axelsson. Iterative Solution Methods. Cambridge University Press, 1996. A. Alu, A. D. Yaghjian, R. A. Shore, and M. G. Silveirimha. Causality relations in the homogenization of metamaterials. Phys Rev B, 84:054305, 2011. S. N. Atluri and T. Zhu. A new meshless local Petrov-Galerkin (MLPG) approach in computational mechanics. Computational Mechanics, 22:117–127, 1998. I. Babuška and A. K. Aziz. Survey lectures on the mathematical foundation of the finite element method. In The Mathematical Foundations of the Finite Element Method with Applications to Partial Differential Equations, pages 5–359. Academic Press, New York, 1972. I. Babuška and A.K. Aziz. On the angle condition in the finite element method. SIAM J. Numer. Analysis, 13(2):214–226, 1976. Alexandra Boltasseva and Harry A. Atwater. Low-loss plasmonic metamaterials. Science, 331(6015):290–291, 2011. Ivo Babuška. On the schwarz algorithm in the theory of differential equations of mathematical physics. Tchecosl. Math. J., 8:328–342, 1958. Ivo Babuška. Error bounds for the finite element method. Numer. Math., 16:322– 333, 1971. Zsolt Badics. Trefftz-discontinuous Galerkin and finite element multi-solver technique for modeling time-harmonic EM problems with high-conductivity regions. IEEE Transactions on Magnetics, 50(2), 2014. Nikolai Sergeevitch Bakhvalov. On the convergence of a relaxation method under natural constraints on an elliptic operator. Zhurnal vychislitel’noj matemaki i matematicheskoj fiziki, 6:861–883, 1966. Achim Basermann. Parallel preconditioned solvers for large sparse Hermitian eigenvalue problems. In Jack Dongarra and Vicente Hernandez, editors, Springer Series: Lecture Notes in Computer Science, volume 1573, pages 72–85. Springer, 1999. Achim Basermann. Parallel block ilut preconditioning for sparse eigenproblems and sparse linear systems. Num. Linear Alg. with Appl., 7:635–648, 2000.

650 [BBC78]

[BBO03]

[BCD+11]

[BCMS02]

[BCO94]

[BD92] [BDC03]

[BDD+00]

[Bec00] [BEK93]

[Bel00] [Ber66] [Ber76a] [Ber76b] [Ber78] [Ber80]

[Ber81] [Ber82] [Ber94] [Ber96]

[BFea99]

References V. V. Belousova, I. S. Bomshtein, and B. B. Chashin. On finding the sources of rotational fields in the scalar potential method. Izvestiia Akad. Nauk SSSR, Energetika i Transport (Power Engineering, pages 151–155, 1978. Ivo Babuška, Uday Banerjee, and John E. Osborn. Survey of meshless and generalized finite element methods: A unified approach. Acta Numerica, 12:1–125, 2003. Daniele Boffi, Martin Costabel, Monique Dauge, Leszek Demkowicz, and Ralf Hiptmair. Discrete compactness for the p-version of discrete differential forms. SIAM Journal on Numerical Analysis, 49(1):135–158, 2011. Stefano Boscolo, Claudio Conti, Michele Midrio, and Carlo G. Someda. Numerical analysis of propagation and impedance matching in 2-d photonic crystal waveguides with finite length. J. Lightwave Technol., 20(2):304, 2002. I. Babuška, G. Caloz, and J. E. Osborn. Special finite-element methods for a class of 2nd-order elliptic problems with rough coefficients. SIAM J. Numer. Analysis, 31(4):945–981, 1994. D. J. Bergman and K. J. Dunn. Bulk effective dielectric-constant of a composite with a periodic microgeometry. Phys Rev B, 45(23):13262–13271, 1992. D Boffi, L Demkowicz, and M Costabel. Discrete compactness for p and hp 2D edge finite elements. Mathematical Models & Methods in Applied Sciences, 13(11):1673–1687, 2003. Zhaojun Bai, James Demmel, Jack Dongarra, Axel Ruhe, and Henk van der Vorst, editors. Templates for the Solution of Algebraic Eigenvalue Problems : a Practical Guide. Society for Industrial and Applied Mathematics: Philadelphia, PA, 2000. Thomas L. Beck. Real-space mesh techniques in density-functional theory. Rev. Mod. Phys., 72(4):1041–1080, 2000. Folkmar Bornemann, Bodo Erdmann, and Ralf Kornhuber. Adaptive multilevel methods in three space dimensions. Int. J. for Numer. Meth. Eng., 36:3187–3203, 1993. L Belloni. Colloidal interactions. Journal of Physics – Condensed Matter, 12(46):R549–R587, NOV 20 2000. Stefan Bergman. Approximation of harmonic functions of three variables by harmonic polynomials. Duke Math. J., 33(2):379–387, 1966. D. J. Bergman. Calculation of bounds for some average bulk properties of composite materials. Phys Rev B, 14(10):4304–4312, 1976. D. J. Bergman. Variational bounds on some bulk properties of a 2-phase compositematerial. Phys Rev B, 14(4):1531–1542, 1976. D. J. Bergman. Dielectric-constant of a composite-material – problem in classical physics. Physics reports – Review section of Physics Letters, 43(9):378–407, 1978. D. J. Bergman. Exactly solvable microscopic geometries and rigorous bounds for the complex dielectric-constant of a 2-component composite-material. Physical Review Letters, 44(19):1285–1287, 1980. D. J. Bergman. Bounds for the complex dielectric-constant of a 2-component composite material. Physical Review B, 23(6):3058–3065, 1981. D. J. Bergman. Rigorous bounds for the complex dielectric-constant of a 2component composite. Annals of Physics, 138(1):78–114, 1982. Jean-Pierre Berenger. A perfectly matched layer for the absorption of electromagnetic waves. Journal of Computational Physics, 114(2):185–200, 1994. Jean-Pierre Berenger. Three-dimensional perfectly matched layer for the absorption of electromagnetic waves. Journal of Computational Physics, 127(2):363– 379, 1996. D. Boffi, P. Fernandes, and L. Gastaldi et al. Computational models of electromagnetic resonators: Analysis of edge element approximation. SIAM J. on Numer. Analysis, 36(4):1264–1290, 1999.

References [BFR98]

[BFTT09]

[BG] [BG19] [BGM05]

[BGT82]

[BGTR04]

[BH70]

[BH86] [BHM00] [BIPS95]

[BJ02]

[BK00]

[BKO+96]

[BL58] [BLG94] [BLP78] [BM84a]

[BM84b]

[BM84c]

651 F. Brezzi, L. P. Franca, and A. Russo. Further considerations on residual free bubbles for advective-diffusive equations. Computer Meth. in Appl. Mech. & Eng., 166:25–33, 1998. Guy Baruch, Gadi Fibich, Semyon Tsynkov, and Eli Turkel. Fourth order schemes for time-harmonic wave equations with discontinuous coefficients. Communications in Computational Physics, 5(2-4):442–455, 2009. Rick Beatson and Leslie Greengard. A short course on fast multipole methods. http://www.math.nyu.edu/faculty/greengar/. Daniele Boffi and Lucia Gastaldi. Adaptive finite element method for the Maxwell eigenvalue problem. SIAM Journal on Numerical Analysis, 57(1):478–494, 2019. A. Bossavit, G. Griso, and B. Miara. Modelling of periodic electromagnetic structures bianisotropic materials with memory effects. J. Math. Pures & Appl., 84(7):819–850, 2005. A Bayliss, M Gunzburger, and E Turkel. Boundary conditions for the numerical solution of elliptic equations in exterior regions. SIAM J Appl Math, 42(2):430– 451, 1982. M. Bathe, A. J. Grodzinsky, B. Tidor, and G. C. Rutledge. Optimal linearized Poisson–Boltzmann theory applied to the simulation of flexible polyelectrolytes in solution. J. Chem. Phys., 121(16):7557–7561, 2004. J. H. Bramble and S. R. Hilbert. Estimation of linear functionals on sobolev spaces with applications to fourier transforms and spline interpolations. SIAM J. Numer. Anal., 7:113–124, 1970. J. Barnes and P. Hut. A hierarchical O(N log N ) force-calculation algorithm. Nature, 324:446–449, 1986. William L. Briggs, Van Emden Henson, and Steve F. McCormick. A Multigrid Tutorial. Philadelphia, PA : Society for Industrial and Applied Mathematics, 2000. Ivo Babuška, Frank Ihlenburg, Ellen T Paik, and Stefan A Sauter. A generalized finite element method for solving the Helmholtz equation in two dimensions with minimal pollution. Comput Meth Appl Mech & Eng, 128:325–359, 1995. Eliane Bécache and Patrick Joly. On the analysis of Bérenger’s perfectly matched layers for Maxwell’s equations. ESAIM: Mathematical Modelling and Numerical Analysis - Modélisation Mathématique et Analyse Numérique, 36(1):87–119, 2002. A. Bossavit and L. Kettunen. Yee-like schemes on staggered cellular grids: a synthesis between fit and fem approaches. IEEE Transactions on Magnetics, 36(4):861–867, 2000. T. Belytschko, Y. Krongauz, D. Organ, M. Fleming, and P. Krysl. Meshless methods: an overview and recent developments. Computer Meth. in Appl. Mech. & Eng., 139(1–4):3–47, 1996. G. M. Bell and S. Levine. Statistical thermodynamics of concentrated colloidal solutions II. Transactions of the Faraday Society, 54:785–798, 1958. T. Belytschko, Y. Y. Lu, and L. Gu. Element-free Galerkin methods. Int. J. for Numer. Meth. Eng., 37:229–256, 1994. A. Bensoussan, J. L. Lions, and G. Papanicolaou. Asymptotic Methods in Periodic Media. North Holland, 1978. I. Babuška and A. D. Miller. The post-processing approach in the finite element method, I: Calculations of displacements, stresses and other higher derivatives of the displacements. Int. J. for Numer. Meth. Eng., 20:1085–1109, 1984. I. Babuška and A. D. Miller. The post-processing approach in the finite element method, II: The calculation of stress intensity factors. Int. J. Numer. Meth. Eng., 20:1111–1129, 1984. I. Babuška and A. D. Miller. The post-processing approach in the finite element method, III: A posteriori error estimation and adaptive mesh selection. Int. J. Numer. Meth. Eng., 20:2311–2324, 1984.

652 [BM89] [BM97] [BM08]

[BMM01] [BMS02]

[BN97] [Bof01] [Bof07]

[Bog17] [Boh74] [Boo01] [Bos] [Bos88a] [Bos88b] [Bos90] [Bos91] [Bos92] [Bos98] [Boy01] [BP53] [BP89]

[BP06]

[BP13] [BPE88]

[BPG04a]

References A. Bossavit and I. Mayergoyz. Edge-elements for scattering problems. IEEE Trans. Magn., 25(4):2816–2821, 1989. I. Babuška and J. M. Melenk. The partition of unity method. Int. J. for Numer. Meth. in Eng., 40(4):727–758, 1997. Zsolt Badics and Yoshihiro Matsumoto. Trefftz discontinuous Galerkin methods for time-harmonic electromagnetic and ultrasound transmission problems. International Journal of Applied Electromagnetics and Mechanics, 28(1-2):17–24, 2008. M. Bordag, U. Mohideen, and V.M. Mostepanenko. New developments in the casimir effect. Physics Reports, 353:1–205, 2001. C. L. Bottasso, S. Micheletti, and R. Sacco. The discontinuous Petrov– Galerkin method for elliptic problems. Computer Meth. in Appl. Mech. & Eng., 191(31):3391–3409, 2002. I. Babuška and R. Narasimhan. The babuška-brezzi condition and the patch test: an example. Computer Meth. in Appl. Mech. & Eng., 140(1-2):183–199, 1997. D. Boffi. A note on the de Rham complex and a discrete compactness property. Appl. Math. Lett., 14(1):33–38, 2001. Daniele Boffi. Approximation of eigenvalues in mixed form, Discrete Compactness Property, and application to hp mixed finite elements. Computer Methods in Applied Mechanics and Engineering, 196(37–40):3672–3681, 2007. Robert Bogue. Sensing with metamaterials: a review of recent developments. Sensor Review, 37(3):305–311, 2017. Craig F. Bohren. Light scattering by an optically active sphere. Chemical Physics Letters, 29(3):458–462, 1974. Carl De Boor. A Practical Guide to Splines. New York: Springer, 2001. A. Bossavit. Applied differential geometry: A compendium. A. Bossavit. A rationale for “edge elements” in 3-d fields computations. IEEE Trans. Magn., 24(1):74–79, 1988. A. Bossavit. Whitney forms: A class of finite elements for three-dimensional computations in electromagnetism. IEE Proc. A, 135:493–500, 1988. A. Bossavit. Solving Maxwell equations in a closed cavity, and the question of “spurious modes”. IEEE Trans. Magn., 26(2):702–705, 1990. A. Bossavit. Differential forms and the computation of fields and forces in electromagnetism. European Journal of Mechanics B – Fluids, 10(5):474–488, 1991. A. Bossavit. Edge-element computation of the force-field in deformable-bodies. IEEE Trans. Magn., 28(2):1263–1266, 1992. Alain Bossavit. Computational Electromagnetism: Variational Formulations, Complementarity, Edge Elements. San Diego: Academic Press, 1998. John P. Boyd. Chebyshev and Fourier Spectral Methods. Publisher: Dover Publications, 2001. L. Brillouin and M. Parodi. Wave Propagation in Periodic Structures. Dover, New York, 1953. N. S. Bakhvalov and G. Panasenko. Homogenisation: Averaging Processes in Periodic Media, Mathematical Problems in the Mechanics of Composite Materials. Springer, 1989. M V Berry and S Popescu. Evolution of quantum superoscillations and optical superresolution without evanescent waves. Journal of Physics A: Mathematical and General, 39(22):6965–6977, may 2006. Malcolm M. Bibby and Andrew F. Peterson. Accurate Computation of Mathieu Functions. Morgan & Claypool Publishers, 2013. S. Bassiri, C. H. Papas, and N. Engheta. Electromagnetic wave propagation through a dielectric–chiral interface and through a chiral slab. J. Opt. Soc. Am. A, 5(9):1450– 1459, 1988. E. Becache, P. G. Petropoulos, and S. D. Gedney. On the long-time behavior of unsplit perfectly matched layers. IEEE Transactions on Antennas and Propagation, 52(5):1335–1342, 2004.

References [BPG04b]

[BPX90] [BR78a] [BR78b] [BR79] [BR99] [BR01] [BR03]

[Bra77] [Bra93]

[Bre74] [BRGW82] [Bri60] [Bri92]

[BS72]

[BS79]

[BS92]

[BS94] [BS01]

[BS02] [BS03] [BS05]

653 Eliane Bécache, Peter G. Petropoulos, and Stephen D. Gedney. On the long-time behavior of unsplit perfectly matched layers. IEEE Trans. Antennas and Propagation, 52(5):1335–1342, 2004. J. H. Bramble, J. E. Pasciak, and J. Xu. Parallel multilevel preconditioners. Math. Comp., 55:1–22, 1990. I. Babuška and W.C. Rheinboldt. A-posteriori error estimates for the finite element method. Int. J. for Numer. Meth. in Eng., 12(10):1597–1615, 1978. I. Babuška and W.C. Rheinboldt. Error estimates for adaptive finite element computations. SIAM J. on Numer. Analysis, 15(4):736–754, 1978. I. Babuška and W.C. Rheinboldt. On the reliability and optimality of the finite element method. Computers & Structures, 10:87–94, 1979. G. Binning and H. Rohrer. In touch with atoms. Reviews of Modern Physics, 71:S324–S330, 1999. Roland Becker and Rolf Rannacher. An optimal control approach to a posteriori error estimation in finite element methods. Acta Numerica, 10:1–102, 2001. Wolfgang Bangerth and Rolf Rannacher. Adaptive Finite Element Methods for Differential Equations. Lectures in Mathematics ETH Zürich. Birkhäuser Verlag, Basel, 2003. Achi Brandt. Multi-level adaptive solutions to boundary-value problems. Mathematics of Computation, 31(138):333–390, apr 1977. James H. Bramble. Multigrid Methods. Harlow, Essex, England: Longman Scientific & Technical; New York: Copubished in the U.S. with J. Wiley & Sons, 1993. F. Brezzi. On the existence, uniqueness and approximation of saddle-point problems arising from lagrange multipliers. R.A.I.R.O., 8:129–151, 1974. G. Binning, H. Rohrer, Ch. Gerber, and E. Weibel. Surface studies by scanning tunneling microscopy. Phys. Rev. Lett., 49(1):57–61, 1982. Léon Brillouin. Wave Propagation and Group Velocity. Academic Press, 1960. J. L. Britton, editor. Collected Works of A.M. Turing. Pure Mathematics. With a section on Turing’s statistical work by I. J. Good. Amsterdam, etc.: North-Holland, 1992. B. V. Bokut’ and A. N. Serdyukov. On the phenomenological theory of natural optical activity. Soviet Physics JETP. [Zh. Eksp. Teor. Fiz. 61, 1808–1813, 1971], 34(5):962–964, 1972. R. E. Bank and A. H. Sherman. The use of adaptive grid refinement for badly behaved elliptic partial differential equations. In R. Vichnevetsky and R. S. Stepleman, editors, Advances in Computer Methods for Partial Differential Equations III, pages 33–39. IMACS, New Brunswick, 1979. D. J. Bergman and D. Stroud. Physical properties of macroscopically inhomogeneous media. In Solid State Physics: Advances in Research and Applications, Vol 46, volume 46 of Solid State Physics: Advances in Research and Applications, pages 147–269. Academic Press, 1992. Ivo Babuška and Manil Suri. The p and h-p versions of the Finite Element Method, basic principles and properties. SIAM Review, 36:578–632, 1994. Ivo Babuška and Theofanis Strouboulis. The Finite Element Method and Its Reliability. Oxford, [England] : Clarendon Press; New York: Oxford University Press, 2001. S. C. Brenner and L. R. Scott. The Mathematical Theory of Finite Element Methods. New York: Springer, 2002. João Pedro A. Bastos and Nelson Sadowski. Electromagnetic Modeling by Finite Element Methods. New York: Marcel Dekker, 2003. Pavel A. Belov and Constantin R. Simovski. Homogenization of electromagnetic crystals formed by uniaxial resonant scatterers. Phys. Rev. E, 72:026615, 2005.

654 [BS08] [BS10] [BS19]

[BSB96]

[BSS97]

[BSS+01]

[BSS+17] [BST04]

[BSTV07] [BSU+94]

[BSv+08]

[BSW82] [BT80] [BT05]

[BTT11]

[BTT18]

[Buh12]

[But87] [But03] [BV82] [BV83]

References Alexander I. Bobenko and Yuri B. Suris. Discrete Differential Geometry: Integrable Structure. Birkhäuser, 2008. G. Bouchitté and B. Schweizer. Homogenization of Maxwell’s equations in a split ring geometry. Multiscale Modeling & Simulation, 8(3):717–750, 2010. M. V. Berry and Pragya Shukla. Geometry of 3d monochromatic light: local wavevectors, phases, curl forces, and superoscillations. Journal of Optics, 21(6):064002, apr 2019. E. L. Briggs, D. J. Sullivan, and J. Bernholc. Real-space multigrid-based approach to large-scale electronic structure calculations. Physical Review B, 54(20):14362– 14375, 1996. Thomas C. Bishop, Robert D. Skeel, and Klaus Schulten. Difficulties with multiple time stepping and fast multipole algorithm in molecular dynamics. J. Comp. Chem., 18:1785–1791, 1997. N. A. Baker, D. Sept, J. Simpson, M. J. Holst, and J.A. McCammon. Electrostatics of nanosystems: Application to microtubules and the ribosome. PNAS, 98(18):10037–10041, 2001. K. Bhattarai, S. Silva, K. Song, A. Urbas, S. J. Lee, Z. Ku, and J. Zhou. Surface studies by scanning tunneling microscopy. Scientific Reports, 7, 2017. P. A. Belov, C. R. Simovski, and S. A. Tretyakov. Backward waves and negative refraction in photonic (electromagnetic) crystals. J. of Communications Technology and Electronics, 49(11):1199–1207, 2004. Benfeng Bai, Yuri Svirko, Jari Turunen, and Tuomas Vallius. Optical activity in planar chiral metamaterials: Theoretical study. Phys. Rev. A, 76:023811, 2007. Ivo Babuška, T. Strouboulis, C. S. Upadhyay, S. K. Gangaraj, and K. Copps. Validation of a posteriori error estimators by numerical approach. Int. J. for Numer. Methods in Eng., 37:1073–1123, 1994. ˇ M. Brehm, A. Schliesser, F. Cajko, I. Tsukerman, and F. Keilmann. Antennamediated back-scattering efficiency in infrared near-field microscopy. Opt. Express, 16(15):11203–11215, 2008. W. E. Brittin, W. R. Smythe, and W. Wyss. Poincare gauge in electrodynamics. American Journal of Physics, 50(8):693–696, 1982. A. Bayliss and E. Turkel. Radiation boundary-conditions for wave-like equations. Comm on Pure and Appl Math, 33(6):707–725, 1980. Achim Basermann and Igor Tsukerman. Parallel generalized finite element method for magnetic multiparticle problems. In M. Daydé et al., editor, Springer Series: Lecture Notes in Computational Science and Engineering, volume LNCS 3402, pages 325–339. Springer, 2005. Steven Britt, Semyon Tsynkov, and Eli Turkel. Numerical simulation of timeharmonic waves in inhomogeneous media using compact high order schemes. Communications in Computational Physics, 9(3, SI):520–541, MAR 2011. Steven Britt, Eli Turkel, and Semyon Tsynkov. A high order compact time/space finite difference scheme for the wave equation with variable speed of sound. Journal of Scientific Computing, 76(2):777–811, AUG 2018. Stefan Yoshi Buhmann. Dispersion Forces I: Macroscopic Quantum Electrodynamics and Ground-State Casimir, Casimir?Polder and van der Waals Forces. With a foreword by I.H. Brevik. Springer-Verlag, 2012. John C. Butcher. The Numerical Analysis of Ordinary Differential Equations: Runge–Kutta and General Linear Methods. John Wiley & Sons, 1987. John C. Butcher. Numerical Methods for Ordinary Differential Equations. Hoboken, NJ: J. Wiley, 2003. A. Bossavit and J.C. Vérité. A mixed FEM-BIEM method to solve 3-D eddy current problem. IEEE Trans. Magn., 18(2):431–435, 1982. A. Bossavit and J.C. Vérité. The Trifou code: solving the 3d eddy currents problems by using h as state variable. IEEE Trans. Magn., 19:2465–2470, 1983.

References [BW99] [BWC11]

[BY75] [Byk72] [Byk75] [Byk93] [BZ60] [BZA+19]

[CACLS17] [Cai13]

[CAO+03]

[CAO15]

[Cas48]

[Cas96]

[Cas19] [CBPS00]

[CBST10]

[CC77] [CCD+05]

[CD92]

655 Max Born and Emil Wolf. Principles of Optics: Electromagnetic Theory of Propagation, Interference and Diffraction of Light. Cambridge University Press, 1999. Carl A. Bauer, Gregory R. Werner, and John R. Cary. A second-order 3d electromagnetics algorithm for curved interfaces between anisotropic dielectrics on a yee mesh. Journal of Computational Physics, 230(5):2060–2075, 2011. P. Barber and C. Yeh. Scattering of electromagnetic waves by arbitrarily shaped dielectric bodies. Applied Optics, 14(12):2864–2872, 1975. V. P. Bykov. Spontaneous emission in a periodic structure. Soviet physics, JETP (Journal of Experimental and Theoretical Physics), 35(2):269–273, 1972. V. P. Bykov. Spontaneous emission from a medium with a band spectrum. Sov. J. Quant. Electron., 4(7):861–871, 1975. V. P. Bykov. Radiation of Atoms in a Resonant Environment. World Scientific, Singapore, 1993. G. I. Barenblatt and Y. T. Zheltov. Fundamental equations of filtration of homogeneous liquids in fissured rocks. Sov. Phys. Doklady, 132(3):545–548, 1960. Michael Berry, Nikolay Zheludev, Yakir Aharonov, Fabrizio Colombo, Irene Sabadini, Daniele C Struppa, Jeff Tollaksen, Edward T F Rogers, Fei Qin, Minghui Hong, Xiangang Luo, Roei Remez, Ady Arie, Jörg B Götte, Mark R Dennis, Alex M H Wong, George V Eleftheriades, Yaniv Eliezer, Alon Bahabad, Gang Chen, Zhongquan Wen, Gaofeng Liang, Chenglong Hao, C-W Qiu, Achim Kempf, Eytan Katzav, and Moshe Schwartz. Roadmap on superoscillations. Journal of Optics, 21(5):053002, apr 2019. Alberto Cabada, José Ángel Cid, and Lucia López-Somoza. Maximum Principles for the Hill’s Equation. Academic Press, 2017. Wei Cai. Computational Methods for Electromagnetic Phenomena: Electrostatics in Solvation, Scattering, and Electron Transport. Cambridge University Press, February 25, 2013. E. Cubukcu, K. Aydin, E. Ozbay, S. Foteinopolou, and C. M. Soukoulis. Subwavelength resolution in a two-dimensional photonic-crystal-based superlens. Phys. Rev. Lett., 91(20):207401, 2003. Liu Chao, Mohammed N. Afsar, and Shin-ichi Ohkoshi. Microwave and millimeter wave dielectric permittivity and magnetic permeability of epsilon-gallium-ironoxide nano-powders. Journal of Applied Physics, 117(17):17B324, 2015. H. B. G. Casimir. On the attraction between two perfectly conducting plates. Proceedings of the Koninklijke Nederlandse Akademie van Wetenschappen, B51:793– 795, 1948. P. Ponte Castañeda. Exact second-order estimates for the effective mechanical properties of nonlinear composite materials. J Mech & Phys Solids, 44(6):827– 862, 1996. Daniele Casati. Coupling Finite Elements and Auxiliary Sources. PhD thesis, ETH Zürich, 2019. P. Castillo, B.Cockburn, I. Perugia, and D. Schöotzau. An a priori error analysis of the local discontinuous Galerkin method for elliptic problems. SIAM J. on Numer. Analysis, 38(5):1676–1706, 2000. C. Classen, B. Bandlow, R. Schuhmann, and I. Tsukerman. FIT & FLAME for sharp edges in electrostatics. 2010 URSI International Symposium on Electromagnetic Theory (EMTS), pages 56–59, 2010. M. V. K. Chari and Z. J. Csendes. Finite-element analysis of skin effect in currentcarrying conductors. IEEE Transactions on Magnetics, 13(5):1125–1127, 1977. D. A. Case, T. E. Cheatham, T. Darden, H. Gohlke, R. Luo, K. M. Merz, A. Onufriev, C. Simmerling, B. Wang, and R. J. Woods. The amber biomolecular simulation programs. J. of Comput. Chem., 26(16):1668–1688, 2005. R. D. Coalson and A. Duncan. Systematic ionic screening theory of macroions. J. of Chem. Phys., 97(8):5653–5661, 1992.

656 [CF97]

[CFC19]

[CFR00]

[CGR99] [CGW+04]

[Cha13] [Che79]

[Che12] [Che13] [Che19] [Cho99] [CHP19]

[Chr95] [Chu02] [CI08] [Cia80] [CJ96] [CJMS01] [CK16] [CKL+14]

[CKNG12]

[CKP10]

References C. M. Cortis and R. A. Friesner. Numerical solution of the poisson-boltzmann equation using tetrahedral finite-element meshes. J. of Comput. Chem., 18(13):1591– 1608, 1997. Jierong Cheng, Fei Fan, and Shengjiang Chang. Recent progress on graphenefunctionalized metasurfaces for tunable phase and polarization control. Nanomaterials, 9(3), March 2019. S. Caorsi, P. Fernandes, and M. Raffetto. On the convergence of galerkin finite element approximations of electromagnetic eigenproblems. SIAM J. on Numer. Analysis, 38(2):580–607, 2000. H. Cheng, L. Greengard, and V. Rokhlin. A fast adaptive multipole algorithm in three dimensions. J. of Comp. Phys., 155(2):468–498, 1999. X. Chen, T. M. Grzegorczyk, B. I. Wu, J. Pacheco, and J. A. Kong. Robust method to retrieve the constitutive effective parameters of metamaterials. Phys Rev E, 70:016608, 2004. D. L. Chapman. A contribution to the theory of electrocapillarity. Philosophical Magazine, 25(6):475–481, 1913. V. L. Chechurin. The scalar potential method for computation of electromagnetic fields. Izvestiia Akad. Nauk SSSR, Energetika i Transport (Power Engineering, pages 95–100, 1979. Hou-Tong Chen. Interference theory of metamaterial perfect absorbers. Opt. Express, 20:7165–7172, 2012. Ruey-Lin Chern. Spatial dispersion and nonlocal effective permittivity for periodic layered metamaterials. Opt. Express, 21(14):16514–16527, 2013. Albert Chern. A reflectionless discrete perfectly matched layer. Journal of Computational Physics, 381:91–109, 2019. Tuck C. Choy. Effective Medium Theory: Principles and Applications. Oxford University Press, 1999. Scott Congreve, Paul Houston, and Ilaria Perugia. Adaptive refinement for hpVersion Trefftz discontinuous Galerkin methods for the homogeneous Helmholtz problem. Advances in Computational Mathematics, 45(1):361–393, FEB 2019. Christos Christopoulos. The Transmission-Line Modeling Method: TLM. WileyIEEE Press, 1995. T. J. Chung. Computational Fluid Dynamics. Cambridge University Press, 2002. Christophe Caloz and Tatsuo Ito. Electromagnetic Metamaterials: Transmission Line Theory and Microwave Applications. Wiley-IEEE Press, 2008. P. G. Ciarlet. The Finite Element Method for Elliptic Problems. Amsterdam; New York: North-Holland Pub. Co., 1980. W. C. Chew and J. M. Jin. Perfectly matched layers in the discretized space: An analysis and optimization. Electromagnetics, 16(4):325–340, 1996. W. C. Chew, J. M. Jin, E. Michielssen, and J. M. Song, editors. Fast and Efficient Algorithms in Computational Electromagnetics. Artech House: Boston, MA, 2001. Leilee Chojnacki and Achim Kempf. New methods for creating superoscillations. Journal of Physics A: Mathematical and Theoretical, 49(50):505203, 2016. Yonghao Cui, Lei Kang, Shoufeng Lan, Sean Rodrigues, and Wenshan Cai. Giant chiral optical response from a twisted-arc metamaterial. Nano Letters, 14(2):1021– 1025, 2014. R. V. Craster, J. Kaplunov, E. Nolde, and S. Guenneau. Bloch dispersion and high frequency homogenization for separable doubly-periodic structures. Wave Motion, 49(2):333–346, 2012. R. V. Craster, J. Kaplunov, and A. V. Pichugin. High-frequency homogenization for periodic media. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Science, 466(2120):2341–2362, 2010.

References [CKS00]

[CKSU05]

[CLC+18]

[CM96] [CM98a] [CM98b] [CM07] [Col66] [Col97] [Col04]

[Cor84] [COS+12]

[CR72]

[CR73]

[CS09] [CTY16] [CV02] [CW71] [CW86]

[CW94]

[CW02]

657 B. Cockburn, G. E. Karniadakis, and C.-W. Shu. The development of discontinuous Galerkin methods. In B. Cockburn, G. E. Karniadakis, and C.-W. Shu, editors, Discontinuous Galerkin Methods. Theory, Computation and Applications, volume 11 of Lecture Notes in Comput. Sci. Engrg., pages 3–50. Springer-Verlag, New York, 2000. H. Cohn, R. Kleinberg, B. Szegedy, and C. Umans. Group-theoretic algorithms for matrix multiplication. In Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 379–388, 2005. Shumei Chen, Guixin Li, Kok Wai Cheah, Thomas Zentgraf, and Shuang Zhang. Controlling the phase of optical nonlinearity with plasmonic metasurfaces. Nanophotonics, 7(6, SI):1013–1024, JUN 2018. L. Cordes and B. Moran. Treatment of material discontinuity in the element-free Galerkin method. Comput. Meth. Appl. Mech. Engng., 139:75–89, 1996. Gary Cohen and Peter Monk. Gauss point mass lumping schemes for Maxwell’s equations. Num. Meth. for Partial Diff. Equations, 14(1):63–88, 1998. F. Collino and P. B. Monk. Optimizing the perfectly matched layer. Comput. Methods Appl. Mech. Engrg., 164:157–171, 1998. Y. Capdeville and J.-J. Marigo. Second order homogenization of the elastic wave equation for non-periodic layered media. Geophys. J. Int., 170:823–838, 2007. Lothar Collatz. The Numerical Treatment of Differential Equations. New York: Springer, 1966. James B. Cole. High accuracy solution of Maxwell’s equations using nonstandard finite differences. Computers in Physics, 11(3):287–292, 1997. James B. Cole. High-accuracy FDTD solution of the absorbing wave equation, and conducting Maxwell’s equations based on a nonstandard finite-difference model. IEEE Trans. on Antennas & Propagation, 52(3):725–729, 2004. F. H. J. Cornish. The Poincare and related gauges in electromagnetic theory. American Journal of Physics, 52(5):460–462, 1984. A. V. Chebykin, A. A. Orlov, C. R. Simovski, Yu. S. Kivshar, and P. A. Belov. Nonlocal effective parameters of multilayered metal-dielectric metamaterials. Phys. Rev. B, 86:115420, 2012. P. G. Ciarlet and P.-A. Raviart. General Lagrange and Hermite interpolation in Rn with applications to finite element methods. Arch. Rational Mech. Anal., 46:177– 199, 1972. M. Crouzeix and P. A. Raviart. Conforming and nonconforming finite element methods for solving the stationary Stokes equation. RAIRO Anal. Numer., 7, R3:33–76, 1973. MR 49:8401. Wenshan Cai and Vladimir Shalaev. Optical Metamaterials: Fundamentals and Applications. Springer, 2009. Hou-Tong Chen, Antoinette J Taylor, and Nanfang Yu. A review of metasurfaces: physics and applications. Reports on Progress in Physics, 79(7):076401, jun 2016. Carlos Conca and M. Vanninathan. Fourier approach to homogenization problems. ESAIM: Control, Optimisation and Calculus of Variations, 8:489–511, 2002. P. G. Ciarlet and C. Wagschal. Multipoint taylor formulas and applications to the finite element method. Numer. Math., 17:84–100, 1971. I. R. Ciric and S. H. Wong. Inversion transformation for the finite-element solution of three-dimensional exterior-field problems. COMPEL – The International Journal for Computation and Mathematics in Electrical and Electronic Engineering, 5(2):109–119, 1986. Weng Cho Chew and William H. Weedon. A 3d perfectly matched medium from modified Maxwell’s equations with stretched coordinates. Microwave and Optical Technology Letters, 7(13):599–604, 1994. M. Clemens and T. Weiland. Magnetic field simulation using conformal FIT formulations. IEEE Trans. Magn., 38(2):389–392, 2002.

658 [CXS93] [CXZL07]

[CYK08]

[CZO+10]

[Dav75]

[Dav06] [DB01]

[DBO00]

[DCPRB94]

[DD74]

[DECB98]

[Dem97] [Dem06] [Dem07]

[DER89] [DEW99]

[DEW+06]

[DF52] [DF94] [DF03] [DFP82]

References Rongqing Chen, Zhizhan Xu, and Lan Sun. Finite-difference scheme to solve Schrödinger equations. Phys. Rev. E, 47(5):3799–3802, 1993. M. Chai, T. Xiao, G. Zhao, and Q. H. Liu. A hybrid pstd/adi-cfdtd method for mixed-scale electromagnetic problems. IEEE Transactions on Antennas and Propagation, 55(5):1398–1406, 2007. Guang Chen, Ping Yang, and George W. Kattawar. Application of the pseudospectral time-domain method to the scattering of light by nonspherical particles. J. Opt. Soc. Am. A, 25(3):785–790, 2008. Hou-Tong Chen, Jiangfeng Zhou, John F. O’Hara, Frank Chen, Abul K. Azad, and Antoinette J. Taylor. Antireflection coating using metamaterials and identification of its mechanism. Phys. Rev. Lett., 105:073901, 2010. E. R. Davidson. The iterative calculation of a few of the lowest eigenvalues and corresponding eigenvectors of large real symmetric matrices. J. Comput. Phys., 17:87–94, 1975. Timothy A. Davis. Direct Methods for Sparse Linear Systems. Philadelphia: SIAM, Society for Industrial and Applied Mathematics, 2006. Suvranua De and Klaus-Jürgena Bathe. Towards an efficient meshless computational technique: the method of finite spheres. Engineering Computations, 18(1– 2):170–192, 2001. C. A. Duarte, I. Babuška, and J. T. Oden. Generalized finite element methods for three-dimensional structural mechanics problems. Computers & Structures, 77(2):215–232, 2000. Andrea Dal Corso, Michel Posternak, Raffaele Resta, and Alfonso Baldereschi. Ab initio study of piezoelectricity and spontaneous polarization in zno. Phys. Rev. B, 50:10715–10721, 1994. J. Douglas and T. Dupont. Galerkin approximations for the two point boundary problem using continuous, piecewise polynomial spaces. Numer. Math., 22:99– 109, 1974. Costas D. Dimitropoulos, Brian J. Edwards, Kyung-Sun Chae, and Antony N. Beris. Efficient pseudospectral flow simulations in moderately complex geometries. J. of Comp. Phys., 144:517–549, 1998. James W. Demmel. Applied Numerical Linear Algebra. Society for Industrial & Applied Mathematics, 1997. Leszek Demkowicz. Computing with hp-Adaptive Finite Elements: Volume 1, Oneand Two-Dimensional Elliptic and Maxwell Problems. Chapman & Hall/Crc, 2006. Alan Demlow. Sharply localized pointwise and W-infinity(-1) estimates for finite element methods for quasilinear problems. Mathematics of Computation, 76(260):1725–1741, 2007. I. S. Duff, A. M. Erisman, and J. K. Reid. Direct Methods for Sparse Matrices. Oxford University Press, 1989. A. Doicu, Yu.A. Eremin, and T. Wriedt. Convergence of the T-matrix method for light scattering from a particle on or near a surface. Optics Communications, 159:266–277, 1999. Gunnar Dolling, Christian Enkrich, Martin Wegener, Costas M. Soukoulis, and Stefan Linden. Low-loss negative-index metamaterial at telecommunication wavelengths. Optics Letters, 31(12):1800–1802, 2006. G. Toraldo Di Francia. Super-gain antennas and optical resolving power. Il Nuovo Cimento (1943-1954), 9(3):426–438, Mar 1952. B. Draine and P. Flatau. Discrete-dipole approximation for scattering calculations. J. Opt. Soc. Am. A, 11:1491–1499, 1994. B. Draine and Piotr J. Flatau. User guide to the discrete dipole approximation code DDSCAT 6.0. arXiv:astro-ph/0309069, 2003. J. B. Davies, F. A. Fernandez, and G. Y. Philippou. Finite-element analysis of all modes in cavities with circular symmetry. IEEE Trans. on MTT, 30(11):1975– 1980, 1982.

References [DG05]

[DGdA11]

[DGK16] [DH98a]

[DH98b] [DH01]

[DHM00]

[DHM+04]

[DK00] [DK01] [DL41]

[DL04]

[DL17] [dLPS80a]

[dLPS80b]

[dLPS86]

[DLS14]

[DLSW12]

[DM65] [DM97]

659 M. Dorica and D. D. Giannacopoulos. Impact of mesh quality improvement systems on the accuracy of adaptive finite-element electromagnetics with tetrahedra. IEEE Trans. Magn., 41(5):1692–1695, 2005. Christin David and F. Javier Garcia de Abajo. Spatial nonlocality in the optical response of metal nanoparticles. Journal of Physical Chemistry C, 115(40):19470– 19475, OCT 13 2011. V. Druskin, S. Güttel, and L. Knizhnerman. Near-optimal perfectly matched layers for indefinite Helmholtz problems. SIAM Review, 58(1):90–116, 2016. Markus Deserno and Christian Holm. How to mesh up Ewald sums. I. a theoretical and numerical comparison of various particle mesh routines. J. Chem. Phys., 109:7678–7693, 1998. Markus Deserno and Christian Holm. How to mesh up Ewald sums. II. an accurate error estimate for the P3M algorithm. J. Chem. Phys., 109:7694–7701, 1998. Markus Deserno and Christian Holm. Cell model and poisson-boltzmann theory: A brief introduction. In C. Holm, P. K’ekicheff, and R. Podgornik, editors, NATO Science Series II – Mathematics, Physics and Chemistry, volume 46. Kluwer, Dordrecht, 2001. Markus Deserno, Christian Holm, and Sylvio May. Fraction of condensed counterions around a charged rod: comparison of Poisson–Boltzmann theory and computer simulations. Macromolecules, 33:199–205, 2000. J. Dobnikar, D. Haložan, M.Brumen, H.-H. von Grünberg, and R. Rzehak. Poisson– Boltzmann Brownian dynamics of charged colloids in suspension. Computer Physics Communications, 159:73–92, 2004. Z. H. Duan and R. Krasny. An Ewald summation based multipole method. J. Chem. Phys., 113(9):3492–3495, 2000. Z. H. Duan and R. Krasny. An adaptive treecode for computing nonbonded potential energy in classical molecular systems. J. Comp. Chem., 22(2):184–195, 2001. B. V. Derjaguin and L. Landau. Theory of the stability of strongly charged hydrophobic sols and of the adhesion of strongly charged particles in solutions of electrolytes. Acta Physicochimica (USSR), 14:633–662, 1941. R. A. Depine and A. Lakhtakia. A new condition to identify isotropic dielectricmagnetic materials displaying negative phase velocity. Microwave Opt. Technol. Lett., 41:315–316, 2004. Zi-Lan Deng and Guixin Li. Metasurface optical holography. Materials Today: Physics, 3:16–32, DEC 2017. S. W. de Leeuw, J. W. Perram, and E. R. Smith. Simulation of electrostatic systems in periodic boundary conditions. I. Lattice sums and dielectric constants. Proc. Royal Soc. London A, 373:27–56, 1980. S. W. de Leeuw, J. W. Perram, and E. R. Smith. Simulation of electrostatic systems in periodic boundary conditions. II. Equivalence of boundary conditions. Proc. Royal Soc. London A, 373:57–66, 1980. S. W. de Leeuw, J. W. Perram, and E. R. Smith. Computer simulation of the static dielectric constant of systems with permanent electric dipoles. Ann. Rev. Phys. Chem., 37:245–270, 1986. T. Dohnal, A. Lamacz, and B. Schweizer. Bloch-wave homogenization on large time scales and dispersive effective wave equations. Multiscale Modeling & Simulation, 12(2):488–513, 2014. A. Demlow, D. Leykekhman, A. H. Schatz, and L. B. Wahlbin. Best approximation property in the W-infinity(1) norm for finite element methods on graded meshes. Mathematics of Computation, 81(278):743–764, 2012. M. Danos and L.C. Maximon. Multipole matrix elements of the translation operator. J. Math. Phys., 6:766–778, 1965. S. Dey and R. Mittra. A locally conformal finite-difference time-domain (fdtd) algorithm for modeling three-dimensional perfectly conducting objects. IEEE Microwave and Guided Wave Letters, 7(9):273–275, 1997.

660 [DM99]

[DMC97]

[DMW08]

[DO96] [Dod76] [Dol61]

[Dou55] [Dou96] [DP01]

[DPB18] [DPWT11]

[DRV84]

[DT06]

[DT08] [DTP97]

[DTRS07]

[DTvS08] [Dub90]

[Dud94] [DvG02]

References S. Dey and R. Mittra. A conformal finite-difference time-domain technique for modeling cylindrical dielectric resonators. IEEE Trans. MTT, 47(9):1737–1739, 1999. S. Dey, R. Mittra, and S. Chebolu. A technique for implementing the fdtd algorithm on a nonorthogonal grid. Microwave and Optical Technology Letters, 14(4):213– 215, 1997. H. De Gersem, I. Munteanu, and T. Weiland. Construction of differential material matrices for the orthogonal finite-integration technique with nonlinear materials. IEEE Transactions on Magnetics, 44(6):710–713, 2008. C. A. Duarte and J. T. Oden. h-p adaptive method using clouds. Computer Meth. in Appl. Mech. & Eng., 139:237–262, 1996. J. Dodziuk. Finite-difference approach to the Hodge theory of harmonic forms. Amer. J. Math., 98(1):79–104, 1976. L. S. Dolin. On the possibility of comparison of three-dimensional electromagnetic systems with nonuniform anisotropic filling. Izvestiya Vysshikh Uchebnykh Zavedenii. Radiofizika, 4:964–967, 1961. Jim Douglas, Jr. On the numerical integration of ∂ 2 u/∂ x 2 + ∂ 2 u/∂ y 2 = ∂u/∂t by implicit methods. J. Soc. Indust. Appl. Math., 3:42–65, 1955. Craig C. Douglas. Multigrid methods in science and engineering. IEEE Comput. Sci. Eng., 3(4):55–68, 1996. D. C. Dobson and J. E. Pasciak. Analysis of an algorithm for computing electromagnetic Bloch modes using Nedelec spaces. Comput. Meth. in Appl. Math., 1(2):138–153, 2001. Fei Ding, Anders Pors, and Sergey I. Bozhevolnyi. Gradient metasurfaces: a review of fundamentals and applications. Reports on Progress in Physics, 81(2), 2018. Jianhua Dai, Helder Pinheiro, J. P. Webb, and Igor Tsukerman. Flexible approximation schemes with numerical and semi-analytical bases. COMPEL, 30(2):552–573, 2011. K. S. Demirchian, Yu. V. Rakitskii, and S. P. Voskoboynikov. Numerical solution of boundary value problems using spatial matrices [in russian]. Izv. Akad. Nauk SSSR, Energetika i Transport. [Power Engineering (USSR Academy of Sciences)], (5):63–75, 1984. Jianhua Dai and Igor Tsukerman. Flexible difference schemes with numerical bases for electrostatic particle interactions. In Proceedings of Twelfth Biennial IEEE Conference on Electromagnetic Field Computation (CEFC 2006), 2006. J. Dai and I. Tsukerman. Flexible approximation schemes with adaptive grid refinement. IEEE Transactions on Magnetics, 44(6):1206–1209, 2008. T. A. Darden, A. Toukmaji, and L. G. Pedersen. Long-range electrostatic effects in biomolecular simulations. Journal de Chimie Physique et de Physico-Chimie Biologique, 94:1346–1364, 1997. J. Dai, I. Tsukerman, A. Rubinstein, and S. Sherman. New computational models for electrostatics of macromolecules in solvents. IEEE Trans. Magn., 43(4):1217– 1220, 2007. ˇ Jianhua Dai, Igor Tsukerman, František Cajko, and Mark I. Stockman. Electrodynamic effects in plasmonic nanolenses. Phys. Rev. B, 77:115419, 2008. François Dubois. Discrete vector potential representation of a divergence-free vector field in three-dimensional domains: Numerical analysis of a model problem. SIAM Journal on Numerical Analysis, 27(5):1103–1141, 1990. Donald G. Dudley. Mathematical Foundations for Electromagnetic Theory. WileyIEEE Press, 1994. Markus Deserno and Hans-Hennig von Grünberg. Osmotic pressure of charged colloidal suspensions: A unified approach to linearized poisson-boltzmann theory. Phys. Rev. E, 66(1):011401, 2002.

References [DWE]

[DWE06]

[DWSL07] [DYDB18]

[DYP93]

[Eas73] [EB05] [Efr04] [EGA+13]

[EGJK07] [EJ88] [EJ97a]

[EJ97b]

[EJ08] [EKS+15]

[EKSW15]

[EM77] [EPB+95] [ES83] [ETA16]

[EVK06]

661 Adrian Doicu, Thomas Wriedt, and Yuri A. Eremin. The null-field method with discrete sources. Fortran code. An extension of the T-Matrix Method to compute light scattering by arbitrarily shaped dielectric particles. https://scattport.org/ index.php/programs-menu/t-matrix-codes-menu/239-nfm-ds. Adrian Doicu, Thomas Wriedt, and Yuri A. Eremin. Light Scattering by Systems of Particles: Null-Field Method with Discrete Sources: Theory and Programs. Springer, 2006. G. Dolling, M. Wegener, C.M. Soukoulis, and S. Linden. Negative-index metamaterial at 780 nm wavelength. Optics Letters, 32(1):53–55, 2007. Fei Ding, Yuanqing Yang, Rucha A. Deshpande, and Sergey I. Bozhevolnyi. A review of gap-surface plasmon metasurfaces: fundamentals and applications. Nanophotonics, 7(6):1129–1156, 2018. Tom Darden, Darrin York, and Lee Pedersen. Particle mesh Ewald: An n log(n) method for Ewald sums in large systems. J. of Chem. Phys., 98(12):10089–10092, 1993. M. S. P. Eastham. The Spectral Theory of Periodic Differential Equations. Scottish Academic Press: Edinburgh and London, 1973. G. V. Eleftheriades and K. G. Balmain. Negative Refraction Metamaterials: Fundamental Principles and Applications. Wiley-IEEE Press, 2005. A. L. Efros. Comment II on “resonant and antiresonant frequency dependence of the effective parameters of metamaterials”. Phys. Rev. E, 70:048602, 2004. Sandra Eibenberger, Stefan Gerlich, Markus Arndt, Marcel Mayor, and Jens Tuexen. Matter-wave interference of particles selected from a molecular library with masses exceeding 10 000 amu. Physical Chemistry Chemical Physics, 15(35):14696–14700, 2013. T. Emig, N. Graham, R. L. Jaffe, and M. Kardar. Casimir forces between arbitrary compact objects. Phys. Rev. Lett., 99:170403, 2007. N. Engheta and D. L. Jaggard. Electromagnetic chirality and its applications. IEEE Antennas and Propagation Society Newsletter, 30(5):6–12, 1988. Alexandre Elmkies and Patrick Joly. Finite elements and mass lumping for Maxwell’s equations: the 2D case. Comptes Rendus de l’Academie des Sciences Series I Mathematics, 324(11):1287–1293, 1997. Alexandre Elmkies and Patrick Joly. Finite elements and mass lumping for Maxwell’s equations: the 3D case. Comptes Rendus de l’Academie des Sciences Series I Mathematics, 325(11):1217–1222, 1997. T Emig and R L Jaffe. Casimir forces between arbitrary compact objects. Journal of Physics A: Mathematical and Theoretical, 41(16):164001, 2008. Herbert Egger, Fritz Kretzschmar, Sascha M. Schnepp, Igor Tsukerman, and Thomas Weiland. Transparent boundary conditions for a discontinuous Galerkin Trefftz method. Applied Mathematics and Computation, 267:42–55, SEP 15 2015. Herbert Egger, Fritz Kretzschmar, Sascha M. Schnepp, and Thomas Weiland. A space-time discontinuous Galerkin Trefftz method for time dependent Maxwell’s equations. SIAM Journal on Scientific Computing, 37(5):B689–B711, 2015. Bjorn Engquist and Andrew Majda. Absorbing boundary conditions for the numerical simulation of waves. Math. Comp., 31:629–651, 1977. U. Essmann, L. Perera, M.L. Berkowitz, T. Darden, H. Lee, and L.G. Pedersen. A smooth particle mesh Ewald method. J. Chem. Phys., 103(19):8577–8593, 1995. C. R. I. Emson and J. Simkin. An optimal method for 3-d eddy currents. IEEE Trans. Magn., 19(6):2450–2452, 1983. Itai Epstein, Yuval Tsur, and Ady Arie. Surface-plasmon wavefront and spectral shaping by near-field holography. Laser & Photonics Reviews, 10(3):360–381, 2016. R. Esteban, R. Vogelgesang, and K. Kern. Simulation of optical near and far fields of dielectric apertureless scanning probes. Nanotechnology, 17(2):475–482, 2006.

662 [Ewa21]

[Fab98] [FAL99]

[FB97] [FB05] [FB14] [FCC17]

[Fed61]

[Fed64]

[Fei02] [Fel00] [Fen10] [FEVM01]

[Fey59] [FF63] [FGBB09] [FGW73] [FHF01] [FK99] [FK04]

[FKT+16]

References P. Ewald. Die Berechnung optischer und elektrostatischer Gitterpotentiale. [Evaluation of optical and electrostatic lattice potentials]. Ann. Phys. Leipzig, 64:253– 287, 1921. I. L. Fabelinskii. Seventy years of combination (Raman) scattering. Physics Uspekhi, 41(12):1229–1247, 1998. Nordin Félidj, Jean Aubard, and Georges Lévi. Discrete dipole approximation for ultraviolet-visible extinction spectra simulation of silver and gold colloids. J. of Chem. Phys., 111(3):1195–1208, 1999. F. Fogolari and J. M. Briggs. On the variational approach to Poisson-Boltzmann free energies. Chemical Physics Letters, 281:135–139, 1997. Didier Felbacq and Guy Bouchitté. Theory of mesoscopic magnetism in photonic crystals. Phys Rev Lett, 94:183902, 2005. Asaf Farhi and David J. Bergman. Analysis of a Veselago lens in the quasistatic regime. Phys. Rev. A, 90:013806, 2014. Fei Fan, Sai Chen, and Sheng-Jiang Chang. A review of magneto-optical microstructure devices at Terahertz frequencies. IEEE Journal of Selected Topics in Quantum Electronics, 23(4), Jul-Aug 2017. Radii Petrovich Fedorenko. A relaxation method for solving elliptic difference equations. Zhurnal Vychislitel’noj Matemaki i Matematicheskoj Fiziki, 1:922–927, 1961. English translation: USSR Computational Math. and Math. Physics, vol. 1, 1962, pp. 1092–1096. Radii Petrovich Fedorenko. The speed of convergence of one iteration process. Zhurnal Vychislitel’noj Matemaki i Matematicheskoj Fiziki, 4:559–563, 1964. English translation: USSR Computational Math. and Math. Physics, vol. 4, 1964, pp. 227–235. Evgenii L. Feinberg. The forefather (about Leonid Isaakovich Mandelstam). Physics-Uspekhi, 45:81–100, 2002. D. Felbacq. Anomalous homogeneous behaviour of metallic photonic crystals. Journal of Physics A: Mathematical and General, 33(4):815–821, 2000. S. Feng. Graphical retrieval method for orthorombic anisotropic materials. Optics Express, 18(16):17009–17019, 2010. F. Fogolari, G. Esposito, P. Viglino, and H. Molinari. Molecular mechanics and dynamics of biomolecules using a solvent continuum model. J. of Comput. Chem., 22(15):1830–1842, 2001. Richard Phillips Feynman. There’s plenty of room at the bottom. In APS meeting, December 29, 1959, 1959. D. K. Faddeev and V. N. Faddeeva. Computational Methods of Linear Algebra. W. H. Freeman: San Francisco, 1963. D. Felbacq, B. Guizal, G. Bouchitté, and C. Bourel. Resonant homogenization of a dielectric metamaterial. Microwave and Opt Tech Lett, 51:2695–2701, 2009. G. J. Fix, S. Gulati, and G.I. Wakoff. On the use of singular functions with finite elements approximations. J. Comput. Phys., 13:209–228, 1973. C. Farhat, I. Harari, and L. P. Franca. The discontinuous enrichment method. Computer Meth. in Appl. Mech. & Eng., 190:6455–6479, 2001. Paulo J. S. G. Ferreira and Achim Kempf. Superoscillations: faster than the nyquist rate. IEEE Transactions on Signal Processing, 54(10):3732–3740, 1999. Takeshi Fujisawa and Masanori Koshiba. Time-domain beam propagation method for nonlinear optical propagation analysis and its application to photonic crystal circuits. J. of Lightwave Technology, 22(2):684–691, 2004. A. Fedoseyev, E. J. Kansa, S. Tsynkov, S. Petropavlovskiy, M. Osintcev, U. Shumlak, and W. D. Henshaw. A universal framework for non-deteriorating time-domain numerical algorithms in Maxwell’s electrodynamics. AIP Conference Proceedings, 1773(1):020001, 2016.

References [FL89]

[Fla97] [FLS89]

[FLSZ05] [FLZB97]

[FM67] [FM03] [For75] [F.R79]

[Fra69] [Fri05] [FRR+06]

[FS03] [FS05]

[FS10] [FSA19] [FSD+18]

[Fus92] [Gan90] [Gap10] [Gar84] [GAW+19]

663 E. M. Freeman and D. A. Lowther. An open boundary technique for axisymmetric and three dimensional magnetic and electric field problems. IEEE Transactions on Magnetics, 25(5):4135–4137, 1989. Piotr J. Flatau. Improvements in the discrete-dipole approximation method of computing scattering and absorption. Optics Letters, 22:1205–1207, 1997. Richard Phillips Feynman, Robert B. Leighton, and Matthew L. Sands. The Feynman Lectures on Physics. Redwood City, Calif.: Addison-Wesley, 1989. vol. 1: Mechanics, radiation, and heat; vol. 2: Electromagnetism and matter; vol. 3: Quantum mechanics. Nicholas Fang, Hyesog Lee, Cheng Sun, and Xiang Zhang. Sub-diffraction-limited optical imaging with a silver superlens. Science, 308(5721):534–537, 2005. F. Figueirido, R. M. Levy, R. H. Zhou, and B. J. Berne. Large scale simulation of macromolecules in solution: Combining the periodic fast multipole method with multiple time step integrators. J. of Chem. Phys., 106(23):9835–9849, 1997. George E. Forsythe and Cleve B. Moler. Computer Solution of Linear Algebraic Systems. Englewood Cliffs, N.J. : Prentice-Hall, 1967. D. R. Fredkin and I. D. Mayergoyz. Resonant behavior of dielectric objects (electrostatic resonances). Phys. Rev. Lett., 91(25):253902, 2003. B. Fornberg. On a fourier method for the integration of hyperbolic equations. SIAM Journal on Numerical Analysis, 12(4):509–528, 1975. Lord Rayleigh F.R.S. XXXI. Investigations in optics, with special reference to the spectroscope. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 8(49):261–274, 1879. U. Kreibig & C. Von Fragstein. Limitation of electron mean free path in small silver particles. Zeitschrift fuer Physik, 224(4):307–23, 1969. Gary Friedman. Private communication, 2002–2005. A. Farjadpour, David Roundy, Alejandro Rodriguez, M. Ibanescu, Peter Bermel, J. D. Joannopoulos, Steven G. Johnson, and G. W. Burr. Improving accuracy by subpixel smoothing in the finite-difference time domain. Opt. Lett., 31(20):2972– 2974, Oct 2006. S. Foteinopoulou and C. M. Soukoulis. Negative refraction and left-handed behavior in two-dimensional photonic crystals. Phys. Rev. B, 67(23):235107, 2003. S. Foteinopoulou and C. M. Soukoulis. Electromagnetic wave propagation in twodimensional photonic crystals: A study of anomalous refractive effects. Physical Review B (Condensed Matter and Materials Physics), 72(16):165112, 2005. Chris Fietz and Gennady Shvets. Current-driven metamaterial homogenization. Physica B: Cond Mat, 405(14):2930–2934, 2010. T. Fujima, N. Shimizu, and H. Arimatsu. p-n control of AlMgB14 -based thermoelectric materials by metal site occupancy. Materials, 12:632, 2019. Florian Fröwis, Pavel Sekatski, Wolfgang Dür, Nicolas Gisin, and Nicolas Sangouard. Macroscopic quantum states: Measures, fragility, and implementations. Rev. Mod. Phys., 90:025004, 2018. M. Fushiki. Molecular dynamics simulations for charged colloidal dispersions. J. Chem. Phys., 97(2):6700–6713, 1992. Felix R. Gantmakher. The Theory of Matrices, vol. 1. American Mathematical Society, 1990. Sergey V. Gaponenko. Introduction to Nanophotonics. Cambridge University Press, 2010. E. C. Gartland, Jr. Computable pointwise error bounds and the Ritz method in one dimension. SIAM Journal on Numerical Analysis, 21(1):84–100, 1984. Wenpei Gao, Christopher Addiego, Hui Wang, Xingxu Yan, Yusheng Hou, Dianxiang Ji, Colin Heikes, Yi Zhang, Linze Li, Huaixun Huyan, Thomas Blum, Toshihiro Aoki, Yuefeng Nie, Darrell G. Schlom, Ruqian Wu, and Xiaoqing Pan. Real-space charge-density imaging with sub-ångström resolution by four-dimensional electron microscopy. Nature, 575:480–484, 2019.

664 [Gaz81] [Gbu18] [GC15] [GCA+17]

[GCGTLC19]

[GD03]

[GDLM93]

[Gea67] [Gea71] [Ged96]

[Ged11] [GGG00]

[GGZ+11]

[GH02]

[GH14]

[Gin62] [Giv91] [Giv92] [Giv99] [Giv01] [Giv04] [GK90]

References J. Gazdag. Modeling of the acoustic wave equation with transform methods. Geophysics, 46:854–859, 1981. G. Gbur. Using superoscillations for superresolved imaging and subwavelength focusing. Nanophotonics, 8:205–225, 2018. Patrice Genevet and Federico Capasso. Holographic optical metasurfaces: a review of current progress. Reports on Progress in Physics, 78(2):024401, 2015. Patrice Genevet, Federico Capasso, Francesco Aieta, Mohammadreza Khorasaninejad, and Robert Devlin. Recent advances in planar optics: from plasmonic to dielectric metasurfaces. Optica, 4(1):139–152, 2017. Alfredo Gonzalez-Calderon, Enrique Gonzalez-Tovar, and Marcelo LozadaCassou. Very long-range attractive and repulsive forces in model colloidal dispersions. European Physical Journal – Special Topics, 227(15-16):2375–2390, 2019. Nail A. Gumerov and Ramani Duraiswami. Recursions for the computation of multipole translation and rotation coefficients for the 3-D Helmholtz equation. SIAM J. Sci. Comput., 25(4):1344–1381, 2003. M. K. Gilson, M. E. Davis, B. A. Luty, and J. A. McCammon. Computation of electrostatic forces on solvated molecules using the Poisson-Boltzmann equation. J. Phys. Chem., 97:3591–3600, 1993. C. William Gear. The numerical integration of ordinary differential equations. Math. Comp., 21(98):146–156, 1967. C. William Gear. Numerical Initial Value Problems in Ordinary Differential Equations. Englewood Cliffs, N.J., Prentice-Hall, 1971. S. D. Gedney. An anisotropic perfectly matched layer-absorbing medium for the truncation of FDTD lattices. IEEE Transactions on Antennas and Propagation, 44(12):1630–1639, 1996. Stephen D. Gedney. Introduction to the Finite-Difference Time-Domain (FDTD) Method for Electromagnetics. Morgan & Claypool Publishers, 2011. S. González García, B. García Olmedo, and R. Gómez Martín. A time-domain near- to far-field transformation for FDTD in two dimensions. Microwave and Optical Technology Letters, 27(6):427–432, 2000. Majid Gharghi, Christopher Gladden, Thomas Zentgraf, Yongmin Liu, Xiaobo Yin, Jason Valentine, and Xiang Zhang. A carpet cloak for visible light. Nano Letters, 11(7):2825–2828, 2011. Leslie F. Greengard and Jingfang Huang. A new version of the Fast Multipole Method for screened Coulomb interactions in three dimensions. J. Comp. Phys., 180:642–658, 2002. Claude J. Gittelson and Ralf Hiptmair. Dispersion analysis of plane wave discontinuous Galerkin methods. International Journal for Numerical Methods in Engineering, 98(5):313–323, MAY 4 2014. V. L. Ginzburg. Propagation of electromagnetic waves in plasma. Gordon and Breach, 1962. Dan Givoli. Non-reflecting boundary conditions. Journal of Computational Physics, 94(1):1–29, 1991. D. Givoli. Numerical Methods for Problems in Infinite Domains. Elsevier, 1992. D. Givoli. Recent advances in the dtn fe method. Archives of Computational Methods in Engineering, 6(2):71–116, 1999. Dan Givoli. High-order nonreflecting boundary conditions without high-order derivatives. J. Comput. Phys., 170(2):849–870, 2001. Dan Givoli. High-order local non-reflecting boundary conditions: a review. Wave Motion, 39(4):319–326, 2004. Dan Givoli and Joseph B. Keller. Non-reflecting boundary conditions for elastic waves. Wave Motion, 12(3):261–279, 1990.

References [GK95] [GK96] [GK98] [GKA+12]

[GL81] [GL89] [GL96] [GLe] [GLOP96]

[GMKH05] [GN03] [GNS02]

[GNV02] [Gou10] [GPN01]

[GR87a] [GR87b] [GR97] [Gra09]

[Gre48] [Gre87]

[Gro65]

665 M. Grote and J. Keller. Exact nonreflecting boundary conditions for the time dependent wave equation. SIAM Journal on Applied Mathematics, 55(2):280–297, 1995. Marcus J. Grote and Joseph B. Keller. Nonreflecting boundary conditions for timedependent scattering. Journal of Computational Physics, 127(1):52–65, 1996. Marcus J Grote and Joseph B Keller. Nonreflecting boundary conditions for Maxwell’s equations. Journal of Computational Physics, 139(2):327–342, 1998. Sergey A. Gredeskul, Yuri S. Kivshar, Ara A. Asatryan, Konstantin Y. Bliokh, Yuri P. Bliokh, Valentin D. Freilikher, and Ilya V. Shadrivov. Anderson localization in metamaterials and other complex media (Review Article). Low Temperature Physics, 38(7):570–602, 2012. Alan George and Joseph W-H Liu. Computer Solution of Large Sparse Positive Definite Systems. Englewood Cliffs, N.J., Prentice-Hall, 1981. Alan George and Joseph W. H. Liu. The evolution of the minimum degree ordering algorithm. SIAM Review, 31(1):1–19, 1989. Gene H. Golub and Charles F. Van Loan. Matrix Computations. The Johns Hopkins University Press: Baltimore, MD, 1996. Cleve Ashcraft, Roger Grimes, Joseph Liu, and Jim Patterson et al. Spooles 2.2: SParse object oriented linear equations solver. M. Gyimesi, D. Lavers, D. Ostergaard, and T. Pawlak. Hybrid finite element – Trefftz method for open boundary anslysis. IEEE Trans. Magn., 32(3):671–674, 1996. R. Gajic, R. Meisels, F. Kuchar, and K. Hingerl. Refraction and rightness in photonic crystals. Opt. Express, 13:8596–8605, 2005. Dan Givoli and Beny Neta. High-order nonreflecting boundary conditions for the dispersive shallow water equations. J Comput Appl Math, 158(1):49–60, 2003. A. Yu. Grosberg, T. T. Nguyen, and B. I. Shklovskii. Colloquium: The physics of charge inversion in chemical and biological systems. Reviews of Modern Physics, 74(2):329–345, 2002. N. Garcia and M. Nieto-Vesperinas. Left-handed materials do not make a perfect lens. Phys. Rev. Lett., 88(20):207403, 2002. G. Gouy. Sur la constitution de la charge électrique à la surface d’un électrolyte. Journal de physique théorique et appliqué, 9:457–468, 1910. J. A. Grant, B. T. Pickup, and A. Nicholls. A smooth permittivity function for Poisson–Boltzmann solvation methods. J. Comp. Chem., 22(6):608–640, 2001. and references therein. S.K. Godunov and V.S. Ryabenkii. Difference Schemes: an Introduction to the Underlying Theory. Amsterdam; New York: Elsevier Science Pub. Co., 1987. L. Greengard and V. Rokhlin. A fast algorithm for particle simulations. J. Comput. Phys., 73:325–348, 1987. L. Greengard and V. Rokhlin. A new version of the fast multipole method for the laplace equation in three dimensions. Acta Numerica, 6:229–269, 1997. S. Gratkowski. Asymptotyczne warunki brzegowe dla stacjonarnych zagadnie´n elektromagnetycznych w obszarach nieograniczonych - algorytmy metody elementów sko´nczonych. Wydawnictwo Uczelniane Zachodniopomorskiego Uniwersytetu Technologicznego, 2009. H. J. Greenberg. The determination of upper and lower bounds for the solution of the Dirichlet problem. J. Math. Phys., 27:161–182, 1948. Leslie Greengard. The rapid evaluation of potential fields in particle systems. PhD thesis, The Massachusetts Institute of Technology, 1987. Association for Computing Machinery distingished dissertations. S. R. De Groot. The Maxwell equations: Nonrelativistic multipole expansion to all orders. Physica, 31(6):953–958, 1965.

666 [GRSP95]

[GS00]

[GS02a]

[GS02b] [GvdV00] [GW03] [GW05] [GWO01] [GWP97]

[Hac85] [Hac18] [Had02a]

[Had02b]

[Had18] [Haf99a] [Haf99b] [Hag99]

[Häg10]

[Hal13] [Har93] [Har01] [HB94] [HBC03]

References Griffin K. Gothard, Sadasiva M. Rao, Tapan K. Sarkar, and Magdalena Salazar Palma. Finite element solution of open region electrostatic problems incorporating the measured equation of invariance. IEEE Microwave and Guided Wave Letters, 5(8):252–254, 1995. M. Griebel and M. A. Schweitzer. A particle-partition of unity method for the solution of elliptic, parabolic and hyperbolic PDE. SIAM J. Sci. Comp., 22(3):853– 890, 2000. M. Griebel and M. A. Schweitzer. A particle-partition of unity method-part II: efficient cover construction and reliable integration. SIAM J. Sci. Comp., 23(5):1655– 1682, 2002. M. Griebel and M. A. Schweitzer. A particle-partition of unity method-part III: a multilevel solver. SIAM J. Sci. Comp., 24(2):377–409, 2002. Gene H. Golub and Henk A. van der Vorst. Eigenvalue computation in the 20th century. Journal of Computational and Applied Mathematics, 123:35–65, 2000. S. E. Golowich and M. I. Weinstein. Homogenization expansion for resonances of microstructured photonic waveguides. J. Opt. Soc. Am. B, 20(4):633–647, 2003. S. E. Golowich and M. I. Weinstein. Scattering resonances of microstructures and homogenization theory. Multiscale Model. Simul., 3(3):477–521, 2005. M. Gyimesi, Jian-She Wang, and D. Ostergaard. Hybrid p-element and Trefftz method for capacitance computation. IEEE Trans. Magn., 37(5):3680–3683, 2001. R. Graglia, D. R. Wilton, and A. F. Peterson. Higher order interpolatory vector bases for computational electromagnetics. IEEE Trans. Antennas and Prop., 45:329– 342, 1997. Wolfgang Hackbusch. Multi-grid Methods and Applications. Berlin ; New York : Springer-Verlag, 1985. Wolfgang Hackbusch. Iterative Solution of Large Sparse Systems of Equations. Springer, 2018. G. Ronald Hadley. High-accuracy finite-difference equations for dielectric waveguide analysis I: uniform regions and dielectric interfaces. Journal of Lightwave Technology, 20(7):1210–1218, 2002. G. Ronald Hadley. High-accuracy finite-difference equations for dielectric waveguide analysis II: dielectric corners. Journal of Lightwave Technology, 20(7):1219– 1231, 2002. Yakir Hadad. Private communication, 2018. Christian Hafner. MaX-1: A Visual Electromagnetics Platform for PCs. Wiley, 1999. Christian Hafner. Post-modern Electromagnetics: Using Intelligent MaXwell Solvers. Wiley, 1999. T. Hagstrom. Radiation boundary conditions for the numerical simulation of waves. In A. Iserlis, editor, Acta Numerica, volume 8, pages 47–106, Cambridge, 1999. Cambridge University Press. Jon Häggblad. Boundary and interface conditions for electromagnetic wave propagation using FDTD. Licentiate thesis in numerical analysis, KTH, School of Computer Science and Communication, 2010. Brian C. Hall. Quantum Theory for Mathematicians. Springer, 2013. Roger F. Harrington. Field Computation by Moment Methods. Wiley – IEEE Press, 1993. Roger F. Harrington. Time-Harmonic Electromagnetic Fields. Wiley-IEEE Press, 2001. Lutz Hecht and Laurence D. Barron. Rayleigh and raman optical activity from chiral surfaces. Chemical Physics Letters, 225(4):525–530, 1994. Andrew A. Houck, Jeffrey B. Brock, and Isaac L. Chuang. Experimental observations of a left-handed material that obeys snell’s law. Phys. Rev. Lett., 90(13):137401, 2003.

References [HC01]

[HCS90] [HE88] [HE12]

[Her00] [HFT04]

[HGH+12]

[HGRB14]

[HGS+08]

[HH98]

[HHT03]

[Hig86] [Hig87] [Hig02] [Hip01] [HJ90] [HJ94] [HK71] [HKBJK03]

[HL07]

[HMC16]

667 Kyu-Pyung Hwang and A. C. Cangellaris. Effective permittivities for second-order accurate fdtd equations at dielectric interfaces. IEEE Microwave and Wireless Components Letters, 11(4):158–160, April 2001. K. M. Ho, C. T. Chan, and C. M. Soukoulis. Existence of a photonic gap in periodic dielectric structures. Phys. Rev. Lett., 65(25):3152–3155, 1990. R. W. Hockney and J. W. Eastwood. Computer Simulation Using Particles. Taylor & Francis, Inc.: Bristol, PA, USA, 1988. Jon Häggblad and Björn Engquist. Consistent modeling of boundaries in acoustic finite-difference time-domain simulations. The Journal of the Acoustical Society of America, 132(3):1303–1310, 2012. Ismael Herrera. Trefftz method: A general theory. Numer. Methods Partial Differential Eq., 16:561–580, 2000. Derek Halverson, Gary Friedman, and Igor Tsukerman. Local approximation matching for open boundary problems. IEEE Trans. Magn., 40(4):2152–2154, 2004. Klaus Hornberger, Stefan Gerlich, Philipp Haslinger, Stefan Nimmrichter, and Markus Arndt. Colloquium: Quantum interference of clusters and molecules. Rev. Mod. Phys., 84:157–173, 2012. Thomas Hagstrom, Dan Givoli, Daniel Rabinovich, and Jacobo Bielak. The double absorbing boundary method. Journal of Computational Physics, 259:220–241, 2014. Thomas H. Hand, Jonah Gollub, Soji Sajuyigbe, David R. Smith, and Steven A. Cummer. Characterization of complementary electric field coupled resonant surfaces. Applied Physics Letters, 93(21):212504, 2008. Thomas Hagstrom and S. I. Hariharan. A formulation of asymptotic and exact boundary conditions using local operators. Appl Numer Math, 27(4):403–416, 1998. Thomas Hagstrom, S. I. Hariharan, and David Thompson. High-order radiation boundary conditions for the convective wave equation in exterior domains. SIAM J. Sci. Comput., 25(3):1088–1101 (electronic), 2003. Robert L. Higdon. Absorbing boundary conditions for difference approximations to the multidimensional wave equation. Math. Comp., 47(176):437–459, 1986. Robert L. Higdon. Numerical absorbing boundary conditions for the wave equation. Math. Comp., 49(179):65–90, 1987. Nicholas J. Higham. Accuracy and Stability of Numerical Algorithms. Society for Industrial & Applied Mathematics, 2002. R. Hiptmair. Discrete Hodge operators. Numer. Math., 90:265–289, 2001. Roger A. Horn and Charles R. Johnson. Matrix Analysis. Cambridge [England]; New York: Cambridge University Press, 1990. Roger A. Horn and Charles R. Johnson. Topics in Matrix Analysis. Cambridge University Press, 1994. Kenneth M. Hoffman and Ray Kunze. Linear Algebra. Pearson, 1971. Christopher L. Holloway, Edward F. Kuester, James Baker-Jarvis, and Pavel Kabos. A double negative (DNG) composite medium composed of magnetodielectric spherical particles embedded in a matrix. IEEE Transactions on Antennas and Propagation, 51(10):2596–2603, 2003. Thomas Hagstrom and Stephen Lau. Radiation boundary conditions for Maxwell’s equations: a review of accurate time-domain formulations. Journal of Computational Mathematics, 25(3):305–336, 2007. Davit Harutyunyan, Graeme W. Milton, and Richard V. Craster. Highfrequency homogenization for travelling waves in periodic media. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 472(2191):2016.0066, 2016.

668 [HMOG08]

[HMP11]

[HMP13]

[HMP16a]

[HMP16b] [HMPS14]

[HN95] [Hol77] [Hol84] [HPM+12]

[HR98a]

[HR98b] [HR14]

[HrW93]

[HS99] [HSKP+14]

[HSL+00]

[HT95] [Hug95]

[HW04]

References Thomas Hagstrom, Assaf Mar-Or, and Dan Givoli. High-order local absorbing conditions for the wave equation: extensions and improvements. J. Comput. Phys., 227(6):3322–3357, 2008. R. Hiptmair, A. Moiola, and I. Perugia. Plane wave discontinuous Galerkin methods for the 2d Helmholtz equation: Analysis of the p-version. SIAM Journal on Numerical Analysis, 49(1):264–284, 2011. Ralf Hiptmair, Andrea Moiola, and Ilaria Perugia. Error analysis of Trefftzdiscontinuous Galerkin methods for the time-harmonic Maxwell equations. Math. Comp., 82:247–268, 2013. R. Hiptmair, A. Moiola, and I. Perugia. Plane wave discontinuous Galerkin methods: Exponential convergence of the hp-version. Foundations of Computational Mathematics, 16(3):637–675, Jun 2016. Ralf Hiptmair, Andrea Moiola, and Ilaria Perugia. A Survey of Trefftz Methods for the Helmholtz Equation, pages 237–279. Springer International Publishing, 2016. Ralf Hiptmair, Andrea Moiola, Ilaria Perugia, and Christoph Schwab. Approximation by harmonic polynomials in star-shaped domains and exponential convergence of Trefftz hp-DG FEM. Esaim-Mathematical Modelling and Numerical AnalysisModelisation Mathematique et Analyse Numerique, 48(3):727–752, 2014. B. Honig and A. Nicholls. Classical electrostatics in biology and chemistry. Science, 268(5214):1144–1149, 1995. R. Holland. Threde: A free-field emp coupling and scattering code. IEEE Transactions on Nuclear Science, 24(6):2416–2421, 1977. R. Holland. Implicit three-dimensional finite differencing of Maxwell’s equations. IEEE Transactions on Nuclear Science, 31(6):1322–1326, 1984. O. Hess, J. B. Pendry, S. A. Maier, R. F. Oulton, J. M. Hamm, and K. L. Tsakmakidis. Active nanoplasmonic metamaterials. Nature Materials, 11:573–584, 2012. John H. Henderson and Sadasiva M. Rao. Electrostatic solution for threedimensional arbitrarily shaped conducting bodies using finite element and measured equation of invariance. IEEE Transactions on Antennas and Propagation, 46(11):1660–1664, 1998. Tomasz Hrycak and Vladimir Rokhlin. An improved fast multipole algorithm for potential fields. SIAM Journal on Scientific Computing, 19(6):1804–1826, 1998. Jon Häggblad and Olof Runborg. Accuracy of staircase approximations in finitedifference methods for wave propagation. Numerische Mathematik, 128(4):741– 771, 2014. E. Hairer, S. P. Nørsett, and G. Wanner. Solving Ordinary Differential Equations – Stiff and Differential-Algebraic Problems. Berlin; New York: Springer-Verlag, 1993. J. M. Hyman and M. Shashkov. Mimetic discretizations for Maxwell’s equations. Journal of Computational Physics, 151(2):881–909, 1999. Hanan Herzig Sheinfux, Ido Kaminer, Yonatan Plotnik, Guy Bartal, and Mordechai Segev. Subwavelength multilayer dielectrics: Ultrasensitive transmission and breakdown of effective-medium theory. Phys. Rev. Lett., 113:243901, Dec 2014. T. Hirono, Y. Shibata, W. W. Lui, S. Seki, and Y. Yoshikuni. The second-order condition for the dielectric interface orthogonal to the yee-lattice axis in the fdtd scheme. IEEE Microwave and Guided Wave Letters, 10(9):359–361, 2000. I. Harari and E. Turkel. Fourth order accurate finite difference methods for timeharmonic wave propagation. J. Comp. Phys., 119:252–270, 1995. T. J. R. Hughes. Multiscale phenomena: Green’s functions, the Dirichlet-toNeumann formulation, subgrid-scale models, bubbles and the origins of stabilized methods. Computer Meth. in Appl. Mech. & Eng., 127:387–401, 1995. Thomas Hagstrom and Timothy Warburton. A new auxiliary variable formulation of high-order local radiation boundary conditions: corner compatibility conditions and extensions to first-order systems. Wave Motion, 39(4):327–338, 2004.

References [HWG10]

[HWL+10]

[HY94a]

[HY94b] [HY04] [HZ09] [Iag62] [IB03]

[IB12] [IHH+04]

[IK13] [Ise96] [Isr92] [ITJ05]

[Jac46]

[Jac30] [Jac99] [Jaf05] [Jam70] [Jam76] [JC72] [Jin02] [Jir78] [JJ01]

669 Thomas Hagstrom, Timothy Warburton, and Dan Givoli. Radiation boundary conditions for time-dependent waves based on complete plane wave expansions. J Comp Appl Math, 234(6):1988–1995, 2010. Jiaming Hao, Jing Wang, Xianliang Liu, Willie J. Padilla, Lei Zhou, and Min Qiu. High performance optical absorber based on a plasmonic metamaterial. Applied Physics Letters, 96(25):251104, 2010. T. B. Hansen and A. D. Yaghjian. Planar near-field scanning in the time-domain. Part 2: Sampling theorems and computation schemes. IEEE Trans. Antennas Propagat, 42(9):1292–1300, 1994. T. B. Hansen and A. D. Yaghjian. Planar near-field scanning in the time-domain. Part1: Formulation. IEEE Trans. Antennas Propagat, 42(9):1280–1291, 1994. Louis A. Hageman and David M. Young. Applied Iterative Methods. Dover Publications, 2004. Fu Min Huang and Nikolay I. Zheludev. Super-resolution without evanescent waves. Nano Lettersg, 9(3):1249–1254, 2009. A. M. Iaglom. An Introduction to the Theory of Stationary Random Functions. New York, Dover Publications, 1973, c1962. P. Ingelstrom and A. Bondeson. Goal-oriented error estimation and h-adaptivity for Maxwell’s equations. Computer Meth. in Appl. Mech. & Eng., 192(22–24):2597– 2616, 2003. Nathan Ida and Joao P. A. Bastos. Electromagnetics and Calculation of Fields. Lecture Notes in Statistics. Springer, 1997, 2012. Taro Ichimura, Norihiko Hayazawa, Mamoru Hashimoto, Yasushi Inouye, and Satoshi Kawata. Tip-enhanced coherent anti-stokes raman scattering for vibrational nanoimaging. Phys. Rev. Lett., 92(22):220801, 2004. Yu.A. Il’inskii and L.V. Keldysh. Electromagnetic Response of Material Media. Springer Science & Business Media, 2013. Arieh Iserles. A First Course in the Numerical Analysis of Differential Equations. Cambridge [England]; New York: Cambridge University Press, 1996. Jacob Israelachvili. Intermolecular and Surface Forces with Applications to Colloidal and Biological Systems. London; San Diego, CA: Academic Press, 1992. Akira Ishimaru, John Rhodes Thomas, and Sermsak Jaruwatanadilok. Electromagnetic waves over half-space metamaterials of arbitrary permittivity and permeability. IEEE Transactions on Antennas and Propagation, 53(3):915–921, 2005. C. G. J. Jacobi. Über ein leichtes verfahren, die in der theorie der sä"cularstörungen vorkommenden gleichungen numerisch aufzulösen. J. reine angew. Math, 30:51– 94, 1846. D. Jackson. The Theory of Approximation, volume 11 of Colloquium Publications – American Mathematical Society. American Mathematical Society, 1930. John David Jackson. Classical Electrodynamics. New York: Wiley, 1999. R. L. Jaffe. Casimir effect and the quantum vacuum. Physical Review D, 72(2), 2005. Pierre Jamet. On the convergence of finite-difference approximations to onedimensional singular boundary-value problems. Numer. Math., 14:355–378, 1970. Pierre Jamet. Estimations de l’erreur pour des éléments finis droits preque dégénérés. Rairo Anal. Numer., 10:43–60, 1976. P. B. Johnson and R. W. Christy. Optical constants of the noble metals. Phys. Rev. B, 6(12–15):4370–4379, 1972. Jianming Jin. The Finite Element Method in Electromagnetics. Wiley – IEEE Press, 2002. J. Jirousek. Basis for development of large finite elements locally satisfying all field equations. Comp. Meth. Appl. Mech. Eng., 14:65–92, 1978. Steven G. Johnson and J. D. Joannopoulos. Block-iterative frequency-domain methods for Maxwell’s equations in a planewave basis. Opt. Express, 8(3):173– 190, 2001.

670 [JKO94] [JL77] [Joh87] [Joh10] [JT93]

[JTUM92]

[JW76] [JZ97] [KAAF18]

[Kam99] [Kap04]

[Kar61] [KB82] [KB98] [KBA01] [KCZS03]

[KEB96]

[KFZA13]

[KG89] [Khi59] [Khu15] [Khu17]

[KL93] [KM01]

References V. V. Jikov, S. M. Kozlov, and O. A. Oleinik. Homogenization of Differential Operators and Integral Functionals. Springer-Verlag: Berlin; New York, 1994. J. Jirousek and N. Leon. A powerful finite element for plate bending. Comp. Meth. Appl. Mech. Eng., 12:77–96, 1977. Sajeev John. Strong localization of photons in certain disordered dielectric superlattices. Phys. Rev. Lett., 58(23):2486–2489, 1987. Steven G. Johnson. Notes on perfectly matched layers (PMLs), 2010. T. G. Jurgens and A. Taflove. Three-dimensional contour fdtd modeling of scattering from single and multiple bodies. IEEE Transactions on Antennas and Propagation, 41(12):1703–1708, 1993. T. G. Jurgens, A. Taflove, K. Umashankar, and T. G. Moore. Finite-difference time-domain modeling of curved surfaces (em scattering). IEEE Transactions on Antennas and Propagation, 40(4):357–366, 1992. F. A. Jenkins and H. E. White. Fundamentals of Optics. McGraw-Hill, 1976. J. Jirousek and A.P. Zielinski. Survey of Trefftz-type element formulations. Computers & Structures, 63:225–242, 1997. Seyedeh Mahsa Kamali, Ehsan Arbabi, Amir Arbabi, and Andrei Faraon. A review of dielectric optical metasurfaces for wavefront control. Nanophotonics, 7(6, SI):1041–1068, JUN 2018. A. Kameari. Symmetric second order edge elements for triangles and tetrahedra. IEEE Trans. Magn., 35(3):1394–1397, 1999. Igor Kaporin. The aggregation and cancellation techniques as a practical tool for faster matrix multiplication. Theoretical Computer Science, 315(2-3):469–510, 2004. S. N. Karp. A convergent ‘farfield’ expansion for two-dimensional radiation functions. Communications on Pure and Applied Mathematics, XIV:427–434, 1961. D. D. Kosloff and E. Baysal. Forward modeling by a Fourier method. Geophysics, 47(10):1402–1412, 1982. Y. Krongauz and T. Belytschko. EFG approximation with discontinuous derivatives. Int. J. Numer. Meth. Engng., 41:1215–1233, 1998. George Karniadakis, Ali Beskok, and Narayan Aluru. Micro Flows. Springer, 2001. K. Lance Kelly, E. Coronado, L. L. Zhao, and G. C. Schatz. The optical properties of metal nanoparticles: The influence of size, shape, and dielectric environment. Journal of Physical Chemistry B, 107(3):668–677, 2003. L. R. Petzold K. E. Brenan, S. L. Campbell. Numerical Solution of Initial-Value Problems in Differential-Algebraic Equations. Philadelphia: Society for Industrial and Applied Mathematics, 1996. G. R. Keiser, K. Fan, X. Zhang, and R. D. Averitt. Towards dynamic, tunable, and nonlinear metamaterials via near field interactions: A review. Journal of Infrared Millimeter and Terahertz Waves, 34(11):709–723, 2013. Joseph B. Keller and Dan Givoli. Exact non-reflecting boundary conditions. Journal of Computational Physics, 82(1):172–192, 1989. N. A. Khizhnyak. Artificial anisotropic dielectrics formed from two-dimensional lattices of infinite bars and rods. Sov. Phys. Tech. Phys., 29:604–614, 1959. Jacob B. Khurgin. How to deal with the loss in plasmonics and metamaterials. Nature Nanotechnology, 10(1):2–6, 2015. Jacob B. Khurgin. Replacing noble metals with alternative materials in plasmonics and metamaterials: how good an idea? Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 375(2090):20160068, 2017. Karl S. Kunz and Raymond J. Luebbers. The Finite Difference Time Domain Method for Electromagnetics. CRC Press: Boca Raton, FL, 1993. Jörg P. Kottmann and Olivier J. F. Martin. Retardation-induced plasmon resonances in coupled nanoparticles. Optics Letters, 26:1096–1098, 2001.

References

671

[KME+05]

Th. Koschny, P. Markoš, E. N. Economou, D. R. Smith, D. C. Vier, and C. M. Soukoulis. Impact of inherent periodic structure on effective medium description of left-handed and related metamaterials. Phys. Rev. B, 71:245105, 2005. G. L. Klimchitskaya, U. Mohideen, and V. M. Mostepanenko. The Casimir force between real materials: Experiment and theory. Reviews of Modern Physics, 81(4):1827–1885, Oct-Dec 2009. Fritz Kretzschmar, Andrea Moiola, Ilaria Perugia, and Sascha M. Schnepp. A priori error analysis of space-time Trefftz discontinuous Galerkin methods for wave problems. IMA Journal of Numerical Analysis, 36(4):1599–1635, 2016. Vladimir Kamotski, Karsten Matthies, and Valery P. Smyshlyaev. Exponential homogenization of linear second order elliptic PDEs with periodic coefficients. SIAM J. Math. Anal., 38(5):1565–1587, 2007. T. Koschny, P. Markos, D. R. Smith, and C. M. Soukoulis. Resonant and antiresonant frequency dependence of the effective parameters of metamaterials. Phys Rev E, 68:065602(R), 2003. Muamer Kadic, Graeme W. Milton, Martin van Hecke, and Martin Wegener. 3d metamaterials. Nature Reviews Physics, 1(3):198–210, 2019. A. V. Knyazev. Preconditioned eigensolvers — an oxymoron? Electronic Transactions on Numerical Analysis, 7:104–123, 1998. A. V. Knyazev. Toward the optimal preconditioned eigensolver: Locally optimal block preconditioned conjugate gradient method. SIAM Journal on Scientific Computing, 23(2):517–541, 2001. K Kim, Y Nakayama, and R Yamamoto. Direct numerical simulations of electrophoresis of charged colloids. Physical Review Letters, 96(20), May 26 2006. Heinz-Otto Kreiss and Joseph Oliger. Comparison of accurate methods for the integration of hyperbolic equations. Tellus, 24(3):199–215, 1972. J. Komorowski. On finite-dimensional approximations of the exterior differential, codifferential and laplacian on a riemannian manifold. Bull. Acad. Polonaise Sc. (Math., Astr., Phys.), 23(9):999–1005, 1975. A. Konrad. The numerical solution of steady-state skin effect problems – an integrodifferential approach. IEEE Transactions on Magnetics, 17(1):1148–1152, 1981. Jin Au Kong. Electromagnetic Wave Theory. Wiley, 1986. J. Korringa. Early history of multiple scattering theory for ordered systems. Phys. Rep., 238(6):341–360, 1994. Peter Robert Kotiuga. Hodge Decompositions and Computational Electromagnetics. PhD thesis, McGill University, Montreal, Canada, 1985. W. Kohn and N. Rostoker. Solution of the schrödinger equation in periodic lattices with an application to metallic lithium. Phys. Rev., 94(5):1111–1120, 1954. H. A. Kramers. Quantum Mechanics. Interscience, New York, 1957. M. V. Kostin and V. V. Shevchenko. Artificial magnetic material based on ring currents. Sov. J. of Communications Technology and Electronics, 33(12):38–42, 1988. M. V. Kostin and V. V. Shevchenko. Theory of artificial magnetic substances based on ring currents. Sov. J. of Communications Technology and Electronics, 38:78–83, 1993. Soon-Cheol Kong, Jamesina J. Simpson, and Vadim Backman. ADE-FDTD scattered-field formulation for dispersive materials. IEEE Microwave and Wireless Components Letters, 18(1):4–6, 2008. Fritz Kretzschmar, Sascha M Schnepp, Herbert Egger, Farzad Ahmadi, Nabil Nowak, Vadim Markel, and Igor Tsukerman. The Power of Trefftz Approximations: Finite Difference, Boundary Difference and Discontinuous Galerkin Methods; Nonreflecting Conditions and Non-Asymptotic Homogenization, volume 9045, pages 47–58. Springer Series: Lecture Notes in Computer Science, 2015.

[KMM09]

[KMPS16]

[KMS07]

[KMSS03]

[KMvHW19] [Kny98] [Kny01]

[KNY06] [KO72] [Kom75]

[Kon81]

[Kon86] [Kor94] [Kot85] [KR54] [Kra57] [KS88]

[KS93]

[KSB08]

[KSE+15]

672 [KSTW14]

[KT93] [KTH00]

[Kuc93] [Kum03]

[Kˇri92] [KV95] [KW16] [KWC80] [KWKS08]

[Lad69] [Lak06] [Lam91] [Lam99] [Lam12]

[LDFH05] [LdSB+13]

[LED+06]

[Lek96] [Leo06] [LeV96] [LeV02a] [Lev02b]

References Fritz Kretzschmar, Sascha M. Schnepp, Igor Tsukerman, and Thomas Weiland. Discontinuous Galerkin methods with Trefftz approximations. Journal of Computational and Applied Mathematics, 270:211–222, 2014. Fourth International Conference on Finite Element Methods in Engineering and Sciences (FEMTEC 2013). A. Konrad and I. A. Tsukerman. Comparison of high-frequency and low-frequency electromagnetic-field analysis. Journal de Physique III, 3(3):363–371, 1993. M. Koshiba, Y. Tsuji, and M. Hikari. Time-domain beam propagation method and its application to photonic crystal circuits. IEEE J. of Lightwave Technology, 18(1):102–110, 2000. P. A. Kuchment. Floquet Theory for Partial Differential Equations. Springer, 1993. Manoj Kumar. A new finite difference method for a class of singular two-point boundary value problems. Applied Mathematics and Computation, 143(2-3):551– 557, 2003. M. Kˇrižek. On the maximum angle condition for linear tetrahedral elements. SIAM J. Numer. Analysis, 29(2):513–520, 1992. Uwe Kreibig and Michael Vollmer. Optical Properties of Metal Clusters. Springer, 1995. J. Kaschke and M. Wegener. Optical and infrared helical metamaterials. Nanophotonics, 5:510–523, 2016. M. Kerker, D. S. Wang, and H. Chew. Surface enhanced raman-scattering (sers) by molecules adsorbed at spherical particles. Applied Optics, 19:3373–3388, 1980. D.-H. Kwon, D. H. Werner, A. V. Kildishev, and V. M. Shalaev. Material parameter retrieval procedure for general bi-isotropic metamaterials and its application to optical chiral negative-index metamaterial design. Opt. Express, 16:11822, 2008. O. A. Ladyzhenskaya. The Mathematical Theory of Viscous Incompressible Flows. Gordon and Breach, London, 1969. Akhlesh Lakhtakia. Boundary-value problems and the validity of the post constraint in modern electromagnetism. Optik, 117(4):188–192, 2006. John Denholm Lambert. Numerical methods for Ordinary Differential Systems: the Initial Value Problem. Chichester; New York: Wiley, 1991. S. K. Lamoreaux. Resource letter CF-1: Casimir force. American Journal of Physics, 67(10):850–861, 1999. Steve K. Lamoreaux. The casimir force and related effects: The status of the finite temperature correction and limits on new long-range forces. Annual Review of Nuclear and Particle Science, 62(1):37–56, 2012. B. Lombardet, L. A. Dunbar, R. Ferrini, and R. Houdré. Fourier analysis of Bloch wave propagation in photonic crystals. J. Opt. Soc. Am. B, 22:1179–1190, 2005. Felix J. Lawrence, C. Martijn de Sterke, Lindsay C. Botten, R. C. McPhedran, and Kokou B. Dossou. Modeling photonic crystal interfaces and stacks: impedancebased approaches. Adv. Opt. Photon., 5(4):385–455, Dec 2013. Stefan Linden, Christian Enkrich, Gunnar Dolling, Matthias W. Klein, Jiangfeng Zhou, Thomas Koschny, Costas M. Soukoulis, Sven Burger, Frank Schmidt, and Martin Wegener. Photonic metamaterials: Magnetism at optical frequencies. IEEE J. of Selected Topics in Quantum Electronics, 12(6):1097–1105, 2006. J. Lekner. Optical properties of isotropic chiral media. Pure and Applied Optics: Journal of the European Optical Society, Part A, 5(4):417–443, 1996. Ulf Leonhardt. Optical conformal mapping. Science, 312(5781):1777–1780, 2006. Randall J. LeVeque. Numerical Methods for Conservation Laws. Birkhauser, 1996. Randall J. LeVeque. Finite Volume Methods for Hyperbolic Problems. Cambridge [England]; New York: Cambridge University Press, 2002. Yan Levin. Electrostatic correlations: from plasma to biology. Reports on Progress in Physics, 65(11):1577–1632, 2002.

References [Lew47] [LGG13]

[Lie93a] [Lie93b] [Lif56] [Lim03]

[Liu97] [Liu02] [LJJP02]

[LJZ95] [LLN03]

[LLP84] [LLR18]

[LM03]

[LMLW17]

[LMO13]

[LR67]

[LR80] [LS15]

[LS18]

[LSB03]

673 L. Lewin. The electrical constants of a material loaded with spherical particles. Proc. Inst. Elec. Eng., 94:65–68, 1947. Yan Liu, Sébastien Guenneau, and Boris Gralak. Artificial dispersion via highorder homogenization: magnetoelectric coupling and magnetism from dielectric layers. Proc. R. Soc. A, 469:2013.0240, 2013. A. Liebsch. Surface-plasmon dispersion and size dependence of Mie resonance: Silver versus simple metals. Phys. Rev. B, 48(15):11317–11328, 1993. A. Liebsch. Surface plasmon dispersion of Ag. Phys. Rev. Lett., 71(1):145–148, 1993. E. M. Lifshitz. The theory of molecular attractive forces between solids. Sov. Phys. JETP, 2:73–83, 1956. T. C. Lim. The relationship between Lennard-Jones (12-6) and Morse potential functions. Zeitschrift für Naturforschung A. A Journal of Physical Sciences, 58(11):615–617, Nov 2003. Q. H. Liu. The PSTD algorithm: A time-domain method requiring only two cells per wavelength. Microwave and Optical Technology Letters, 15(3):158–165, 1997. G. R. Liu. Mesh Free Methods: Moving Beyond the Finite Element Method. CRC Press, 2002. See Chapter 7 for Meshless Local Petrov-Galerkin method. Chiyan Luo, Steven G. Johnson, J. D. Joannopoulos, and J. B. Pendry. All-angle negative refraction without negative effective index. Phys. Rev. B, 65(20):201104, May 2002. W. Liu, S. Jun, and Y. Zhang. Reproducing kernel particle methods. Int. J. Numer. Meth. Fluids, 20:1081–1106, 1995. Larry A. Lambe, Richard Luczak, and John W. Nehrbass. A new finite difference method for the Helmholtz equation using symbolic computation. International Journal of Computational Engineering Science, 4(1):121–144, 2003. L. D. Landau, E. M. Lifshitz, and L. P. Pitaevskii. Electrodynamics of Continuous Media. Butterworth-Heinemann; 2 edition, 1984. Hao Li, Pierre Ladeveze, and Herve Riou. On wave based weak Trefftz discontinuous Galerkin approach for medium-frequency heterogeneous Helmholtz problem. Computer Methods in Applied Mechanics and Engineering, 328:201–216, 2018. A. Lakhtakia and G. Mulholland. On two numerical techniques for light scattering by dielectric agglomerated structures. J. of Research of the Nat. Inst. of Standards and Tech., 98(6):699–716, 2003. Xinrui Lei, Lei Mao, Yonghua Lu, and Pei Wang. Revisiting the effective medium approximation in all-dielectric subwavelength multilayers: Breakdown and rebuilding. Phys. Rev. B, 96:035439, 2017. Zhaofeng Li, Mehmet Mutlu, and Ekmel Ozbay. Chiral metamaterials: from optical activity and negative refractive index to asymmetric transmission. Journal of Optics, 15(2):023001, jan 2013. M. L. Levin and S. M. Rytov. Teoriia Ravnovesnykh Teplovykh Fluktuatsii v Elektrodinamike. (Theory of steady-state thermal fluctuations in electrodynamics.). Moskva, Nauka, 1967. R. E. Lynch and J. R. Rice. A high-order difference method for differential equations. Math. Comp., 34:333–372, 1980. Jie Li and Balasubramaniam Shanker. Time-dependent Debye–Mie series solutions for electromagnetic scattering. IEEE Transactions on Antennas and Propagation, 63(8):3644–3653, 2015. R. Lipton and B. Schweizer. Effective Maxwell’s equations for perfectly conducting split ring resonators. Archive for Rational Mechanics and Analysis, 229(3):1197–1221, 2018. Kuiru Li, Mark I. Stockman, and David J. Bergman. Self-similar chain of metal nanospheres as an efficient nanolens. Phys. Rev. Lett., 91(22):227402, 2003.

674 [LSB06] [LSC91] [LSM+08] [LSPV05] [LSSP10]

[LT09]

[LVH04]

[LVV89] [LW95]

[lX95] [LYX06] [LZCS03] [LZK+10]

[LZTZ12] [LZZ17] [Mül78] [MA05]

[Mac91] [Mai07] [Mal99] [Man45] [Man47] [Man50] [Map50] [Mar08]

References Kuiru Li, Mark I. Stockman, and David J. Bergman. Li, Stockman, and Bergman reply. Phys. Rev. Lett., 97(7):079702, 2006. J. F. Lee, D. K. Sun, and Z. J. Cendes. Tangential vector finite-elements for electromagnetic-field computation. IEEE Trans. Magn., 27(5):4032–4035, 1991. N. I. Landy, S. Sajuyigbe, J. J. Mock, D. R. Smith, and W. J. Padilla. Perfect metamaterial absorber. Phys. Rev. Lett., 100:207402, 2008. V. Lucarini, J.J. Saarinen, K.-E. Peiponen, and E.M. Vartiainen. Kramers-Kronig Relations in Optical Materials Research. Springer, 2005. Xianliang Liu, Tatiana Starr, Anthony F. Starr, and Willie J. Padilla. Infrared spatial and frequency selective metamaterial with near-unity absorbance. Phys. Rev. Lett., 104:207403, 2010. Zhili Lin and Lars Thylen. On the accuracy and stability of several widely used FDTD approaches for modeling lorentz dielectrics. IEEE Transactions on Antennas and Propagation, 57(10):3378–3381, 2009. Domenico Lahaye, Stefan Vandewalle, and Kay Hameyer. An algebraic multilevel preconditioner for field-circuit coupled problems. J. Comput. Appl. Math., 168(12):267–275, 2004. Akhlesh Lakhtakia, Vijay K Varadan, and Vasundara V Varadan. Time-Harmonic Electromagnetic Fields in Chiral Media. Springer, 1989. P. T. S. Liu and J. P. Webb. Analysis of 3d microwave cavities using hierarchal vector finite elements. IEE Proceedings - Microwaves, Antennas and Propagation, 142(5):373–378, 1995. Y. l. Xu. Electromagnetic scattering by an aggregate of spheres. Appl. Opt., 34:4573–4588, 1995. Zhipeng Li, Zhilin Yang, and Hongxing Xu. Comment on “Self-similar chain of metal nanospheres as an efficient nanolens”. Phys. Rev. Lett., 97(7):079701, 2006. Jensen Li, Lei Zhou, C. T. Chan, and P. Sheng. Photonic band gap from a stack of positive and negative index materials. Phys. Rev. Lett., 90(8):083901, 2003. Zhaofeng Li, Rongkuo Zhao, Thomas Koschny, Maria Kafesaki, Kamil Boratay Alici, Evrim Colak, Humeyra Caglayan, Ekmel Ozbay, and C. M. Soukoulis. Chiral metamaterials with negative refractive index based on four u split ring resonators. Applied Physics Letters, 97(8):081901, 2010. A. Q. Liu, W. M. Zhu, D. P. Tsai, and N. I. Zheludev. Micromachined tunable metamaterials: a review. Journal of Optics, 14(11, SI), NOV 2012. Guixin Li, Shuang Zhang, and Thomas Zentgraf. Nonlinear photonic metasurfaces. Nature Reviews Materials, 2(5), May 2017. W. Müller. Analytic torsion and R-torsion of Riemannian manifolds. Advances in Mathematics, 28:233–305, 1978. Stefan A. Maier and Harry A. Atwater. Plasmonics: Localization and guiding of electromagnetic energy in metal/dielectric structures. Journal of Applied Physics, 98:011101, 2005. Daniel W. Mackowski. Analysis of radiative scattering for multiple sphere configurations. Proc. Royal Soc. London A, 433:599–614, 1991. Stefan A. Maier. Plasmonics: Fundamentals and Applications. Springer, 2007. Stéphane Mallat. A Wavelet Tour of Signal Processing (Wavelet Analysis & Its Applications). Academic Press, 1999. See p. 28 for the Poisson summation formula. L. I. Mandelshtam. Group velocity in crystalline arrays. Zh. Eksp. Teor. Fiz., 15:475–478, 1945. L. I. Mandelshtam. Polnoe Sobranie Trudov, v. 2. Akademiia Nauk SSSR, 1947. L. I. Mandelshtam. Polnoe Sobranie Trudov, v. 5. Akademiia Nauk SSSR, 1950. C. B. Maple. The Dirichlet problem: bound at a point for the solution and its derivatives. Quart. Appl. Math., 8:213–228, 1950. V. A. Markel. Can the imaginary part of permeability be negative? Phys Rev E, 78:026608, 2008.

References [Mar10] [Mar16a]

[Mar16b]

[Mar16c] [Mar18]

[Mas92] [Mat97]

[Mat00]

[May03] [Maz57] [Maz05] [MB96]

[MB05] [MBW06]

[McC67] [McC89]

[McL64] [MCL18] [MD01] [MDG14]

[MDH+99]

[MEHV02]

675 Vadim A. Markel. On the current-driven model in the classical electrodynamics of continuous media. J Phys: Condensed Matter, 22:485401, 2010. Vadim A. Markel. Introduction to the Maxwell Garnett approximation: tutorial. Journal of the Optical Society of America A – Optics Image Science and Vision, 33(7):1244–1256, 2016. Vadim A. Markel. Maxwell Garnett approximation (advanced topics): tutorial. Journal of the Optical Society of America A – Optics Image Science and Vision, 33(11):2237–2255, 2016. Vadim A. Markel. Private communication, 2016. Vadim A. Markel. External versus induced and free versus bound electric currents and related fundamental questions of the classical electrodynamics of continuous media: discussion. J. Opt. Soc. Am. A, 35(10):1663–1673, 2018. G. Dal Maso. An introduction to -convergence. Birkhauser, 1992. C. Mattiussi. An analysis of finite volume, finite element, and finite difference methods using some concepts from algebraic topology. J. of Comp. Phys., 133(2):289–309, 1997. Claudio Mattiussi. The finite volume, finite element, and finite difference methods as numerical methods for physical field problems. In Peter W. Hawkes, editor, Advances in Imaging and Electron Physics, volume 113, pages 1–146. Elsevier, 2000. I. D. Mayergoyz. Mathematical Models of Hysteresis and Their Applications. Amsterdam; Boston: Elsevier Academic Press, 2003. P. Mazur. On Statistical Mechanics and Electromagnetic Properties of Matter, chapter 10, pages 309–360. John Wiley & Sons, Ltd, 1957. Martial Mazars. Lekner summations and Ewald summations for quasi-twodimensional systems. Molecular Physics, 103(9):1241–1260, 2005. J. M. Melenk and I. Babuška. The partition of unity finite element method: Basic theory and applications. Comput. Methods Appl. Mech. Engrg., 139:289–314, 1996. D. O. S. Melville and R. J. Blaikie. Super-resolution imaging through a planar silver layer. Optics Express, 13(6):2127–2134, 2005. Graeme W Milton, Marc Briane, and John R Willis. On cloaking for elasticity and physical equations with a transformation invariant form. New Journal of Physics, 8(10):248–248, Oct 2006. Charles W. McCutchen. Superresolution in microscopy and the abbe resolution limit. J. Opt. Soc. Am., 57(10):1190–1192, 1967. Stephen Fahrney McCormick. Multilevel Adaptive Methods for Partial Differential Equations. Philadelphia, PA : Society for Industrial and Applied Mathematics, 1989. N. W. McLachlank. Theory and Application of Mathieu Functions. Dover Publications, 1964. Wei Ma, Feng Cheng, and Yongmin Liu. Deep-learning-enabled on-demand design of chiral metamaterials. ACS Nano, 12:6326–6334, 2018. P. Monk and L. Demkowicz. Discrete compactness and the approximation of Maxwell’s equations in R3 . Math. of Comp., 70(234):507–523, 2001. A. Modave, E. Delhez, and C. Geuzaine. Optimizing perfectly matched layers in discrete contexts. International Journal for Numerical Methods in Engineering, 99(6):410–437, 2014. S. Moskow, V. Druskin, T. Habashy, P. Lee, and S. Davydycheva. A finite difference scheme for elliptic equations with rough coefficients using a Cartesian grid nonconforming to interfaces. SIAM J. Numer. Analysis, 36(2):442–464, 1999. Esteban Moreno, Daniel Erni, Christian Hafner, and Rüdiger Vahldieck. Multiple multipole method with automatic multipole setting applied to the simulation of surface plasmons in metallic nanostructures. Opt. Soc. Am. A, 19(1):101–111, 2002.

676 [Mei67] [Mel99] [Mer70] [Mer04] [Mes02] [Meu07] [MF05] [MFZ05a] [MFZ+05b]

[MGKH06]

[MGS01] [MHP11]

[Mic94] [Mic00] [Mik64] [Mik65] [Mil70] [Mil94] [Mil01] [Mil02] [Mil04] [Min03] [Mit89] [Mit92] [MJ19]

References Günter Meinardus. Approximation of Functions: Theory and Numerical Methods. Springer, 1967. J. M. Melenk. Operator adapted spectral element methods I: harmonic and generalized harmonic polynomials. Numer. Math., 84:35–69, 1999. E. Merzbacher. Quantum Mechanics. Wiley, New York, 1970. R. Merlin. Analytical solution of the almost-perfect-lens problem. Applied Physics Letters, 84(8):1290–1292, 2004. René Messina. Image charges in spherical geometry: Application to colloidal systems. The J. of Chem. Phys., 117(24):11062–11074, 2002. Gérard Meunier, editor. The Finite Element Method for Electromagnetic Modeling. ISTE Publishing Company, 2007. Cesar Monzon and D. W. Forester. Negative refraction and focusing of circularly polarized waves in optically active media. Phys. Rev. Lett., 95:123904, Sep 2005. Isaak D. Mayergoyz, Donald R. Fredkin, and Zhenyu Zhang. Electrostatic (plasmon) resonances in nanoparticles. Physical Review B, 72(15):155412, 2005. R. Moussa, S. Foteinopoulou, Lei Zhang, G. Tuttle, K. Guven, E. Ozbay, and C. M. Soukoulis. Negative refraction and superlens behavior in a two-dimensional photonic crystal. Physical Review B (Condensed Matter and Materials Physics), 71(8):085106, 2005. R. Meisels, R. Gajic, F. Kuchar, and K. Hingerl. Negative refraction and flat-lens focusing in a 2d square-lattice photonic crystal at microwave and millimeter wave frequencies. Opt. Express, 14:6766–6777, 2006. J. M. Melenk, K. Gerdes, and C. Schwab. Fully discrete hp-finite elements: fast quadrature. Comput. Methods Appl. Mech. Engrg., 190:4339–4364, 2001. A. Moiola, R. Hiptmair, and I. Perugia. Plane wave approximation of homogeneous helmholtz solutions. Zeitschrift für angewandte Mathematik und Physik, 62(5):809, 2011. Ronald E. Mickens. Nonstandard Finite Difference Models of Differential Equations. Singapore; River Edge, N.J.: World Scientific, 1994. Ronald E. Mickens, editor. Applications of Nonstandard Finite Difference Schemes. Singapore; River Edge, N.J.: World Scientific, 2000. S. G. Mikhlin. Variational Methods in Mathematical Physics. Oxford, New York, Pergamon Press, 1964. S. G. Mikhlin. The Problem of the Minimum of a Quadratic Functional. San Francisco, Holden-Day, 1965. W. E. Milne. Numerical Solution of Differential Equations. New York, Dover Publications, 1970. P. W. Milonni. The Quantum Vacuum: An Introduction to Quantum Electrodynamics. Boston : Academic Press, 1994. K. A. Milton. The Casimir Effect: Physical Manifestations of Zero-point Energy. World Scientific, 2001. Graeme Milton. The Theory of Composites. Cambridge University Press: Cambridge; New York, 2002. P. W. Milonni. Fast Light, Slow Light and Left-Handed Light. Taylor & Francis, 2004. J. R. Minkel. Left-handed materials debate heats up. Phys. Rev. Focus, 9:755–760, 2003. William F. Mitchell. A comparison of adaptive refinement techniques for elliptic problems. ACM Trans. Math. Softw., 15(4):326–347, 1989. William F. Mitchell. Optimal multilevel iterative methods for adaptive grids. SIAM J. on Sci & Stat. Computing, 13(1):146–167, 1992. M Mansuripur and P K Jakobsen. An approach to constructing super oscillatory functions. Journal of Physics A: Mathematical and Theoretical, 52(30):305202, jul 2019.

References

677

[MKS+18]

Matthias Moeferdt, Thomas Kiel, Tobias Sproll, Francesco Intravaia, and Kurt Busch. Plasmonic modes in nanowire dimers: A study based on the hydrodynamic Drude model including nonlocal and nonlinear effects. Physical Review B, 97(7):075431, 2018. Michelle Duval Malinsky, K. Lance Kelly, George C. Schatz, and Richard P. Van Duyne. Nanosphere lithography: effect of substrate on the localized surface plasmon resonance spectrum of silver nanoparticles. J. Phys. Chem. B, 105:2343–2350, 2001. C. Moler and C. Van Loan. Nineteen dubious ways to compute the exponential of a matrix. SIAM Review, 20:801–836, 1978. C. Moler and C. Van Loan. Nineteen dubious ways to compute the exponential of a matrix, twenty-five years later. SIAM Review, 45(1):3–49, 2003. Tom G Mackay and Akhlesh Lakhtakia. Electromagnetic Anisotropy and Bianisotropy: A Field Guide. World Scientific, 2010. D. Mehtani, N. Lee, R. D. Hartschuh, A. Kisliuk, M. D. Foster, A. P. Sokolov, and J. F. Maguire. Nano-Raman spectroscopy with side-illumination optics. Journal of Raman Spectroscopy, 36(11):1068–1075, 2005. D. Mehtani, N. Lee, R. D. Hartschuh, A. Kisliuk, M. D. Foster, A. P. Sokolov, ˇ F. Cajko, and I. Tsukerman. Optical properties and enhancement factors of the tips for apertureless near-field optics. Journal of Optics A: Pure and Applied Optics, 8:S183–S190, 2006. D. W. Mackowski and M. I. Mishchenko. Calculation of the t matrix and the scattering matrix for ensembles of spheres. J. Optic. Soc. Amer. A, 13:2266–2277, 1996. Agnès Maurel and Jean-Jacques Marigo. Sensitivity of a dielectric layered structure on a scale below the periodicity: A fully local homogenized model. Phys. Rev. B, 98:024306, 2018. Ricardo Marqués, Francisco Medina, and Rachid Rafii-El-Idrissi. Role of bianisotropy in negative permeability and left-handed metamaterials. Phys. Rev. B, 65:144440, 2002. P. Mazur and B. R. A. Nijboer. On the statistical mechanics of matter in an electromagnetic field. I: Derivation of the Maxwell equations from electron theory. Physica, 19(1):971–986, 1953. J. Mahanty and B.W. Ninham. Dispersion Forces. London; New York: Academic Press, 1976. Graeme W. Milton and Nicolae-Alexandru P. Nicorovici. On the cloaking effects associated with anomalous localized resonance. Proc. R. Soc. Lond. A, 2006. Ahmad Mohammadi, Hamid Nadgaran, and Mario Agio. Contour-path effective permittivities for the two-dimensional finite-difference time-domain method. Opt. Express, 13(25):10367–10381, 2005. R. C. McPhedran, N. A. Nicorovici, and L. C. Botten. Neumann series and lattice sums. J. of Math. Phys., 46(8):083509, 2005. Graeme W. Milton, Nicolae-Alexandru P. Nicorovici, Ross C. McPhedran, and Viktor A. Podolskiy. A proof of superlensing in the quasistatic regime, and limitations of superlenses in this regime due to anomalous localized resonance. Proc. R. Soc. Lond. A, 461(2064):3999–4034, 2005. Pedro Morin, Ricardo H. Nochetto, and Kunibert G. Siebert. Convergence of adaptive finite element methods. SIAM Rev., 44(4):631–658, 2002. Peter Monk. Finite Element Methods for Maxwell’s Equations. Oxford: Clarendon Press, 2003, 2003. Alexander Moroz. Metallo-dielectric diamond and zinc-blende photonic crystals. Phys. Rev. B, 66(11):115109, 2002. Andrea Moiola and Ilaria Perugia. A space-time Trefftz discontinuous Galerkin method for the acoustic wave equation in first-order formulation. Numerische Mathematik, 138(2):389–435, Feb 2018.

[MKSD01]

[ML78] [ML03] [ML10] [MLH+05]

[MLH+06]

[MM96]

[MM18]

[MMREI02]

[MN53]

[MN76] [MN06] [MNA05]

[MNB05] [MNMP05]

[MNS02] [Mon03] [Mor02] [MP18]

678 [MPC+94]

[MPG+18]

[MPT16] [MRB+93]

[MRB+97]

[MRIL10]

[MS73] [MS03]

[MS10] [MS12]

[MSW80]

[MT97] [MT98]

[MT13] [MT16]

[MT20] [MTC17]

[MTL02] [MTL06]

References K. K. Mei, R. Pous, Z. Chen, Y. W. Liu, and M. D. Prouty. Measured equation of invariance: A new concept in field computation. IEEE Trans. Antennas Propagat., 42:320–327, 1994. Martin McCall, John B Pendry, Vincenzo Galdi, Yun Lai, S A R Horsley, Jensen Li, Jian Zhu, Rhiannon C Mitchell-Thomas, Oscar Quevedo-Teruel, Philippe Tassin, Vincent Ginis, Enrica Martini, Gabriele Minatti, Stefano Maci, Mahsa Ebrahimpouri, Yang Hao, Paul Kinsler, Jonathan Gratus, Joseph M Lukens, Andrew M Weiner, Ulf Leonhardt, Igor I Smolyaninov, Vera N Smolyaninova, Robert T Thompson, Martin Wegener, Muamer Kadic, and Steven A Cummer. Roadmap on transformation optics. Journal of Optics, 20(6):063001, May 2018. K G Makris, D G Papazoglou, and S Tzortzakis. Invariant superoscillatory electromagnetic fields in 3d-space. Journal of Optics, 19(1):014003, dec 2016. R. D. Meade, A. M. Rappe, K. D. Brommer, J. D. Joannopoulos, and O. L. Alerhand. Accurate theoretical analysis of photonic band-gap materials. Phys. Rev. B, 48(11):8434–8437, 1993. R. D. Meade, A. M. Rappe, K. D. Brommer, J. D. Joannopoulos, and O. L. Alerhand. Erratum: Accurate theoretical analysis of photonic band-gap materials [Phys. Rev. B 48, 8434 (1993)]. Phys. Rev. B, 55(23):15942, Jun 1997. Christoph Menzel, Carsten Rockstuhl, Rumen Iliew, and Falk Lederer. High symmetry versus optical isotropy of a negative-index metamaterial. Physical Review B, 81:195123, 2010. C. B. Moler and G. W. Stewart. An algorithm for generalized matrix eigenvalue problems. SIAM J. Numer. Analysis, 10(2):241–256, 1973. Peter Markoš and C. M. Soukoulis. Transmission properties and effective electromagnetic parameters of double negative metamaterials. Opt. Express, 11(7):649– 661, Apr 2003. V. A. Markel and J. C. Schotland. On the sign of refraction in anisotropic nonmagnetic media. J Optics, 12:015104, 2010. Vadim A. Markel and John C. Schotland. Homogenization of Maxwell’s equations in periodic composites: Boundary effects and dispersion relations. Phys. Rev. E, 85:066603, 2012. Josef Meixner, Friedrich Wilhelm Schäfke, and Gerhard Wolf. Mathieu Functions and Spheroidal Functions and Their Mathematical Foundations: Further Studies. Springer Verlag, 1980. V. M. Mostepanenko and N. N. Trunov. The Casimir Effect and Its Applications. Clarendon Press, 1997. M. I. Mishchenko and L. D. Travis. Capabilities and limitations of a current fortran implementation of the T-matrix method for randomly oriented, rotationally symmetric scatterers. J. Quant. Spectrosc. Radiat. Transfer, 60:309–324, 1998. Vadim A. Markel and Igor Tsukerman. Current-driven homogenization and effective medium parameters for finite samples. Phys. Rev. B, 88:125131, 2013. Vadim A. Markel and Igor Tsukerman. Applicability of effective medium description to photonic crystals in higher bands: Theory and numerical analysis. Phys. Rev. B, 93:224202, 2016. Vadim A. Markel and Igor Tsukerman. Current-driven models and magnetic effects in metamaterials [working title; in preparation]. 2020. Shampy Mansha, Igor Tsukerman, and Yidong Chong. The FLAME-slab method for electromagnetic wave scattering in aperiodic slabs. Optics Express, 25:32602– 32617, 2017. M. I. Mishchenko, L. D. Travis, and A. A. Lacis. Scattering, Absorption, and Emission of Light by Small Particles. Cambridge University Press, 2002. M. I. Mishchenko, L. D. Travis, and A. A. Lacis. Multiple Scattering of Light by Particles: Radiative Transfer and Coherent Backscattering. Cambridge University Press, 2006.

References [MTM96]

[MTT12]

[Mun00] [Mur81]

[Mur98] [MV97]

[MW79] [MW17] [MZ95] [Néd80] [Néd86] [Nam99] [NBF18] [Neh96] [NGG+04] [NGS00] [NGS16]

[Nik16] [Nik17] [NIS] [NMB95a] [NMB95b] [NMM94] [NO99] [NO00]

679 M. I. Mishchenko, L. D. Travis, and D. W. Mackowski. T-matrix computations of light scattering by nonspherical particles: A review. J. Quant. Spectrosc. Radiat. Transfer, 55:535–575, 1996. M. Medvinsky, S. Tsynkov, and E. Turkel. The method of difference potentials for the Helmholtz equation using compact high order schemes. Journal of Scientific Computing, 53(1):150–193, 2012. E. H. Mund. A short survey on preconditioning techniques in spectral calculations. Applied Numer. Math., 33:61–70, 2000. G. Mur. Absorbing boundary conditions for the finite-difference approximation of the time-domain electromagnetic-field equations. IEEE Transactions on Electromagnetic Compatibility, EMC-23(4):377–382, Nov 1981. G. Mur. The fallacy of edge elements. IEEE Trans. Magn., 34(5):3244–3247, 1998. S. Moskow and M. Vogelius. First order corrections to the homogenized eigenvalues of a periodic composite medium: A convergence proof. Proceedings of the Royal Society of Edinburg, 127A:1263–1299, 1997. Wilhelm Magnus and Stanley Winkler. Hill’s Equation. New York: Dover Publications, 1979. See p.28 for the Poisson summation formula. A. A. Maznev and O. B. Wright. Upholding the diffraction limit in the focusing of light and sound. Wave Motion, 68:182 – 189, 2017. S. A. Meguid and Z. H. Zhu. A novel finite element for treating inhomogeneous solids. Int. J. for Numer. Meth. Eng., 38:1579–1592, 1995. Jean-Claude Nédélec. Mixed finite elements in R3 . Numer. Math., 35:315–341, 1980. Jean-Claude Nédélec. A new family of mixed finite elements in R3 . Numer. Math., 50:57–81, 1986. T. Namiki. A new FDTD algorithm based on alternating-direction implicit method. IEEE Trans Microwave Theory and Techniques, 47(10):2003–2007, 1999. Nikita A. Nemkov, Alexey A. Basharin, and Vassili A. Fedotov. Electromagnetic sources beyond common multipoles. Phys. Rev. A, 98:023858, 2018. John W. Nehrbass. Advances in Finite Difference Methods for Electromagnetic Modeling. PhD thesis, Ohio State University, 1996. C. L. Nehl, N. K. Grady, G. P. Goodrich, F. Tam, N. J. Halas, and J. H. Hafner. Scattering spectra of single gold nanoshells. Nano Letters, 4(12):2355–2359, 2004. T. T. Nguyen, A. Yu. Grosberg, and B. I. Shklovskii. Macroions in salty water with multivalent ions: giant inversion of charge. Phys. Rev. Lett., 85:1568–1571, 2000. I. Niyonzima, C. Geuzaine, and S. Schöps. Waveform relaxation for the computational homogenization of multiscale magnetoquasistatic problems. Journal of Computational Physics, 327:416–433, 2016. Hrvoje Nikoli´c. Proof that Casimir force does not originate from vacuum energy. Physics Letters B, 761:197–202, 2016. Hrvoje Nikoli´c. Is zero-point energy physical? A toy model for Casimir-like effect. Annals of Physics, 383:181–195, 2017. Digital library of mathematical functions. dlmf.nist.gov. N. A. Nicorovici, R. C. McPhedran, and L. C. Botten. Photonic band gaps for arrays of perfectly conducting cylinders. Phys Rev E, 52(1):1135–1145, 1995. N. A. Nicorovici, R. C. McPhedran, and L. C. Botten. Photonic band gaps: Noncommuting limits and the ‘acoustic band’. Phys Rev Lett, 75(8):1507–1510, 1995. N. A. Nicorovici, R. C. McPhedran, and G. W. Milton. Optical and dielectric properties of partially resonant composites. Phys. Rev. B, 49(12):8479–8482, 1994. R. R. Netz and H. Orland. Field theory for charged fluids and colloids. Europhysics Letters, 45(6):726–732, 1999. R. R. Netz and H. Orland. Beyond Poisson-Boltzmann: Fluctuation effects and correlation functions. The European Phys. J. E, 1:203–214, 2000.

680 [Not00]

[NP16] [NRM+94]

[NSB16] [NSD+13]

[NSR04] [NVG03] [NZ05]

[NZAG08]

[NZG10] [OBB98] [OH09]

[Ohs94a]

[Ohs94b] [Ohs95]

[OK05]

[OKJ09]

[OP01] [OP02a]

[OP02b]

References M. Notomi. Theory of light propagation in strongly modulated photonic crystals: Refraction like behavior in the vicinity of the photonic band gap. Phys. Rev. B, 62(16):10696–10705, 2000. Mikhail A. Noginov and Viktor A. Podolskiy, editors. Tutorials in Metamaterials. CRC Press, 2016. A. Nicolet, J.-F. Remacle, B. Meys, A. Genon, and W. Legros. Transformation methods in computational electromagnetism. Journal of Applied Physics, 75(10):6036–6038, 1994. Justus C. Ndukaife, Vladimir M. Shalaev, and Alexandra Boltasseva. Plasmonics— turning loss into gain. Science, 351(6271):334–335, 2016. I. Niyonzima, R. V. Sabariego, P. Dular, F. Henrotte, and C. Geuzaine. Computational homogenization for laminated ferromagnetic cores in magnetodynamics. IEEE Transactions on Magnetics, 49(5):2049–2052, 2013. C. C. Neacsu, G. A. Steudle, and M. B. Raschke. Plasmonic light scattering from nanoscopic metal tips. Appl. Phys. B, 80:295–300, 2004. M. Nieto-Vesperinas and N. Garcia. Nieto-Vesperinas and Garcia Reply. Phys. Rev. Lett., 91(9):099702, 2003. Nader Engheta and R. W. Ziolkowski. A positive future for double-negative metamaterials. IEEE Transactions on Microwave Theory and Techniques, 53(4):1535– 1556, April 2005. A. Nicolet, F. Zolla, Y. Ould Agha, and S. Guenneau. Geometrical transformations and equivalent materials in computational electromagnetism. COMPEL: The International Journal for Computation and Mathematics in Electrical and Electronic Engineering, 27(4):806–819, 2008. A. Nicolet, F. Zolla, and C. Geuzaine. Transformation optics, generalized cloaking and superlenses. IEEE Transactions on Magnetics, 46(8):2975–2981, 2010. J. T. Oden, I. Babuška, and C. E. Baumann. A discontinuous hp finite element method for diffusion problems. J. of Comp. Phys., 146:491–519, 1998. Yuri N. Obukhov and Friedrich W. Hehl. On the boundary-value problems and the validity of the post constraint in modern electromagnetism. Optik, 120(9):418–421, 2009. Hiroyuki Ohshima. Electrostatic interaction between a hard-sphere with constant surface-charge density and a soft-sphere – polarization effect of a hard-sphere. J. Colloidal & Interface Sci., 168(1):255–265, 1994. Hiroyuki Ohshima. Electrostatic interaction between two dissimilar spheres: an explicit analytic expression. J. of Colloid & Interface Sci, 162(2):487–495, 1994. Hiroyuki Ohshima. Electrostatic interaction between two dissimilar spheres with constant surface charge density. J. of Colloid & Interface Sci, 170(2):432–439, 1995. C.-C. Oetting and L. Klinkenbusch. Near-to-far-field transformation by a timedomain spherical-multipole analysis. IEEE Transactions on Antennas and Propagation, 53:2054–2063, 2005. Ardavan F. Oskooi, Chris Kottke, and Steven G. Johnson. Accurate finite-difference time-domain simulation of anisotropic media by subpixel smoothing. Opt. Lett., 34(18):2778–2780, 2009. J. Tinsley Oden and S. Prudhomme. Goal-oriented error estimation and adaptivity for the finite element method. Computers & Math. with Appl., 41:735–756, 2001. S. O’Brien and J. B. Pendry. Magnetic activity at infrared frequencies in structured metallic photonic crystals. Journal of Physics: Condensed Matter, 14(25):6383– 6394, jun 2002. Stephen O’Brien and John B Pendry. Photonic band-gap effects and magnetic activity in dielectric composites. Journal of Physics: Condensed Matter, 14(15):4035– 4044, Apr 2002.

References [Ors80] [OS97] [Ott96] [OV18]

[OWM15]

[Paf59] [Pan84] [Par64] [Par80] [Par06]

[Pat80] [PB02] [PDL84] [PE02] [PE03] [PECT17]

[Pel96]

[Pen00] [Pen01a] [Pen01b] [Pen04] [Pet95] [Pet00]

[PFZ09]

681 S. A. Orszag. Spectral methods for problems in complex geometries. J. Comp. Phys., 37(1):70–92, 1980. Walter Oevel and Mark Sofroniou. Symplectic runge–kutta-schemes ii: Classification of symmetric methods, 1997. Hans Christian Ottinger. Stochastic Processes in Polymeric Fluids: Tools and Examples for Developing Simulation Algorithms. Springer, 1996. M. Ohlberger and B. Verfurth. A new heterogeneous multiscale method for the helmholtz equation with high contrast. Multiscale Modeling & Simulation, 16(1):385–411, 2018. Giacomo Oliveri, Douglas H. Werner, and Andrea Massa. Reconfigurable Electromagnetics Through Metamaterials – A Review. Proceedings of the IEEE, 103(7, SI):1034–1056, JUL 2015. V. E. Pafomov. K voprosu o perehodnom izluchenii i izluchenii VavilovaCherenkova. Zh. Eksp. Teor. Fiz., 36(6):1853–1858, 1959. Victor Pan. How can we speed up matrix multiplication? SIAM Rev., 26(3):393– 415, 1984. D. Park. Introduction to the Quantum Theory. New York : McGraw-Hill, 1964. B. N. Parlett. The Symmetric Eigenvalue Problem. Prentice-Hall, Englewood Cliffs, N.J., 1980. Vozken Adrian Parsegian. Van der Waals Forces : A Handbook for Biologists, Chemists, Engineers, and Physicists. New York : Cambridge University Press, 2006. S. V. Patankar. Numerical Heat Transfer and Fluid Flow. John Benjamins Publishing Co., 1980. John A. Pelesko and David H. Bernstein. Modeling MEMS and NEMS. CRC Press, 2002. D. W. Pohl, W. Denk, and M. Lanz. Optical stethoscopy: Image recording with resolution lambda/20. Applied Physics Letters, 44(7):651–653, 1984. A. L. Pokrovsky and A. L. Efros. Sign of refractive index and group velocity in lefthanded media. Solid State Communications, 124:283–287, 2002. A. L. Pokrovsky and A. L. Efros. Diffraction theory and focusing of light by a slab of left-handed material. Physica B: Condensed Matter, 338:333–337, 2003. Adriana Passaseo, Marco Esposito, Massimo Cuscunà, and Vittorianna Tasco. Materials and 3d designs of helix nanostructures for chirality at optical frequencies. Advanced Optical Materials, 5(16):1601079, 2017. J. Peltoniemi. Electromagnetic scattering by irregular grains usingvariational volume integral equation method. J. Quant. Spectrosc. Radiat. Transfer, 55(5):637– 647, 1996. J. B. Pendry. Negative refraction makes a perfect lens. Phys. Rev. Lett., 85(18):3966–3969, 2000. John Pendry. Electromagnetic materials enter the negative age. Physics World, 14(9):47, 2001. John Pendry. Pendry replies:. Phys. Rev. Lett., 87(24):249704, 2001. J. B. Pendry. A chiral route to negative refraction. Science, 306(5700):1353–1355, 2004. Henrik G. Petersen. Accuracy and efficiency of the particle mesh ewald method. The J. of Chem. Phys., 103(9):3668–3679, 1995. P. Petropoulos. Reflectionless sponge layers as absorbing boundary conditions for the numerical solution of Maxwell equations in rectangular, cylindrical, and spherical coordinates. SIAM Journal on Applied Mathematics, 60(3):1037–1058, 2000. E Plum, V A Fedotov, and N I Zheludev. Extrinsic electromagnetic chirality in metamaterials. Journal of Optics A: Pure and Applied Optics, 11(7):074009, May 2009.

682 [PFZ10]

[PGL+03]

[PH67] [PHRS99]

[Pis84] [PLN16]

[PLV+04]

[PM96] [PO02]

[Pos62] [Pos97] [PP62] [PPL+98]

[PR55]

[PR03] [PR04]

[Pra03] [Pra04] [PRM98] [PS04] [PSHT16]

[PSS06]

References E Plum, V A Fedotov, and N I Zheludev. Asymmetric transmission: a generic property of two-dimensional periodic patterns. Journal of Optics, 13(2):024006, nov 2010. C. G. Parazzoli, R. B. Greegor, K. Li, B. E. C. Koltenbah, and M. Tanielian. Experimental verification and simulation of negative index of refraction using Snell’s law. Phys. Rev. Lett., 90(10):107401, 2003. Paul Penfield and H. A. Haus. Electrodynamics of Moving Media. The MIT Press, 1967. J. B. Pendry, A.J. Holden, D. J. Robbins, and W.J. Stewart. Magnetism from conductors and enhanced nonlinear phenomena. IEEE Trans. on Microwave Theory & Tech., 47(11):2075–2084, Nov 1999. Sergio Pissanetzky. Sparse Matrix Technology. London : Academic Press, 1984. Vladislav Popov, Andrei V. Lavrinenko, and Andrey Novitsky. Operator approach to effective medium theory to overcome a breakdown of Maxwell Garnett approximation. Phys. Rev. B, 94:085428, 2016. P. V. Parimi, W. T. Lu, P. Vodo, J. Sokoloff, J. S. Derov, and S. Sridhar. Negative refraction and left-handed electromagnetism in microwave photonic crystals. Phys. Rev. Lett., 92(12):127401, 2004. M. A. Paesler and P. J. Moyer. Near Field Optics: Theory, Instrumentation and Applications. New York: John Wiley & Sons, Inc., 1996. S. Prudhomme and J. Tinsley Oden. Computable error estimators and adaptive techniques for fluid flow problems. In T. Barth and H. Deconinck, editors, Error Estimation and Adaptive Discretization Methods in Computational Fluid Dynamics, Lecture Notes in Computational Science and Engineering, vol. 25, pages 207– 268. Springer-Verlag, Heidelberg, 2002. E. J. Post. Formal Structure of Electromagnetics: General Covariance and Electromagnetics. North-Holland, 1962. E. J. Post. Formal Structure of Electromagnetics: General Covariance and Electromagnetics. Dover, 1997. Paperback. Wolfgang K. H. Panofsky and Melba Phillips. Classical Electricity and Magnetism. Reading, Mass., Addison-Wesley Pub. Co., 1962. B. Palpant, B. Prével, J. Lermé, E. Cottancin, M. Pellarin, M. Treilleux, A. Perez, J. L. Vialle, and M. Broyer. Optical properties of gold clusters in the size range 2–4 nm. Phys. Rev. B, 57(3):1963–1970, 1998. D. W. Peaceman and H. H. Rachford. The numerical solution of parabolic and elliptic differential equations. Journal of the Society for Industrial and Applied Mathematics, 3(1):28–41, 1955. J. B. Pendry and S.A. Ramakrishna. Focussing light using negative refraction. J. Phys.: Condens. Matter, 15:6345–6364, 2003. Richard Pasquetti and Francesca Rapetti. Spectral element methods on triangles and quadrilaterals: comparisons and applications. J. Comp. Phys., 198:349–362, 2004. Paras N. Prasad. Introduction to Biophotonics. Wiley-Interscience, 2003. Paras N. Prasad. Nanophotonics. Wiley-Interscience, 2004. Andrew F. Peterson, Scott L. Ray, and Raj Mittra. Computational Methods for Electromagnetics. Oxford University Press, 1998. J. B. Pendry and D. R. Smith. Reversing light with negative refraction. Phys. Today, 57:37–43, 2004. A. Paganini, L. Scarabosio, R. Hiptmair, and I. Tsukerman. Trefftz approximations: A new framework for nonreflecting boundary conditions. IEEE Transactions on Magnetics, 52(3):7201604, 2016. J. B. Pendry, D. Schurig, and D. R. Smith. Controlling electromagnetic fields. Science, 312(5781):1780–1782, 2006.

References [PT02] [PT12]

[PT17]

[PTFY03]

[PTPT00] [PW09] [PWT07]

[PZD+09]

[QBT17] [QT08] [Qui96]

[Qui99] [QV99] [QZT+18]

[Rüd93] [RAH01]

[Rak72]

[Ram05] [RBWM15]

[RC95] [RCJ11]

683 L. Proekt and I. Tsukerman. Method of overlapping patches for electromagnetic computation. IEEE Trans. Magn., 38(2):741–744, 2002. S. V. Petropavlovsky and S. V. Tsynkov. A non-deteriorating algorithm for computational electromagnetism based on quasi-lacunae of Maxwell’s equations. Journal of Computational Physics, 231(2):558–585, 2012. S. Petropavlovsky and S. Tsynkov. Non-deteriorating time domain numerical algorithms for Maxwell’s electrodynamics. Journal of Computational Physics, 336:1– 35, 2017. A. Plaks, I. Tsukerman, G. Friedman, and B. Yellen. Generalized Finite Element Method for magnetized nanoparticles. IEEE Trans. Magn., 39(3):1436–1439, 2003. A. Plaks, I. Tsukerman, S. Painchaud, and L. Tabarovsky. Multigrid methods for open boundary problems in geophysics. IEEE Trans. Magn., 36(4):633–638, 2000. H. Pinheiro and J. P. Webb. A FLAME molecule for 3-d electromagnetic scattering. IEEE Transactions on Magnetics, 45(3):1120–1123, 2009. H. Pinheiro, J. P. Webb, and I. Tsukerman. Flexible local approximation models for wave scattering in photonic crystal devices. IEEE Trans. Magn., 43(4):1321–1324, 2007. E. Plum, J. Zhou, J. Dong, V. A. Fedotov, T. Koschny, C. M. Soukoulis, and N. I. Zheludev. Metamaterial with negative index due to chirality. Phys. Rev. B, 79:035407, 2009. Fan Qing-Bin and Xu Ting. Research progress of imaging technologies based on electromagnetic metasurfaces. Acta Physica Sinica, 66(14, SI), JUL 20 2017. H. Qasimov and S. Tsynkov. Lacunae based stabilization of PMLs. Journal of Computational Physics, 227(15):7322–7345, 2008. Michael Quinten. Optical constants of gold and silver clusters in the spectral range between 1.5 ev and 4.5 ev. Zeitschrift für Physik B Condensed Matter, 101(2):211– 217, 1996. Michael Quinten. Optical effects associated with aggregates of clusters. J. of Cluster Science, 10(2):319–358, 1999. Alfio Quarteroni and Alberto Valli. Domain Decomposition Methods for Partial Differential Equations. Oxford; New York: Clarendon Press, 1999. Meng Qiu, Lei Zhang, Zhixiang Tang, Wei Jin, Cheng-Wei Qiu, and Dang Yuan Lei. 3d metaphotonic nanostructures with intrinsic chirality. Advanced Functional Materials, 28(45):1803147, 2018. Ulrich Rüde. Fully adaptive multigrid methods. SIAM J. Numer. Anal., 30(1):230– 248, 1993. W. Rocchia, E. Alexov, and B. Honig. Extending the applicability of the nonlinear Poisson–Boltzmann equation: Multiple dielectric constants and multivalent ions. J. Phys. Chem. B, 105(28):6507–6514, 2001. Yu. V. Rakitskii. A methodology for a systematic time step increase in the numerical integration of ordinary differential equations. Doklady Akademii Nauk SSSR (Mathematics). Proceedings of the Academy of Sciences of the USSR. Comptes rendus de l’Académie des sciences de l’URSS, 207(4):793–795, 1972. S. Anantha Ramakrishna. Physics of negative refractive index materials. Rep. Prog. Phys., 68:449–521, 2005. Søren Raza, Sergey I Bozhevolnyi, Martijn Wubs, and N Asger Mortensen. Nonlocal optical response in metallic nanostructures. Journal of Physics: Condensed Matter, 27(18):183204, apr 2015. C. J. Railton and I. J. Craddock. Analysis of general 3-d pec structures using improved cpfdtd algorithm. Electronics Letters, 31(20):1753–1754, 1995. Alejandro W. Rodriguez, Federico Capasso, and Steven G. Johnson. The casimir effect in microstructured geometries. Nature Photonics, 5:211–221, 2011.

684 [RdL04] [RdL10]

[RDS97]

[REG+09]

[Rek80] [Res92] [Res94a] [Res94b] [Res02] [Res10] [Res18] [RG00]

[RG08] [RGK+98]

[RI00] [Ric03] [RII+07]

[RKG88] [RLR+12]

[Rob05] [Rod83]

[Rod93]

References Roger E. Raab and Owen L. de Lange. Multipole Theory in Electromagnetism: Classical, quantum, and symmetry aspects, with applications. OUP Oxford, 2004. R E Raab and O L de Lange. Comment on ‘On the origin dependence of multipole moments in electromagnetism’. Journal of Physics D: Applied Physics, 43(50):508001, dec 2010. Tamar Schlick Robert D. Skeel, Guihua Zhang. A family of symplectic integrators: stability, accuracy, and molecular dynamics applications. SIAM J. Sci. Comput., 18:203–222, 1997. Sahand Jamal Rahi, Thorsten Emig, Noah Graham, Robert L. Jaffe, and Mehran Kardar. Scattering theory approach to electrodynamic casimir forces. Phys. Rev. D, 80:085021, Oct 2009. Karel Rektorys. Variational Methods in Mathematics, Science, and Engineering. Dordrecht ; Boston : D. Reidel, 1980. R Resta. Theory of the electric polarization in crystals. Ferroelectrics, 136(1– 4):51–55, 1992. R Resta. Macroscopic polarization in crystalline dielectrics - the geometric phase approach. Reviews of Modern Physics, 66(3):899–915, 1994. Raffaele Resta. Modern theory of polarization in ferroelectrics. Ferroelectrics, 151(1):49–58, 1994. Raffaele Resta. Why are insulators insulating and metals conducting? Journal of Physics: Condensed Matter, 14(20):R625–R656, 2002. Raffaele Resta. Electrical polarization and orbital magnetization: the modern theories. Journal of Physics-Condensed Matter, 22(12), 2010. Raffaele Resta. Polarization in Kohn-Sham density-functional theory. European Physical Journal B, 91(6), Jun 4 2018. J. A. Roden and S. D. Gedney. Convolutional PML, (CPML): an efficient FDTD implementation of the CFS-PML for arbitrary media. Microw. Opt. Technol. Lett., 27:334–338, 2000. S. Anantha Ramakrishna and Tomasz M. Grzegorczyk. Physics and Applications of Negative Refractive Index Materials. CRC Press, 2008. J. A. Roden, S. D. Gedney, M. P. Kesler, J. G. Maloney, and P. H. Harms. Time-domain analysis of periodic structures at oblique incidence: orthogonal and nonorthogonal FDTD implementations. IEEE Transactions on Microwave Theory and Techniques, 46(4):420–427, 1998. Z. Ren and N. Ida. Solving 3d eddy current problems using second order nodal and edge elements. IEEE Trans. Magn., 36(4):746–750, 2000. David Richards. Near-field microscopy: throwing light on the nanoworld. Phil. Trans. R. Soc. Lond. A, 361(1813):2843–2857, 2003. Alejandro Rodriguez, Mihai Ibanescu, Davide Iannuzzi, J. D. Joannopoulos, and Steven G. Johnson. Virtual photons in imaginary time: Computing exact casimir forces via standard numerical electromagnetism techniques. Phys. Rev. A, 76:032106, Sep 2007. M. O. Robbins, K. Kremer, and G. Grest. Phase diagram and dynamics of Yukawa systems. J. Chem. Phys., 88:3286–3312, 1988. Edward T F Rogers, Jari Lindberg, Tapashree Roy, Salvatore Savo, John E. Chad, Mark R. Dennis, and Nikolay I Zheludev. A super-oscillatory lens optical microscope for subwavelength imaging. Nature Materials, 11(95):432–435, 2012. Sara Robinson. Toward an optimal algorithm for matrix multiplication. SIAM News, 38(9), 2005. D. Rodger. Finite-element method for calculating power frequency 3-dimensional electromagnetic-field distributions. IEE Proceedings-A: Science, Measurement and Technology, 130(5):233–238, 1983. Luigi Rodino. Linear Partial Differential Operators in Gevrey Spaces. World Scientific, 1993.

References [RR90]

[RRWJ09]

[RS76] [RS78] [RS04] [RST96]

[RSY+85]

[RT74]

[RT75]

[RT77]

[RT95]

[RTT01]

[RUC79]

[Rud76] [Rus70] [RV07]

[RWG82]

[RWJ13]

[Ryt53]

685 Eric S. Reiner and Clayton J. Radke. Variational approach to the electrostatic free energy in charged colloidal suspensions: general theory for open systems. J. Chem. Soc., Faraday Trans., 86:3901–3912, 1990. M. T. Homer Reid, Alejandro W. Rodriguez, Jacob White, and Steven G. Johnson. Efficient computation of casimir interactions between arbitrary 3d objects. Phys. Rev. Lett., 103:040401, Jul 2009. G. W. Reddien and L. L. Schumaker. On a collocation method for singular twopoint boundary value problems. Numerische Mathematik, 25:427–432, 1976. Michael Reed and Barry Simon. Analysis of Operators (Methods of Modern Mathematical Physics). Academic Press, 1978. A. Rubinstein and S. Sherman. Influence of the solvent structure on the electrostatic interactions in proteins. Biophys J., 87:1544–1557, 2004. H.-G. Roos, M. Stynes, and L. Tobiska. Numerical Methods for Singularly Perturbed Differential Equations: Convection-Diffusion and Flow Problems (Springer Series in Computational Mathematics. Springer, 1996. Yu. V. Rakitskii, E. D. Shchukin, V. S. Yushchenko, I. A. Tsukerman, Yu. B. Suris, and A. I. Slutsker. Mechanism of the formation of energy fluctuation and a method for studying it. Doklady. Physical chemistry: Proceedings of the Academy of Sciences of the USSR. Comptes rendus de l’Académie des sciences de l’URSS, 285(4):1204–1207, 1985. G. D. Raithby and K. E. Torrance. Upstream weighted differencing schemes and their application to elliptic problems involving fluid flow. J. of Computers & Fluids, 2:191–206, 1974. Donald J. Rose and R. Endre Tarjan. Algorithmic aspects of vertex elimination. In STOC ’75: Proceedings of seventh annual ACM symposium on Theory of computing, pages 245–254, New York, NY, USA, 1975. ACM Press. P.-A. Raviart and J. M. Thomas. A mixed finite element method for 2nd order elliptic problems. Lecture Notes in Mathematics, Springer, Berlin, 606:292–315, 1977. V. Ryaben’kii and S. Tsynkov. Artificial boundary conditions for the numerical solution of external viscous flow problems. SIAM Journal on Numerical Analysis, 32(5):1355–1389, 1995. V. S. Ryaben’kii, S. V. Tsynkov, and V. I. Turchaninov. Long-time numerical computation of wave-type solutions driven by moving sources. Applied Numerical Mathematics, 38(1):187–222, 2001. IU. V. Rakitskii, S. M. Ustinov, and I. G. Chernorutskii. Chislennye Metody Resheniia Zhestkikh Sistem [in Russian]. (Numerical Methods for Stiff Systems). Moskva: Nauka, 1979. Walter Rudin. Principles of Mathematical Analysis. McGraw-Hill, 1976. G. Russakoff. A derivation of the macroscopic Maxwell equations. American Journal of Physics, 38(10):1188–1195, 1970. Raffaele Resta and David Vanderbilt. Theory of polarization: A modern approach. In Rabe KM, Ahn CH, and Triscone, JM, editor, Physics of Ferroelectrics: A Modern Perspective, volume 105 of Topics in Applied Physics, pages 31–68. Springer, 2007. S. Rao, D. Wilton, and A. Glisson. Electromagnetic scattering by surfaces of arbitrary shape. IEEE Transactions on Antennas and Propagation, 30(3):409–418, 1982. M. T. Homer Reid, Jacob White, and Steven G. Johnson. Fluctuating surface currents: An algorithm for efficient prediction of Casimir interactions among arbitrary materials in arbitrary geometries. Physical Review A, 88(2), Aug 23 2013. S. M. Rytov. Teoriia Elektricheskikh Fluktuatsii i Teplovogo Izlucheniia (Theory of Electric Fluctuations and Thermal Radiation). Moskva, Izd-vo Akademii nauk SSSR, 1953.

686 [RZ13]

[Saa92] [Saa03] [Sad97]

[Sak05] [Sam01] [San99] [SB91] [SB09]

[SBC00]

[SC99] [SC00] [Sch04] [Sch43] [Sch66] [Sch73] [Sch75] [Sch97] [Sch02] [Sch10] [SD99]

[SD01] [SEK+05]

[Sel84] [Ser18] [SF52]

References Edward T F Rogers and Nikolay I Zheludev. Optical super-oscillations: subwavelength light focusing and super-resolution imaging. Journal of Optics, 15(9):094008, 2013. Yousef Saad. Numerical Methods for Large Eigenvalue Problems. Manchester University Press, Manchester, UK, 1992. Yousef Saad. Iterative Methods for Sparse Linear Systems. Philadelphia: Society for Industrial and Applied Mathematics, 2003. John Elie Sader. Accurate analytic formulae for the far field effective potential and surface charge density of a uniformly charged sphere. J. of Colloid & Interface Sci, 188:508–510, 1997. Article No. CS974776. Kazuaki Sakoda. Optical Properties of Photonic Crystals. Berlin; New York: Springer, 2005. A. A. Samarskii. The Theory of Difference Schemes. New York: M. Decker, 2001. G. V. Sandrakov. Homogenization of elasticity equations with contrasting coefficients. Sbornik Math., 190(12):1749–1806, 1999. Barna Szabó and Ivo Babuška. Finite Element Analysis. New York: Wiley, 1991. Amir Shlivinski and Amir Boag. Multilevel surface decomposition algorithm for rapid evaluation of transient near-field to far-field transforms. IEEE Transactions on Antennas and Propagation, 57(1):188–195, 2009. T. Strouboulis, I. Babuška, and K. L. Copps. The design and analysis of the Generalized Finite Element Method. Computer Meth. in Appl. Mech. & Eng., 181(1– 3):43–69, 2000. Sheppard Salon and M. V. K. Chari. Numerical Methods in Electromagnetism. Academic Press, 1999. Gwenaël Salin and Jean-Michel Caillol. Ewald sums for Yukawa potentials. J. of Chem. Phys., 113(23):10459–10463, 2000. A. Schuster. Introduction to the Theory of Optics. Edward Arnold, London, 1904. S. A. Schelkunoff. A mathematical theory of linear arrays. The Bell System Technical Journal, 22(1):80–107, 1943. Laurent Schwartz. Mathematics for the Physical Sciences. Addison-Wesley Pub. Co., 1966. I. J. Schoenberg. Cardinal Spline Interpolation (CBMS-NSF Regional Conference Series in Applied Mathematics No. 12). SIAM, 1973. Julian Schwinger. Casimir effect in source theory. Letters in Mathematical Physics, 1(1):43–47, 1975. Lawrence S. Schulman. Time’s Arrows and Quantum Measurement. Cambridge University Press, 1997. Tamar Schlick. Molecular Modeling and Simulation. Springer, 2002. John B. Schneider. Understanding the Finite-Difference Time-Domain Method. www.eecs.wsu.edu/ schneidj/ufdtd, 2010. Celeste Sagui and Thomas A. Darden. Molecular dynamics simulations of biomolecules: long-range electrostatic effects. Annu. Rev. Biophys. Biomol. Struct., 28:155–179, 1999. Celeste Sagui and Thomas A. Darden. Multigrid methods for classical molecular dynamics simulations of biomolecules. J. Chem. Phys., 114(15):6578–6591, 2001. Daniel Sjöberg, Christian Engstrom, Gerhard Kristensson, David J. N. Wall, and Niklas Wellander. A Floquet–Bloch decomposition of Maxwell’s equations applied to homogenization. Multiscale Modeling & Simulation, 4(1):149–171, 2005. Siegfried Selberherr. Analysis and Simulation of Semiconductor Devices. SpringerVerlag, 1984. Bo E. Sernelius. Fundamentals of van der Waals and Casimir Interactions. Springer, 2018. S. A. Schelkunoff and H. T. Friis. Antennas Theory and Practice. Wiley, 1952.

References [SF73] [SF90] [SG16] [SH90] [Sha06] [SHD19]

[She94] [SHT20] [Shy11] [Sil59] [Sil72] [Sil07]

[Sil09] [Sim03] [Sim09] [Sim11] [Siv57] [Sjö05]

[SJS06]

[SKB+18]

[SKLL95]

687 G. Strang and G.J. Fix. An Analysis of the Finite Element Method. Prentice-Hall, Englewood Cliffs, 1973. P. P. Silvester and R. L. Ferrari. Finite Elements for Electrical Engineers. Cambridge ; New York : Cambridge University Press, 2nd ed., 1990. Matt K. Smith and Gregory J. Gbur. Construction of arbitrary vortex and superoscillatory fields. Opt. Lett., 41(21):4979–4982, 2016. Kim A. Sharp and Barry Honig. Calculating total electrostatic energies with the nonlinear poisson-boltzmann equation. J. Phys. Chem., 94:7684–7692, 1990. Vladimir M. Shalaev. Optical negative-index metamaterials. Nature Photonics, 1:41–48, 2006. Johannes Skaar, Hans Olaf Hågenvik, and Christopher A. Dirdal. Four definitions of magnetic permeability for periodic metamaterials. Phys. Rev. B, 99:064407, Feb 2019. N. Al Shenk. Uniform error estimates for certain narrow lagrange finite elements. Math. Comput., 63(207):105–119, 1994. Markus Schöbinger, Karl Hollaus, and Igor Tsukerman. Nonasymptotic homogenization of laminated magnetic cores. IEEE Trans Magn, 56(2):1–4, 2020. D. M. Shyroki. Modeling of sloped interfaces on a yee grid. IEEE Transactions on Antennas and Propagation, 59(9):3290–3295, 2011. R. A. Silin. Waveguiding properties of two-dimensional periodical slow-wave systems. Voprosy Radioelektroniki, Elektronika, 4:11–33, 1959. R. A. Silin. Optical properties of artificial dielectrics. Izvestia VUZov Radiofizika, 15:809–820, 1972. M. G. Silveirinha. Metamaterial homogenization approach with application to the characterization of microstructured composites with negative parameters. Phys Rev B, 75:115104, 2007. M. G. Silveirinha. Poynting vector, heating rate, and stored energy in structured materials: A first principles derivation. Phys Rev B, 80:235120, 2009. Thomas Simonson. Electrostatics and dynamics of proteins. Reports on Progress in Physics, 66(5):737–787, 2003. and references therein. C. R. Simovski. Material paremeters of metamaterials (a review). Optics and Spectroscopy, 107(5):766–793, 2009. C. R. Simovski. On electromagnetic characterization and homogenization of nanostructured metamaterials. Journal of Optics, 13:103001, 2011. D. V. Sivukhin. The energy of electromagnetic waves in dispersive media. Optika i Spektroskopia, 3:308–312, 1957. Daniel Sjöberg. Homogenization of dispersive material parameters for Maxwell’s equations using a singular value decomposition. Multiscale Modeling & Simulation, 4(3):760–789, 2005. Patrick Stoller, Volker Jacobsen, and Vahid Sandoghdar. Measurement of the complex dielectric constant of a single gold nanoparticle. Optics Letters, 31(16):2474– 2476, 2006. Mark I Stockman, Katrin Kneipp, Sergey I Bozhevolnyi, Soham Saha, Aveek Dutta, Justus Ndukaife, Nathaniel Kinsey, Harsha Reddy, Urcan Guler, Vladimir M Shalaev, Alexandra Boltasseva, Behrad Gholipour, Harish N S Krishnamoorthy, Kevin F MacDonald, Cesare Soci, Nikolay I Zheludev, Vassili Savinov, Ranjan Singh, Petra Groß, Christoph Lienau, Michal Vadai, Michelle L Solomon, David R Barton, Mark Lawrence, Jennifer A Dionne, Svetlana V Boriskina, Ruben Esteban, Javier Aizpurua, Xiang Zhang, Sui Yang, Danqing Wang, Weijia Wang, Teri W Odom, Nicolò Accanto, Pablo M de Roque, Ion M Hancu, Lukasz Piatkowski, Niek F van Hulst, and Matthias F Kling. Roadmap on plasmonics. Journal of Optics, 20(4):043001, Mar 2018. Z. S. Sacks, D. M. Kingsland, R. Lee, and J.-F. Lee. A perfectly matched anisotropic absorber for use as an absorbing boundary condition. IEEE Trans. on Antennas and Propag., 43(12):1460–1463, 1995.

688 [SL86]

[SL91] [SL00] [SLK15] [SLL+15]

[Smi81] [SMJ+06]

[Smo18] [SMS06] [Smy89] [SNM+16]

[SP80] [SP96] [SP02] [Spa72]

[SPV+00]

[SR01] [SS97] [SS00] [SS09] [SSC94] [SScvS02]

[SSK05]

References K. Stüben and J. Linden. Multigrid methods: An overview with emphasis on grid generation processes. In J. Häuser, editor, Proc. First Internat. Conference on Numerical Grid Generations in Computational Fluid Dynamics, Swansea, 1986. Pinerige Press. K. E. Schmidt and M. A. Lee. Implementing the fast multipole method in three dimensions. J. Stat. Phys., 63(5/6):1223–1235, 1991. A. K. Soh and Z. F. Long. Development of two-dimensional elements with a central circular hole. Comput. Methods Appl. Mech. Engrg., 188:431–440, 2000. Ilya V. Shadrivov, Mikhail Lapine, and Yuri S. Kivshar, editors. Nonlinear, Tunable and Active Metamaterials. Springer, 2015. Lei Sun, Zhigang Li, Ting S. Luk, Xiaodong Yang, and Jie Gao. Nonlocal effective medium analysis in symmetric metal-dielectric multilayer metamaterials. Phys. Rev. B, 91:195147, 2015. E. R. Smith. Electrostatic energy in ionic crystals. Proc. Roy. Soc. London A, 375:475–505, 1981. D. Schurig, J.J. Mock, B. J. Justice, S. A. Cummer, J. B. Pendry, A. F. Starr, and D. R. Smith. Metamaterial electromagnetic cloak at microwave frequencies. Science, 314(5801):977–980, 2006. Igor I Smolyaninov. Metamaterial Multiverse. Morgan & Claypool Publishers, 2018. D. Schurig, J. J. Mock, and D. R. Smith. Electric-field-coupled resonators for negative permittivity metamaterials. Applied Physics Letters, 88(4):041109, 2006. William B. Smythe. Static and Dynamic Electricity. John Benjamins Publishing Co, 1989. Robert Schittny, Andreas Niemeyer, Frederik Mayer, Andreas Naber, Muamer Kadic, and Martin Wegener. Invisibility cloaking in light-scattering media. Laser & Photonics Reviews, 10(3):382–408, 2016. E. Sanchez-Palencia. Non-homogeneous Media and Vibration Theory. Springer Verlag, 1980. J. S. Savage and A. F. Peterson. Higher-order vector finite elements for tetrahedral cells. IEEE Trans. on Microwave Theory Tech., 44(6):874–879, 1996. J. T. Shen and P. M. Platzman. Near field imaging with negative dielectric constant lenses. Applied Physics Letters, 80(18):3286–3288, 2002. D. B. Spalding. A novel finite-difference formulation for differential expressions involving both first and second derivatives. Int. J. for Numer. Meth. Eng., 4:551– 559, 1972. D. R. Smith, Willie J. Padilla, D. C. Vier, S. C. Nemat-Nasser, and S. Schultz. Composite medium with simultaneously negative permeability and permittivity. Phys. Rev. Lett., 84(18):4184–4187, May 2000. Rajinder Singh and Falk Riess. The 1930 Nobel Prize for physics: a close decision? Notes and Records of the Royal Society of London, 55:267–283, 2001. Kazuaki Sakoda and Hitomi Shiroma. Numerical method for localized defect modes in photonic lattices. Phys. Rev. B, 56(8–15):4830–4835, 1997. V Sahni and A Starobinsky. The case for a positive cosmological Lambda-term. International Journal of Modern Physics D, 9(4):373–443, AUG 2000. Laszlo Solymar and Ekaterina Shamonina. Waves in Metamaterials. Oxford University Press, 2009. J. M. Sanz-Serna and M. P. Calvo. Numerical Hamiltonian Problems. Chapman and Hall, Londonr, 1994. D. R. Smith, S. Schultz, P. Markoš, and C.M. Soukoulis. Determination of effective permittivity and permeability of metamaterials from reflection and transmission coefficients. Phys. Rev. B, 65(19):195104, 2002. Ilya V. Shadrivov, Andrey A. Sukhorukov, and Yuri S. Kivshar. Complete band gaps in one-dimensional left-handed periodic structures. Phys. Rev. Lett., 95(19):193903, 2005.

References [SSR+03]

[SSS01] [SST04]

[ST98] [ST06] [ST08] [Sto92]

[Str41] [Str69] [Str72]

[Str04] [Stü83] [Stü00]

[STW98]

[SU04]

[Sub90]

[Sul00] [Sur87] [Sur90] [Sur96] [SV04]

[SvdV96]

689 David R. Smith, David Schurig, Marshall Rosenbluth, Sheldon Schultz, S. Anantha Ramakrishna, and John B. Pendry. Limitations on subdiffraction imaging with a negative refractive index slab. Applied Physics Letters, 82(10):1506–1508, 2003. R. A. Shelby, D. R. Smith, and S. Schultz. Experimental verification of a negative index of refraction. Science, 292(5514):77–79, 2001. B. Sauviac, C. R. Simovski, and S. A. Tretyakov. Double split-ring resonators: Analytical modeling and numerical simulations. Electromagnetics, 24(5):317–338, 2004. I. Singer and E. Turkel. High order finite difference methods for the Helmholtz equation. Computer Meth. in Appl. Mech. & Eng., 163:343–358, 1998. Lucía B Scaffardi and Jorge O Tocho. Size dependence of refractive index of gold nanoparticles. Nanotechnology, 17(5):1309–1315, 2006. Ari Sihvola and Sergei Tretyakov. Comments on boundary problems and electromagnetic constitutive parameters. Optik, 119(5):247–249, 2008. A. Stohchniol. A general transformation for open boundary finite element method for electromagnetic problems. IEEE Transactions on Magnetics, 28(2):1679– 1681, 1992. J. A. Stratton. Electromagnetic Theory. McGraw-Hill: New York, 1941. Volker Strassen. Gaussian elimination is not optimal. Numerische Mathematik, 13:354–356, 1969. G. Strang. Variational crimes in the finite element method. In A. K. Aziz, editor, The Mathematical Foundations of the Finite Element Method with Applications to Partial Differential Equations, pages 689–710. Academic Press, New York, 1972. J. C. Strikwerda. Finite Difference Schemes and Partial Differential Equations. Society for Industrial and Applied Mathematics, 2004. K. Stüben. Algebraic multigrid (AMG): experiences and comparisons. Appl. Math. Comput., 13:419–452, 1983. K. Stüben. An introduction to algebraic multigrid. In U. Trottenberg, C. W. Oosterlee, and A. Schüller, editors, Multigrid, pages 413–532. Academic Press, London, 2000. Appendix A. A. H. Schatz, V Thomee, and L. B Wahlbin. Stability, analyticity, and almost best approximation in maximum norm for parabolic finite element equations. Communications on Pure and Applied Mathematics, 51(11–12):1349–1385, Nov–Dec 1998. Gennady Shvets and Yaroslav A. Urzhumov. Engineering the electromagnetic properties of periodic nanostructures using electrostatic resonances. Phys. Rev. Lett., 93(24):243902, 2004. Yu. N. Subbotin. Dependence of estimates of a multidimensional piecewise polynomial approximation on the geometric characteristics of the triangulation. Proceedings of the Steklov Institute of Mathematics, 189(4):135–159, 1990. Dennis M. Sullivan. Electromagnetic Simulation Using the FDTD Method. WileyIEEE Press, 2000. Yu. B. Suris. On some properties of methods for numerical integration of systems x¨ = f (x). USSR J. Comput. Math. and Math. Phys., 27:149–156, 1987. Yu. B. Suris. Hamiltonian methods of Runge–Kutta type and their variational interpretation [in Russian]. Math. Modeling, 2:78–87, 1990. Yu. B. Suris. Partitioned Runge–Kutta methods as phase-volume preserving integrators. Phys. Lett. A, 220:63–69, 1996. D. R. Smith and D. C. Vier. Design of metamaterials with negative refractive index. In M. Razeghi and G. J. Brown, editors, Quantum Sensing and Nanophotonic Devices. Edited by Razeghi, Manijeh; Brown, Gail J. Proceedings of the SPIE, Volume 5359, pp. 52-63 (2004)., pages 52–63, 2004. G. L. G. Sleijpen and H. A. van der Vorst. A Jacobi-Davidson iteration method for linear eigenvalue problems. SIAM J. Matrix Anal. Appl., 17:401–425, 1996.

690 [SvdV00] [SVKS00]

[SVKS05]

[SW77] [SW78]

[SW79]

[SW95] [SW03] [SW04]

[SW11]

[SWZ+03]

[Syn28] [Syn32] [Syn57] [sZZ83] [Taf80]

[Tar09] [TB96] [TBM+14]

[TC97] [TC98a]

References Yousef Saad and Henk A. van der Vorst. Iterative solution of linear systems in the 20th century. J. Comput. Appl. Math., 123(1-2):1–33, 2000. D. R. Smith, D. C. Vier, N. Kroll, and S. Schultz. Direct calculation of permeability and permittivity for a left-handed metamaterial. Applied Physics Letters, 77(14):2246–2248, 2000. D. R. Smith, D. C. Vier, Th. Koschny, and C. M. Soukoulis. Electromagnetic parameter retrieval from inhomogeneous metamaterials. Phys. Rev. E, 71:036617, 2005. A. H. Schatz and L. B. Wahlbin. Interior maximum norm estimates for finiteelement methods. Mathematics of Computation, 31(138):414–442, 1977. A. H. Schatz and L. B. Wahlbin. Maximum norm estimates in the finite-element method on plane polygonal domains 1. Mathematics of Computation, 32(141):73– 109, 1978. A. H. Schatz and L. B. Wahlbin. Maximum norm estimates in the finite-element method on plane polygonal domains 2. refinements. Mathematics of Computation, 33(146):465–492, 1979. A. H. Schatz and L. B. Wahlbin. Interior maximum-norm estimates for finiteelement methods .2. Mathematics of Computation, 64(211):907–928, Jul 1995. G. L. G. Sleijpen and F. W. Wubs. Exploiting multilevel preconditioning techniques in eigenvalue computations. SIAM J. Sci. Comput., 25:1249–1272, 2003. R. Schuhmann and T. Weiland. Recent advances in finite integration technique for high frequency applications. In S. H. M. J. Houben W. H. A. Schilders, Jan W. ter Maten, editor, Springer Series: Mathematics in Industry, volume 4, pages 46–57, 2004. Costas M. Soukoulis and Martin Wegener. Past achievements and future challenges in the development of three-dimensional photonic metamaterials. Nature Photonics, 5:523–530, 2011. K.-H. Su, Q.-H. Wei, X. Zhang, J.J. Mock, D. R. Smith, and S. Schultz. Interparticle coupling effects on plasmon resonances of nanogold particles. Nanoletters, 3(8):1087–1090, 2003. E. H. Synge. A suggested method for extending microscopic resolution into the ultramicroscopic region. Philosophical Magazine, 6:356–362, 1928. E. H. Synge. An application of piezoelectricity to microscopy. Philosophical Magazine, 13:297–300, 1932. J. L. Synge. The Hypercircle in Mathematical Physics; a Method for the Approximate Solution of Boundary Value Problems. Cambridge: University Press, 1957. Ole Østerby & Zahari Zlatev. Direct Methods for Sparse Matrices. Berlin; New York: Springer-Verlag, 1983. A. Taflove. Application of the finite-difference time-domain method to sinusoidal steady-state electromagnetic-penetration problems. IEEE Transactions on Electromagnetic Compatibility, EMC-22(3):191–202, Aug 1980. Luc Tartar. The General Theory of Homogenization: A Personalized Introduction. Springer, 2009. Abdulnour Y. Toukmaji and John Board. Ewald summation techniques in perspective: a survey. Computer Physics Communications, 95:73–92, 1996. Jeremiah P. Turpin, Jeremy A. Bossard, Kenneth L. Morgan, Douglas H. Werner, and Pingjuan L. Werner. Reconfigurable and tunable metamaterials: A review of the theory and applications. Int J of Antennas and Propag, 2014. F. L. Teixeira and W. C. Chew. PML-FDTD in cylindrical and spherical grids. IEEE Microwave and Guided Wave Letters, 7(9):285–287, 1997. F. L. Teixeira and W. C. Chew. General closed-form PML constitutive tensors to match arbitrary bianisotropic and dispersive linear media. IEEE Microwave and Guided Wave Lett., 8:223–225, 1998.

References [TC98b]

[TC99] [TE08] [Tei01] [Tei14] [Tes14] [TGGT13]

[TH05] [THC20]

[Tho01]

[TK04]

[TKB99]

[TKL92]

[TKMS93]

[TLK94]

[TLS69]

[TM86] [TM14]

[TM16] [TMCM19]

[TNA+03]

691 F. L. Teixeira and W. C. Chew. General closed-form PML constitutive tensors to match arbitrary bianisotropic and dispersive linear media. IEEE Microwave and Guided Wave Lett., 8:223–225, 1998. F. L. Teixeira and W. C. Chew. Lattice electromagnetic theory from a topological viewpoint. Journal of Mathematical Physics, 40:169–187, 1999. Anna-Karin Tornberg and Björn Engquist. Consistent boundary conditions for the yee scheme. Journal of Computational Physics, 227(14):6922–6943, 2008. F. L. Teixeira. Geometric aspects of the simplicial discretization of Maxwell’s equations. Progress In Electromagnetics Research (PIER), 32:171–188, 2001. F. L. Teixeira. Lattice Maxwell’s equations (invited paper). Progress in Electromagnetics Research – PIER, 148:113–128, 2014. Gerald Teschl. Mathematical Methods in Quantum Mechanics: With Applications to Schrödinger Operators. American Mathematical Society, 2014. Eli Turkel, Dan Gordon, Rachel Gordon, and Semyon Tsynkov. Compact 2D and 3D sixth order schemes for the Helmholtz equation with variable wave number. Journal of Computational Physics, 232(1):272–287, 2013. Allen Taflove and Susan C. Hagness. Computational Electrodynamics: The FiniteDifference Time-Domain Method. Artech House Publishers, 2005. Igor Tsukerman, A N M Shahriyar Hossain, and Y. D. Chong. Homogenization of layered media: Intrinsic and extrinsic symmetry breaking. arXiv:2003.08492, 2020. Vidar Thomée. From finite differences to finite elements: A short history of numerical analysis of partial differential equations. J. of Comp. and Applied Math., 128:1– 54, 2001. F. Trevisan and L. Kettunen. Geometric interpretation of discrete approaches to solving magnetostatic problems. IEEE Transactions on Magnetics, 40(2):361–365, 2004. T. Tarhasaari, L. Kettunen, and A. Bossavit. Some realizations of a discrete hodge operator: A reinterpretation of finite element techniques. IEEE Trans. Magn., 35(3):1494–1497, 1999. I. A. Tsukerman, A. Konrad, and J. D. Lavers. A method for circuit connections in time-dependent eddy-current problems. IEEE Transactions on Magnetics, 28(2):1299–1302, 1992. I. A. Tsukerman, A. Konrad, G. Meunier, and J. C. Sabonnadiere. Coupled fieldcircuit problems: trends and accomplishments. IEEE Transactions on Magnetics, 29(2):1701–1704, 1993. I. Tsukerman, J. D. Lavers, and A. Konrad. Using complementary formulations for accurate computations of magnetostatic fields and forces in a synchronous motor. IEEE Transactions on Magnetics, 30(5, 2):3479–3482, Sep 1994. C. Taylor, Dong-Hoa Lam, and T. Shumpert. Electromagnetic pulse scattering in time-varying inhomogeneous media. IEEE Transactions on Antennas and Propagation, 17(5):585–589, Sep 1969. L. Ting and M. J. Miksis. Exact boundary conditions for scattering problems. Acoust. Soc. Am., 80:1825–1827, 1986. Igor Tsukerman and Vadim A. Markel. A nonasymptotic homogenization theory for periodic electromagnetic structures. Proc Royal Society A, 470:2014.0245, 2014. Igor Tsukerman and Vadim A. Markel. Nonasymptotic homogenization of periodic electromagnetic structures: Uncertainty principles. Phys. Rev. B, 93:024418, 2016. Igor Tsukerman, Shampy Mansha, Y. D. Chong, and Vadim A. Markel. Trefftz approximations in complex media: Accuracy and applications. Computers & Mathematics with Applications, 77(6):1770–1785, 2019. S. Tretyakov, I. Nefedov, A.Sihvola, S. Maslovski, and C. Simovski. Waves and energy in chiral nihility. Journal of Electromagnetic Waves and Applications, 17(5):695–706, 2003.

692 [Ton02] [Tox94] [TP98]

[TP99a] [TP99b] [TPB98]

[Tre85] [Tre97] [Tre03] [Tre05]

[TSJ05]

[Tsu90] [Tsu94] [Tsu95a]

[Tsu95b] [Tsu98a] [Tsu98b] [Tsu98c] [Tsu99] [Tsu02] [Tsu03a] [Tsu03b] [Tsu04a] [Tsu04b]

References E. Tonti. Finite formulation of electromagnetic field. IEEE Trans. Magn., 38(2):333–336, 2002. Søren Toxvaerd. Hamiltonians for discrete dynamics. Phys. Rev. E, 50(3):2271– 2274, 1994. Igor Tsukerman and Alexander Plaks. Comparison of accuracy criteria for approximation of conservative fields on tetrahedra. IEEE Trans. Magn., 34(5):3252–3255, 1998. Igor Tsukerman and Alexander Plaks. Hierarchical basis multilevel preconditioners for 3d magnetostatic problems. IEEE Trans. Magn., 35(3):1143–1146, 1999. Igor Tsukerman and Alexander Plaks. Refinement strategies and approximation errors for tetrahedral elements. IEEE Trans. Magn., 35(3):1342–1345, 1999. Igor Tsukerman, Alexander Plaks, and H. Neal Bertram. Multigrid methods for computation of magnetostatic fields in magnetic recording problems. J. of Applied Phys., 83(11):6344–6346, 1998. Lloyd N. Trefethen. Three mysteries of Gaussian elimination. ACM SIGNUM Newsletter, 20(4):2–5, 1985. Lloyd Nicholas Trefethen. Numerical Linear Algebra. Philadelphia: Society for Industrial and Applied Mathematics, 1997. Sergei Tretyakov. Analytical Modeling in Applied Electromagnetics. Artech House, 2003. S. A. Tretyakov. Research on negative refraction and backward-wave media: A historical perspective. In Proceedings of the EPFL Latsis Symposium, Lausanne, 2005. S. Tretyakov, A. Sihvola, and L. Jylhä. Backward-wave regime and negative refraction in chiral composites. Photonics and Nanostructures - Fundamentals and Applications, 3(2):107–115, 2005. The Sixth International Symposium on Photonic and Electromagnetic Crystal Structures (PECS-VI). I. A. Tsukerman. Error estimation for finite-element solutions of the eddy currents problem. COMPEL, 9(2):83–98, 1990. Igor Tsukerman. Application of multilevel preconditioners to finite element magnetostatic problems. IEEE Trans. Magn., 30(5):3562–3565, 1994. I. Tsukerman. A stability paradox for time-stepping schemes in coupled fieldcircuit problems. IEEE Transactions on Magnetics, 31(3, 1):1857–1860, May 1995. I. Tsukerman. Accurate computation of ripple solutions on moving finite-element meshes. IEEE Transactions on Magnetics, 31(3, 1):1472–1475, May 1995. Igor Tsukerman. Approximation of conservative fields and the element edge shape matrix. IEEE Trans. Magn., 34(5):3248–3251, 1998. Igor Tsukerman. A general accuracy criterion for finite element approximation. IEEE Trans. Magn., 34(5):2425–2428, 1998. Igor Tsukerman. How flat are flat elements? The International Compumag Society Newsletter, 5(1):7–12, 1998. Igor Tsukerman. Finite element matrices and interpolation errors. unpublished, March 1999. Igor Tsukerman. Some paradoxes and misconceptions in computational electromagnetics. In IEEE CEFC Conference, June 2002. I. Tsukerman. Spurious numerical solutions in electromagnetic resonance problems. IEEE Transactions on Magnetics, 39(3):1405–1408, 2003. Igor Tsukerman. Symbolic algebra as a tool for understanding edge elements. IEEE Trans. Magn., 39(3):1111–1114, 2003. Igor Tsukerman. Efficient computation of long-range electromagnetic interactions without Fourier Transforms. IEEE Trans. Magn., 40(4):2158–2160, 2004. Igor Tsukerman. Flexible local approximation method for electro- and magnetostatics. IEEE Trans. Magn., 40(2):941–944, 2004.

References [Tsu04c]

[Tsu05a] [Tsu05b] [Tsu06] [Tsu07] [Tsu08] [Tsu09] [Tsu10] [Tsu11a] [Tsu11b] [Tsu11c] [Tsu14] [Tsu17] [Tsu20a] [Tsu20b] [Tsy98] [Tsy03]

[TTA96]

[Tur48] [Tv08] [TW67] [TW05]

[TW14] [Twe52] [Var00]

693 Igor Tsukerman. Toward Generalized Finite Element Difference Methods for electro- and magnetostatics. In S. H. M. J. Houben, W. H. A. Schilders, Jan W. ter Maten, editor, Springer Series: Mathematics in Industry, volume 4, pages 58–77, 2004. I. Tsukerman. Electromagnetic applications of a new finite-difference calculus. IEEE Trans. Magn., 41(7):2206–2225, 2005. I. Tsukerman. A new FD calculus: simple grids for complex problems. The International Compumag Society Newsletter, 12(2):3–17, 2005. I. Tsukerman. A class of difference schemes with flexible local approximation. J. Comput. Phys., 211(2):659–699, 2006. Igor Tsukerman. Computational Methods for Nanoscale Applications: Particles, Plasmons and Waves. Springer, 2007. Igor Tsukerman. Negative refraction and the minimum lattice cell size. J. Opt. Soc. Am. B, 25:927–936, 2008. Igor Tsukerman. Quasi-homogeneous backward-wave plasmonic structures: theory and accurate simulation. J of Opt A, 11(11):114025, 2009. Igor Tsukerman. Trefftz difference schemes on irregular stencils. J of Comput Phys, 229(8):2948–2963, 2010. Igor Tsukerman. Effective parameters of metamaterials: a rigorous homogenization theory via Whitney interpolation. J Opt Soc Am B, 28(3):577–586, 2011. Igor Tsukerman. Nonlocal homogenization of metamaterials by dual interpolation of fields. J Opt Soc Am B, 28(12):2956–2965, 2011. Igor Tsukerman. A singularity-free boundary equation method for wave scattering. IEEE Trans on Antennas and Propagation, 59(2):555–562, 2011. Igor Tsukerman. A “Trefftz Machine” for Absorbing Boundary Conditions. arxiv:1406.0224, 2014. Igor Tsukerman. Classical and non-classical effective medium theories: New perspectives. Physics Letters A, 381(19):1635–1640, 2017. Igor Tsukerman. A general classical treatment of polarization. In preparation, 2020. Igor Tsukerman. Generalized reciprocity relations for plane waves. In preparation, 2020. S. V. Tsynkov. Numerical solution of problems on unbounded domains. A review. Appl. Numer. Math., 27:465–532, 1998. S. V. Tsynkov. Artificial boundary conditions for the numerical simulation of unsteady acoustic waves. Journal of Computational Physics, 189(2):626–650, 2003. S. V. Tsynkov, E. Turkel, and S. Abarbanel. External flow computations using global boundary conditions. AIAA (American Institute of Aeronautics and Astronautics), 34(4):700–706, 1996. A. M. Turing. Rounding-off errors in matrix processes. Q J Mechanics Appl Math, 1(1):287–308, 1948. ˇ I. Tsukerman and F. Cajko. Photonic band structure computation using FLAME. IEEE Trans Magn, 44(6):1382–1385, 2008. W. F. Tinney and J. W. Walker. Direct solutions of sparse network equations by optimally ordered triangular factorization. Proc. IEEE, 55:1801–1809, 1967. Andrea Toselli and Olof Widlund. Domain Decomposition Methods – Algorithms and Theory. Springer Series in Computational Mathematics , Vol. 34. Springer, 2005. Lloyd N. Trefethen and J. A. C. Weideman. The exponentially convergent trapezoidal rule. SIAM Review, 56(3):385–458, 2014. Victor Twersky. Multiple scattering of radiation by an arbitrary configuration of parallel cylinders. J. Acoust. Soc. Am., 24:42–46, 1952. Richard S. Varga. Matrix Iterative Analysis. Berlin; New York: Springer, 2000.

694 [VBSV13]

[VCK98]

[vD34]

[VD99] [VDB08]

[vDR89]

[vdV03a] [vdV03b] [vdV04] [Ver96] [Ver19] [Ves68] [Vin02] [VK84] [Vla84] [Vle32] [VLZ+09] [VO48] [VSK93]

[vT08] [VWV02]

[VWV03] [Wal98a] [Wal98b]

References Ventsislav K. Valev, Jeremy J. Baumberg, Concita Sibilia, and Thierry Verbiest. Chirality and chiroptical effects in plasmonic nanostructures: Fundamentals, recent progress, and outlook. Advanced Materials, 25(18):2517–2534, 2013. John L. Volakis, Arindam Chatterjee, and Leo C. Kempel. Finite Element Method for Electromagnetics: Antennas, Microwave Circuits, and Scattering Applications. Wiley – IEEE Press, 1998. D. van Dantzig. The fundamental equations of electromagnetism, independent of metrical geometry. Mathematical Proceedings of the Cambridge Philosophical Society, 30(4):421–427, 1934. L. Vardapetyan and L. Demkowicz. hp-adaptive finite elements in electromagnetics. Comput. Methods Appl. Mech. Engrg., 169:331–344, 1999. Ilya Valuev, Alexei Deinega, and Sergei Belousov. Iterative technique for analysis of periodic structures at oblique incidence in the finite-difference time-domain method. Opt. Lett., 33(13):1491–1493, 2008. L. L. van Dommelen and E. A. Rundensteiner. Fast, adaptive summation of point forces in the two-dimensional poisson equation. J. Comput. Phys., 83:126–147, 1989. Henk A. van der Vorst. Iterative Krylov Methods for Large Linear Systems. Cambridge, UK; New York: Cambridge University Press, 2003. Henk A. van der Vorst. Iterative Krylov Methods for Large Linear Systems. Cambridge University Press, 2003. Henk A. van der Vorst. Modern methods for the iterative computation of eigenpairs of matrices of high dimension. Z. Angew. Math. Mech., 84(7):444–451, 2004. Rüdiger Verfürth. A Review of A Posteriori Error Estimation and Adaptive MeshRefinement Techniques. Wiley-Teubner, Stuttgart, 1996. Barbara Verfürth. Heterogeneous multiscale method for the Maxwell equations with high contrast. ESAIM: M2AN, 53(1):35–61, 2019. V. G. Veselago. Electrodynamics of substances with simultaneously negative values of  and μ. Sov Phys Uspekhi, 10(4):509–514, 1968. A. P. Vinogradov. On the form of constitutive equations in electrodynamics. Phys. Usp., 45:331–338, 2002. V. V. Voevodin and Iu. A. Kuznetsov. Matritsy i Vychisleniia. Moskva: Nauka, 1984. [in Russian]. V. S. Vladimirov. Equations of Mathematical Physics. Moscow: Mir, 1984. Translated from the Russian by Eugene Yankovsky. John Hasbrouck Van Vleck. The Theory of Electric and Magnetic Susceptibilities. Clarendon Press, 1932. J. Valentine, J. Li, T. Zentgraf, G. Bartal, and X. Zhang. An optical cloak made of dielectrics. Nature Materials, 8:568–571, 2009. E. J. W. Verwey and J. Th. G. Overbeek. Theory of the Stability of Lyophobic Colloids. Elsevier, Amsterdam, 1948. M. E. Veysoglu, R. T. Shin, and J. A. Kong. A finite-difference time-domain analysis of wave scattering from periodic surfaces: Oblique incidence case. Journal of Electromagnetic Waves and Applications, 7(12):1595–1607, 1993. ˇ F. Cajko and I. Tsukerman. Flexible approximation schemes for wave refraction in negative index materials. IEEE Trans. Magn., 44(6):1378–1381, 2008. P. M. Valanju, R. M. Walser, and A. P. Valanju. Wave refraction in negative-index media: Always positive and very inhomogeneous. Phys. Rev. Lett., 88(18):187401, 2002. P. M. Valanju, R. M. Walser, and A. P. Valanju. Valanju, Walser, and Valanju reply. Phys. Rev. Lett., 90(2):029704, 2003. S. Waldron. The error in linear interpolation at the vertices of a simplex. SIAM J. Numer. Analysis, 35(3):1191–1200, 1998. Wolfgang Walter. Ordinary Differential Equations. New York: Springer, 1998.

References [WAM06]

[Was53] [Was03] [WB00]

[WCWL16]

[WD00]

[WE10]

[WE15]

[Web93] [Web99]

[Web02] [Web05] [Web07] [Wei74] [Wei77]

[Wei96] [Wei03]

[Wei14] [Wer17] [Wes85] [Wes91] [WF93] [WFS01]

695 Mark S. Wheeler, J. Stewart Aitchison, and Mohammad Mojahedi. Coated nonmagnetic spheres with a negative index of refraction at infrared frequencies. Physical Review B, 73(4):045105, 2006. K. Washizu. Bounds for solutions of boundary value problems in elasticity. J. Math. Phys., 32:117–128, 1953. Takumi Washio. private communication, 2002-2003. A. Wiegmann and K. P. Bube. The explicit-jump immersed interface method: Finite difference methods for PDEs with piecewise smooth solutions. SIAM J. Numer. Analysis, 37(3):827–862, 2000. Zuojia Wang, Feng Cheng, Thomas Winsor, and Yongmin Liu. Optical chiral metamaterials: a review of the fundamentals, fabrication methods and applications. Nanotechnology, 27(41):412001, sep 2016. T. Wriedt and A. Doicu. T-matrix method for light scattering from a particle on or near an infinite surface. In F. Moreno and F. González, editors, Springer Lecture Notes in Physics, volume 534, pages 113–132. Springer-Verlag, 2000. A. M. H. Wong and G. V. Eleftheriades. Adaptation of Schelkunoff’s superdirective antenna theory for the realization of superoscillatory antenna arrays. IEEE Antennas and Wireless Propagation Letters, 9:315–318, 2010. A. M. H. Wong and G. V. Eleftheriades. Superoscillations without sidebands: Power-efficient sub-diffraction imaging with propagating waves. Scientific Reports, 5:8449–8454, 2015. J. P. Webb. Edge elements and what they can do for you. IEEE Trans. Magn., 29(2):1460–1465, 1993. J. P. Webb. Hierarchal vector basis functions of arbitrary order for triangular and tetrahedral finite elements. IEEE Trans. on Antennas & Propagation, 47(8):1244– 1253, 1999. J. P. Webb. P-adaptive methods for electromagnetic wave problems using hierarchal tetrahedral edge elements. Electromagnetics, 22(5):443–451, 2002. J. P. Webb. Using adjoint solutions to estimate errors in global quantities. IEEE Trans. Magn., 41(5):1728–1731, 2005. Jon Webb. Private communication, 2004–2007. Robert Weinstock. Calculus of Variations: With Applications to Physics and Engineering. New York, Dover Publications, 1974. Thomas Weiland. Eine methode zur lösung der maxwellschen gleichungen für sechskomponentige felder auf diskreter basis. AEÜ (Electronics and Communication, in German), 31:116–120, 1977. Thomas Weiland. Time domain electromagnetic field computation with finite difference methods. International Journal of Numerical Modelling, 9:295–319, 1996. Thomas Weiland. Finite integration method and discrete electromagnetism. In Peter Monk, Carsten Carstensen, Stefan Funken, Wolfgang Hackbusch, and Ronald H. W. Hoppe, editors, Proceedings of the GAMM Workshop on Computational Electromagnetics, Kiel, Germany, January 26–28, 2001. Springer, 2003. Steven H. Weintraub. Differential Forms: Theory and Practice. Academic Press, 2014. Douglas H. Werner, editor. Broadband Metamaterials in Electromagnetics: Technology and Applications. Pan Stanford, 2017. J. E. Wessel. Surface-enhanced optical microscopy. J. Opt. Soc. Am. B, 2:1538– 1541, 1985. Pieter Wesseling. An Introduction to Multigrid Methods. Chichester [England]; New York: J. Wiley, 1991. J. P. Webb and B. Forghani. Hierarchical scalar and vector tetrahedra. IEEE Trans. Magn., 29(2):1495–1498, 1993. William A. Harris, Jr., Jay P. Fillmore, and Donald R. Smith. Matrix exponentials. another approach. SIAM Review, 43(4):694–706, 2001.

696 [WGY17] [WH01] [Whi57] [Wik19] [Wil61] [Wil65] [Wil94] [Wil01] [WK03] [WLP12] [WNS+11]

[WP86] [WP96] [WPL02] [WR14] [Wri99] [XDK+10]

[XL08]

[XTAC10]

[Xu89] [Xu92] [Xu97]

[XW14]

[XZ03]

References Weiwei Wan, Jie Gao, and Xiaodong Yang. Metasurface holograms for holographic imaging. Advanced Optical Materials, 5(21), NOV 2 2017. Zuowei Wang and Christian Holm. Estimate of the cutoff errors in the ewald summation for dipolar systems. J. of Chem. Phys., 115(14):6351–6359, 2001. H. Whitney. Geometric Integration Theory. Princeton, NJ: Princeton Univ. Press, 1957. Wikipedia contributors. STED microscopy — Wikipedia, the free encyclopedia, 2019. [Online; accessed 29-September-2019]. James Hardy Wilkinson. Error analysis of direct methods of matrix inversion. J. of the ACM, 8(3):281–330, 1961. James Hardy Wilkinson. The Algebraic Eigenvalue Problem. Oxford; New York: Oxford University Press, 1988 (c1965). James Hardy Wilkinson. Rounding Errors in Algebraic Processes. Dover Publications (Reprint edition), 1994. John Michael Williams. Some problems with negative refraction. Phys. Rev. Lett., 87(24):249703, 2001. Niklas Wellander and Gerhard Kristensson. Homogenization of the Maxwell equations at fixed frequency. SIAM J. Appl. Math., 64(1):170–195, 2003. Claire M. Watts, Xianliang Liu, and Willie J. Padilla. Metamaterial electromagnetic wave absorbers. Advanced Materials, 24(23):OP98–OP120, 2012. Chihhui Wu, Burton Neuner, Gennady Shvets, Jeremy John, Andrew Milder, Byron Zollars, and Steve Savoy. Large-area wide-angle spectrally selective plasmonic absorber. Phys. Rev. B, 84:075102, 2011. P. C. Waterman and N. E. Pedersen. Electromagnetic scattering by periodic arrays of particles. J. Appl. Phys., 59:2609–2618, 1986. A. J. Ward and J. B. Pendry. Refraction and geometry in Maxwell’s equations. J. Mod. Opt., 43:773–793, 1996. Z. J. Wang, A. J. Przekwas, and Yen Liu. A FV-TD electromagnetic solver using adaptive Cartesian grids. Computer Physics Communications, 148:17–29, 2002. Karl F. Warnick and Peter Russer. Differential forms and electromagnetic field theory. Progress In Electromagnetics Research, 148:83–112, 2014. Thomas Wriedt, editor. Generalized Multipole Techniques for Electromagnetic and Light Scattering. Amsterdam; London: Elsevier, 1999. Shumin Xiao, Vladimir P. Drachev, Alexander V. Kildishev, Xingjie Ni, Uday K. Chettiar, Hsiao-Kuan Yuan, and Vladimir M. Shalaev. Loss-free and active optical negative-index metamaterials. Nature, 466:735–738, 2010. Tian Xiao and Qing Huo Liu. A 3-D enlarged cell technique (ECT) for the conformal FDTD method. IEEE Transactions on Antennas and Propagation, 56(3):765– 773, 2008. Jie L. Xiong, Mei Song Tong, Phillip Atkins, and Weng Cho Chew. Efficient evaluation of Casimir force in arbitrary three-dimensional geometries by integral equation methods. Physics Letters A, 374(25):2517–2520, May 2010. J. Xu. Theory of Multilevel Methods. PhD thesis, Cornell University, 1989. Jinchao Xu. Iterative methods by space decomposition and subspace correction. SIAM Review, 34:581–613, 1992. Jinchao Xu. An introduction to multigrid convergence theory. In Tony F. Chan, Raymond H. Chan, and Gene H. Golub, editors, Iterative Methods in Scientific Computing, pages 169–242, Singapore ; New York, 1997. Springer. Sotiris S. Xantheas and Jasper C. Werhahn. Universal scaling of potential energy functions describing intermolecular interactions. I. Foundations and scalable forms of new generalized Mie, Lennard-Jones, Morse, and Buckingham exponential-6 potentials. Journal of Chemical Physics, 141(6), Aug 14 2014. Jinchao Xu and Ludmil Zikatanov. Some observations on babuška and brezzi theories. Numerische Mathematik, 94(1):195–202, 2003.

References [Yab87] [Yas06] [YC97]

[Yee66]

[Yeh79] [Yeh05] [YF04] [YFB04]

[YG89] [YLJ+11]

[YM01]

[YM05]

[You03] [YP19] [Yse86] [YT97]

[YWK+02]

[YY94] [ZAT+15]

[ZC00]

[ZCZ99]

697 Eli Yablonovitch. Inhibited spontaneous emission in solid-state physics and electronics. Phys. Rev. Lett., 58(20):2059–2062, 1987. Kiyotoshi Yasumoto, editor. Electromagnetic theory and applications for photonic crystals. Boca Raton, FL : CRC/Taylor & Francis, 2006. K. S. Yee and J. S. Chen. The finite-difference time-domain (FDTD) and the finitevolume time-domain (FVTD) methods in solving Maxwell’s equations. IEEE Trans. Antennas Prop., 45(3):354–363, 1997. K. S. Yee. Numerical solution of initial boundary value problems involving Maxwell’s equations in isotropic media. IEEE Trans. Antennas & Prop., AP14(3):302–307, 1966. Pochi Yeh. Electromagnetic propagation in birefringent layered media. J. Opt. Soc. Am., 69(5):742–756, 1979. Pochi Yeh. Optical Waves in Layered Media. Hoboken, N.J.: John Wiley, 2005. B. Yellen and G. Friedman. Programmable assembly of heterogeneous colloidal particle arrays. Adv. Mater., 16(2):111–115, 2004. B. B. Yellen, G. Friedman, and K. A. Barbee. Programmable self-aligning ferrofluid masks for lithographic applications. IEEE Trans Magn., 40(4):2994–2996, 2004. E. Yablonovitch and T. J. Gmitter. Photonic band structure: The face-centeredcubic case. Phys. Rev. Lett., 63(18):1950–1953, 1989. L. Yu, S. K. Lu, Y. L Jiang, B. Xiao, X. Tang, and H. Q. Ru. First-principles calculation of structural and electronic properties of ti-doped B13 C2 . Procedia Engineering, 12:204 – 209, 2011. 2011 SREE Conference on Engineering Modelling and Simulation (CEMS 2011). W. Yu and R. Mittra. A conformal finite difference time domain technique for modeling curved dielectric surfaces. IEEE Microwave Wireless Comp. Lett., 11:25–27, 2001. Vassilios Yannopapas and Alexander Moroz. Negative refractive index metamaterials from inherently non-magnetic materials for deep infrared to terahertz frequency ranges. J. of Physics: Condensed Matter, 17(25):3717–3734, 2005. David M. Young, editor. Iterative Solution of Large Linear Systems. Dover Publications, 2003. SeokJae Yoo and Q-Han Park. Metamaterials and chiral sensing: a review of fundamentals and applications. Nanophotonics, 8(2):249–261, 2019. H. Yserentant. On the multilevel splitting of finite-element spaces. Numerische Mathematik, 49(4):379–412, 1986. T. V. Yioultsis and T. D. Tsiboukis. Development and implementation of second and third order vector finite elements in various 3-d electromagnetic field problems. IEEE Trans Magn., 33(2):1812–1815, 1997. S. Yamada, Y. Watanabe, Y. Katayama, X. Y. Yan, and J. B. Cole. Simulation of light propagation in two-dimensional photonic crystals with a point defect by a highaccuracy finite-difference time-domain method. J. of Applied Phys., 92(3):1181– 1184, 2002. Darrin M. York and Weitao Yang. The fast Fourier Poisson method for calculating Ewald sums. J. Chem. Phys., 101:3298–3300, 1994. Sergei V. Zhukovsky, Andrei Andryieuski, Osamu Takayama, Evgeniy Shkondin, Radu Malureanu, Flemming Jensen, and Andrei V. Lavrinenko. Experimental demonstration of effective medium approximation breakdown in deeply subwavelength all-dielectric multilayers. Phys. Rev. Lett., 115:177402, 2015. J. S. Zhao and W. C. Chew. Integral equation solution of Maxwell’s equations from zero frequency to microwave frequencies. IEEE Trans. Antennas & Prop., 48(10):1635–1645, 2000. F. Zheng, Z. Chen, and J. Zhang. A finite-difference time-domain method without the Courant stability conditions. IEEE Microwave Guided Wave Letters, 9(11):441–443, 1999.

698 [ZCZ00]

[ZDW+09]

[Zen87] [ZFM+05]

[ZFM+06]

[ZFP+05]

[Zha95] [Zhe08] [Zhi00] [ZK12] [ZKGY06]

[Zlá68] [ZMC+97]

[ZSW03] [ZT00] [ZT05]

[ZT13] [ZTZ05] [Zuf18]

[ZZ87]

[ZZ92a]

References Fenghua Zheng, Zhizhang Chen, and Jiazong Zhang. Toward the development of a three-dimensional unconditionally stable finite-difference time-domain method. IEEE Transactions on Microwave Theory and Techniques, 48(9):1550–1558, 2000. Jiangfeng Zhou, Jianfeng Dong, Bingnan Wang, Thomas Koschny, Maria Kafesaki, and Costas M. Soukoulis. Negative refractive index due to chirality. Phys. Rev. B, 79:121104, 2009. R. Zengerle. Light propagation in singly and doubly periodic planar waveguides. J. Mod. Optics, 34:1589–1617, 1987. Shuang Zhang, Wenjun Fan, K. J. Malloy, S. R. Brueck, N. C. Panoiu, and R. M. Osgood. Near-infrared double negative metamaterials. Optics Express, 13(13):4922–4930, 2005. Shuang Zhang, Wenjun Fan, Kevin J. Malloy, Steven R.J. Brueck, Nicolae C. Panoiu, and Richard M. Osgood. Demonstration of metal-dielectric negative-index metamaterials with improved performance at optical frequencies. J. Opt. Soc. Am. B, 23(3):434–438, 2006. Shuang Zhang, Wenjun Fan, N. C. Panoiu, K. J. Malloy, R. M. Osgood, and S. R. J. Brueck. Experimental demonstration of near-infrared negative-index metamaterials. Phys. Rev. Lett., 95(13):137404, 2005. S. Y. Zhang. Successive subdivisions of tetrahedra and multigrid methods on tetrahedral meshes. Houston J. of Math., 21(3):541–556, 1995. Nikolay I. Zheludev. What diffraction limit? Nature Materials, 7:420–422, 2008. V. V. Zhikov. On an extension and an application of the two-scale convergence method. Sbornik Math., 191(7–8):973–1014, 2000. Nikolay I. Zheludev and Yuri S. Kivshar. From metamaterials to metadevices. Nature Materials, 11:917–924, 2012. G. Zheng, A. A. Kishk, A. W. Glisson, and A. B. Yakovlev. A novel implementation of modified Maxwell’s equations in the periodic finite-difference time-domain method. Progress In Electromagnetics Research, 59:85–100, 2006. M. Zlámal. On the finite element method. Numer. Math., 12:394–409, 1968. F.-X. Zgainski, Y. Maréchal, J.-L. Coulomb, M.G. Vanti, and A. Raizer. An a priori indicator of finite element quality based on the condition number of the stiffness matrix. IEEE Trans. Magn., 33(2):1748–1751, 1997. I. A. Zagorodnov, R. Schuhmann, and T. Weiland. A uniformly stable conformal FDTD-method in Cartesian grids. Int. J. Numer. Model., 16:127–141, 2003. O. C. Zienkiewicz and R. L. Taylor. The Finite Element Method Volume 1: The Basis. Oxford; Burlington, Mass.: Elsevier Butterworth-Heinemann, 2000. O. C. Zienkiewicz and R. L. Taylor. The Finite Element Method for Solid And Structural Mechanics. Oxford; Burlington, Mass.: Elsevier Butterworth-Heinemann, 2005. Asaf Zarmi and Eli Turkel. A general approach for high order absorbing boundary conditions for the Helmholtz equation. J Comput Phys, 242:387–404, 2013. O. C. Zienkiewicz, R. L. Taylor, and J. Z. Zhu. The Finite Element Method: Its Basis and Fundamentals. Oxford; Boston: Elsevier Butterworth-Heinemann, 2005. Simone Zuffanelli. Antenna Design Solutions for RFID Tags Based on Metamaterial-Inspired Resonators and Other Resonant Structures. Springer International Publishing, 2018. O. C. Zienkiewicz and J. Z. Zhu. A simple error estimator and adaptive procedure for practical engineering analysis. Int. J. for Numer. Meth. Eng., 24(2):337–357, 1987. O. C. Zienkiewicz and J. Z. Zhu. The superconvergent patch recovery and aposteriori error-estimates. 1. The recovery technique. Int. J. for Numer. Meth. Eng., 33(7):1331–1364, 1992.

References [ZZ92b]

[ZZZL09]

699 O. C. Zienkiewicz and J. Z. Zhu. The superconvergent patch recovery and aposteriori error-estimates. 2. Error estimates and adaptivity. Int. J. for Numer. Meth. Eng., 33(7):1365–1382, 1992. Q. Zhao, J. Zhou, F. Zhang, and D. Lippens. Mie resonance-based dielectric metamaterials. Materials Today, 12:60–69, 2009.

Index

A Absorbing boundary conditions, 179, 213, 232, 377, 384, 409 Bayliss–Turkel, 377, 378, 385, 389, 415 Engquist–Majda, 377, 384, 389, 390 Hagstrom–Hariharan, 377 Hagstrom–Warburton, 377 Higdon, 377 Mur, 377, 415 “Trefftz machine”, 385 Adams methods, 24, 29 Adaptive mesh refinement, 89, 175, 178, 190, 505, 522 Adaptive refinement, 98, 141, 142, 178, 190, 302, 303, 505 Algebraic multigrid (AMG) schemes, 152 Alternating Direction Implicit (ADI) method, 374–376 Apertureless SNOM, 520, 521 Aperture-limited SNOM, 519 A posteriori error estimates, 124, 142, 143, 145, 148 Approximation accuracy, 27, 28, 75, 77, 78, 120, 121, 124, 152, 153, 174, 177, 178, 183, 185, 187, 200, 209, 216, 231, 270, 397 singular value condition, 167 analytical, 89, 98, 105, 185, 189, 205, 209, 210, 212, 213, 232, 234 and condition number, 173 finite element, 89, 107, 120, 123, 136, 137, 140, 152–154, 159, 216, 217 local, 89, 98, 143, 147, 175–177, 183– 185, 187–193, 199, 215, 217, 224, 225, 229, 230, 232, 234–239

Atomic Force Microscopy (AFM), 517

B Backward Differentiation Formulae (BDF), 29 Backward waves, 431, 435, 454, 456, 465, 485, 523, 533, 535, 537, 538, 541, 546, 549, 550 group velocity, 431, 432, 434, 435, 454, 458, 459, 526, 527, 536, 537, 541, 544 historical notes, 489, 523 Mandelshtam’s chain, 535 photonic crystals, 476, 480, 490, 541, 544, 551 weakly inhomogeneous regime, 549 Bessel functions, 189, 231, 288, 318, 324 Bjerrum length, 313 Bloch transform, 551, 552 Bloch waves, 439, 440, 442–444, 449, 450, 455–459, 462–465, 527, 533, 536– 538, 551, 578, 599, 605, 613, 624, 638 energy velocity, 435 FLAME, 465, 471, 476–478, 480–485, 500–503 Fourier analysis, 452, 467, 474, 550 group velocity, 431, 432, 434, 435, 454, 458, 459, 526, 527, 536, 537, 541, 544 Boundary value problems, 12, 38, 46, 47, 252, 287, 349 FEM, 287 Bravais lattice, 450, 547 B-splines, 271–274 Butcher tableau, 22, 38

© Springer Nature Switzerland AG 2020 I. Tsukerman, Computational Methods for Nanoscale Applications, Nanostructure Science and Technology, https://doi.org/10.1007/978-3-030-43893-7

701

702 C Carpet cloak, 564 Casimir forces, 327, 329, 333 Causality, 405, 535, 585 Céa’s theorem, 86–88, 119–121 Charge-to-grid interpolation, 268 Chirality, 573, 574, 625 Cholesky decomposition, 50, 110, 129, 130, 473, 476 Cholesky factorization, 50, 110, 129, 130, 473, 476 Ciarlet–Raviart theory, 120 Classical effective medium theories, 577, 620, 621, 624 Clausius-Mossotti model, 620, 622, 623 Cloaking, 563, 564, 567, 571 Collocation, 72–75, 88, 207, 231, 499 Colloidal simulation, 199, 316, 344, 646 correlations, 315, 339, 348 Complex coordinate transforms, 379 Correlations, 315, 332, 333, 339, 348 Crank-Nicolson scheme, 16, 17, 27, 32 Curl, generalized, 137, 180, 181, 414 Cylindrical harmonics, 184, 212, 213, 289, 290, 295, 387, 476, 485, 502

D Dark-field microscopy, 520, 521 Debye–Hückel parameter, 314 Debye length, 316, 339, 342 Derjaguin–Landau–Verwey–Overbeek theory (DLVO), 324, 326, 339, 342, 343 Difference schemes, 11–65 consistency, 12–14, 16, 17, 22, 27, 39– 42, 44, 48, 51–53, 55, 57, 59–63 convergence, 12, 14, 44, 59–62 stability, 15, 18, 23–29, 31, 32, 61 Diffraction limit, 507–517, 528, 565 can it be broken?, 507–517 Direct solvers, 50, 132, 133, 555, 560 Direct sum, 259, 260 Dirichlet boundary conditions, 46, 50, 71, 95, 156, 325, 374 Discontinuous Galerkin method, 226 Discrete-Dipole method, 499 Discrete Perfectly Matched Layers, 382 Distributions, 39, 69, 89, 109, 110, 123, 137, 139, 145, 148, 179–181, 190, 222, 223, 232, 244, 247, 250, 253, 258, 279, 286, 287, 295, 311–313, 321, 324, 325, 329, 338–340, 342, 346, 349, 351–355, 409–411, 414, 418,

Index 442, 450, 475–478, 497, 499, 504, 505, 515, 522, 524, 529–533, 538, 545, 594, 615, 618–620 Divergence and curl operators, 179 Divergence, generalized, 179–181, 354 Domain decomposition, 11, 50, 230 Drude model, 491, 492, 497 E Edge elements, 133, 136, 138–141, 144, 154, 161, 175, 178, 234, 427, 470, 489, 505, 639, 640 historical notes, 140 tetrahedral, 118, 119, 136, 139–141, 145, 153, 154, 160–163, 165, 166, 170–172, 174, 175, 178 Edge shape matrix, 152, 154, 163, 164, 167, 169, 170, 174, 175, 178 Eigenvalue analysis, 67, 154, 554 Eigenvalue solvers, 129, 552–560 Electromagnetic wave scattering, 304 Electrostatic energy DLVO theory, 324 Electrostatic forces, 54, 298, 337 long-range, 54, 285 Electrostatic problems, 285–355 Electrostatics of macromolecules, 232 Element shape approximation accuracy, 8, 152 Energy velocity, 435 Entropy, 338, 339, 344–346, 349 Error of solution by collocation, 74 Euler schemes, 14–19, 21, 26, 28 Ewald formulas, 256, 259, 278 Ewald methods, 243–286, 343, 344 direct sum, 244, 259, 260 grid-based, 259, 260, 263, 268, 269, 277, 280 Particle–Particle Particle–Mesh, 266 smooth PME, 269, 271 York–Yang, 275 Ewald sum, 259, 275, 278, 280 Ewald summation, 5, 7, 246, 247, 253, 256, 268, 278, 279, 285, 646 Particle-mesh Ewald, 268 Explicit schemes, 28, 374, 415 F Fast Multipole Method (FMM), 190, 286, 316 FE-Galerkin, 92, 105, 146, 149 Fermi velocity, 497 Field enhancement, 496, 500

Index conical tips, 522, 523 particle cascades, 505, 506 plasmonic, 185, 507, 509 Finite Difference (FD) derivation using constraints, 40 Finite Difference (FD) analysis, 11, 389 Finite difference schemes, 11–65, 357–360 Finite Difference Time Domain (FDTD), 357–423 codes, 213, 416 contour-path FDTD, 393 encyclopedia, 358, 414 father of, 417 frequency dispersion, 381 historical notes, 414 long-term stability, 391 material interfaces, 392–403 near-to-far-field transformation, 409– 414 numerical dispersion, 370 periodic structures, 406 stability, 200, 216, 231, 238, 361, 409, 415 staggered grids, 227, 360, 362, 366, 369, 370, 393, 395, 402, 407, 414 total field/scattered field, 403 Finite element analysis, 7, 70, 103, 107, 125, 140–142, 178, 234, 416 Finite element approximation, 123, 136, 137, 152 Finite element approximation versus interpolation, 107, 157 Finite element mesh, 4, 90, 91, 109, 131, 132, 136, 159, 184, 189 Finite Element Method (FEM), 11, 69–179, 299, 465, 496, 503, 529, 560 a posteriori error estimates, 148 a priori error estimates, 171, 175, 178 approximation, 123, 136, 137, 152, 642 mesh, 90, 91, 109, 132, 159, 184, 189, 470, 529 Finite element shape approximation accuracy, 119, 120, 152 edge shape matrix, 155, 163, 164, 167, 169, 170, 174, 175, 178 maximum eigenvalue condition, 157, 162, 164, 174 singular value condition, 145, 165, 167, 171, 172 Finite Integration Technique (FIT), 371–374 First-order elements, 91, 102, 103, 113, 123, 146

703 FLAME, 11, 12, 18, 19, 33, 44, 50, 53–55, 57, 58, 62, 64, 109, 177, 183–241, 276–278, 287–310, 317–324, 339– 344, 377, 378, 385, 386, 389–391, 393, 396, 399, 404, 465, 471, 476– 478, 480–485, 500–503 band structure, 452, 465, 473, 475, 480, 481, 484 particles, 185, 189, 190, 212, 232, 234, 285–289, 292, 293, 295–306, 308–310, 316–319, 322, 324–327, 329, 332, 335, 339–343, 425, 434, 436, 481, 492, 494– 504, 506, 507, 519 photonic crystals, 185, 234, 480, 490, 541, 544 plasmonic particles, 507 Flux-balance scheme, 42–46, 48, 49, 54, 56, 211, 212, 232, 309 Fréchet derivative, 185, 198, 241, 321, 322 Free energy, 252, 279, 335, 337–339, 345– 348 G Galerkin method, 74, 75, 79, 82, 88, 92, 105, 146, 149, 224, 226, 239, 325, 326, 471, 556, 639 Gaussian elimination, 46, 110, 125–127, 129 Gaussian factorization, 127 Gauss’s theorem, 48, 56 Generalized curl, 137, 180, 181, 414 Generalized divergence, 180, 181, 353–355 Generalized FEM, 175, 178, 185, 188, 190, 193, 222, 224, 225 Goal-oriented error estimation, 147 Gouy-Chapman length, 312 Grid-to-charge interpolation, 266, 274, 280 Group velocity, 431–435, 454, 458, 459, 526, 527, 536, 537, 541, 544 H Hamiltonian systems, 11, 33, 34, 37, 64 Harmonic oscillator, 34, 36, 191, 205, 206 Heisenberg principle, 509 Helmholtz equation, 201, 214, 215, 224, 231, 288, 320, 379, 383, 387, 396, 412, 413, 418, 460, 464 Helmholtz free energy, 339, 345, 348 Hertzian dipole, 506 Hessenberg form (of a matrix), 553 Hierarchical bases, 143 Hill’s equation, 436, 437, 440, 441 HODIE schemes, 59

704 Homogeneous slabs, 596 Homogenization, 223, 224, 226, 227, 469, 549, 565, 567, 572, 577–581, 585– 602, 604–606, 608–610, 613, 620– 625, 631, 638, 639 high frequency, 587 non-asymptotic, 588 wavevector-dependent tensor, 587 Householder reflection, 553

I Immersed surface methodology, 224, 231 Implicit models, 3, 190 Implicit schemes, 374, 375 Impressed current, 638 Integral conservation principle, 49 Integral equation methods, 357, 499, 502 Iterative solvers, 133, 143, 151

J Jacobi–Davidson method, 558 Jamet’s condition, 153, 166, 167, 174

K Kohn–Sham equation, 4, 205, 468 Korringa-Kohn-Rostoker method, 468 Kramers–Kronig relations, 405, 585

L Lacunae method, 392 Ladyzhenskaya–Babuška–Brezzi condition, 88 Lagrange multipliers, 41, 53, 345 Lanczos method, 557 Laplace equation, 52, 53, 58, 59, 64, 185, 191, 192, 195, 201, 203–205, 211, 220, 232, 250, 251, 277, 289, 292, 305, 306, 309, 319, 320, 324, 325, 395, 399, 529, 566, 568, 639 Lax–Milgram theorem, 86 Lax–Richtmyer equivalence theorem, 61, 216, 397 Lax–Richtmyer theorem, 397 LBB condition, 122, 155 LDU factorization, 129 Leapfrog scheme, 38, 408 Lennard–Jones potential, 327, 331 Linear multistep schemes, 24, 25 Lorentz model, 405, 491, 620–623

Index Losses, 435, 463, 471, 494, 512, 533, 542, 564, 565, 585, 626, 637, 643

M Macromolecular simulation, 190, 338, 344 Magnetization, 249, 578, 588–590, 592, 593, 614 Mandelshtam’s chain, 535 Mass matrix, 79, 101, 106, 108, 119, 161, 466, 473, 476, 560 Matrix exponential, 33, 35, 66, 67, 375 Matrix sparsity structure, 125, 140 Maximum angle condition, 153, 161, 171, 174, 175 Maximum eigenvalue condition, 157, 162, 164, 174 Maxwell Garnett formula, 623 Maxwell’s equations, 332, 350, 357–359, 371, 373, 375, 376, 379, 392, 406– 408, 414, 415, 419, 423, 425–429, 456, 459, 460, 488, 497, 533, 535 Green’s functions, 335, 418 Maxwell stress tensor, 293, 333, 339, 341 Maxwell’s equations, 568, 570, 578, 588, 591–595, 597–599, 612, 623, 624, 631, 633, 634 Mean-field theory, 315, 338 Mehrstellen schemes, 59, 64, 205, 235 19-point, 55, 58, 64 2D, 37, 38, 40, 42, 47–59, 64 3D, 12, 40, 42, 54–59, 64 9-point, 51, 52, 58, 59, 64 Mesh-based methods, 5 Mesh generation, 97, 190, 228, 287 Meshless methods, 223, 228, 229 element-free Galerkin, 223 local Petrov–Galerkin, 223, 235 Mesh quality, 145, 173 Metamaterials, 179, 225, 249, 406, 456, 465, 515–517, 543, 546, 547, 549–551, 561–635, 637, 639 absorbers, 564, 572, 573 applications, 562–564, 567, 571, 576, 587, 595, 627, 628 carpet cloak, 564 chiral, 573 cloaking, 563, 564, 567, 571 high-frequency homogenization, 578, 587 homogenization, 565, 567, 572, 577– 581, 585–594, 596–602, 604–606, 609, 610, 613, 620, 621, 623–625, 631

Index lenses, 563, 565, 576 non-asymptotic homogenization, 591, 605, 606, 620 parameter retrieval, 577, 580, 584, 610 reconfigurable, 571 superconducting, 571 tunable, 562, 571, 576 two-parameter homogenization, 578, 586 uncertainty principle, 610–613 Metasurfaces, 576 Mie theory, 476, 505 Minimum degree reordering, 112, 113 Modeling errors, 5 Modes TE, 461, 641 TM, 461, 641 Moment method, 357, 639 Momentum conservation, 151, 265, 266 Multigrid methods, 142, 149, 151, 152 Multiple Multipole Method (MMP), 499 Multiscale modeling, 3 Multivalued approximation, 185, 187, 193, 228, 235, 236

N Nano-focusing, 492, 502, 503, 509 Nano-lens, 507, 509 Near-to-far-field transformation, 409–414 Negative refraction, 431, 435, 456, 465, 515, 517, 523–528, 541–547, 549, 550, 563–565, 574 metamaterials, 456, 465, 515–517, 543, 546, 547, 549, 551 photonic crystals, 476, 480, 490, 541, 544, 551 Nested Dissection, 133 Neumann boundary conditions, 46, 78, 85, 96–98, 378, 410, 621, 623 Newton–Raphson method, 30, 198 Numerical errors for different one-step schemes, 19

O One-Way Dissection (1WD), 131 Optical tips, 178, 499, 522

P Particle–Mesh Ewald method, 268, 271 Particle–Particle Particle–Mesh Ewald methods, 266, 274

705 Particle–Particle Particle–Mesh method, 274 Partition function-sum over states, 348 Perfect lens, 527, 528, 530, 563–565 Perfectly Matched Layers (PML), 378 complex coordinate transforms, 379 discrete, 382 Permeability negative, 515, 523–525 Permittivity bulk, 497, 504, 543, 544 complex, 495, 501, 502, 533 nanoscale, 425, 514, 515 negative, 490–496, 515, 523–525, 528, 542 Phase velocity, 361, 364, 406, 431, 432, 456, 526, 527, 534, 535, 537, 539–541, 543, 546, 549, 589 Photonic band structure, 475, 484 FEM, 484 FLAME, 484 PWE, 484 Photonic bandgap, 227, 448, 449, 454, 462, 466, 467, 469, 471, 473–475, 477, 483, 485, 488–490 Photonic crystals, 178, 185, 234, 406, 442, 462, 463, 474–477, 479, 480, 484, 490, 541, 543, 544, 550, 561, 590, 641 Photonic crystal waveguide, 479 Plane-Wave Expansion (PWE), 8, 246, 452, 465, 467–469, 476, 480, 481, 483, 485, 553, 555, 560 Plasma frequency, 491, 492, 494 Plasmonics, 490–507, 517–523, 600, 637 Elasmobranchii and Teleostii fishes, 492 losses, 564 Plasmon resonances, 495, 497, 500, 509 Poisson equation, 39, 40, 47, 51, 55, 58, 60, 61, 64, 65, 69, 71, 73, 76, 84, 123, 139, 146, 186, 252, 253, 256, 265, 274, 276, 280, 311, 339, 345, 639 1D, 642 colloidal simulation, 3, 199 Ewald formula, 256, 257, 259, 278 FEM, 639, 640, 642 Poisson–Boltzmann equation, 3, 190, 199, 205, 288, 310, 312, 313, 316, 318, 321, 324, 325, 339, 345, 346 Poisson–Boltzmann model, 190, 288, 311 Polarization (dielectric), 243, 249, 252, 285, 286, 491, 492, 499, 578, 588–590, 592, 593, 614–622 modern theory, 620

706 Polarization (waves), 367, 378, 430, 461, 462, 466, 467, 474, 490, 500, 505, 507, 520, 562, 565, 574, 582, 583, 596, 599, 602, 608 Poynting vector, 435, 457–459, 524, 525, 534, 535, 537–541, 543, 544, 548, 606, 610 Bloch waves, 431, 432, 435, 439, 440, 442, 444, 450, 452, 454–456, 458, 459, 462, 463, 465, 467, 471, 474, 476–478, 480–485, 500–503, 527, 533, 536–538, 541, 544, 550, 551 Fourier harmonics, 456–458, 468, 541, 548 group velocity, 431, 432, 434, 435, 458, 459, 536, 537, 541 mechanical, 524, 535, 537 Pseudospectral methods, 185, 223, 230 Pseudospectral time domain methods, 376, 403, 415

Q QR decomposition, 553 QR iterations, 553 Quotient Minimum Degree (QMD) method, 131

R Raman spectroscopy, 520 Reciprocal lattice, 246, 247, 278 Reciprocal space, 253, 255, 276, 488, 632 Reciprocal sum, 260 Recovery-based error estimators, 146 Robin boundary conditions, 85 Runge–Kutta methods, 12, 20–23, 38, 443 Rytov–Lifshitz theory, 333

S Scanning Near-Field Optical Microscopy (SNOM), 494, 514, 517–522 Scanning Tunneling Microscope, 517 Schrödinger equation, 191, 205–208, 224, 232, 329, 469 Second-order elements, 100, 102, 113 Simulation model, 2 Singular value condition, 145, 152, 165, 167, 171, 172 Smooth Particle–Mesh Ewald Methods, 271 Solution by collocation, 73 Source current, 638 Source field, 640

Index Special approximation techniques, 231 Spectral convergence, 134, 139, 639 Spherical harmonics, 189, 234, 288, 304, 306, 307, 309, 318, 323, 324, 476, 477, 493, 498, 499, 506 Split-ring resonators, 527, 562, 625 Spurious modes, 134, 136, 139, 178, 639 Stiff systems, 12, 24, 27, 33, 65 Superconvergence, 124, 643 Superlens, 528 Superoscillations, 516 Surface plasmons, 416 Suris–Sanz-Serna condition, 37, 38 Symmetric Positive Definite (SPD) systems, 129 Symplecticness, 33, 38

T Taylor-based schemes, 18, 44, 54, 205 Tetrahedral elements, 117–119, 136, 140, 145, 153, 154, 161, 166, 174, 175, 178 Thermodynamic potential, 288, 335, 338, 344–346, 348 electrostatics in solvents, 344 T-matrix method, 477, 496, 498, 499 Transformation optics, 381, 566–568, 571, 627 Trefftz machine, 167, 171, 172 Trefftz–FLAME, 44, 183–241, 290, 317, 318, 321, 324, 377, 378, 391, 396, 399, 404, 481, 500 case studies, 185, 201

139, 171, 337,

570,

292, 390,

U Uncertainty principle, Heisenberg, 508, 509, 528 Uncertainty principle, homogenization, 610–613 Units Gaussian, 358, 425 SI, 358, 425

V Van der Waals forces, 329, 331 Variational FLAME, 235, 236, 239 Variational methods, 69, 72, 81, 86 Vector and matrix norms, 65 Veselago medium, 533, 549

Index W Wave analysis, 234, 487, 496, 497, 505, 522, 646 Wave function, 620 Whitney-Nédélec elements, 136 Wigner–Seitz cell, 246

Y Yee scheme, 227, 360–362, 364, 365, 367– 371, 375, 393, 403, 407–409, 414, 415, 417 1D, 358–365, 370, 379

707 2D, 365–368, 369 3D, 368–371 at material interfaces, 392–403 “magic time step, 361, 417 numerical dispersion, 364, 367, 370 order, 359–361, 366, 368, 369, 392–395, 399, 400, 402, 403, 414 stability, 359, 364, 365, 367, 368, 370, 371, 391, 393–395, 397–399, 408, 409, 415 Yukawa potential, 315, 316, 319, 323, 326, 339, 343 linearized PBE, 319