128 63 8MB
English Pages 505 [522] Year 2023
GRADUATE STUDIES I N M AT H E M AT I C S
230
Inverse Problems for Fractional Partial Differential Equations Barbara Kaltenbacher William Rundell
Inverse Problems for Fractional Partial Differential Equations
GRADUATE STUDIES I N M AT H E M AT I C S
230
Inverse Problems for Fractional Partial Differential Equations Barbara Kaltenbacher William Rundell
EDITORIAL COMMITTEE Matthew Baker Marco Gualtieri Gigliola Staffilani (Chair) Jeff A. Viaclovsky Rachel Ward 2020 Mathematics Subject Classification. Primary 35R30, 35R11, 65M32; Secondary 26A33, 35K57, 35Qxx, 35R25, 47F05, 60K50, 65J20.
For additional information and updates on this book, visit www.ams.org/bookpages/gsm-230
Library of Congress Cataloging-in-Publication Data Names: Kaltenbacher, Barbara, author. | Rundell, William, author. Title: Inverse problems for fractional partial differential equations / Barbara Kaltenbacher, William Rundell. Description: Providence, Rhode Island : American Mathematical Society, [2023] | Series: Graduate studies in mathematics, 1065-7339 ; volume 230 | Includes bibliographical references and index. Identifiers: LCCN 2022046072 | ISBN 9781470472450 | ISBN 9781470472450 (paperback) | ISBN 9781470472764 (ebook) Subjects: LCSH: Inverse problems (Differential equations)–Textbooks. | Fractional differential equations–Textbooks. | AMS: Partial differential equations – Miscellaneous topics – Inverse problems. | Partial differential equations – Miscellaneous topics – Fractional partial differential equations. | Numerical analysis – Partial differential equations, initial value and timedependent initial-boundary value problems – Inverse problems. | Real functions – Functions of one variable – Fractional derivatives and integrals. | Partial differential equations – Parabolic equations and systems – Reaction-diffusion equations. | Partial differential equations – Equations of mathematical physics and other areas of application. | Partial differential equations – Miscellaneous topics – Improperly posed problems. | Operator theory – Partial differential operators – Partial differential operators. | Numerical analysis – Numerical analysis in abstract spaces – Improperly posed problems; regularization. Classification: LCC QA378.5 K35 2023 | DDC 515/.353–dc23/eng20230111 LC record available at https://lccn.loc.gov/2022046072
Copying and reprinting. Individual readers of this publication, and nonprofit libraries acting for them, are permitted to make fair use of the material, such as to copy select pages for use in teaching or research. Permission is granted to quote brief passages from this publication in reviews, provided the customary acknowledgment of the source is given. Republication, systematic copying, or multiple reproduction of any material in this publication is permitted only under license from the American Mathematical Society. Requests for permission to reuse portions of AMS publication content are handled by the Copyright Clearance Center. For more information, please visit www.ams.org/publications/pubpermissions. Send requests for translation rights and licensed reprints to [email protected]. c 2023 by the American Mathematical Society. All rights reserved. The American Mathematical Society retains all rights except those granted to the United States Government. Printed in the United States of America. ∞ The paper used in this book is acid-free and falls within the guidelines
established to ensure permanence and durability. Visit the AMS home page at https://www.ams.org/ 10 9 8 7 6 5 4 3 2 1
28 27 26 25 24 23
To my family, Manfred, Robert, Cornelia, and Katharina To my wife, Linda, who has encouraged my work for over 50 years
Contents
Preface Chapter 1. Preamble
xi 1
§1.1. Diffusion and Brownian motion
1
§1.2. Early fractional integrals and their inverses
4
§1.3. A short overview on inverse problems
12
Chapter 2. Genesis of Fractional Models
17
§2.1. Continuous time random walk
17
§2.2. Constitutive modeling
37
Chapter 3. Special Functions and Tools
45
§3.1. Basic results in complex analysis
46
§3.2. Abelian and Tauberian theorems
52
§3.3. The Gamma function
54
§3.4. The Mittag-Leffler function
59
§3.5. The Wright function
77
Chapter 4. Fractional Calculus
87
§4.1. Fractional integrals
87
§4.2. Fractional derivatives
91
§4.3. Fractional derivatives in Sobolev spaces
106
§4.4. Notes
112
vii
viii
Chapter 5. Fractional Ordinary Differential Equations
Contents
113
§5.1. Initial value problems
113
§5.2. Boundary value problems
121
Chapter 6. Mathematical Theory of Subdiffusion
131
§6.1. Fundamental solution
132
§6.2. Existence and regularity theory
139
§6.3. Maximum principle
154
§6.4. Notes
156
Chapter 7. Analysis of Fractionally Damped Wave Equations
157
§7.1. Linear damped wave equations
158
§7.2. Some nonlinear damped wave equations
175
Chapter 8. Methods for Solving Inverse Problems
197
§8.1. Regularisation of linear ill-posed problems
197
§8.2. Methods for solving nonlinear equations
209
§8.3. The quasi-reversibility method
228
Chapter 9. Fundamental Inverse Problems for Fractional Order Models
233
§9.1. Determining fractional order
234
§9.2. Direct and inverse Sturm–Liouville problems
258
§9.3. The fractional Sturm–Liouville problem
281
§9.4. The inverse Sturm–Liouville problem with fractional derivatives
291
§9.5. Fractional derivative as an inverse solution
296
Chapter 10. Inverse Problems for Fractional Diffusion
299
§10.1. Determining the initial condition
300
§10.2. Sideways subdiffusion and superdiffusion
327
§10.3. Inverse source problems
333
§10.4. Coefficient identification from boundary data
341
§10.5. Coefficient identification from final time data
362
§10.6. Recovering nonlinear partial differential equation terms
395
§10.7. An inverse nonlinear boundary coefficient problem
412
Contents
Chapter 11.
ix
Inverse Problems for Fractionally Damped Wave Equations
417
§11.1. Reconstruction of initial conditions
418
§11.2. Reconstruction of the wave speed
439
§11.3. Nonlinearity coefficient imaging
443
Chapter 12. Outlook Beyond Abel
453
§12.1. Fractional powers of operators
453
§12.2. The Calder´on problem
465
§12.3. Notes
469
Appendix A.
Mathematical Preliminaries
471
§A.1. Integral transforms
471
§A.2. Basics of Sobolev spaces
475
§A.3. Some useful inequalities
479
Bibliography
483
Index
503
Preface
Since its beginning, the study of calculus has made advances by asking what if questions. Leibnitz in 1695 asked about the possible meaning of a half derivative. One hundred and forty years later, the nineteenth century saw the first definitions of general fractional derivatives that supported rigorous analysis. By the middle of the twentieth century, fractional calculus was already an almost fully developed field. Physics often progresses in a similar way. In the early twentieth century, Einstein presented a mathematical foundation for Brownian motion in terms of a random walk model. This formulation resulted in a law that stated the mean square deviation from the starting position depends linearly on time. This ansatz held sway for over half a century, but soon thereafter examples were turning up of diffusion processes that were poorly described or led to conflicting consequences with this law. Then the question naturally arose of whether Einstein’s formulation could be generalised to a fractional power (less than unity) of time, and what this would mean. As the story unfolded, the connection between the physical and mathematical fractional questions became apparent. As a consequence, fractional diffusion was now supported by an existing mathematical framework, courtesy of the fractional derivatives, and fractional derivatives now had a physical motivation for further study and transitioned from pure mathematics to applied. The lingua franca of mathematical physics is2 partial differential equations, and the new modality requires a study of equations containing one or more fractional derivatives. This is one of the primary purposes of this textbook: we ask the usual questions of existence, uniqueness, and regularity of the solutions and try to compare them with the classical counterpart. The short answer is there is much in common, but with some significant
xi
xii
Preface
and important differences as well as additional technical challenges. There is considerable emphasis in the book towards an analysis of the forwards or direct problem; we seek to determine the solution under the assumption that we know the exact form of the equation including any coefficients and domain boundaries and are given initial or boundary information. However, our main emphasis is on inverse problems. These occur when we have, say, an undetermined coefficient in the equation that we wish to recover together with the solution itself given additional, often boundary, information. This brings up several questions, but a recurring theme is to answer what differences using the fractional differential equation paradigm bring over the classical one. In some cases these differences can be substantial and can bring in new consequences for the underlying physical model. In others the differences are relatively minor. What we will often find is that the stability of the solutions for integer order and for fractional order derivatives in terms of the given initial/boundary data can differ markedly. Not only does this say something about the effect of the models from a physics standpoint, but it sometimes suggests a way to stabilise a highly ill-conditioned classical inverse problem by approximating it with a “nearby” fractional one. Issues such as these form a major theme and even perhaps the primary focus of the second part of the book. We have taken a broad perspective on the types of inverse problems. The reader will see ones where the underlying model is based on elliptic, parabolic, and hyperbolic equations together with what might be viewed as fractional counterparts to these classical equations. This transition to the fractional version can take several forms, each of which can add different types and levels of complexity not only to the forward problem, but also to inverse problems of recovering internal properties and features. There are many excellent books available, especially at the research level, that emphasise one aspect of fractional calculus or fractional diffusion operators. Our intent was to take a broader scope than the more closely focused research monographs and to select material for a text suitable for a graduate class whose students have a background in analysis and the basic theory of partial differential equations. Inevitably there will be background material required for understanding the more specialised topics in this book, which we provide to make the book self-contained. We would like to be able to claim that the book can be read completely in sequence. This was impossible as there are the invevitable uncertainties about the reader’s background as well as our wish to intermingle the applications and the historical record with a logical mathematical thread. However, we have attempted to minimize the need to “look forward” in the book as much as we thought practical.
Preface
xiii
The book has two logistical blocks: the study of fractional derivatives and operators together with their motivational and analytical background material in the first block, and wide coverage of inverse problems involving fractional operators in the second block. The two blocks are then quite suitable for a two-term course with a logical break. For a one-term course there are several options. Those who have a strong background in analysis and are willing to consult the first block as needed can proceed directly to Chapter 8. An alternative is that the first seven chapters together with the first sections of the final chapter could also provide a solid introduction to the components of fractional calculus and operators for differential equations at the expense of omitting the block on inverse problems. However, the authors are both hopeful and optimistic this is not the selection. Acknowledgments. The work of the first author was supported by the Austrian Science Fund fwf under the grant doc78. The work of the second author was supported in part by the National Science Foundation through award dms-2111020. Both authors acknowledge the support offered by the Oberwolfach Research Institute for Mathematics during a very productive Research in Pairs opportunity specifically awarded to work on this book. We thank Paul Kogler, Lisa Papitsch, Richard Emil Rauter, David Rackl, Benjamin Rainer, and Celina Strasser (University of Klagenfurt) for their careful reading of parts of a preliminary version of this book. Barbara Kaltenbacher William Rundell
Chapter 1
Preamble
The mathematical formulation of fractional calculus and the underpinnings of the physics behind diffusion both have their origins in the first half of the nineteenth century, but it is only in the last half of the twentieth century that these two topics have been seen to be related. The goal in this preamble is to indicate why these two ideas coalesced.
1.1. Diffusion and Brownian motion One way to understand diffusion processes is provided by the theory of random walks. Perhaps the simplest, certainly the most well-known, example here is the so-called Brownian motion in which the dynamics is governed by an uncorrelated, Markovian, Gaussian stochastic process. In this case, the dynamics can be described at a macroscopic level using the diffusion equation. This connection, one that dates from at least the early 1900s, provides a transparent view of the diffusion model and allows effective solution by means of a well-understood partial differential equation (pde). However, as we shall see, it also points out its limitations. In particular, when the random walk involves correlations, non-Gaussian statistics, or non-Markovian “memory” effects, the diffusion equation cannot describe the macroscopic limiting process. We shall see that one way to extend the classical diffusion paradigm is to extend the idea of a derivative, an idea that in itself has has a long mathematical history. The roots of modern theory of diffusion processes are to be found in the early decades of the nineteenth century. The work of Fourier in formulating his famous heat equation comes to mind. Fourier’s arguments [108] were from a macroscopic viewpoint but from a particle-diffusion perspective. 1
2
1. Preamble
The first systematic study was by the Scottish chemist, Thomas Graham. Graham was the inventor of dialysis through a method of separation of materials by diffusion through a membrane. His research work on diffusion in gases was performed beginning in 1828 and finally published in 1833 [118]. He observed that, “gases of different nature, when brought into contact, do not arrange themselves according to their density, the heaviest undermost, and the lighter uppermost, but they spontaneously diffuse, mutually and equally, through each other, and so remain in the intimate state of mixture for any length of time.” Graham did not only perform the first quantitative experiments on diffusion, but was able to make reliable measurements of the coefficient of diffusion. At about the same time period, in what turns out to be parallel work, is the celebrated Brownian motion which comes from the investigations of another Scottish scientist, the botanist Robert Brown in 1827 [38]. His initial observations were of the seemingly chaotic movement of pollen grains suspended in water when viewed under a microscope. He also observed a similar motion in particles of inorganic matter ruling out the possibility that the effect was life-related. However, Brown did not provide a mathematical framework for the phenomenon. In fact, amongst the first to address the problem of diffusion from a differential-equation viewpoint was the German physiologist Adolf Fick [105]. Following on from the work of Graham and also aware of Fourier’s heat equation formulation, Fick was investigating the way that fluids and nutrients travel through membranes in living organisms. His eponymous law postulates that the diffusive flux would flow from regions of high concentration to regions of low. Mathematically, this means that if u(x, t) is the concentration at point x at time t, then J, the amount of material per unit volume per unit time, will satisfy J ∝ ∇u or J = −D∇u, where D is the diffusivity. Combining this with the conservation law ut = −div J (the rate of change of concentration in time is proportional to the net amount of flux leaving the region and where we have chosen the units to obtain the proportionality constant unity) gives the heat equation ut = Ddiv (∇u) or ut = Duxx in one space dimension. Of course, this was much less of a first principles approach than it was an analogy with Fourier’s derivation of the heat equation for temperature flow some 50 years earlier. Einstein’s 1905 paper on the topic [92] provided the now generally accepted explanation for Brownian motion and had far-reaching consequences for physics. Indeed, he published four famous papers that year, the others being on special relativity [94], the correlation between matter and energy (the famous E = mc2 ) [91], and on the photoelectric effect [93]. While the work on the first two are certainly the most celebrated—and the 1921 Nobel
1.1. Diffusion and Brownian motion
3
prize was awarded for the photoelectric paper—the most often-cited is the one on Brownian motion. There are two key pieces to this work: first, the assumption that a change in the direction of motion of a particle is random and that the mean-squared displacement over many changes is proportional to time; second, he combined this with the Boltzmann distribution for a system in thermal equilibrium to get a value on the proportionality, the diffusivity D. Thus x2 = 2Dt, D = 6NRT πηa , where T is the temperature, R is the universal gas constant, N is Avogadro’s number, and γ = 6πηa where a is the particle radius and η is the viscosity, is Stokes’s relation for the viscous drag coefficient. With this he was able to predict the properties of the unceasing motion of Brownian particles in terms of collisions with surrounding liquid molecules. The French physicist Jean Baptiste Perrin, did a series of experiments beginning in 1908, one of which verified Einstein’s equations and led to acceptance of the atomic and molecular-kinetic theory. Perrin received the Nobel prize for this work in 1926. In the same year that Einstein was formulating the Brownian motion concept into a mathematical model, the British statistician Karl Pearson, in his letter to Nature [263], was asking the question: A man starts from the point O and walks yards in a straight line; he then turns through any angle whatever and walks another yards in a second straight line. He repeats this process n times. I require the probability that after n stretches he is at a distance between r and r + δ from his starting point O. A solution was provided almost immediately by Lord Rayleigh (who had been studying similar questions since 1880): If n be very great, the probability sought is
2 −r 2 /n r dr. ne
This prompted the retort by Pearson: The lesson of Lord Rayleigh’s solution is that in open country the most probable place to find a drunken man who is at all capable of keeping on his feet is somewhere near his starting point! From this one can infer the origins of the term “drunkard’s walk” as an alternative to “random walk”. Similar results have also been verified in numerous experiments including Perrin’s measurements of mean square displacements to determine Avogadro’s number. Thus, classical Brownian motion can be viewed as a ran-
4
1. Preamble
dom walk in which the dynamics is governed by an uncorrelated, Markovian, Gaussian stochastic process. On the other hand, when the random walk involves correlations, non-Gaussian statistics, or a non-Markovian process (for example, due to memory effects), the diffusion equation will fail to describe the macroscopic limit. Indeed, for an increasing number of processes the overall motion of an object is better described by steps that are not independent, identically distributed copies of each other and that can take different times to perform. In addition, the inclusion of memory effects (thus removing the assumption of a Markov process) can be particularly important. These considerations lead to so-called anomalous diffusion processes and the physically motivated examples are numerous. These include the foraging of animals: think of the random walk that asks one to minimize the time taken to go from a given starting place to an unknown region where food is available; as Rayleigh noted, the search path is definitely not Brownian. Actually, our issue with the fixed paradigms of Brownian motion should not lie only with the underlying assumptions, we should ask for the outcomes of the model to be satisfactory from a physical perspective. Rayleigh’s observation that in Brownian motion the particles have a high probability of being near their starting position can be seen in terms of the probability density function given by the Gaussian; a relatively slow diffusion initially but a very rapid decay of the plume in space. It has certainly been observed that many processes do not exhibit this effect.
1.2. Early fractional integrals and their inverses There is the well-known story of L’Hˆopital asking Leibniz about the possin bility that n be a fraction in the formula dxn and the reply in 1695, “It will lead to a paradox.” But he added prophetically, “From this apparent paradox, one day useful consequences will be drawn” [209]. The idea of such a fractional derivative had already occurred to Leibnitz through the observation that the basic formula ddx xn = nxn−1 makes sense for noninteger values (and specifically in Leibniz’s case with α = 12 ). There were many attempts at such generalisations most of which were based on replacing formulae for integer order derivatives by a noninteger value. Mathematicians frequently ask if a definition can be extended to a broader set, and one of the earliest successes in the direction of an extended derivative was Euler’s generalisation of the factorial function to the Gamma
1.2. Early fractional integrals and their inverses
5
function in 1725. For this purpose he used the representation 1 n n (1.1) n! = k= − log(x) dx 0
k=1
to obtain the formula (in modern notation) (1.2)
Γ(β + 1) d α xβ xβ−α , = dxα Γ(β − α + 1)
valid for all real α and β and in complete analogy to the case when α is an integer. Of course generalising this beyond powers of x was another matter entirely and there turned out to be more complexity and indeed obstacles, not the least of which is the fact that the fractional derivative of a constant will not be zero. Indeed the original work by Euler and others in describing the Gamma function itself really only came to full fruition with the tools of complex analysis. We will see a similar pattern for the generalisation of the derivative to noninteger values. The next significant mention of a derivative of noninteger order appears in a long textbook by the French mathematician, S. F. Lacroix [206] that in its day had widespread impact. Lacroix also noted the formula n! d n xm xn−m = n dx (n − m)! as being valid for integers n and m. Again, using the then available formula for the Gamma function extension of the factorial [208], replacing m by 12 and n by any real number a, he obtained the formal expression 1
d2 dx
1 2
xa =
Γ(a + 1) a− 1 x 2. Γ a + 12
He gives the explicit case for n = 1, y = x then as 1
2 1 x = √ x2 π dx d2
1 2
√ since Γ 32 = 12 Γ 12 = 12 π. This turns out to agree with the more rigorous formula obtained some years later by a variety of authors. The key to the next phase included an application, namely the tautochrone problem of finding the curve down which a bead placed anywhere will fall to the bottom in the same amount of time. The solution was originally found by Huygens [154] in a formal way, but in 1823 Abel [1, 2] provided the rigorous mathematical solution to the general tautochrone problem:1 1 The
below is taken directly from [273] which is in turn taken from the translations [3].
6
1. Preamble
K.....
L
Suppose that cb is a horizontal line, a is a setpoint, ab is perpendicular to bc, am is a curve with rectangular coordinates ap= x, pm= y. Moreover, ab= a, am= s. It is known that as a body moves along an arc ca, when the initial velocity is zero, that the time T , which is necessary for the passage, depends on the shape of the curve, and on a. One has to find the definition of a curve kca, for which the time T is a given function of a, for example ψ(a).
... ... ... ... ... .. C ....... B ... ... ... ... ... ... . M ........ P ..... ..... ..... ..... ...... ...... ....... ....... ......... ........... ................... ..........A
The picture above appeared in the French translation in [3] and Abel obtains the equation a ds √ , ψ(a) = a−s 0 and he then continues:2
(1.3)
Instead of solving this equation, I will show how one can derive s from the more general equation a ds ψ(a) = n 0 (a − s) where n has to be less than 1 to prevent the integral between the two limits being infinite; ψ(a) is an arbitrary function that is not infinite, when a = 0.
Abel seeks the unknown function s(x) in the form of a power series, and after term-by-term operations using Legendre’s then-recently discovered properties of the Gamma function, he arrives at the solution of equation (1.3) sin(nπ) n 1 ψ(xt) dt x . (1.4) s(x) = 1−n π 0 (1 − t) As remarkable as this theorem is, [1] further explores these representations to obtain, in his notation, n d−n ψx 1 1 ψx.dxn = . (1.5) s(x) = Γ(1 − n) Γ(1 − n) dx−n Thus Abel understood that he had unified the notions of integration and differentiation and their extension to noninteger orders. Equation (1.3) can be rewritten as t s (x) dx . (1.6) ψ(t) = n 0 (t − x) 2 We
added our equation numbers for later reference.
1.2. Early fractional integrals and their inverses
7
Thus we have the main ingredient: the Abel integral operator, t 1 f (τ ) dτ α (1.7) I f= , α > 0. Γ(α) 0 (t − τ )1−α The reader will later recognise (1.6) again in Chapter 4 as the key element in the two most commonly used “modern” fractional derivatives for applications, the so-called Riemann–Liouville and Djrbashian–Caputo derivatives. Equally remarkable to the mathematical formulations is the fact the entire structure was based on the solution of an important application rather than generalisation for its own sake. This aspect would to a certain extent be subordinated to purely mathematical development until the middle of the twentieth century when the use of fractional operators with their inherent nonlocal nature became a building block of a wide range of quite diverse applications. Liouville published three long memoirs in 1832 and several more in the next two decades. His starting point is the known result for derivatives of integral order D m eax = am eax which he attempted to extend in a natural way to D α eax = aα eax . He expanded the function f (x) in the series f (x) = ∞ ak x and defined the derivative of order α to be k=0 ck e (1.8)
α
D f (x) =
∞
ck aαk eak x .
k=0
This has the disadvantage that the allowed values of α have to be restricted to those for which the series in (1.8) converged. As a second method applied to negative powers f (x) = x−a where a > 0, he started with the Laplace transform formula ∞ 1 −a ta−1 e−xt dt, (1.9) x = Γ(a) 0 then used equation (1.8) to obtain (1.10)
D α x−a = (−1)α
Γ(α + a) −a−α . x Γ(a)
Clearly, this formula also has limited applicability. However, if it is formally generalised to β = −a and disregarding the existence of the resulting integral, one obtains the formula (1.11)
(−1)α Γ(−β + α) β−α d α xβ x = . dxα Γ(−β)
This formula is reminiscent of (1.2) and is in fact identical to it for the case of integral α. For nonintegral α things are different: taking α = 12 and
8
1. Preamble
β = − 12 gives (1.12)
Γ( 12 ) 1 i 1 (−1)1/2 Γ(1) 1 = 0 = √ = . Γ(0) x x π x Γ( 12 )
In fact by further analysis (see [146]) it can be shown that the definitions of Euler and Liouville differed in their limits of integration. In fact the necessity of the correct limits of integration is central (and subsequent definitions realised that the definition of a fractional derivative required a starting point—as indeed Abel was aware and in fact used). Liouville recast his formula (1.8) as a fractional integral of order α ∞ α 1 α f (x) dx = f (x + y)y α−1 dy (1.13) (−1)α Γ(α) 0 and then derived a formula for fractional differentiation dα f (−1)n−α ∞ dn f (x + y) n−α−1 (1.14) = y dy, n − 1 < α < n, dxα Γ(n − α) 0 dxn where f is defined by his exponential formula (1.8) with λn > 0 and f (−∞) = 0. See [220]. He further inserted the representation (1.8) directly into the above to obtain the series representation ∞
dα f (−1)α k α f (x + kh) , (1.15) = lim (−1) h→0 dxα hα k k=0 α Γ(α − 1)Γ(k − 1) . where = Γ(α + k − 1) k Riemann’s work here stemmed from his student days and was published only posthumously in 1875 [287]. The definition taken was more Abelian in character: the fractional integral of order α was x 1 α (x − t)α−1 f (t) dt + ψ(x). (1.16) I f (x) = Γ(α) c Here c is a given constant and ψ(x) represents the constant of integration or complementary function needed in some form. The latter caused considerable confusion, one representation being (1.17)
ψ(x) =
∞ k=1
Kk
x−α−k , Γ(−k − α + 1)
where Kk are finite constants. As we will see, the modern resolution of this is to view the fractional integral and the derivative derived from it as being defined for a fixed starting point c which is then part of the definition.
1.2. Early fractional integrals and their inverses
9
With this Riemann suggested that the fractional derivative of order α be defined by taking the ordinary derivative of the Abel integral of order 1 − α, d 1−α (1.18) [D α f ](x) = [I f ](x). dx More modern notation would write this as RL D α f to incorporate the contribution of Liouville in the notation. If the historical record had been more accurate, this might be more appropriately be written A D α . Despite the above drawbacks in the approaches of both Liouville and Riemann, their names formed the nomenclature from the late nineteenth century: the fractional integral is a weakly singular integral of Abel type and the fractional derivative of order α is the regular derivative of the fractional integral of order 1 − α. There were other nineteenth century attempts at defining the concept of a fractional derivative that have survived the passage of time. Foremost amongst these is the so-called Gr¨ unwald–Letnikov derivative. The idea of Gr¨ unwald was to extend the basic difference limit for the derivative to higher orders, f (x + h) − f (x) , h→0 h f (x + 2h) − 2f (x + h) + f (x) f (x) = lim , ...; h→0 h2 the general case uses the binomial theorem to obtain n n 1 (n) (−1)k f x + (n − k)h . (1.19) f (x) = lim n h→0 h k f (x) = lim
k=0
Removing the restriction that n be an integer in (1.19) gives ∞ α 1 α (−1)k f x + (α − k)h . (1.20) D f (x) = lim α h→0 h k k=0
This idea of an infinite sum of difference quotients was certainly inspired by the work of Liouville. Note that the computation of this derivative requires knowledge of the entire past history of the system due to the summation operator and thus, as in the case of the Abel formulation, it must be viewed as a nonlocal operator. The above ideas were taken further into what is now known as the Gr¨ unwald–Letnikov derivative. We now take the theme into the twentieth century. The motivation for the work of Weyl on fractional integrals was due to the fact that the Abel integral does not preserve periodicity; f being periodic of, say, period 2π does not guarantee that integrals I α f have the same period or even that they
10
1. Preamble
are periodic at all. This is simply a statement that this fractional derivative is better suited to integration of functions defined by powerseries than 2πik by Fourier series. Functions periodic on the unit circle f = ∞ −∞ ck e for which the integral vanishes (c0 = 0) will, when integrated, be again periodic after choosing the constant of integration appropriately. Thus if f ∈ Lp (0, 2π) with 1 ≤ p < ∞ such that f is periodic of period 2π and the integral of f vanishes, then the Weyl fractional integral of order α [346] is (1.21) 2π ∞ 1 eikz α f ](x) = Ψα± (x − y)f (y) dy, Ψα± (z) = . [ W I± 2π 0 ±ikα k=−∞,k=0
The Weyl fractional integral is identical to the Abel fractional integral when the function f is 2π periodic and its integral over a period vanishes, that is with c0 = 0 in the above. As in the Riemann–Liouville case, the Weyl fractional derivative of order α is defined as (1.22)
α f ](x) = ± [ W D±
d W 1−α [ I± f ](x). dx
The Abel, Riemann–Liouville, and Weyl derivatives and integrals are all one-sided as they require a fixed lower endpoint and an upper value to either the left or right of this. The Riesz derivative [288, 290] was designed to be more symmetric and suitable for boundary value as opposed to initial value problems. Riesz defined I α f for f ∈ L1Loc (R) by
1 α α [ W I+ f ](x) + [W I− f ] (x). (1.23) [I α f ](x) = 2 cos(απ/2) The Riesz derivative is then defined in an analogous way to the Weyl derivative. Based on the work of Riesz, Hardy and Littlewood investigated the mapping properties of the Riemann–Liouville derivative in H¨older spaces. A function f defined in an open set Ω ⊂ Rn is said to be H¨ older continuous of order β, 0 < β ≤ 1, if |f (x) − f (y)| ≤ C x − y β . The H¨older space C k,β (Ω) consists of those functions on Ω having continuous derivatives up to order k and such that the kth partial derivatives are H¨older continuous with exponent β. The parameter β allows an interpolation between the integer derivatives, and one might ask, “Is that not what fractional derivatives do and what is the connection?” That such a connection exists is the result of the Hardy–Littlewood theorem which (roughly) states for f ∈ C k,β (Ω) the αth fractional derivative of f lies in C k,β−α (Ω) [138, page 587]. There is a generalisation to functions in the Sobolev spaces H m,p (Ω)—the Hardy– Littlewood–Sobolev theorem.
1.2. Early fractional integrals and their inverses
11
At this stage the concept of a fractional derivative seemed reasonably complete and precise, but there were still many fractional calculus topics to be explored. Such a derivative had the properties originally laid out as essential: it should be a linear operation and defined for as broad a class of functions as possible while for integer values it should equal the usual integer derivative. The Riemann–Liouville version met this criteria but at a cost. The fractional derivative of a constant was nonzero—in fact it was a singular function. From a purely mathematical perspective this doesn’t matter, but for applications it most certainly does. Newton’s original motivation for the subject would no longer make sense for such definitions. There is a simple solution. Simply reverse the orders of fractional integration and integer derivatives in the Riemann–Liouville derivative. The cost here is to narrow the class of applicable functions to which the derivative then applies. Such a version of the Abel derivative was studied extensively beginning in the 1950s by the Armenian mathematician Mkhitar Djrbashian and culminating in his extensive research monograph on the topic published in 1966 [87], but only available in Russian. The English translation appeared in 1993 [81]. In 1967 Caputo realised that using standard derivatives in modeling many physical systems showed that the decay was exponential irrespective of frequency, and this simply did not correspond to observations. He pioneered the derivative first version for the Abel integral as a superior model, and the work of many others followed suit. For this reason the version is often referred to the Caputo derivative, although its definition dates back to 1823 and the work of Djrbashian was almost unknown in the west. The basic function of ordinary differential equations (ode) is the exdy = λy is the function y = eλx . What is the ponential: the solution to dx equivalent for fractional derivatives? The answer turns out to be another special function of entire type: the eponymous function ∞ zk , z ∈ C, (1.24) Eα,β (z) = Γ(αk + β) k=0
was studied in a series of papers by Mittag-Leffler from 1903–1905 [250,251]. It forms a critical component of all of fractional calculus. The exponential function is also the key component of the fundamental solution of the heat equation, but replacing the time derivative ut by one of fractional type, Dtα u requires another special function of entire type: the Wright function [354–357]. This is originally due to the number theorist E. M. Wright in connection with his investigations in the asymptotic theory of partitions, but it has found many, seemingly unconnected, applications within several areas of mathematics.
12
1. Preamble
The notion of fractional derivatives takes on a new complexity when one considers differential operators acting on functions of several variables, for example, the Laplacian −. Certainly, the second power 2 leads to the biharmonic equation 2 u = f which has wide applicability, but what about fractional powers or, in greater generality, if − is replaced by a general second order elliptic operator L? This introduces a rich area for applications and, over the last ten years, this has become a vibrant area of research.
1.3. A short overview on inverse problems Not surprisingly, there are many examples of inverse-type problems in mathematics that go far back in history. We are of course most interested in those that involve differential equations, and one of the most quoted, and from Section 1.2 quite relevant to our specific topic, is the tautochrone problem. The invention of the pendulum clock around 1656 by Christiaan Huygens was inspired by investigations of Galileo Galilei a half century before and resulted in more than a thousandfold increase in accuracy over previous mechanisms. Although a good approximation to a harmonic oscillator, the actual resulting equation is nonlinear and accuracy deteriorates with amplitude of oscillation. Thus further solutions were required. Huygens again proposed that the pendulum be constrained by placing two evolutes on either side, and the question was then what shape they should take to make the pendulum isochronous. His answer, in 1673, was to make the pendulum swing in an arc of a cycloid by choosing the evolute to be an inverted cycloid in shape. Unfortunately, friction against the evolute causes a greater error than that corrected by the cycloidal path. In the early part of the seventeenth century Kepler published his laws of planetary motion having found them by analyzing the astronomical observations of his mentor Tycho Brahe. They gave remarkable accuracy but were ad hoc and derived by essentially data fitting rather than from firm principles. Newton provided the answer: the gravitational attraction between the planets and the sun coming from a central, radial force. The further question was what precise form this force should take? The choice was the inverse square law that allowed the direct derivation of Kepler’s equations from this assumption alone. In retrospect this can be rephrased as, “What gravitational force law would give rise to Kepler’s equations”, and the solution to this inverse problem constituted one of the greatest scientific triumphs of all time. The solution of Poisson’s equation −u = f for u given a function f is of course one of the most basic problems in differential equations. The inverse gravimetry problem of recovering f given the measured values of ∇u on the boundary ∂Ω was known to Laplace. But the first simple solutions were
1.3. A short overview on inverse problems
13
obtained only after another hundred years by Stokes and some fifty years later by Herglotz. An excellent exposition of this problem can be found in [160]. Despite these and later subsequent triumphs that might be classified as an inverse problem, the study of inverse problems for differential equations did not take place until the twentieth century and its systematic study not until the last half. Two early examples are noteworthy. In 1911 Hermann Weyl [345] showed the asymptotic behavior of eigenvalues of the Laplace–Beltrami operator in a domain Ω ⊂ Rd : the number, N (λ) of the Dirichlet eigenvalues (counting their multiplicities) less than or equal to λ is 1 N (λ) = (2π)−d λd/2 ωd vol(Ω) ∓ (2π)1−d ωd−1 λ(d−1)/2 area(∂Ω) + o(λ(d−1)/2 ), 4 where ωd is the volume of the unit ball in Rd ; see [345]. Based on this, one can recover the volume and surface area of Ω from spectral data. This led to questions about whether further geometrical quantities of a region Ω could be determined from such spectral information epitomized by Marc Kac’s famous article, “Can you hear the shape of a drum?” [172]. Following similar spectral question lines, in the late 1920s Ambartsumian in his study of atomic structure and the energy levels arising from the Schr¨ odinger equation asked the question of whether the potential arising from the force field could be determined from spectral measurements which directly correlate to the eigenvalues of the equation. He in fact gave a solution in a particular case, and the complete solution was provided by Borg in 1946 [36]. This showed that the potential q(x) in −uxx +q(x)u = λn u on a finite interval could be uniquely determined from two complete spectral series ∞ {λn }∞ n=1 and {μn }n=1 corresponding to different endpoint boundary conditions. This work was continued in 1951 by Gel’fand and Levitan using a much shorter and ingenious method that subsequently has been much copied and expanded beyond the original scope of the problem; see [111]. It also included a constructive algorithm for q(x). This was later improved to the point where the inverse problem of recovering q(x) from its spectral data is solved significantly faster than computing even a few eigenvalues given q(x), and it also expanded to uniqueness and reconstruction algorithms for more general equations − p(x)u + q(x)u = λρ(x)u and boundary conditions [57, 300]. Although pdes have a history dating to at least the middle of the eighteenth century and they were extensively used as models to describe physical processes throughout this period, the formulation and solution methods were less rigorous. The early twentieth century saw a considerable shift, not
14
1. Preamble
just in their analysis but in their relationship to physics; the birth of modern physics was accomplished by increased sophistication in mathematics. In all these periods of intense discovery there is a tendency to draw lines, and one famous example is due to Hadamard. Hadamard took the point of view that every mathematical problem corresponding to some physical or technological problem must be well-posed. His definition of this was that the mathematical problem had to have a solution, this solution was unique, and further, it had to depend continuously on the data. The third was the most controversial, and the reasoning behind it was that an arbitrary small change in the data should not lead to large changes in the solution. The most explicit versions of these statements were in his 1923 book [127] on pdes, which was extremely influential, but there were similar statements much earlier [125,126]. Of course several already well-known problems such as the backwards heat equation and the Cauchy problem for Laplace’s equation fell into the ill-posed category, as it became known. Alternative names were equally pejorative: “incorrectly set” or “ill-conditioned” problems. The result was that for several decades such classic inverse problems as noted above or the recovery of parameters within the equation were viewed in mathematically negative terms—despite the fact that there were critical applications to the contrary. The rationale for this negativity was based on the instability arising from even small measurement errors and hence a resulting lack of reliability on interpreting the solution. The 1960s saw the tide turn with a series of methods aimed at quantifying and ameliorating the level of ill-conditioning—regularisation methods that sought to temper the effect. We show some of these in Chapter 8 and indeed they are a fundamental tool in almost all inverse problems involving differential equations. The same decade saw the beginning of a systematic study in the determination of unknown coefficients occurring in partial differential operators from extra information measurements. Frequently, the latter consisted of additional boundary measurements. For example, if the direct problem partial differential equation (pde) is of elliptic type and subject to a Dirichlet boundary condition, then one might additionally measure Neumann conditions. If the equation is of parabolic type, one might use the values of the solution u(x, t) for a fixed value of x and t ranging over an interval. We will refer to this as ”time trace data” throughout the book. An alternative situation is when part of the boundary is hidden and one wants to recover solution values u(x) there from additional measurements at accessible parts of the boundary. Examples here include the abovementioned Cauchy problem for Laplace’s equation and the backwards heat equation whereby one is able to measure u(x, t) at a later time t = T and
1.3. A short overview on inverse problems
15
hopes to recover the initial configuration of temperature u(x, 0). These very classic problems were known in their basic form from at least the nineteenth century. They have unique solutions but the dependence of the unknowns on the data is highly unstable. Another important problem is that of inverse scattering. A common paradigm is to convert the wave equation to frequency domain and obtain the Helmholtz equation u + k 2 u = 0. In the simplest situation, we have an impenetrable scatterer D; a plane wave with direction d is fired at D and the resulting amplitude is measured in all directions on a large sphere (which could have infinite radius) the so-called far-field pattern. The uniqueness question is whether these measurements allow recovery of ∂D. Further questions include allowing a penetrable obstacle and the equation u + k 2 n(x)u = 0, where n(x) is the interior refraction index of D, but additionally providing the far-field pattern arising from a complete set of incident directions. Problems such as these are at the heart of imaging modalities and the determination of material parameters where the only information that can be measured is exterior to the region. The applications are ubiquitous in science and engineering. In the above models we have classical pdes of all the three main types and in each case the derivatives are the usual ones of integer order. One of the main purposes of this book is to answer the following questions: What changes if some of the integer order derivatives are replaced by ones of fractional type? Do we we still retain uniqueness? Are there cases where the integer order situation leads to an underdetermined problem but uniqueness is restored under fractional derivatives? Does the degree of ill-conditioning depend on the fractional order α, and if so, to what extent? There is also the additional question of determining the fractional operator itself, for this might depend on several fractional exponents as well as coupling coefficients. Let us finally point out that for some classical examples, such as the backwards heat equation, it is immediate what a physically meaningful fractional counterpart should be, namely replacing the first order time derivative by a fractional one. This is much less clear for some others, like inverse scattering, where space derivatives in higher dimensions play a central role. Thus we focus on inherently one-dimensional derivative concepts and predominantly time derivatives. These come with a decisive nonlocality and directionality that in some inverse problems strongly influence the degree of its ill-posedness. Going to higher (space) dimensions also requires mathematical tools that are very different from those used here. For these reasons, some of the recent highly productive areas of inverse problems for fractional pdes, such as those involving fractional version of the Laplacian, are only touched upon in an outlook in the final chapter.
Chapter 2
Genesis of Fractional Models
After having reviewed the mathematical history of fractional derivatives, we now turn to their physical motivation, which, as already mentioned, has driven much of the mathematical tool development over the last few decades. Here, on one hand, we discuss a derivation of fractional models— in particular models of anomalous diffusion—originating in random walks and therefore based on stochastic principles. On the other hand, we highlight the fact that the phenomonological definition of constitutive laws using fractional derivatives upon their combination with classical physical balance laws leads to fractional pde models as well.
2.1. Continuous time random walk The classical diffusion equation, ut − kΔu = 0, has been used extensively to describe transport phenomena observed in nature. In the context of diffusion, u often denotes the concentration of a substance, k is the diffusion coefficient, and the diffusion equation models how the concentration evolves over time. This equation can be derived from a purely macroscopic argument, based on conservation of mass ∂u + ∇ · J = 0, ∂t 17
18
2. Genesis of Fractional Models
where J is the flux, and Fick’s first law of diffusion, which states that the flux J equals to −k∇u, i.e., J = −k∇u. Alternatively, as we noted in Chapter 1, following the ground-breaking work of Albert Einstein in 1905 [92], one can also derive it from the underlying stochastic process at a microscopic level, under the assumption that the particle movement follows a Brownian motion in the sense that the probability density function of the particle at position x and time t satisfies the constraint that the mean square path is proportional to time; x2 ∝ t. In this chapter, we shall derive fractional diffusion models from the continuous time random walk due to Montroll and Weiss [252], which generalise Brownian motion. We shall show that the resulting probability density function (pdf) p(x, t) of the walker at position x and time t satisfies a pde involving a fraction-order derivative in either time or space. The main tools in the derivation are the Laplace and Fourier transformations; see Section A.1 for a brief review. 2.1.1. Random walk on a lattice. Perhaps the simplest stochastic approach for deriving the diffusion equation is to consider the random walk framework on a lattice, which is also known as a Brownian walk. At every time step (with a time step size Δt), the walker randomly jumps to one of its four nearest neighbouring sites on a square lattice (with a space step Δx). In the one-dimensional case, such a process can be modelled by the master equation pj (t + Δt) = 12 pj−1 (t) + 12 pj+1 (t), where the index j denotes the position on the underlying one-dimensional lattice (grid). It defines the pdf to be at position j at time t + Δt to depend on the pdf of the two adjacent sites j ± 1 at time t. The factor 12 expresses the directional isotropy of the jumps: jumps to the left and right are equally likely. A rearrangement gives pj (t + Δt) − pj (t) 1 pj−1 (t) − 2pj (t) + pj+1 (t) = . 2 (Δx) 2 (Δx)2 The right-hand side is clearly the central finite difference approximation 2 p(x,t) . In the continuum limits of the second order derivative in space ∂ ∂x 2 Δt → 0 and Δx → 0 (and denoting x = jΔx), provided that the pdf p(x, t) is regular in x and t, the Taylor expansions ∂p(x, t) + O((Δt)2 ), ∂t ∂p(x, t) (Δx)2 ∂ 2 p(x, t) + + O((Δx)3 ) p(x ± Δx, t) = p(x, t) ± Δx ∂x 2 ∂x2 p(x, t + Δt) = p(x, t) + Δt
2.1. Continuous time random walk
19
lead to the diffusion equation ∂u(x, t) ∂ 2 u(x, t) =K , ∂t ∂x2 on discarding the higher order terms in Δt and Δx. The continuum limit is drawn such that the quotient (2.1)
(Δx)2 Δx→0,Δt→0 2Δt is a positive constant. K is referred to as the diffusion coefficient: it connects the spatial and time scales. (2.2)
K=
lim
Diffusion equation (2.1) can also be regarded as a direct consequence of the central limit theorem, which we describe below. Generally, suppose that the jump length Δx has a pdf given by λ(x) so that P (a < Δx < b), with a < b, is given by b λ(x) dx. P (a < Δx < b) = a
Then taking the Fourier transform gives ∞ ∞ −iξx 1 − iξx − 12 ξ 2 x2 + · · · λ(x) dx e λ(x) dx = λ(ξ) = −∞
= 1 − iξμ1 −
−∞
1 2 2 ξ μ2
+··· ,
where μj is the jth moment μj =
∞
xj λ(x) dx,
−∞
provided that these moments exist, which holds if the pdf λ(x) decays sufficiently fast as x → ±∞. Further, assume that the pdf λ(x) is normalised and even, i.e., μ1 = 0, μ2 = 1, and μ3 = 0. Then λ(ξ) = 1 − 1 ξ 2 + O(ξ 4 ). 2
In the random walk model, the steps ΔX1 , ΔX2 , . . . are independent events. The sum of the independent and identically distributed (i.i.d.) random variables ΔXn Xn = ΔX1 + ΔX2 + · · · + ΔXn gives the position of the walker after n steps. It is also a random variable, n, and by Theorem 2.1 its density has a Fourier transform pn (ξ) = (λ(ξ)) √ and the pdf of the normalised sum Xn / n (with a rescaling according to √ (2.7) with σ = n) has the Fourier transform n √ 1 2 ξ + O(n−2 ) . p n (ξ/ n) = 1 − 2n ξ2
Taking the limit n → ∞ gives p (ξ) = e− 2 and inverting the Fourier trans2
form gives a Gaussian distribution p(x) =
x √1 e− 4 4π
. This is a statement of
20
2. Genesis of Fractional Models
the central limit theorem that the long term averaged behaviour of i.i.d. variables has a Gaussian density. One requirement for the whole procedure to work is that the second moment μ2 of λ(x) be finite. Now we interpret Xn as the particle position after n time steps, at the actual measured time t. To this end, we shall correlate the time step size Δt with the variance of Δx, following the ansatz (2.2). This can be easily achieved by rescaling the variance of λ(x) to 2Kt. Then by the scaling rule for the Fourier transform, the Fourier transform p n (ξ) is pn (ξ) = (1 − n−1 Ktξ 2 + O(n−2 ))n , and now taking limit of a large enough number of steps n → ∞ to arrive at 2 the Fourier transform p (ξ) = e−ξ Kt , which upon inverse Fourier transformation gives the pdf of being at a certain position x at time t, is governed by diffusion equation (2.1), and it is given by the Gaussian pdf 1 2 e−x /4Kt . 4πKt This function is the fundamental solution (also known as the propagator in physics), i.e., the solution of (2.1) with the initial condition p(x, 0) = δ(x), the Dirac function concentrated at x = 0. We note that at any fixed time t > 0, the function p(x, t) is a Gaussian distribution in x, with mean zero and variance 2Kt, i.e., ∞ x2 p(x, t) dx = 2Kt, p(x, t) = √
−∞
which scales linearly with time t. Linear scaling with t is one characteristic feature of normal diffusion processes. 2.1.2. Continuous time random walk. There are several stochastic models for deriving differential equations involving a fractional order derivative. We shall only briefly discuss the continuous time random walk framework due to Montroll and Weiss [252]. 2.1.3. General strategy. The continuous time random walk (ctrw) generalises the random walk model, in which the length of a given jump, as well as the waiting time elapsing between two successive jumps follow a given pdf. Below we shall assume that these two random variables are uncorrelated, even though in theory one can allow correlation in order to achieve more flexible modelling. In one spatial dimension, the picture is as follows: a walker moves along the x-axis, starting at a position x0 at time t0 = 0. At time t1 , the walker jumps to x1 , then at time t2 jumps to x2 , and so on. We assume that the temporal and spatial increments Δtn = tn − tn−1
and Δxn = xn − xn−1
2.1. Continuous time random walk
21
are i.i.d. random variables, following pdfs ψ(t) and λ(x), respectively, which are known as the waiting time distribution and jump length distribution, respectively. Namely, the probability of Δtn lying in any interval [a, b] ⊂ (0, ∞) is b P (a < Δtn < b) = ψ(t) dt, a
and the probability of Δxn lying in any interval [a, b] ⊂ R is b λ(x) dx. P (a < Δxn < b) = a
Now the goal is to determine the probability that the walker lies in a given spatial interval at time t. For given pdf, ψ, and λ, the position x of the walker, can be regarded as a step function of t. We recall a standard result on the sum of two independent random variables, Theorem 2.1. If X and Y are independent random variables with a pdf given by f and g, respectively, then the sum Z = X + Y has a pdf f ∗ g. Example 2.1. Suppose that the waiting time distribution ψ(t) is Poissonian with a parameter τ > 0 ψ(t) = τ −1 e−t/τ ,
0 < t < ∞,
and the jump length distribution λ(x) is Gaussian with mean zero and variance σ 2 x2 1 λ(x) = √ e− 2σ2 . 2πσ 2 Then the waiting time Δtn and jump length Δxn have the expected values and variances E[Δtn ] = τ,
E[Δt2n ] = τ 2 ,
E[Δxn ] = 0,
E[Δx2n ] = σ 2 .
The position x(t) of the walker is a step function and one sample trajectory of the ctrw with τ = 1 and σ = 1 is given in Figure 2.1. With the help of Theorem 2.1, we can derive the pdf of the total waiting time tn after n steps and the total jump length xn − x0 after n steps. We denote by ψn (t) the pdf of the random variable tn = Δt1 + Δt2 + · · · + Δtn . By Theorem 2.1, we have
t
ψn (t) = (ψn−1 ∗ ψ)(t) = 0
ψn−1 (s)ψ(t − s)ds = ψ ∗ ψ ∗ · · · ∗ ψ .
n-fold
22
2. Genesis of Fractional Models
15
10
5
0
−5
....... . ......... .... ...... ............ ........... ......... ... .. . ... .. ... .... ..... ... .. .... ... ....... ... .. .. .. ..... ........ .... . ..... ...... ... ....... ........... . ......... .... ........ ..... .. .. .. ... ... ........ ..... ........ ........ .. .... .. .. ........ . ... .... . .............. .............. ................... .. .... . . . .... .... .... .. ... . . . . . . . . . ..... . . . . . . . . ... ................. ..... ... ....... ... ... . ....... ... . .. . . . ............... . .. .. . . . . . . . . . .............. .......... ........ . . .. . . . . ..... ............ .... ... . .. ........ . . .... . ... ..... ...... .. ... .. .... . . . . ..... . .. ..... .. ..... . . . . . ..... . . ... ......... .. .... ......... ... ....... .. ..... .... .. . ........... . . .. ... . ... .... .. ..... ... .... .... .. .... ................... .. .. . .. ... ........ .. .... . . . . . . . ......... . . . . . ... . . . . ... .. .... . .... .... .. ............ . . .. ... . ... .. ... .. ... ..... .............. .. ... . ... ... ..... ... ...... ..... .... .. .... ... . .... ... .... .. .... .... ........... ....... .... .... .. . ... ........ ... .... . . ......... .... ... ....... ... . .. ...... .... .... .... . . . .. .......... . ... ... .... ...... ... ...... .. .. ..... . ... .. .......... ... ...... .. .. ..... ... .. . . . . . . . ..... ....... ..... ........ .... . . . . .... .............. .... ... .... ..... .. .. . . ....... ......... . . . . .... ... .. ..... .. ....... . .... .... . . . . . . . . . .. ... .. ...... .... ...... ...... ..... . . . . . . . . .. .... . ... . 0 200 300 400....... 500 ... ... .. 100 ....... ........... .... ... ....... ....... ... ... .... ... ... ............. ... . .... ...... .... ........ ... ..... .... ..... ..... .. .. ......... ....... ...... .... ... .
x
t
Figure 2.1. Brownian motion random walk
Then the characteristic function ψn (z) of ψn (t), i.e., its Laplace transform ∞ e−zt ψn (t) dt, ψn (z) = L(ψn )(z) = 0
by the convolution rule for the Laplace transform (A.2) is given by n . ψn (z) = (ψ(z))
Let Ψ(t) denote the survival probability, i.e., the probability of the walker not jumping within a time t, or equivalently, the probability of remaining stationary for at least a duration t. Then ∞ t Ψ(t) = ψ(s)ds = 1 − ψ(s)ds, 0 < t < ∞. t
0
The characteristic function for the survival probability Ψ is 1 − ψ(z) = . Ψ(z) = z −1 − z −1 ψ(z) z By the definition of the survival probability, the probability of taking exactly n steps up to time t is given by t ψn (t)Ψ(t − s)ds = (ψn ∗ Ψ)(t). χn (t) = 0
Appealing to the convolution rule once again yields n = ψ(z) χ n (z) = ψn (z)Ψ(z)
1 − ψ(z) . z
Next we derive the pdf of the jump length xn − x0 after n steps. We denote by λn (x) the pdf of the random variable xn − x0 = Δx1 + Δx2 + · · · + Δxn .
2.1. Continuous time random walk
23
Appealing to Theorem 2.1 again yields ∞ λn (x) = (λn−1 ∗ λ)(x) = λn−1 (y)λ(x − y)dy = λ ∗ λ ∗· · · ∗ λ , −∞
n fold
and consequently, by the convolution rule for the Fourier transform, its n (ξ) is given by Fourier transform λ n n (ξ) = (λ(ξ)) , λ
n ≥ 0.
Now denote by p(x, t) the pdf of the walker at the position x at time t. Since χn (t) is the probability of taking n-steps up to time t, p(x, t) =
∞
λn (x)χn (t).
n=0
It remains to derive an analytic formula for p(x, t). To this end, we denote the Fourier–Laplace transform of p(x, t) by ∞ ∞ −zt e e−iξx p(x, t) dxdt. p (ξ, z) = LF [p](ξ, z) = −∞
0
Then the Fourier–Laplace transform p (ξ, z) is given by (2.3)
p (ξ, z) =
∞ n=0
∞ 1 − ψ(z) n n (ξ) ψ(z)] λ χn (z) = [λ(ξ) , z n=0
since λ and ψ are probability density functions ∞ λ(x) dx = 1 and ψ(0) = λ(0) = (2.4) −∞
∞
ψ(t) dt = 1. 0
ψ(z)| If ξ = 0 or z > 0, then |λ(ξ) < 1 so that the infinite sum is absolutely convergent, ∞ 1 n ψ(z)] [λ(ξ) = , 1 − λ(ξ)ψ(z) n=0 and hence we obtain the fundamental relation for the probability density function p(x, t) in the Laplace-Fourier domain, (2.5)
1 1 − ψ(z) . p (ξ, z) = ψ(z) z 1 − λ(ξ)
Example 2.2. For the waiting time distribution ψ(t) = τ −1 e−t/τ and jump length distribution λ(x) = ψ(z) =
x2
√ 1 e− 2σ2 2πσ 2
1 1 + τz
in Example 2.1, we have 2 2
and
σ ξ λ(ξ) = e− 2 .
24
2. Genesis of Fractional Models
By (2.5), the Laplace-Fourier transform p (ξ, z) of the pdf p(x, t) of the walker at position x (relative to x0 ) at time t is given by τ . p (ξ, z) = σ2 ξ2 1 + τ z − e− 2 Different types of ctrw processes are categorised by the characteristic waiting time ∞ tψ(t) dt T =: E[Δtn ] = 0
and the jump length variance
2
2
Σ =: E[(Δxn ) ] =
∞
x2 λ(x) dx
−∞
being finite or diverging. In the rest of this section, we shall discuss the following three scenarios: (i) Finite T and Σ2 , (ii) Diverging T and finite Σ2 , and (iii) Diverging Σ2 and finite T . 2.1.4. Finite characteristic waiting time and jump length variance. If both characteristic waiting time T and jump length variance Σ are finite, the long-time limit corresponds to Brownian motion, as we shall see below, and thus the ctrw does not lead to anything new. To see this, we assume that the pdfs ψ(t) and λ(x) are normalised to satisfy ∞ ∞ ∞ (2.6) tψ(t) dt = 1, xλ(x) dx = 0, x2 λ(x) dx = 1. −∞
0
−∞
These conditions can be satisfied by rescaling if the waiting time pdf ψ(t) has a finite mean, and the jump length pdf λ(x) has a finite variance. We recall the relation (2.4): ψ(0) = 1 = λ(0) and since ∞ dk ψ = e−zt (−t)k ψ(t) dt, dz k 0 ∞ using the normalisation condition (2.6), we have ψ (0) = − 0 tψ(t) dt = −1. Similarly, we deduce ∞ xλ(x) dx = 0, λ (0) = −i −∞
(0) = − λ
∞ −∞
x2 λ(x) dx = −1.
Next, for τ > 0 and σ > 0, let the random variables Δtn and Δxn follow the rescaled pdfs t 1 1 x
and λσ (x) = λ . (2.7) ψτ (t) = ψ τ τ σ σ
2.1. Continuous time random walk
25
We shall now pass to the diffusion limit: namely the limit of the Fourier– Laplace transform p (ξ, z; σ, τ ) as τ, σ → 0. Then a simple computation shows T =: E[Δtn ] = τ,
E[Δxn ] = 0,
Σ2 =: E[Δx2n ] = σ 2 .
By the scaling rules for the Fourier and Laplace transformations, ψτ (z) = z) and λ σ (ξ) = λ(σξ), and consequently, ψ(τ z) 1 − ψ(τ 1 . p (ξ, z; σ, τ ) = z 1 − ψ(τ z)λ(σξ) We shall study the limit of p as τ, σ → 0. The Taylor expansion of ψ(z) around z = 0 yields + ψ (0)z + · · · = 1 − z + O(z 2 ) ψ(z) = ψ(0)
as z → 0.
For further simplicity we assume that the pdf λ(x) is even, i.e., if λ(−x) = (0) = 0, and the Taylor expansion of λ(ξ) around ξ = 0 is λ(x), then λ given by (0)ξ + 1 λ (0)ξ 2 + · · · = 1 − 1 ξ 2 + O(ξ 4 ). λ(ξ) = λ(0) +λ 2 2 Using these two relations, some algebraic manipulation gives (2.8)
p (ξ, z; σ, τ ) =
τ τz +
1 2 2 2σ ξ
1 + O(τ z) . 1 + O(τ z + σ 2 ξ 2 )
Now in formula (2.8), we take σ → 0 and τ → 0 while keeping the relation σ2 = K, lim σ→,τ →0 2τ for some fixed K > 0, and we obtain (2.9)
p (ξ, z) =
lim
σ→,τ →0
τ τz +
1 2 2 2σ ξ
=
1 . z + Kξ 2
Here the scaling relation for K is identical to that in the random walk framework, cf. (2.2). Upon inverting the Laplace transform, we find ezt 1 2 dz = e−Kξ t , p (ξ, t) = 2 2πi Ha z + Kξ and then inverting the Fourier transform recovers the familiar Gaussian pdf p(x, t) in the physical domain p(x, t) = √
1 2 e−x /4Kt . 4πKt
26
2. Genesis of Fractional Models
We now verify that the pdf p(x, t) satisfies the diffusion equation (2.1): p (ξ, z) − p (ξ, 0) + Kξ 2 LF [pt − Kpxx ](ξ, z) = z p (ξ, z) p (ξ, z) − p (ξ, 0) = 1 − p (ξ, 0) = 0, = (z + Kξ 2 ) where the last step follows by p(x, 0) = δ(x), and so p (ξ, 0) = 1. Therefore, p(x, t), the pdf for the random variable x(t) − x0 at position x (relative to x0 ) of the walker at time t, satisfies diffusion equation (2.1). Hence, the ctrw framework recovers the classical diffusion equation, as long as the waiting time pdf ψ(t) has a finite mean and the jump length pdf λ(x) has finite first and second moments. In this case, the displacement variance is given by ∞ 2 μ2 (t) = Ep [(x(t) − x0 ) ] = x2 p(x, t) dx = 2Kt, −∞
which grows linearly with the time t. Thus the critical observation is that any pair of pdfs with finite characteristic waiting time T and jump length variance Σ2 leads to the same result to the lowest order, and hence to the long-time limit. Note that the notion of “long time”, equivalent to the diffusion limit, is only relative with respect to the time scale τ . 2.1.5. Divergent mean waiting time. We consider the situation where the characteristic waiting time T diverges, but the jump length variance Σ2 is still kept finite. This occurs for example when the particle might be partially trapped in a certain potential well with the result that it takes a long time to leave the well, A (2.10) ψ(t) ∼ 1+α as t → ∞, t for some α ∈ (0, 1), where A > 0 is a constant. One example of such α a pdf is ψ(t) = (1+t) 1+α . The (asymptotic) power law decay in (2.10) is heavy tailed, and it allows an occasional very large waiting time between consecutive walks. Again, the specific form of ψ(t) is irrelevant, and only the power law decay at large time matters. The parameter α determines the asymptotic decay of the pdf ψ(t): the closer is α to zero, the slower is the decay resulting in a more likely long wait between jumps. Now the question is whether such an occasional long waiting time will change completely the dynamics of the stochastic process in the ∞large time. For the power law decay, the mean waiting time is divergent: 0 tψ(t) dt = +∞, and so one cannot assume the normalizing condition on the pdf ψ, and the preceding analysis breaks down. However, in this part, the assumption on λ(x) remains unchanged, i.e., ∞ ∞ xλ(x) dx = 0 and x2 λ(x) dx = 1. −∞
−∞
2.1. Continuous time random walk
. ..... .. . .... . .. ...... ............ ... ... ........ ................. ........ ...... ..... .... ...... ... .......... .... ... ......... .. ...... .. .. .... .... ....... ....... ..... . . . . . . .. . . . .. ... ... ................ ... ... .. ... ....... ................. .... ... ..... .. .. .... ...... . . . ... ...... .. .. ......... ... . ...... ... ............ .. .. ........ ..... ....... ... . ...... .... . ........ .... ...... ..... ..... . . . . . ... . . . . . . . . . . ...... ..... .. .... ... ...... ... .... .... ........ .................................... ........ .. . .... .. ... ...... ..... . . . ... . ...... . . . . . .. ... .. .. .. ....... ....... .. ............. ... .. ...... ....... .. .... .. ....... .. ............ ........... ... .... .. . ........ . . . . . . . .... . . . . ............ .. ....... . ......... ... ..... ...... ................. .................. ... ..... ...... ....... ... . .... ............................... .... . ....... ... . .. ... .. .. . .. ..... .......... .... ... . . . . . . .. .3000 4000 5000 6000 7000 8000... 0 1000 .........2000 . . . .... ... ..... ..... .... .... .......... .... .... ..... ........ ... ...... ........ .. ........ ......... ..... . .. ......... .. .. ... . ..
27
x
15
10
5
0
−5
.... .. .. . ... . .. .... .... .... .. ....... ....... ... ... .... .... ... .. ...... .. ... .... .... .... ....... ....... . ... . . . . .. .... .... .............. . .. ... ............ .. ...... . ... ... .. . .... ....... ... .... .... .... ........ ... ... .. .. ... .... .... .. .. ... .... .... ... ... ...... . .. .. . ... ........ .... ...................................................... ....... ... ... .... . ......... ... .. ... ... ... ....... .. ...... ... .. .. ... ................. ... .. ...... .... .. . ....... .... .. ... ............... ... . ... . ... ...... ...... . . . . . . ........ ...... . .. . . . . . . . . . . . . . . . . . . . . . . .. ... . . . . ..... ... .. . .................. .. . ......... ...... ......................... .. ... .... ..... .. .. ... ....... ... 6 .. 6 ..... .. 0 10 2.10 ......... ... ........ ... ...... ... ...... .. ............ ....... .... .. .... ..
x
15
10
5
t
0
−5
t
Figure 2.2. Subdiffusion ctrw: left, α = 0.75; right, α = 0.5
In Figure 2.2 we show sample trajectories of the ctrw with a power law α 1 3 waiting time pdf ψ(t) = (1+t) 1+α , for the values α = 2 , 4 , together with the standard spatial Gaussian jump length pdf. Along the trajectory, one clearly observes the occasional but enormously large waiting time, and such a large waiting time appears at different time scales, This is dramatically different from the case of finite mean waiting time; cf. Figure 2.1. The following result gives an estimate on the Laplace transform ψ(z) for z → 0. Theorem 2.2. Let α ∈ (0, 1) and A > 0. If ψ(t) = At−1−α + O(t−2+α ) as t → ∞, then with Bα = Aα−1 Γ(1 − α), ψ(z) = 1 − Bα z α + O(z)
as z → 0.
Proof. We shall only sketch the proof, and leave the remaining details to the reader. By the assumption on ψ, there exists a constant T > 0 such that
(2.11)
|t1+α ψ(t) − A| ≤ ct−1
∀ T ≤ t < ∞.
28
2. Genesis of Fractional Models
Now consider 0 < z < T −1 so that zT < 1. Since ψ(0) = 1, this gives ∞ (1 − e−zt )ψ(t) dt 1 − ψ(z) = =
0 T
(1 − e−zt )(ψ(t) − At−1−α ) dt ∞ (1 − e−zt )(ψ(t) − At−1−α ) dt + 0
T ∞
+
(1 − e
−zt
)At
−1−α
0
dt =:
3
Ii .
i=1
Using the elementary inequality 0 ≤ 1 − e−s ≤ min(1, s) for all s ∈ [0, 1], we deduce that the first term I1 satisfies the estimates T T z tψ(t) dt + z tAt−1−α dt |I1 | ≤ 0
0
∞
≤ zT
T
ψ(t) dt + Az 0
t−α dt ≤ cα,T z.
0
For the second term I2 , using (2.11) and the change of variables s = z t, we deduce that ∞ 1+α (1 − e−s )s−2−α ds ≤ cα,T z. |I2 | ≤ cz Tz
Similarly, for the third term I3 , by a change of variable s = zt, we have ∞ ∞ 1 − e−s (1 − e−zt )At−1−α dt = Az α ds. I3 = s1+α 0 0 Integration by parts and the recursion formula for the Gamma function, cf. (3.10), gives ∞ ∞ 1 − e−s −1 ds = α e−s s−α ds = α−1 Γ(1 − α) = −Γ(−α). s1+α 0 0 Combining the preceding estimates yields the desired inequality.
As previously, now we introduce the following rescaled pdfs for the incremental waiting time Δtn and jump length Δxn : t 1 1 x
and λσ (x) = λ . ψτ (t) = ψ τ τ σ σ Accordingly, by (2.5), the Laplace-Fourier transform p (ξ, z; σ, τ ) is given by z) 1 1 − ψ(τ , p (ξ, z; σ, τ ) = z 1 − ψ(τ z)λ(σξ)
2.1. Continuous time random walk
29
where the under the power law waiting time pdf, by Theorem 2.2 and some computation, we deduce ψ(z) = 1 − Bα z α + O(z) as z → 0, λ(ξ) = 1 − 12 ξ 2 + O(ξ 4 )
as ξ → 0.
Consequently, further algebraic manipulations give (2.12)
p (ξ, z; σ, τ ) =
1 + O(τ 1−α z 1−α ) Bα τ α z α−1 . Bα τ α z α + 12 σ 2 ξ 2 1 + O(τ 1−α z 1−α + τ α z α + σξ)
In (2.12), we again take the limit to find the Fourier–Laplace transform p (ξ, z) by sending σ → 0 and τ → 0, while keeping σ2 = Kα σ→,τ →0 2Bα τ α lim
for some fixed Kα > 0. Thus we obtain (2.13) p (ξ, z) =
p (ξ, z; σ, τ ) = lim
σ→,τ →0
z α−1 Bα τ α z α−1 = . 1 α σ→,τ →0 Bα τ α z α + σ 2 ξ 2 z + Kα ξ 2 2 lim
Formally, we recover formula (2.9) by setting α = 1 (but the argument has to be changed). Last, we invert the Fourier–Laplace transform p (ξ, z) back into the spacetime domain. Using the Laplace transform formula of the Mittag-Leffler function (1.24) Eα (z) = Eα,1 (z), from Lemma 3.1, we deduce p (ξ, t) = Eα (−Kα tα ξ 2 ), and next applying the fact that it is the Fourier transform of the M -Wright function according to Theorem 3.29, we get the explicit expression of the pdf p(x, t) in the physical domain 1 |x| p(x, t) = √ Mα/2 √ . 2 Kα tα Kα tα With α = 1, by the relation (3.67), this formula recovers the Gaussian density we have seen earlier. Last, by the Laplace transform formula for the Djrbashian–Caputo fractional derivative in Lemma 4.12, we verify that the pdf p(x, t) satisfies ∂2 DC α p (ξ, z) − z α−1 p (ξ, 0) + Kα ξ 2 p (ξ, z) LF 0 Dt p(x, t) − Kα 2 p(x, t) = z α ∂x p (ξ, z) − z α−1 p (ξ, 0) ≡ 0 = (z α + Kα ξ 2 )
30
2. Genesis of Fractional Models
since p (ξ, 0) = δ(ξ) = 1. Thus the pdf p(x, t) satisfies the time fractional diffusion equation, ∂ 2 p(x, t) = 0, 0 < t < ∞, −∞ < x < ∞. ∂x2 It is easy to check using the Laplace transform formula for the Riemann– Liouville fractional derivative in Lemma 4.6 that the equation can be equivalently written using a Riemann–Liouville fractional derivative as DC α 0 Dt p(x, t)
− Kα
∂p(x, t) RL 1−α ∂ 2 p(x, t) = 0 Dt . ∂t ∂x2 Now let us compute the mean square displacement ∞ μ2 (t) = x2 p(x, t) dx, t > 0. −∞
To derive an explicit formula, we resort to the Laplace transform ∞ d2 x2 p(x, z) dx = − 2 p (ξ, z)|ξ=0 μ 2 (z) = dξ −∞ =−
d2 (z + Kα z 1−α ξ 2 )−1 |ξ=0 = 2Kα z −1−α , dξ 2
which taking the inverse Laplace transform yields μ2 (t) =
2Kα α t , Γ(1 + α)
and thus the mean square displacement grows only sublinearly with respect to time and, which at large time t, is slower than that in the Gaussian diffusion case. This leads to the terminology “subdiffusion” process. In the limit α → 1, the formula for Gaussian diffusion is recovered. 2.1.6. Diverging jump length variance. Finally, we turn to the case of diverging jump length variance Σ2 , but finite characteristic waiting time T . To again be specific, we assume an exponential waiting time ψ(t), and that the jump length follows a (possibly asymmetric) L´evy distribution. The most common and convenient way to define standard L´evy stable random variables is via their characteristic function (Fourier transform) −|ξ|μ (1 − iβ sign(ξ) tan μπ 2 ), μ = 1, μ, β) = ln λ(ξ; −|ξ|(1 + iβ sign(ξ) π2 ln |ξ|), μ = 1, where μ ∈ (0, 2], β ∈ [−1, 1]. The characteristic parameter μ lies in the range (0, 2] and determines the rate at which the tails of the pdf taper off. When μ = 2, it recovers a Gaussian distribution, irrespective of the value
2.1. Continuous time random walk
31
of β, whereas with μ = 1 and β = 0, the stable density is identical to the Cauchy distribution, that is, 1 λ(x) = . π(1 + x2 ) The parameter β lies in the range [−1, 1], and determines the degree of asymmetry of the distribution. When β is negative (resp., positive), the distribution is skewed to the left (resp., right). In the symmetric case β = 0, the expression simplifies to μ) = e−|ξ|μ . λ(ξ; Generally, the pdf of the L´evy stable distribution is not available in closed form. However, it is possible to compute the density numerically [259] or to simulate random variables from such stable distributions; see Section 2.1.8 below for details. Nonetheless, following the proof of Theorem 2.2, one can show that in the x − t space, it has the following inverse power law asymptotic λ(x) ∼ Aμ,β |x|−1−μ
(2.14)
as |x| → ∞,
for some constant Aμ,β > 0, from which one can see that the jump length variance diverges, i.e., ∞ x2 λ(x) dx = ∞. (2.15) −∞
In general, the pth moment of a stable random variable is finite if and only if p < μ. For a ctrw, one especially relevant case is λ(x; μ, β) with μ ∈ (1, 2] and β = 0; we show a realisation of this as a sample trajectory in Figure 2.3. One sees that very long jumps may occur with a much higher probability than for an exponentially decaying pdf based on the Gaussian distribution. The scaling nature of the Levy jump length pdf leads to the clusters along 50
0
−50
−100
x
. ... . ....... ......... ................. .... . . ......... .... ........ ........... ... ........... . . ... .. ... ... ... ... .... ... ... .. . . ... ..... .............. . ... .... ...... .. .. .. ..100 0 200 300 ............. ... .... ... ... ... ... ... ....... .......... ... . .... .. ... ............................... ........................ .... .......... .. ..... .. .. .... . .... ..... . . ...... ... .. ...... ... ............. ... . ........ ... ..... ...... .... ...................... ....... . ... .... ....... .....
Figure 2.3. Superdiffusion random walk: μ = 1.5
t
32
2. Genesis of Fractional Models
............ ............. .......... .............................. .......... . .......................................... ....................... . . . . . . . . . . .. . ............ ............................ ......... ...................................... ....... ................................................ .......... ....... ................................................................................................................................................... ..... ................................................................................................... ... .... .................................................................................... ....... ............................................................ ......................... ................ ........................................................... ................................... ......... .. .... .. .. . . ... .... . . . ... .. .......................................................................................................................................................................................... ..................................... • .. .. . . .. ....... . . . ........ ..................................... ....................................................................................................................................... . . . . . . . . . .... . . . ........................................... ......... ........ . ...
Gaussian(Brownian)Walk
. ...... .......... ..................................................... .......... .... ..................................................... .............................................. ............................................................................................... ....................................................................................................................... . . ........................... . . . . ..... ................. ... .... ... ................................................................................................................. . .. ... ......... ............................... . .. . . . . . . . . .................... .................................................................................................... . . . . . . . . . . . . . . . . . . . . . . . . ....... ...................... .... . . . . . . . . . . . ... . . . . . . . . ....... ........... . . .......... ..... .................... ...... ... ........ . .. . . ................................................ . ... ... ... .... . . . .. . . . . .......... ........ .. . ........ ..... .. ............. . ... . ............ ....... ... ...................................... ............ ................. ............................................. ........................................ ...... ...... . ...... .................................................................................... ..... ............................................................................ .......... ... . . . . . . . . . . . . . . . . . . . . . . . ..... ............ ......... .......... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ......................................... .................. ... . . . ... ....... ... ......... ........ ........... ...... ........... ... . . ... .... .... . . . ..... . ................................ ..... . . ..... ....... ..... ....................................... .................................... •
Anomalous( μ = 1.5 ) Walk
Figure 2.4. 2D Gaussian and anomalous (μ = 1.5) random walks
the trajectory, i.e., local motion is occasionally interrupted by long sojourns, on all length scales. In Figure 2.4 we show a two dimensional comparison of a Brownian and an anomalous (with μ = 32 ) random walk. Note the significant difference: if the particle were a grazing animal whose motions were based on a Gaussian process, then it would tend to stay within a quite local region; if it used an anomalous process, then the occasional very large step sizes would allow it to traverse a much larger region. In fact, this phenomenon has been observed in practice; see [245]. Below we derive the diffusion limit using the scaling technique, by appealing to the scaled pdfs ψτ (t) and λσ (x) according to (2.7), for some ∞ τ, σ > 0. Here the pdf ψ is assumed to be normalised, i.e., 0 tψ(t) dt = 1. Using the Taylor expansion of the exponential function, for small ξ, we have μπ
˜ + O(|ξ|2μ ). eλ(ξ;μ,β) = 1 − |ξ|μ 1 − iβ sign(ξ) tan 2 Hence, ψˆτ (z) = 1 − τ z + O(τ 2 z 2 ) as τ → 0,
˜ σ (ξ) = 1 − σ μ |ξ|μ 1 − iβ sign(ξ) tan μπ + O(σ 2μ |ξ|2μ ) as σ → 0. λ 2 These two identities and the Fourier–Laplace convolution formulae yield 1 ˆ(ξ, z; σ, τ ) = , (2.16) p˜ˆ(ξ, z) = lim p˜ μ μ σ→0,τ →0 z + K |ξ| 1 − iβ sign(ξ) tan μπ 2 where the limit is taken under the condition σμ K μ = lim σ→0,τ →0 τ
2.1. Continuous time random walk
33
for some positive constant K μ > 0, which represents the diffusion coefficient in the stochastic process. Next we invert the Fourier transform of the pdf p (ξ, z), p (ξ, t) = e−K
μ |ξ|μ (1−iβ sign(ξ) tan μπ )t 2
,
which is the characteristic function of a L´evy stable distribution. However, an explicit expression is unavailable. This can be regarded as a generalised central limit theorem for stable distributions [243]. With some work, by making both a Fourier and a Laplace inversion, one can verify that the pdf satisfies the differential equation Kμ ∂p(x, t) RL μ μ = (1 − β) −∞ Dx p(x, t) + (1 + β)RL x D∞ p(x, t) , ∂t 2 and in the symmetric case β = 0, it further simplifies to ∂p(x, t) = K μ ∇μx p(x, t), ∂t where ∇μx denotes the Riesz fractional operator 1 RL μ μ ∇μx f = ( −∞ Dx f + RL x D∞ f ). 2 The solution in this case can be obtained analytically using the rather unwieldy Fox H-functions (which are generalisations of the Wright function to be discussed). It recovers the Gaussian density as μ → 2, irrespective of the asymmetry parameter β. Using (2.14), we deduce the power-law asymptotics as |x| → ∞, p(x, t) ∼
K μt , |x|1+μ
μ < 2.
This shows that the mean squared displacement diverges ∞.
∞
−∞ |x|
2 p(x, t) dx =
The particle following such a random walk model spreads much faster than normal diffusion, and it is commonly known as superdiffusion in the literature. The divergent mean squares displacement has caused some controversy in practice, and various modifications have been proposed. Consequently, boundary value problems for superdiffusion are much more involved than for the subdiffusive case, as the long jumps make the very definition of a boundary condition delicate. 2.1.7. Another random walk leading to a fractional operator. Here is another random walk model that more directly leads to a fractional operator—in this case to (a version of) the fractional Laplacian. More details can be found in [40].
34
2. Genesis of Fractional Models
We take the following probability distribution on the natural numbers Z: If I is a subset of Z, then the probability of I is taken to be (2.17)
P (I) := cs
k∈I
1
|k|
cs :=
, 1+2s
|k|−1−2s
−1
,
k∈Z
where the normalisation cs is chosen to ensure P is a probability measure. A particle will move in Rd according to a discrete in time and space probability process with τ and h the respective discrete time and space steps and we will constrain these with the scaling τ = h2s . The probability of finding the particle at point x and time t we set to be u(x, t) and it obeys the following law: At each time step τ the particle selects both a direction v according to the uniform distribution on the surface of the unit sphere ∂B and an integer k ∈ Z according to the distribution in (2.17). Thus the particle moves from (x, t) to (x + kh v, t + τ ). Now the probability u(x, t+τ ) of finding the particle at position x at time t + τ is the sum of the probabilities of finding the particle say at x + kh v for some direction v and some natural number k times the probability of having selected such a direction and such a natural number. This gives u(x, t + τ ) =
cs |∂B|
cs |∂B|
k∈Z
∂B
u(x + kh v, t) d−1 dS (v), |k|1+2s
∂B
u(x + kh v, t) − u(x, t) d−1 dS (v). |k|1+2s
and so u(x, t + τ ) − u(x, t) =
k∈Z
We can also change v to −v in the above, then average the two resulting expressions to obtain u(x, t + τ ) − u(x, t) u(x + kh v, t) + u(x − kh v, t) − 2u(x, t) d−1 cs dS (v). = 2|∂B| |k|1+2s ∂B k∈Z
Now we divide by τ = h2s and take the limit as h → 0 cs h ∂u = lim h→0 2|∂B| ∂t
k∈Z ∂B
u(x + kh v, t) + u(x − kh v, t) − 2u(x, t) d−1 dS (v). |hk|1+2s
2.1. Continuous time random walk
35
The key step now is to recognise the summation as a Riemann sum and so formally replace this by its associated integral obtaining (2.18) ∞ ∂u cs u(x + r v, t) + u(x − r v, t) − 2u(x, t) d−1 = dS (v) dr ∂t 2|∂B| 0 r1+2s ∂B u(x + y, t) + u(x − y, t) − 2u(x, t) d−1 cs dS (v) = 2|∂B| Rd |y|d+2s = −Cd,s (−)s u(x, t) for some constant Cd,s which acts as the coupling between the space and time scales—a diffusion coefficient. Thus the probabilistic model, formally for small space and time steps, approaches a fractional heat equation based on a version of the fractional Laplacian. 2.1.8. Simulating continuous time random walk. To simulate the ctrw, one needs to generate random numbers with a given distribution, such as power laws and L´evy stable distributions. The task of generating random numbers for an arbitrary pdf is often highly nontrivial. There are a number of possible methods, and we shall describe only the transformation method, which is simple and easy to implement. For more advanced methods, we refer to the monographs [222, 292]. The transformation method requires that we have access to random reals uniformly distributed in the unit interval 0 ≤ r ≤ 1, which can be generated by any standard pseudo-random number generators. Now suppose that p(x) is a continuous pdf from which we wish to draw random variables. The pdfs p(x) and p(r) (the uniform distribution on [0, 1]) are related by dr dr = , dx dx where the second equality follows because p(r) = 1 over the interval from 0 to 1. Integrating both sides with respect to x, we get the complementary cumulative distribution function (ccdf) P (x) in terms of r, 1 ∞ p(x ) dx = dr = 1 − r, P (x) = p(x) = p(r)
x
r
or equivalently x = P −1 (1 − r), where P −1 denotes the inverse function of the ccdf P (x). For many pdfs, this function can be evaluated in closed form. Example 2.3. In this example we illustrate the method on the power law density α . ψ(x) = (1 + x)1+α
36
2. Genesis of Fractional Models
One can compute directly ∞ P (x) = ψ(x ) dx = x
∞ x
and the inverse function
P −1 (x )
α 1 dx = , (1 + x )1+α (1 + x)α
is given by
P −1 (x ) = (x )− α − 1. 1
This last formula provides an easy way to generate random variables from the power law density ψ(t), needed for simulating the subdiffusion model. Generating random variables from the L´evy stable distribution is more delicate. The main challenge lies in the fact that there are no analytic expressions for the inverse of the distribution function, and thus the transformation method is not easy to apply. Two well-known exceptions are Gaussian (μ = 2) and Cauchy (μ = 1, β = 0). A popular algorithm is from [59] or [343]. For μ ∈ (0, 2] and β ∈ [−1, 1], it generates a skewed random variable X ∼ λ(x; μ, β) as follows. (1) Generate a random variable V uniformly distributed on (− π2 , π2 ) and an independent exponential random variable W with mean 1. (2) For μ = 1, compute X = Cμ,β
sin(μ(V + Bμ,β )) (cos V )
1 μ
where Bμ,β =
arctan(β tan μπ 2 ) μ
and
cos(V − μ(V + Bμ,β )) W
1−μ μ
,
1 μπ 2μ Cμ,β = 1 + β 2 tan2 . 2
(3) For μ = 1, compute W cos V 2 π ( + βV ) tan V − β log π . X= π 2 2 + βV 2.1.9. Bibliographical notes. There are a number of excellent survey articles discussing the role of fractional kinetics in physical models from the perspective of a random walk [245, 246], and an overview on the L´evy stable distribution in fractional kinetics can be found in [359]. We also refer to [196] for further details on applications of random walk models for anomalous diffusion. The random walk framework itself also provides an elegant approach to simulate the related fractional differential equations. Such schemes have close connections with other numerical methods developed from the numerical analysis perspective.
2.2. Constitutive modeling
37
2.2. Constitutive modeling A large number of fractional models arises from the use of fractional derivatives in constitutive laws, which when combined with classical continuity or balance laws leads to fractional pdes. In this section we will highlight this for two problem classes: The first deals with fractional versions of Fick’s first law of diffusion, which leads to diffusion models including and also going beyond the one derived from the fundamental ctrw approach in Section 2.1. The second comprises viscoelastic material laws in mechanics, which was perhaps the first application of fractional time derivative concepts and was among the driving forces for their development; see Caputo [52] and Caputo and Mainardi [53]. 2.2.1. Anomalous diffusive transport. As already pointed out in Section 2.1, diffusive processes are commonly described by Fick’s first and second laws, (2.19)
J = −D∇u,
(2.20)
ut = Du,
where J(x, t) is the flux, u(x, t) is the density of some diffusing quantity, and D the diffusion coefficient. The latter arises from the former by inserting it into the continuity equation (2.21)
ut + divJ = 0,
which results from the assumption that the quantity u is conserved in the following sense. Considered over an arbitrary smooth subset V of the overall spatial domain Ω ⊆ Rd , the temporal change of u is balanced by the flow through the boundary d u dx = − J · ν ds = − divJ dx , dt V ∂V V where we have used Gauss’s theorem in the second equality. Since this identity holds for arbitrary V , it implies (2.21). We will keep (2.21) as an absolute physical principle while modifying the constitutive law (2.19). To motivate these modifications, we point to the fact that in (2.20), due to its parabolic nature, perturbations propagate at infinite speed. Indeed, the unique solution of (2.20) considered on all of Rd with initial conditions u(x, 0) = u0 (x) for a nonnegative function u0 supported in a small open neighborhood U (x0 ) of some point x0 , u0 (x) = 0 in Rd \U (x0 ), u0 (x) > 0 in |x−y|2 1 U (x0 ), is given by u(x, t) = (4πt) d/2 U (x0 ) exp(− 4t )u0 (y) dy and therefore is strictly positive for any x ∈ Rd and t > 0. That is, the initial perturbation localised at x0 propagates throughout the whole space immediately.
38
2. Genesis of Fractional Models
To remove this unphysical phenomenon, Cattaneo in 1948 [56] proposed replacing (2.19) by the relaxed version τ Jt + J = −D∇u
(2.22)
with a certain relaxation time τ . This is also known as Maxwell–Catteneo law due to the fact that James Clerk Maxwell suggested a similar relation in mechanics in 1867; see also (2.38) below. Now inserting (2.22) in place of (2.19) into (2.21) results in substituting (2.20) by τ utt + ut = Du ,
(2.23)
which is a weakly damped wave equation with (finite) wave speed
D/τ .
A drawback of the hyperbolic equation (2.23) is that in the physical context of modeling heat flow it may predict temperatures less than absolute zero; see, for example, [100, 104, 365]. Fractional generalisations of the heat flux law have emerged in the literature as a way of interpolating between the properties of the two flux laws; see, e.g., [14, 68, 99, 282] and the references therein. Following [68], we will now look at several fractional time derivative versions of the constitutive law (2.22) obtained by using the partial (with respect to time) versions ∂tα , Itα of the Riemann–Liouville derivative (1.18) and the Abel integral operator (1.7) with α ∈ [0, 1]. This leads to different versions of the resulting diffusion equation (2.23). Doing so, we follow a purely phenomenological approach. However, as shown in [68, section 3], part of these constitutive laws can be achieved alternatively via a ctrw approach. These time-fractional general flux equations (GFE) are as follows: (2.24)
(GFE)
τ α ∂tα J + J = −D∇u;
(2.25)
(GFE I)
τ α ∂tα J + J = −D∂t1−α ∇u;
(2.26)
(GFE II)
τ α ∂tα J + J = −DIt1−α ∇u;
(2.27)
(GFE III)
τ ∂t J + J = −DIt1−α ∇u.
When inserted into continuity equation (2.21), they lead to the respective modified diffusion equations: (2.28)
(GCE)
τ α ∂t1+α u + ut = Du;
(2.29)
(GCE I)
τ α ∂t1+α u + ut = D∂t1−α u;
(2.30)
(GCE II)
τ α ∂t1+α u + ut = DIt1−α u;
(2.31)
(GCE III)
τ utt + ut = DIt1−α u;
in place of (2.23). The latter, in its turn, is recovered in the limit α → 1 from each of these models, whereas (up to a constant factor multiplied with the time derivative), we obtain (2.20) by setting α = 0. Here for
2.2. Constitutive modeling
39
easier comparison, as in [68] we use the abbreviation GCE for generalised Cattaneo equation. 2.2.2. Viscoelastic models. Here the physical conservation law is the equation of motion (resulting from a balance of forces) ρutt = divσ + f where ρ is the mass density, u(x, t) is the vector of displacements, and σ(x, t) is the stress tensor. The crucial constitutive equation is now the material law that relates stress and strain, where the latter in case of geometrically linear viscoelasticity is given by the symmetric gradient of u, that is, 1 (2.32) = (∇u + (∇u)T ). 2 For simplicity and in order to avoid working with tensors, we now restrict our exposition to one space dimension. We can imagine this setting as that of a rod whose longitudinal displacement u is caused by a force f acting in longitudinal direction as well. (It is not really a traction force since this would be modeled by a boundary condition imposed at one of the endpoints of the rod, but rather a force acting on each interior point of the rod. For example it could be a gravitational force if we imagine the rod hanging in a vertical direction.) Exposition here partly follows the books [15], [235], to which we also point for further details. The equation of motion then reads (2.33)
ρutt = σx + f,
and the kinematic equation (2.32) relating strain and displacement simplifies to (2.34)
= ux .
The models that we are going to obtain extend in a straightforward manner to the physically relevant setting of higher space dimensions. A purely elastic material is characterised by Hooke’s model, where the stress-strain relation is given by (2.35)
σ = b0
with b0 playing the role of an elastic modulus. Here the force σ is proportional to the extension , as is the case for an elastic spring. If the force is proportional to the extension rate rather than the extension itself, then according to Newton’s model (2.36)
σ = b1 t ,
we deal with pure viscoelasticity. A combination of these leads to Voigt’s model (also called Kelvin–Voigt model) (2.37)
σ = b0 + b1 t .
40
2. Genesis of Fractional Models
On the other hand, similarly to Cattaneo’s law above, one might introduce relaxation by adding a multiple of the strain rate on the left-hand side, which is known as Maxwell’s model (2.38)
σ + a1 σt = b0 .
A combination of Voigt’s and Maxwell’s models yields the so-called Zener model σ + a1 σt = b0 + b1 t .
(2.39)
There exists a large variety of models including even higher time derivatives. We will rather focus on replacing the zero and first order integer derivatives appearing in (2.36)–(2.39) by intermediate fractional ones. This is motivated by the behaviour of the loss tangent δ(ω), a quantity that summarises the damping ability of a viscoelastic body when exposed to sinusoidal excitation at frequency ω. According to experimental data, δ(ω) exhibits a frequency dependence that does not necessarily coincide with the one resulting from one of the above models; see [235, Sections 2.7, 2.8, 3.2.2, 3.2.3]. A general model class is defined by (2.40)
N
an ∂tαn σ =
n=0
M
bm ∂tβm
m=0
with the normalisation a0 = 1, where 0 ≤ α0 < α1 · · · ≤ αN and 0 ≤ β0 < β1 · · · ≤ βM ; cf., e.g., [15, Section 3.1.1]. One can even make this more general by replacing the weighted sum of derivatives by an integral over all possible values of α with respect to some measure μ. This concept of distributed derivatives can, e.g., be found in [198], and we refer to Section 9.1 as well as the original paper [303] for the inverse problem of determining the model from experimental data, that is, of determining this measure. Some relevant instances particularly related to the above integer order models are the fractional Newton model (also called Scott–Blair model) σ = b1 ∂tβ ,
(2.41) the fractional Voigt model
σ = b0 + b1 ∂tβ ,
(2.42) the fractional Maxwell model (2.43)
σ + a1 ∂tα σ = b0 ,
and the fractional Zener model (2.44)
σ + a1 ∂tα σ = b0 + b1 ∂tβ .
2.2. Constitutive modeling
41
We now proceed to deriving wave equations by combining (2.33) and (2.34) with one of the constitutive models (2.41)–(2.44). We do so by using linear combinations of fractionally time differentiated versions of the constitutive law, so that the right-hand side is modified to f˜ =
(2.45)
N
an ∂tαn f .
n=0
This leads to the fractional Newton wave equation ρutt = ∂tβ uxx + f˜ ,
(2.46)
the fractional Kelvin–Voigt wave equation ρutt = b0 uxx + b1 ∂tβ uxx + f˜ ,
(2.47)
the fractional Maxwell wave equation ρutt + a1 ρ∂t2+α u = b0 uxx + f˜ ,
(2.48)
the fractional Zener wave equation (2.49)
ρutt + a1 ρ∂t2+α u = b0 uxx + b1 ∂tβ uxx + f˜ ,
or for the general model (2.40) to N M 2+αn an ∂t u = bm ∂tβm uxx + f˜ . (2.50) ρ n=0
m=0
For a comparison, note that applying this procedure to (2.35), one arrives at the standard wave equation ρutt = b0 uxx + f .
(2.51)
This combine-and-insert procedure also works in the case of space dependent coefficients ρ, bm , as often needed to model inhomogeneous material, and gives N M
2+αn an ∂ u = bm ∂ βm ux + f˜ . (2.52) ρ t
n=0
t
m=0
x
The coefficients a1 , . . . , aN , b0 , . . . , bN are typically positive, hence the fractional derivative terms on the right-hand side of equations (2.46)–(2.49) act as damping terms, as will also become apparent in the energy estimates of Chapter 7. However, the leading order time derivative terms in the latter two models are higher than two, which makes stability somewhat more delicate, as the elementary example N = M = 1, α0 = β0 = 0, α1 = β1 = 1 (that is, the fractional Zener wave equation (2.49) with α = β = 1) demonstrates. Indeed, as Figure 2.5 shows, decay of energy and therewith of wave amplitudes, heavily depends on the interplay of the parameters a1 , b0 , b1 .
42
2. Genesis of Fractional Models
5 4 3 2 1 0 −1 −2
γ = 0.5 γ=0 γ = −1
y(t)
............ ... .. ... . .. .. . . ..... .. ........ ............ .. .... .... .......................... .... .. .. .... ....... . . . . . . . . . .......... ........... . ....... ....... ...... ... .. ...... ........ ..... .... ... ..... ...... ........ ...... .... ....... . .... .... ... t ..... .... .............. ........ ...... ...... ...... ...... ...... ....... .. .. .. .. ...... ...... ...... .. ........... .... . ..... . . . .. . ........ . . . . . . . . . ... .................. .. ... .. ... .. .... . . .... .... .............
−3
Figure 2.5. Amplitudes y(t) of solutions to (2.49) with α = β = 1,
on the space interval (0, 1) with u(x, 0) = sin(2πx), ut (x, 0) = utt (x, 0) = 0, for different values of γ = a11 − bb01 .
More precisely, as has been shown in detail in, e.g., [179, 240, 264], stability of the system crucially depends on the sign of the parameter γ = a11 − bb01 . A necessary condition for thermodynamic consistency, or in more mathematical terms, for long time stability of the wave equation resulting from a combination of (2.33), (2.34) with (2.40) according to [15, Lemma 3.1], is the condition maxn αn ≤ maxm βm . In the well-posedness analysis of the above equations which will be carried out in Chapter 7, we will see that this is also crucial in order to be able to bound the pde solutions in certain norms. However, note that the above case N = M = 1, α0 = β0 = 0, α1 = β1 = 1 is a counterexample to this being sufficient, as Figure 2.5 shows. In higher space dimensions with vector-valued displacement u instead of u, the quantities an , bm in (2.40) will in general be fourth order tensors, and deriving a wave equation is not that straightforward any more, especially in the cases of (2.43) (2.44) due to the fact that then in general div(an σ) can no longer be expressed in terms of divσ alone (for which we have the equation of motion). Still, in case of scalar coefficients, we can use the models above to derive variants of the acoustic wave equation for the pressure p(x, t), which is scalar valued. To do so, we make use of the balance of momentum identity ρutt = −∇p + f
(2.53)
(as compared to mechanics, we make the subtitutions σ → −pI; (u) → ∇·u) and replace (2.40) by N n=0
an ∂tαn p =
M m=0
bm ∂tβm ∇ · u
2.2. Constitutive modeling
to obtain
N
an ∂t2+αn p
43
=
M
bm ∂tβm ∇
n=0 m=0 M βm 1 − m=0 bm ∂t ∇ · ( ρ f ) in place
·
1 ∇p + f˜ , ρ
of (2.50). This remains valid with with f˜ = space-dependent coefficients bm and then reads as (2.54)
N
an ∂t2+αn p +
n=0 1 −∇·( ρ ∇p).
M
bm ∂tβm Ap = f˜ ,
m=0
Specifying this to the fractional counterparts of the where Ap = classical models (2.41)–(2.44), we end up with the following models used in linear and nonlinear acoustics (see, e.g., [45]) that we will revisit in Chapters 7 and 11: the Caputo–Wismer–Kelvin wave equation (corresponding to the fractional Kelvin–Voigt model on the mechanical side) (2.55)
ptt + b0 Ap + b1 ∂tβ Ap = f˜ ,
the modified Szab`o wave equation (corresponding to the fractional Maxwell model) (2.56) ptt + a1 ∂ 2+α p + b0 Ap = f˜ , t
and the fractional Zener wave equation (2.57)
ptt + a1 ∂t2+α p + b0 Ap + b1 ∂tβ Ap = f˜ .
To conclude this section, we also mention that also the sub- and superdiffusion models derived in Section 2.1 by continuous time random walks can be based on constitutive modeling replacing the pointwise-defined Fick’s law (2.19) by a convolution equation t k(t − τ )∇u(x, τ ) dt. (2.58) J(x, t) = − 0
The standard Fick’s law is then obtained by taking k(t) = δ(t). For other kernels k, (2.58) immediately provides a history effect, and by taking k(t) to be a fractional power, k(t) = ctα−1 , 0 < α < 1, recovers Abel’s fractional integral (1.7); see also (4.3). From this point the reduction to a fractional order differential equation is immediate. See, [212, 253] for some of the details in this formulation which has been used for both sub- and superdiffusion cases. We also refer to [124, 260, 281].
Chapter 3
Special Functions and Tools
There is an obvious question in calculus: Is there a function E(t) such that its derivative equals itself? Slightly more generally and looking ahead to differential equations, we should ask if there is an E(t) such that ddt E(t) = λE(t) for some given constant λ? Of course, this shows why E(t) = eλt plays such a central role. Looking a bit further ahead to pdes and their fundamental solutions, the question then becomes what is the key function for ut − uxx = 0? The answer of course again involves the exponential function. If we now replace the integer derivative by one of fractional type, say one based on the Abel fractional integral, what are the answers to the previous questions? For the relaxation equation Dtα u = λu, the answer turns out to be a Mittag-Leffler function Eα (z) but the argument z will contain a fractional power tα . Eα (z) is itself an entire function but considerably more complex than the exponential as the factorial present in the power series for ex is replaced with a Gamma function. We will also need a two parameter version of this function Eα,β (z) adding further complexity. For the fundamental solution of the subdiffusion equation Dtα u−uxx = 0, we will need an additional special function, the study of which was originally due to Wright (but introduced for the purposes of number theory). In fact, there is a great deal of parallelism between the tools for analytic number theory developed over the last 150 years or so and those tools needed for
45
46
3. Special Functions and Tools
the study of fractional differential operators, namely complex analysis. It is to this aspect that we now turn in this chapter.
3.1. Basic results in complex analysis As noted above, in this chapter we will meet several special functions needed in fractional calculus and representations of solutions of fractional order differential equations. These have quite complex behaviour that can only be analysed effectively using the tools of complex analysis. Since these will be a major topic throughout this book, we use this section to give some basic results in this direction as well as setting the groundwork in this chapter for later sections. There are many excellent books that would provide background for this, and for an overall perspective we suggest the graduate texts [69] or [293]. In complex analysis Cauchy’s residue theorem is a powerful tool to evaluate line integrals of analytic functions over closed curves with applications to the computation of real integrals and infinite series, for the analysis of special functions needed in fractional calculus and representations of solutions of fractional order differential equations. We will always assume our curves in C are rectifiable (in fact there is no need for us to go beyond piecewisedzsmooth). A basic fact is that if γ is such a closed curve and a ∈ γ, then γ z−a is an integer: the winding number n(γ, a). If a lies in the exterior of γ, then n(γ, a) = 0. If a lies in the interior of γ and the γ winds around a a total of k times, then n(γ, a) = k. The basic result follows. Theorem 3.1. Let G be an open subset of the plane, and let f : G → C. If γ is a closed curve in G, then for a ∈ G \ γ f (z) 1 dz. n(γ, a)f (a) = 2πi γ z − a Alternatively, if G is simply connected, then γ f dz = 0 for every closed, rectifiable curve γ and every analytic function f . We will also need the argument principle: Theorem 3.2. Let f be meromorphic in G with poles {pj }Pj=1 and zeros {zk }Z k=1 counted according to multiplicity. If γ is a closed, rectifiable curve enclosing but not passing through a pole or zero, then f (z) 1 dz = Z − P. 2πi γ z − a Definition 3.1. An entire function f is of finite order if and only if there exists ρ˜ and R such that |f (z)| < exp(|z|ρ˜) whenever |z| ≥ R. The infimum ρ = ρ(f ) of such ρ˜ is the order of f .
3.1. Basic results in complex analysis
47
In other words, the order of f (z) is the infimum of all m such that |z|m f (z) = O e as z → ∞. Theorem 3.3. Let f be a entire function of finite order. Then ρ(f ) satisfies M (f,r) , where M (f, r) = max|z|=r |f (z)|. ρ(f ) = limR→∞ supr≥R log log log(r) The proof uses the following. If f is of finite order, then ρ
M (f, r) ≤ er ⇒ log M (f, r) ≤ rρ ⇒ log log M (f, r) ≤ ρ log(r), so limR→∞ supr≥R
log log M (f,r) log(r)
≤ ρ.
We also say an entire function of order ρ has (exponential) type σ, where σ = lim sup R−ρ log max |f (z)|. |z|≤R
R→∞
Given positive numbers ρ and σ, an example of an entire function of order ρ and type σ can be exhibited using the power series ∞ eρσ nρ n z . f (z) = n n=1
Some specific examples are the following. • The functions eaz , cosh(az), sinh(az), with a = 0, have order 1 and type |a|. √ 2 • The function sinh( z) has order 12 and e−az has order 2. z
• Polynomials have order zero and ee has infinite order. Any finite sequence {cn } in the complex plane can form a polynomial p(z) that has zeros precisely at the points of that sequence, p(z) = n (z − cn ). This is just the fundamental theorem of algebra; every polynomial may be factored into linear factors, m for each root, where m is its multiplicity. The Weierstrass factorisation theorem asserts that every entire function can be represented as a (possibly infinite) product involving its zeros. As such, it may be viewed as an extension of this. Simply writing the finite product as an infinite one cannot work directly. If the sequence {cn } is not finite, it can never define an entire function because the infinite product would not converge everywhere. A necessary condition for convergence of the infinite product in question is that for each z, the factors (z − cn ) must approach unity as n → ∞. Any more complex form of the factors must allow convergence but not allow any more zeros to be involved. n+1 Consider the functions of the form exp − zn+1 for integers n. At z = 0, they evaluate to unity and have a flat slope for orders up to n. Right after z = 1, they decrease quickly to some small positive value. On the other
48
3. Special Functions and Tools
hand, the function 1 − z has no flat slope but is zero at z = 1. There is also the useful identity to combine the two types of function; for |z| < 1, 1
2 3 (1 − z) = elog(1−z) = exp − z1 − z2 − z3 + · · · . This motivates the choice of Weierstrass factors, (1 − z) 1
if n = 0, En (z) = z z2 zn (1 − z) exp 1 + 2 + · · · + n otherwise, (3.1) ∞
n+1 zk |z| < 1. = exp − zn+1 1+k/(n+1) , k=0
Thus from the last line one can read off how those properties are enforced. With this we have (see [31]) Theorem 3.4 (Weierstrass factorisation theorem). Let f be an entire function, and let {an } be the nonzero zeros of f repeated according to multiplicity. Suppose also that f has a zero at z = 0 of order m ≥ 0. Then there exists an entire function g and a sequence of integers {pn } such that ∞ z
Epn . (3.2) f (z) = z m eg(z) an n=1
The important representation theorem due to Hadamard is the following: Theorem 3.5. Let f be an entire function of finite order ρ ≥ 0, where f has a zero of order m at z = 0. Then f can be written as ∞ Eρ (z/aj ), (3.3) f (z) = z m eg(z) j=1
where {aj } are the zeros of f (z) repeated according to their multiplicity and g(z) is a polynomial of degree ≤ ρ. From the above we have Corollary 3.1. If f (z) is an entire function of finite order ρ without zeros, then f (z) = eg(z) where g(z) is a polynomial of degree ρ ∈ Z. There are several important examples of this infinite product representation, and we note two of them (see [284, Chap. 38]) (3.4) ∞ z 2 z z/n e = πz 1− 1− , sin(z) = πz n n 1 = eγz Γ(z)
n=0 ∞
n=1
n=1
z −z/n e , 1+ n
γ is the Euler–Mascheroni constant.
3.1. Basic results in complex analysis
49
3.1.1. Completely monotone functions. Definition 3.2. A function f with domain (0, ∞) is said to be completely monotone (c.m.) if it possesses derivatives f (n) (x) for all n = 0, 1, 2, . . . and these satisfy (−1)n f (n) (x) ≥ 0 for all x > 0. Some obvious examples of such functions are the following: e−ax ,
ea/x ,
log(1 + x) , x
1 (λ + μx)ν
for a > 0, λ ≥ 0, μ ≥ 0, ν ≥ 0.
It is easy to see that if f (x) is c.m., then f 2m (x) and −f 2m+1 (x) are also c.m. and this produces many more examples. The following result is easily proven. Theorem 3.6. If f (x) and g(x) are c.m., then so is af (x) + bg(x), where a and b are nonnegative constants. The product f (x)g(x) is also c.m. The proof of the second part follows from Leibnitz’s formula n
n (k) dn f (x)g(x) = f (x)g (n−k) (x). dxn k k=0
Theorem 3.7. Let f (x) be c.m. and let g(x) be nonnegative with a c.m. derivative. Then f [g(x)] is also c.m. For the proof of the above, the simplest, if certainly not the most elegant, approach is to use directly the conditions in the derivative formula for a composition of functions n k−1 n 1 (k) dn j k j d f f [g(x)] = (−1) [g(x)]k−j . g(x) [g(x)] dxn k! dxn j k=1
j=0
The following results are also easy to prove. Theorem 3.8. Let y = f (x) be c.m., and let the power series r(y) = ∞ k 0 rky converge for all y in the range of f . If rk ≥ 0 for all k, then r f (x) is c.m. A natural extension of this is to pass from power series to integral representations. Theorem 3.9. Suppose K(x, t) is c.m. in x for all t ∈ (a, b) and f ≥ 0, b then F (x) = a K(x, t)f (t) dt is c.m. It may seem that in the larger scheme the number of distinct c.m. functions is extremely small. The next result shows that this is not the case.
50
3. Special Functions and Tools
Theorem 3.10. Let h(t) be any strictly positive locally integrable function ∞ on [0, ∞). Then H(x) = 0 e−xt h(t) dt is completely monotone on [δ, ∞) for any δ > 0. The proof is very straightforward by directly differentiating under the integral sign and noting that x ≥ δ > 0. The surprising result is that the above theorem is not merely sufficient but also essentially necessary. Theorem 3.11. A necessary and sufficient condition that f (x) be c.m. is that ∞ e−xt dμ(t), (3.5) f (x) = 0
where μ(t) is nondecreasing and the integral converges for 0 < x < ∞. In this form it is known as Bernstein’s theorem [26]; see also Section A.1. More generally, a version of this theorem characterises Laplace transforms of positive Borel measures on [0, ∞), and in this form it is known as the Bernstein–Widder theorem, see [347]. For the proofs of these we refer to [348, Chap IV]. An obvious question is how to reverse this; that is, given a function f how do we determine the function μ(t)? Of course the answer is to be able to invert the Laplace transform. The inverse Laplace transform formula is ν+iT 1 lim est F (s) ds. (3.6) f (t) = 2πi T →∞ ν−iT Here the integration is taken along the vertical line Re(z) = ν in the complex plane such that ν is greater than the real part of all singularities of F (z) and F (z) is bounded on the line. This vertical line contour is the usual Bromwich path and quite suitable for meromorphic functions F . If F requires a branch cut (for example when there are fractional powers in F ), then it is usual to deform the Bromwich path into a Hankel path Haη that wraps around the branch cut in addition to any poles marked as p+ or p− in the upper or lower half-plane. The contributions along the negative real axis cancel, as do the parallel lines to any possible pole, (but we must then pick up the corresponding residues). However, the line integral on the circle around the origin must be included. This is named after Hermann Hankel for his investigations of the Gamma function. An example is shown in Figure 3.1.
3.1. Basic results in complex analysis
51
We will use formula (3.6) extensively throughout the book, and the particular choice for the Hankel path will also feature heavily due to our use of fractional powers. The following result states under which conditions a completely monotone function can be recovered from its values at countably many points. It is closely related to a result by M¨ untz and Sz´ asz, Theorem 9.4, that will be used in Section 9.1 to prove uniqueness for the fundamental inverse problem of determining fractional order in a subdiffusion equation. A proof can be found in [102].
∧ ... ..... ....p+ ..
...... .......
>
. ..... ....p− ....
Figure
3.1. The
Hankel
Path Haη
Theorem 3.12. A completely monotone function isuniquely determined ∞ by its values at a sequence of points {xn } provided n=1 xn diverges. If ∞ x converges, then there are two different completely monotone funcn=1 n tions which agree at the points {xn }n∈N . 3.1.2. Asymptotic analysis and the Stokes phenomenon. The Stokes phenomenon plays a common, some might say a pervasive, role in special functions and the mathematical analysis of their asymptotics. We will encounter (although not delve too deeply) this in several of the special functions required. As the name indicates this is a relatively old, but oftenencountered issue; see [323]. To introduce the idea, we start with a rather trivial example before moving onto perhaps the next simplest—and one that initially motivated George Stokes. These lines are roughly where some term in the asymptotic expansion changes from increasing to decreasing (and therefore can exhibit a purely oscillatory behaviour), and the Stokes lines are those lines along which some term approaches infinity or zero fastest. Consider the function sinh(z): there are two natural regions, Re(z) < 0 for which it is asymptotic to 12 e−z , and Re(z) > 0 for which it is asymptotic to 12 ez . The line Re(z) = 0 is therefore a divide. There are also lines along which the function approaches infinity or zero fastest; in this case these are lines of constant imaginary part. One of these (usually the first) is called the Stokes line. We recommend the article by Meyer [247] as an excellent exposition. Stokes himself used the example of the Airy function Ai(z) which can 2 be defined as the solutions of the equation ddzy2 − zy = 0. The solutions y(z)
52
3. Special Functions and Tools
are approximated for large |z| by linear combinations of 2 3 with ξ = z 2 . 3 These turn out to be quantitatively useful even when the argument is only moderately large. Obviously, u+ (z) and u− (z) are multivalued functions of the complex variable z requiring a branch cut with a branch point at z = 0. In contrast, the solutions y(z) of Airy’s equation are entire functions of z since this is a linear ordinary differential equation whose coefficients are entire. Therefore, as we traverse a contour once around the point z = 0, y(z) will return to its original value, but u+ (z) and u− (z) will not. Thus if a particular solution of Airy’s equation is approximated at z = 0 by a linear combination c1 u+ +c2 u− , then it cannot be approximated by the same linear combination at ze2πi . (3.7)
u± (z) = z − 4 e±ξ 1
Thus the concept of approximation involved here must be domain-dependent and this is the basis of the Stokes phenomenon. We will have reason to look at several entire functions as basic building blocks of fractional analysis; in particular, the functions named after Mittag-Leffler and Wright. This means that when approximating them by expansions as was done for (3.7), we essentially have to break our domain into subregions. It also means that we have to have a smoothing process when near the Stokes line and the usual method of choice is due to Berry; see [27].
3.2. Abelian and Tauberian theorems We mentioned in Section 1.2 that one of Abel’s main interests was in convergence of series and in particular with summability methods. It turns out this work was also fundamental as the Abelian questions were turned around later in the century to have converse results that have deep consequences, and they will play a significant role in the special functions developed in this chapter. n Abel’s theorem on power series states that if R > 0 and ∞ n=0 an R ∞ n is convergent, then n=0 an x is uniformly convergent for |x| ≤ R and n → s(R) as x R. s(x) = ∞ a x n=0 n There is also a version here for complex numbers: Let z be a complex number with |z| < 1. Then the series ∞ k=0 ak is Abel summable to S if limz→1,z∈γ ak z k → S, where γ is any path not tangent to the unit circle. A summation method S is a method for constructing generalised sums of series, generalised limits of sequences, or values of improper integrals. A popular method is that of Ces` aro means, in which S is defined as the limit of the arithmetic means of the first N terms as N tends to infinity: if {aj }∞ j=1
3.2. Abelian and Tauberian theorems
53
is a sequence and Sk := kj=1 aj its partial sums, then {aj } is called Ces` aro 1 n summable, with Ces` aro sum A if limn→∞ n k=1 Sk = A. This also works for series, and it is important to note that a divergent series of real numbers may be Ces` aro summable. The classic example is aj = (−1)j+1 . Abel summability is a stronger method than Ces`aro’s so we have convergent ⇒ Ces` aro summable ⇒ Abel summable, and these arrows cannot be reversed. While this is interesting, its inverse problem is even more so. In 1897 Tauber gave a sufficient condition under which Abel summability implies the usual convergence, now known as the (original) Tauberian theorem; see [326]. Theorem ∞3.13. kLet {ak } be a sequence of complex numbers, assume that for every r, 0 ≤ r < 1. If limr→1 A(r) = A A(r) = k=0 ak r converges ∞ 1 and ak = o k , then k=0 ak = A. Tauber’s result led to various other Tauberian theorems, which are are all characterised as follows: • Supposeone knows the behaviour of f (x) as x → X (for example k A(r) = ∞ k=0 ak r above). • Further, one has a condition on the growth of ak as k → ∞ (as in the theorem). This is the Tauberian condition. • Then one can concludesomething about the behaviour of f (X) (as in the convergence of ∞ k=0 ak above). Over the last 100 years Tauberian theorems have often originated as tools in analytic number theory—in fact every major proof or sharpening of results such as the prime number theorem have been based on stronger Tauberian theorems beginning with that of Hardy and Littlewood in 1914; see [137]. In 1930 Karamata [193] gave a shorter proof of the Hardy– Littlewood result while also extending it to include functions slowly varying at infinity. A function (t) is said to be a slowly varying function at infinity if and only if (λt)/(t) → 1 as t → ∞ for every positive λ. The classic example here is the function (t) = logβ (t) for β ∈ R. Later, Feller in [101,103] transformed this into an integral version based on the Laplace transform which is more typical of the results we will need: Theorem 3.14. Let w : [0, ∞) → R be a monotone function and assume ∞ its Laplace–Stieltjes transform w(s) ˆ = 0 e−st dw(t) exists for all s in the
54
3. Special Functions and Tools
right-hand complex plane. Then for ρ ≥ 0, w(s) ˆ ∼
C sρ
as s → 0
if and only if w(t) ∼
C tρ Γ(ρ + 1)
as t → ∞.
Note that here dw is actually a measure, which in the case of an absolutely continuous function w can be written as dw(t) = w (t)dt. Using Karamata’s theory a slight improvement is possible. Theorem 3.15. Let w : [0, ∞) → R be a monotone function and assume ∞ its Laplace–Stieltjes transform w(s) ˆ = 0 e−st dw(t) exists for all s in the right-hand complex plane. Further, let (t) be a slowly varying function. Then, C as s → 0 w(s) ˆ ∼ ρ (1/s) s if and only if C w(t) ∼ (t)tρ as t → ∞. Γ(ρ + 1) An alternative and equivalent formulation of Theorem 3.14 is the following. has Theorem 3.16. Suppose f ∈ L1 (R) is such that its Fourier transform ∞ no real zeros and for some h ∈ L (R), limx→∞ (f ∗ h)(x) = A R f (x) dx. Then lim (g ∗ h)(x) = A x→∞
R
g(x) dx for any g ∈ L1 .
We close this section by giving an earlier result due to Watson [339] that will be used in several places in this book. Theorem 3.17. Assume that g(t) has an infinite at t = 0 and g(0) = 0. Let φ(t) = t g(t) where that |φ(t)| ≤ Keβt for constants K, β, and t > 0. T −xt φ(t) dt < ∞ and 0 e T ∞ g (n) (0)Γ( + n + 1) e−xt φ(t) dt ∼ (3.8) n!x+n+1 0
number of derivatives > −1, and suppose Then for any T > 0,
as x → ∞.
n=0
3.3. The Gamma function This function will appear in many formulas in this book since it is a fundamental element of the Mittag-Leffler and Wright functions that replace the exponential function in fractional calculus, and it is a key component
3.3. The Gamma function
55
of fractional powers of operators in general. In this section we briefly discuss some of its main properties and give an indication on how it might be computed. The usual way to define the Gamma function is by a method due essentially to Daniel Bernoulli and which is valid for complex numbers with a positive real part, ∞ tz−1 e−t dt, Re(z) > 0. (3.9) Γ(z) = 0
The above notation is due to Legendre and the integral formula is due to Euler [72]. It is analytic in the right half-plane Re(z) > 0 and an integration by parts shows the recursion formula (3.10)
Γ(z + 1) = zΓ(z).
Since Γ(1) = 1, (3.10) implies that for a positive integer n, Γ(n + 1) = n!. The Gamma function Γ(z) is then defined as the analytic continuation of this integral function to a meromorphic function that is holomorphic in the whole complex plane except zero and the negative integers, where the function has simple poles. This follows from (3.9) which shows that limz→0;Re(z)>0 Γ(z) = ∞, and the second part then follows by induction from (3.10). Euler in fact started from the formula valid for a fixed integer m, n! (n + 1)m = 1, n→∞ (n + m)! lim
and then required this equation to hold when m is replaced by an arbitrary complex number z n! (n + 1)z = 1. lim n→∞ (n + z)! Multiplying both sides by z! gives z! (n + 1)z n→∞ (n + z)! 2 3 n+1 z 1 · ··· = lim (1 · · · n) n→∞ (1 + z) · · · (n + z) 1 2 n z ∞ 1 1 = 1+ . z 1+ n n
z! = lim n! (3.11)
n=1
The reflection formula for Γ(z) is important. Theorem 3.18. (3.12)
Γ(z)Γ(1 − z) =
π . sin(π z)
56
3. Special Functions and Tools
n Proof. Since e−t = limn→∞ 1 − nt , the Gamma function can be represented as n t n z−1 (3.13) Γ(z) = lim t dt. 1− n→∞ 0 n Integrating by parts n times yields n t n z−1 t dt 1− Γ(z) = lim n→∞ 0 n n n! (z + k)−1 nz+n = lim n (3.14) n→∞ n k=0
= lim
n→∞
nz z
n k=1
n k nz 1 = lim . z + k n→∞ z 1 + kz k=1
We can use this to evaluate the left-hand side of the reflection formula: (3.15)
n 1 1 Γ(1 − z)Γ(z) = −zΓ(−z)Γ(z) = lim 2 . n→∞ z 1 − kz 2 k=1
We now need another well-known infinite product (see (3.4)) (3.16)
sin(πz) = πz
∞ k=1
z2 1− 2 k
.
It easily follows that (3.17)
n 1 1 π = lim 2 . sin(πz) n→∞ z 1 − kz 2 k=1
Combining (3.17) with (3.15) gives the desired result.
An allied special function is the Beta function B(m, n) defined for integers m and n, (3.18)
1
B(m, n) =
um−1 (1 − u)n−1 du.
0
It easily follows from this that (3.19)
B(m, n) =
Γ(m)Γ(n) , Γ(m + n)
3.3. The Gamma function
57
and so with the change of variables u = (1 + x)/2 so that du = dx/2, and x2 = v, we get (3.20) 1 1 1 1 + x z−1 1 − x z−1 Γ(z) Γ(z) = dx = 22−2z (1 − x2 )z−1 dx Γ(2z) 2 −1 2 2 0 1 v −1/2 (1 − v)z−1 dv. = 21−2z 0
Now use the Beta function identity again to obtain Γ( 1 )Γ(z) 1 Γ(z)Γ(z) = 21−2z B( , z) = 21−2z 2 1 . Γ(2z) 2 Γ(z + 2 ) √ Solving for Γ(2z) and using Γ( 12 ) = π then gives Legendre’s duplication formula 1 22z−1 (3.22) Γ(2z) = √ Γ(z)Γ(z + ). 2 π (3.21)
The Gamma function has no zeros, so the reciprocal Gamma function 1/Γ(z) is an entire function. This important observation is used in computation—it is often easier to compute 1/Γ(z) than Γ(z) itself and a simple reciprocal is all that is then needed. See Figure 3.2 for a plot of the Gamma function and its reciprocal. Γ (z)
. 15 .. ... . ... .. .. .. ... ... .. .. .. . . . . ... .. ... ... .. .. ... .. .. ... .. .. .. ... .. .. .. ... . . . 10 . . . .. ... ... . .. .. ... ... ... .. .. .. .. .. ... ... .. .. .. ... ... ... .. .. ... . .. .. .... . .... ... .. .. .. ... .. .. ... .. .. ... .... .. ... .. 5 ...... .... ... .. ... ... .... . . .. .. .. ... . ........ .. . . . . . .. .. . .. ... .... .. .. ....... ... ... ... .. ................ .. .. ......... .... .. .. .. ..... ............ ........ .... .. ................................................................................................... .... ... .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. .. .. .. .. .. .. .. .. .. .. .. ...... .. .... .. .. .. .. .. .... .. .. .. .. .. .. .... .. ...... ...................................................................................... ... .............................. .... 0 ..... .... ..... ........ .... . ...................... . .... ...−2 −6 −5 −4 −3...... −1 . . ....... . ... .. .. ..... .. .. .. ......... ... .. .. ... .. .... ...... ... .. .. ... ... .. .. .. .. .. .. ... ... . . .. .. .. −5 .... . .. . . . . . . ... .. .. .. .. ... .. .. ... ... .. .. ... .. . ... . . ... .. . ... .. . ... .. .. .. . ... .. . . .. ... . . .. . . . ... . .. . −10 .... . . . ... .. ... . . ... . ... . . . . .. .... . ... . . . .. .. . .... .. .. ... ... .. . .. .. . . .. . . . −15
Re(z)
Γ (x) 1/Γ (x)
Figure 3.2. Plot of the Γ and 1/Γ functions for real z
Both the reflection and the Legendre duplication formula have great practical utility. For example, one only needs to approximate Γ(z) for Re(z) ≥ 1/2, and use (3.12) for Re(z) < 1/2. It also gives directly that √ Γ(1/2) = π.
58
3. Special Functions and Tools
It is possible to continue the Gamma function analytically into the left half-plane. This is often done by a representation of the reciprocal Gamma function as an infinite product 1 n−z = lim z(z + 1) · · · (z + n) Γ(z) n→∞ n! valid for all z. Hence, 1/Γ(z) is an entire function of z with zeros at z = 0 and z equal the negative integers. Thus Γ(z) only has poles at z = 0, −1, −2, . . .. An alternative integral representation for the reciprocal Gamma function is due to Hankel [133] and is useful for deriving integral representations of other special functions that depend on Γ(z), such as the Mittag-Leffler and Wright functions. The representation is often derived using the Laplace transform. Substituting t = su in (3.9) yields ∞ Γ(z) = uz−1 e−su du, sz 0 which can be regarded as the Laplace transform of uz−1 for fixed complex z ∈ C. In turn, uz−1 can be interpreted as an inverse Laplace transform Γ(z) 1 z−1 −1 Γ(z) =L esu z ds. u = z s 2πi C s The path C is any deformed Bromwich contour such that C winds around the negative real axis in the anticlockwise sense. Now substituting ζ = su yields 1 Γ(z)uz z−1 = eζ z dζ, u 2πi C ζ u and hence 1 1 = ζ −z eζ dζ. (3.23) Γ(z) 2πi C The integrand ζ −z has a branch cut on the negative real axis but is analytic elsewhere so that (3.23) is independent of the contour C under the usual restrictions. The contour C can be deformed to facilitate the numerical evaluation of the integral, which substantiates the claim that the reciprocal Gamma function is easier to compute than the Gamma function itself, [312]. Next we perform the substitution ζ = ξ 1/α , α ∈ (0, 2), and for 1 ≤ α < 2, consider only the contours Γ,ϕ for which π/2 < ϕ < π/α. Since is arbitrary, we obtain the equivalent representation, 1 1−z−α 1 1 = eξ α ξ α dξ, (3.24) Γ(z) 2παi Γ,μ where α ∈ (0, 2), πα/2 < μ < min(π, πα).
3.4. The Mittag-Leffler function
59
Let φ(θ) be an analytic function that maps the real line R onto the contour C. The identity (3.23) can be written as ∞ 1 1 =I= [φ(θ)]−z eφ(θ) φ (θ) dθ. Γ(z) 2πi −∞ Due to the choice of C the term eφ(θ) in the integrand decreases exponentially as |θ| → ∞ so that only an exponentially small error is obtained by truncating R to a finite interval. In [312] this interval was chosen as [−π, π] and discretized by N equally spaced points θk . Then a trapezoid approximation to (3.23) becomes IN =
N 1 sk −z e sk ωk , iN
sk = φ(θk ),
ωk = φ (θk ).
k=1
Stirling’s formula for n! holds for real x; for large argument it yields
2π x x (1 + O(x−1 )) as x → ∞ (3.25) Γ(x) = x e showing the expected exponential increase.
3.4. The Mittag-Leffler function The exponential function ez plays an extremely important role in the theory of integer-order differential equations: it is the basic building block for Green’s functions of many differential equations. For fractional-order differential equations, its crucial role is subsumed by the Mittag-Leffler function. The two-parameter Mittag-Leffler function Eα,β (z) is defined by (3.26)
Eα,β (z) =
∞ k=0
zk , Γ(αk + β)
z ∈ C,
for α > 0, and β ∈ R. The function Eα,1 (z), often denoted by Eα (z), is named after the Swedish mathematician Mittag-Leffler [250], who introduced it in 1903. The two parameter version was first studied by Wiman in 1905, but the majority of the analysis on this case did not take place for another 50 years in the (independent) work of Agarwal and Humbert [6, 152, 153] and also by Djrbashian [79, 86]. This function plays a central role in the study of fractional diffusion. In this chapter, we discuss its fundamental analytic properties and efficient computation. The more recent monograph by Gorenflo et al. [114] provides a comprehensive treatment of the Mittag-Leffler function.
60
3. Special Functions and Tools
3.4.1. Analytical properties. We begin with elementary properties of the Mittag-Leffler function Eα,β (z). A few chosen values of α and β give well-known special cases ez − 1 E1,1 (z) = ez , E1,2 (z) = (3.27) , z √ √ sinh z , E2,1 (z) = cosh( z), E2,2 (z) = √ z √ where z means the principal value of the square root of z in the complex plane cut along the negative real semiaxis R− . Thus Eα,β (z) generalises the exponential function ez and recovers the hyperbolic cosine and the hyperbolic sine functions cosh z := (ez + e−z )/2 and sinh z := (ez − e−z )/2. By setting z → iz, we then also recover the trigonometric variants cos(z) and sin(z) as special cases. Proposition 3.1. For any α > 0 and β ∈ R, Eα,β (z) is an entire function of order 1/α and type 1. Proof. For the first part it suffices to show that the series (3.26) converges in the whole complex plane C. Denote by ck = 1/Γ(kα + β) the coefficients in the series (3.26). Using Stirling’s formula for the Gamma function (cf. (3.25)), we have √ 1 1 Γ(x) = 2πxx− 2 e−x (1 + O( )) as x → ∞. x Consequently, Γ((k + 1)α + β) ≥ (kα + β + α)α e−α (1 − O(k −1 )) Γ(kα + β)
as k → ∞.
Thus, the Cauchy formula for the radius of convergence R = lim sup k→∞
ck ck+1
= lim sup k→∞
Γ((k + 1)α + β) =∞ Γ(kα + β)
shows that series (3.26) converges in the whole complex plane. For the second part we again use the Stirling formula and the result easily follows: log |Γ(α k + β| 1 = lim ρ k→∞ k log k √ log 2π − α k − β + (β + α k − 12 ) log(α, k + β) = α. = lim k→∞ k log k
The Mittag-Leffler function Eα,1 (−x) for x > 0 is also completely monotone for 0 ≤ α ≤ 1 and thus retains this important property from the two boundary points of E0,1 (−x) = 1/(1 + x) and of e−x = E1,1 for 0 < α < 1 as well. We will state and prove this result after a few further preliminaries.
3.4. The Mittag-Leffler function
61
In general, the Laplace transform of the Mittag-Leffler function Eα,β (x) is the Wright function Wρ,μ (z) (defined in Section 3.5). However, the Laplace transform of the function xβ−1 Eα,β (λxα ) does take a simple and very useful form. Lemma 3.1. Let α > 0, and β ∈ R. Then (3.28)
L[xβ−1 Eα,β (λxα )](z) =
z α−β , zα − λ
1
Re(z) > |λ| α .
3.4.2. Recurrence and integration as well as differentiation formulae. As with most special functions depending on a variable together with multiple parameters, there are many relations between members of the class and different values of the parameters. These formulae are important for a variety of reasons: they not only give rise to analytic results but they often allow computation of the function at a given point based on known values elsewhere or for other values of the parameters. The reference here for most classical special functions is [4]. See also the book by Sneddon [321], which has many examples of importance in the physical sciences. If these special functions are defined in terms of power series, then recurrence relations are typically formed by matching function and derivative values with the shift in the index of summation. They are frequently easily proved by direct reference to the series representation, and this is the case for the Mittag-Leffler function. In the relations listed below we assume that Re(α) > 0 and Re(β) > 0 and m ∈ Z zEα,α+β (z) = Eα,β (z) − (3.29) z Eα,mα+β (z) = Eα,β (z) − m
1 , Γ(β) m−1 k=0
zk . Γ(β + kα)
The verification of these identities is straightforward. For example, the first of these follows from ∞ ∞ z k+1 zj zEα,α+β = = Γ(αk+α+β) Γ(αj +β) k=0
=
j=1 ∞ j=0
1 1 zj − = Eα,β (z) − , Γ(αj +β) Γ(β) Γ(β)
and the second uses this device twice in a simple induction argument. By direct interchange of differentiation and summation in the power series representation of Eα,β , we have the useful formula (3.30)
dm β−1 z Eα,β (z α ) = z β−m−1 Eα,β−m (z α ), dz m
β>m
62
3. Special Functions and Tools
and the important special case d Eα,1 (−z α ) = −z α−1 Eα,α (−z α ). dz
(3.31)
The Abel fractional integrals and Riemann–Liouville fractional derivatives of the Mittag-Leffler function are still Mittag-Leffler functions, for γ > 0; in fact, as we will see in Example 4.2, (3.32)
γ β−1 Eα,β (λxα ) 0 Ix x
= xβ+γ−1 Eα,β+γ (λxα ),
(3.33)
RL γ β−1 Eα,β (λxα ) 0 Dx x
= xβ−γ−1 Eα,β−γ (λxα ).
As noted earlier, the function Eα (−λxα ), λ ∈ R, often appears. Recall that we use the notation Eα = Eα,1 . It can be verified directly that for λ > 0, α > 0, and m ∈ N, we have dm Eα (−λxα ) = −λxα−m Eα,α−m+1 (−λxα ). dxm Due to Lemma 3.1, the Laplace transform of the function Eα (−x) is given by
(3.34)
L[Eα (−λxα )](z) =
(3.35)
z α−1 , λ + zα
and for its derivative xα−1 Eα,α (−λxα ), the Laplace transform is given by L[xα−1 Eα,α (−λxα )](z) =
1 . λ + zα
Let 0 < α < 1 and consider the following initial value problem for the fractional ordinary differential equation (commonly known as the fractional relaxation equation): Find u satisfying DC α 0 Dx u(x)
+ λu(x) = 0
for x > 0
with u(0) = 1. We claim the solution u(x) is given by ∞ (−λxα )k u(x) = Eα (−λx ) = . Γ(kα + 1) α
k=0
In fact, by taking Laplace transforms, we deduce (z) − z α−1 + λ u(z) = 0, zαu and thus u (z) =
z α−1 z α +λ
=
1 z+λz 1−α
= L[Eα (−λxα )](z), cf. (3.35).
3.4.3. Integral representation and complete monotonicity. We require a contour γ,φ similar to that of the standard Hankel type:
3.4. The Mittag-Leffler function
63
It consists of two rays together with a circular arc joining them. For > 0 and fixed φ, π2 < φ < π, we have the rays (r, ±φ) for < r < ∞ together the circular arc (, θ) for −φ < θ < φ. This divides the complex plane into left and right regions G− (, φ) and G+ (, φ). This contour is required for the reciprocal of the Gamma function as this occurs in the series representation for the Mittag-Leffler function which will be used directly.
∧
γ
Im(z)
... ... ... ... ... ... ... ... ... ............................... . φ ..... ... ....................... ........ ..... ... ... . ..... ..... ..... . . . . . ..... ε .... ... .... .. . .. .. .... . . . . . . . . . . . .. ................... ... ... ... ... . . ... ... ... ... ...
G− (ε , φ )
G+ (ε , φ ) Re(z) >
We now have the following integral representation due originally to Mittag-Leffler [251]. Again we abbreviate Eα = Eα,1 . Theorem 3.19. Let 0 < α < 2. Then for any μ such that min{π, απ} and for z ∈ G− (, μ), we have 1α 1 exi dζ, (3.36) Eα (z) = 2πα γ,μ ζ − z and for z ∈ G+ (, μ) (3.37)
1 1/α 1 Eα (z) = ez + α 2πα
απ 2
< μ
|z|, we , μ), and by (3.36) have z ∈ G− (˜ 1 Eα (z) = 2απi
1
γ˜,μ
eζ α dζ. ζ −z
If z is chosen such that < |z| < ˜ and | arg(z)| < μ, then the Cauchy representation theorem (Theorem 3.1) gives 1 2απi
1
γ˜,μ −γ,μ
eζ α 1 1 dζ = ez α . ζ −z α
From the preceding two identities, the representation (3.37) follows directly. Remark 3.1. With some more work, this representation theorem could be extended to the general case for all β ∈ C, and we would obtain ⎧ 1 1−β ⎪ eζ α ζ α ⎪ 1 ⎪ dζ for z ∈ G− (, μ), ⎨ 2απi ζ−z γ,μ (3.39) Eα,β (z) = 1 1−β 1 1−β ⎪ eζ α ζ α α 1 1 ⎪ z ⎪ α z α e + 2απi dζ for z ∈ G+ (, μ). ⎩ ζ−z γ,μ
However, we chose the simpler version in part for historical reasons. The above allows us to prove the important result on complete monotonicity for the case of the original Mittag-Leffler function Eα (z). This was originally shown by Pollard [275], who used the representation in Theorem 3.19 directly, and it is this proof that we will present. Later, it was shown by Miller and Samko [249] that the extension to include the two-term Mittag-Leffler function with arbitrary β follows easily. The next result gives the complete monotonicity of the function Eα (−x) [275]. Theorem 3.20. For α ∈ [0, 1], Eα (−x) is completely monotone, and ∞ Eα (−x) = e−ux Fα∗ (u)du 0
with Fα∗ (u) =
∞ 1 (−1)k sin(παk)Γ(αk + 1)uk−1 . απ k! k=1
1 and E1,1 (−x) = e−x , these cases hold trivProof. Since E0,1 (−x) = 1+x ially, and it suffices to show the intermediate case 0 < α < 1. By the integral
3.4. The Mittag-Leffler function
65
representations (3.36), (3.37) 1 Eα (−x) = 2απi
(3.40)
1
Γ,μ
eζ α dζ ζ +x
with μ ∈ (απ/2, ∞ απ) and arbitrary but fixed > 0. Using the identity (x + ζ)−1 = 0 e−(ζ+x)u du in (3.40) and the absolute convergence of the double integral, by Fubini’s theorem we obtain ∞ 1 1 −xu ∗ ∗ Eα (−x) = e Fα (u)du with Fα (u) = eζ α e−ζu dζ. 2πiα Γ,μ 0 Now by Bernstein’s theorem (Theorem 3.5) it remains to show Fα∗ (u) ≥ 0 for all u ≥ 0. Integration by parts yields 1 1 1 1 ∗ e−ζu ( ζ α −1 )eζ α dζ. Fα (u) = 2απiu Γ,μ α With the change of variables ζu = z α , then 1 −1 u−1− α 1 α ∗ e−z ezu α dz, Fα (u) = α 2πi Γ,μ where Γ,μ is the image of Γ,μ under the mapping z → ζ. Now consider the function 1 α e−z ezζ dz. φα (ζ) = 2πi Γ,μ It is the inverse Laplace transform of ∞ −z α = e−zζ φα (ζ)dζ, e 0 −1− 1
which is completely monotone [275]. Hence Fα∗ (u) = u α α φα (u− α ) ≥ 0, and Eα (−x) is completely monotone. From the explicit series for φα (ζ) [274], we find also that Fα∗ (u)
∞ 1 (−1)k sin(παk)Γ(αk + 1)uk−1 . = απ k!
1
k=1
The next result is a consequence of Theorem 3.20 [249, 313], and it shows that the function Eα,β (−x) is completely monotone for α ∈ (0, 1) and β ≥ α. This result was also contained in Djrbashian [81, Theorem 1.3-8] and also in the earlier (Russian) 1966 version of the book. The result does not hold for the case 0 < β < α ≤ 1; see Figure 3.4. Theorem 3.21. For any α ∈ (0, 1) and β ≥ α, Eα,β (−x) is completely monotone on the positive real axis.
66
3. Special Functions and Tools
d Proof. First we note from (3.29) that Eα,α (−x) = −α dx Eα (−x), and we now claim that for β > α > 0 we have 1 1 1 (1 − t α )β−α−1 Eα,α (−tx)dt. (3.41) Eα,β (−x) = αΓ(β − α) 0
Indeed, inserting the above relation for Eα,α into the series representation of Eα,β and changing the order of integration and summation, we obtain 1 ∞ 1 1 (−x)k tk (1 − t α )β−α−1 dt, Eα,β (−x) = αΓ(β − α) Γ(αk + α) 0 k=0
thus (3.41), from which Theorem 3.20 follows.
From Theorem 3.20 it follows that for x ≥ 0, the function Eα,α (−x) is c.m. The composite function Eα,1 (−xα ) defined for x ≥ 0 will arise frequently in applications. It is also c.m.: Theorem 3.22. For α ∈ [0, 1], x ≥ 0, Eα,1 (−xα ) is completely monotone. Proof. This follows directly from the complete monotonicity of Eα,1 (−x) and Theorem 3.7, as well as the obvious fact that g(x) := xα is a positive function whose derivative is c.m. Figure 3.4 shows plots of Eα,β (−x) for the case of α = 12 and β values from 0.1 to 1.0 in increments of 0.1. This demonstrates the monotonicity of Eα,β (−x) for fixed α and increasing β in the range 0 < β ≤ 1. The rightmost figure is an enlargement of part of the graph to show more detailed behaviour. Note that for those β values less than 0.5 (that is β < α), Eα,β (−x) is no longer positive and its derivative is no longer negative for all x. Thus the complete monotonicity shown for β ≥ α is in fact sharp. 3.4.4. Asymptotic behaviour. Some of the most interesting tant properties of the function Eα,β (z) are associated with its behaviour as z → ∞ in various sectors of the complex plane C. derived by Djrbashian [79] and subsequently refined by many although the original results will suffice for our purposes.
and imporasymptotic It was first since then,
First for our purposes we consider the most useful case, namely Eα,β (−x) for x ≥ 0, and 1 ≥ β ≥ α. For reasons that will be immediately evident, we actually consider the monotonically decreasing function sα−β ˆ(s) = 1+s u(t) = tβ−1 Eα,β (−tα ). Its Laplace transform from (3.28) is u α. Then taking ρ = β − α ≥ 0 in one of the Tauberian theorems from Section 3.2, Theorem 3.14, we have u ˆ(s) ∼ s−ρ as s → 0, and so it follows that
3.4. The Mittag-Leffler function
67
... ... ... .... .. .. α =1 ... ... α = 3/4 ..... .... α = 1/2 .... ....... α = 1/4 .......... ......... ........... ................ . . . .. ... .. ... ... .... .. .... ...... .. . ...... ... ... ... ........ .... ......... . . . . . . . . . . ... ... ........ .. . ..... ... ... .......... . .... ...... ... ... .......................... ...... ..... .......... . ... ... ... .. ... ... .................... ....... ....... ....... .. ...... ...... ...... ............. . ....... ...... ......... ...... ...... ..... .......... . . ...... ...... ...... ...... . . . . . . . . . . . .................. .......................................................
Eα ,1 (x)
x
−5
−4
−3
−2
−1
1.0
105
0.8
104
0.6
103
0.4
102
0.2
101
0.0
0
x
0.0
0
....... .. ..... ........ ... .... ... .. ... ... .. .. . . . . . ... . ... ... ... ... α =2 ... ... ... ... ... ... .. α = 7/4 .. . . ... .. ... ... .. α = 3/2 ... .. ... .. ... .. α = 5/4 ................ . . . . .. . . ..... .... ..... .. ... ... .. ... .. ... .. ..... .. ........................................................................................... ......................... ............................. .............................. ........... ... .... . .... . .......... . . . . ... . ... .. .. .. ... .. .. ... ............ ............... ... ... . .. ...... ... ... ... .... .. . ... ... .. ........ ... .. . ... . ... ... .. .... ... .. .. ... ... ... ... .. ... .. ........ . . . . ... .. ... ... .. ... .. ... ... ... ..... ... ... . . . . . . . ..... . .. .. ....... ..... ... .. .................... ......
−80
−60
−40
−20
0.5
1.0
1.5
2.0
2.5
1.0
Eα ,1 (x)
−100
.. .. ... ... . .. .. ... .... . . .. ..... ... .. .... ..... .. . . ..... .. ... ..... .. ... ..... . .. . . . ... .... .. ... .... . .... ... .. .. ...... ... ... ...... ... ... . . .. . . . . . . .. .... ... .... .. .............. ....... .... .. ................ ... ...... ..... ...... .. ................................ . ... ....... ..... ...... ....... ...... ..................................... . .......... ...... ............... ............... ..................... ................................. ..................... ...............
400 0.5 300 0.0 200 −0.5
100
−1.0
0
0
...
..
..
3.0
.. .. .. .. .. .. .. .. .. .. .. .. ..
... ... ... .. ... ... ... ... . . . ...... ... ... ...... ... ... . ........ ...... ...... .................................... ....... ....... ......................................... . . . . . . . . ......................................................................................................................................... ................................ ...................
0
2
4
6
8
10
Figure 3.3. Plots of the Mittag-Leffler function for various α values
1.0 0.8 0.6 0.4 0.2 0.0
... ... ....... 1 , β ...... 2 .. ........... ......... ........ ................. ... .. ... ... ... ... ... ... .. . . ... .................. ... ... ... ... ... ... ... ... ... ... . . .. .. .. ... ..... ..... .... ......... ... ... ... ... .... ... ... ... ... .... ... .... .. .. ... .. ... ... ... .... ..... .... ..... ........... ... ... ... .... .... .... ...... ... ... ... ... ... ..... ...... ... .... ... ... ..... ...... ....... ... .... .... ..... ....... ........ ..................... ... ... ... .... ..... ....... ....... ........ ... ... .... .... ....... ....... ......... .......... ... ... ... ...... ....... ........ .......... ........... ... .... ..... ....... .......... ........... ..................................................... ... ... .... ...... ....... .......... ............ ............... .................. ... .... .... ....... ......... ............ ................ .................... ........................ ............ ................ ..................... ......................... .. ... ...... ........ .......... ..... .. ... .... ... ..... ...... ........... ................ ................................................................................................................................................................................................. ............ ........................... .................. ....... .... ... ... ...... ... ..... ........ ................... ...................................................................................................................................................................................... .......................................................... .. ................... ........ ... ...... ........................................ ... ............ .... ....... ....................... .................................................................................................................................................. ... ..... ...... ......................................................................................................................................................................................................................................................................................................................................................................... . . . . . . ........ . . . . . . . . . . . . . . . . . . . . . . . . . . . . .......................................................................................
E
(−x)
0.15 0.1 0.05 0.0
x
−0.05
.................. ................... ..................... ....................... .................... ........... ...................... ......................... ......................... ........................ ............................ ................................ ....... ................................ ....................................... .................... ............................................... ............................................ ........................................................................................... ........................................................................................... ........................................................................................... ......... .......................................................................................................... .............. .....................................................
x 4.5
Figure 3.4. Plots of E 1 ,β (−x) for β = 0.1, 0.2, . . . , 1.0 2
β = 0.1
−0.1 4.0
−0.2
β =1
5.0
68
3. Special Functions and Tools
tβ−1 Eα,β (−tα ) ∼ (3.42)
1 ρ−1 Γ(ρ) t
as t → ∞, thus
Eα,β (−tα ) ∼
1 t−α Γ(β − α)
as t → ∞.
Setting x = tα , we have an important estimate for the Mittag-Leffler function itself, (3.43)
Eα,β (−x) ∼
1 1 Γ(β − α) x
as x → ∞.
Note that this behaviour—up to the constant factor—is independent of α, β. A refinement of this is the estimate (3.44)
Eα,β (−x) = −
N k=1
1
1 1 + O , Γ(β − α k) (−x)k xN +1
and we include below the full estimate for all z. We start with the case α ∈ (0, 2). Theorem 3.23. Let α ∈ (0, 2), β ∈ R, and μ ∈ (απ/2, min(π, απ)), and N ∈ N. Then for |arg(z)| ≤ μ with |z| → ∞, we obtain N 1 1 1−β z α1 1 1 +O , (3.45) Eα,β (z) = z α e − α Γ(β − αk) z k z N +1 k=1
and for μ ≤ |arg(z)| ≤ π with |z| → ∞ (3.46)
Eα,β (z) = −
N k=1
1 1 +O Γ(β − αk) z k
1 z N +1
.
Proof. We sketch the outline of the proof and refer the reader to [81] for more details. To show (3.45), we take ϕ ∈ (μ, min{π, πα}). By the identity ζ k−1 ζN 1 + =− ζ −z zk z N (ζ − z) N
k=1
and the representation (3.39) with = 1, for any z ∈ G(+) (1, ϕ), there holds N 1 1−β 1 1−β z α1 1 eζ α ζ α +k−1 dζz −k + IN , Eα,β (z) = z α e − α 2παi Γ1,ϕ k=1
where the integral IN is defined by 1−β +N 1 α 1 ζα ζ dζ. e (3.47) IN = 2παiz N Γ1,ϕ ζ −z
3.4. The Mittag-Leffler function
69
By (3.38), the first integral can be evaluated by 1 1−β 1 1 eζ α ζ α +k−1 dζ = 2παi Γ1,ϕ Γ(β − αk)
k ≥ 1.
It remains to bound the integral IN . For sufficiently large |z| and | arg(z)| ≤ μ, we have minζ∈Γ1,ϕ |ζ − z| = |z| sin(ϕ − μ), and hence 1 1−β |z|−N −1 |eζ α ||ζ α +N |dζ. |IN (z)| ≤ 2πα sin(ϕ − μ) Γ1,ϕ 1
1
ϕ
For ζ ∈ Γ1,ϕ , arg(ζ) = ±ϕ and |ζ| ≥ 1, we have |eζ α | = e|ζ| α cos α and, due to the choice of ϕ, cos(ϕ/α) < 0. Hence the last integral converges. These estimates together yield (3.45). To show (3.46), we take ϕ ∈ (πα/2, μ), and the representation (3.39) with = 1. Then we obtain Eα,β (z) = −
N k=1
z −k + IN (z), Γ(β − αk)
z ∈ G(−) (1, ϕ),
where IN is defined in (3.47). For large |z| with μ ≤ | arg(z)| ≤ π, minζ∈Γ1,ϕ |ζ − z| = |z| sin(μ − ϕ), and consequently 1 1−β |z|−1−N |eζ α ||ζ α +N |dζ, |IN (z)| ≤ 2πα sin(μ − ϕ) Γ1,ϕ where the integral converges as before, thereby showing (3.46).
Hence, the function Eα,β (z), with α ∈ (0, 2) and β −α ∈ Z− ∪{0}, decays only linearly on the negative real axis R− , which is much slower than the exponential decay for the exponential function ez . However, on the positive real axis R+ , it grows super exponentially, and the growth rate increases with the decrease of α. We show some plots of the function Eα (x) = Eα,1 (x) for several different α values in Figure 3.3. On the positive real axis R+ , the function Eα (x) is monotonically increasing, and for fixed x it is a decreasing function of the parameter α. The behaviour of the function on the negative real axis is far more complex. For α ∈ (0, 1], the function Eα (x) is monotonically decreasing, but the rate of decay differs substantially with the parameter α: for α = 1, the function decays exponentially, but for any other α ∈ (0, 1), it decays only linearly, concurring with Theorem 3.23. The monotonicity of the function Eα (x) is a consequence of Theorem 3.20. For α ∈ (1, 2], the function Eα (x) is no longer positive any more on the negative real axis, and instead it starts to oscillate significantly. For α away from 2, the function Eα (x) decays as x → −∞, and the closer is α to 1, the faster is the decay. However, √ for the limiting case α = 2, there is no decay at all: E2 (x) = cosh(i −x) =
70
3. Special Functions and Tools
√ cos −x, i.e., it recovers the cosine function, and thus Theorem 3.23 does not apply to α = 2. Remark 3.2. In the transition area around the Stokes line | arg(z)| ≤ απ±δ, with δ < απ/2, the function Eα,β (z) exhibits the so-called Stokes phenomenon, and this can be amended by Berry-type smoothing [353]. With the error function complement erfc(z) defined by ∞ 2 2 e−s ds, erfc(z) = √ π z it is given by (3.48)
∞ 1 1−β z α1 1 1 z −k z α e erfc −c(θ) |z| α − Eα,β (z) ∼ 2α 2 Γ(β − αk) k=1
around the lower Stokes line for −3πα/2 < arg(z) < πα/2, where the pa1 2 rameter c(θ) is given by the relation c2 = 1 + iθ − eiθ , where θ = arg(z α ) + π 2 θ3 for small and the principal branch of c is chosen such that c ≈ θ + i θ6 − 36 θ. Around the upper Stokes line απ/2 < arg(z) < 3απ/2, one finds that ∞ 1 1−β z α1 1 1 z −k (3.49) Eα,β (z) ∼ z α e erfc c(θ) |z| α − 2α 2 Γ(β − αk) k=1
with c2 /2 = 1 + iθ − eiθ , θ = arg(z 1/α ) − π, and the same condition as before for small θ. This exponentially improved asymptotic series converges very rapidly for most values of z ∈ C, with |z| > 1. The following useful estimate is a direct corollary of Theorem 3.23. Corollary 3.2. Let 0 < α < 2, β ∈ R, and πα/2 < μ < min(π, πα). Then the following estimates hold: |Eα,β (z)| ≤ c1 (1 + |z|) |Eα,β (z)| ≤
c , 1 + |z|
1−β α
1
eRe(z α ) +
c2 , 1 + |z|
| arg(z)| ≤ μ,
μ ≤ | arg(z)| ≤ π.
Last, we briefly mention the case α ≥ 2, which can be reduced to the α ∈ (0, 2) case via the following duplication formula. Lemma 3.2. For all α > 0, β ∈ R, z ∈ C, and m = [(α − 1)/2] + 1, there holds (3.50)
m ij 1 1 2π 2m+1 2m+1 e α E 2m+1 ). Eα,β (z) = ,β (z 2m + 1 j=−m
3.4. The Mittag-Leffler function
71
Proof. This follows directly from the elementary identity # m j 1 1, if k ≡ 0 (mod 2m + 1), 2kiπ e 2m+1 = 0, if k ≡ 0 (mod 2m + 1). 2m + 1
j=−m
Using Lemma 3.2, the following result is obtained. Theorem 3.24. Let α ≥ 2, β ∈ R, and N ∈ N. The following asymptotics hold: 1 1 2mπi 1−β z α1 e 2mπi z −k α (z α e α ) + O(|z|−N −1 ), e − α Γ(β − kα) N
Eα,β (z) =
k=1
as z → ∞, where the first sum is taken for integers m satisfying the condition 3απ . | arg(z) + 2mπ| ≤ 4 The following uniform estimate on Eα (−x) shows that up to constants the function behaves like 1/(1 + cx) over the entire interval (0, ∞). Theorem 3.25. For every α ∈ (0, 1), the uniform estimate (3.51)
1 1 ≤ Eα (−x) ≤ 1 + Γ(1 − α)x 1 + Γ(1 + α)−1 x
is valid in R+ , where the bounding constants are optimal. Proof. The main idea of the proof will be to convert Eα (−x) by the change of variables x = tα in order utilise the fact that such a combination is invariant under an Abel fractional integral of order α. Set f (t) = Eα,1 (−tα ). Then from Theorem 3.20 f (t) is decreasing, in fact f (t) = −tα−1 Eα,α (−tα ) ≤ 0, α and from integration of the relaxation equation C 0 Dt f (t) + f (t) = 0 with f (0) = 1 we have
Now (3.53) 1 0 I f (t) = Γ(α)
t
α
0
f (t) f (s) ds ≥ 1−α (t − s) Γ(α)
which using (3.52) becomes (3.54)
for t ≥ 0.
f (t) + 0 I α f (t) = 1
(3.52)
1 = f (t) + 0 I α f (t) ≥ 1 +
t 0
tα ds f (t), = (t − s)1−α Γ(α + 1)
tα f (t), Γ(α + 1)
and this is the upper bound of (3.51). For the lower bound, again using the α 1−α f (t), relaxation equation and C 0 Dt f (t) = 0 I −0 I 1−α f (t) dt = f (t) .
72
3. Special Functions and Tools
This gives t 1 f (s) f (t) = − ds Γ(1 − α) 0 (t − s)α t t−α −t−α (1 − f (t)) , f (s) ds = ≥ Γ(1 − α) 0 Γ(1 − α) since −f ≥ 0 and s → s−α is decreasing. Solving the above inequality for f (t) then gives the lower bound. We note that these bounds clearly fail for α = 1 as then the MittagLeffler function becomes just e−x . In fact the estimates are remarkably good for α near zero, since in that case Γ(1 − α) ≈ Γ(1 + α) and both lower and upper bounds are close, but deteriorate as α approaches unity. Note also that for x small we have x 1 ≈1− ≈ Eα,1 (−x), −1 1 + Γ(1 + α) x Γ(1 + α) while for x large 1 1 1 ≈ ≈ Eα,1 (−x), 1 + Γ(1 − α)x Γ(1 − α) x and so the estimates in Theorem 3.25 can be viewed as first order rational approximations to Eα (−x). Of course higher order rational approximations of Pad´e type can be computed in a standard way, and we indicate some possibilities in this direction in Section 3.4.6. 3.4.5. Distribution of zeros. The distribution of zeros of the function Eα,β (z) is of independent interest because of its crucial role in the study of Fourier–Laplace type integrals with Mittag-Leffler kernels [79], spectral theory of fractional-order differential equations [258], inverse problems [200], and stochastic processes [13], to name a few. Actually, shortly after the appearance of the work of Mittag-Leffler [250] in 1903, Wiman [349] showed in 1905 that for α ≥ 2, all zeros of the function Eα,1 (z) are real, negative, and simple, and later, P´olya [276] reproved this fact for 2 ≤ α ∈ N by a different method. It was revisited to much greater depth by Djrbashian [79] and subsequently further refinements were done; see [278] for an overview. A general result for the case α ∈ (0, 2) and β ∈ R is as follows (see [278, Theorem 2.1.1] for the technical proof): Theorem 3.26. Let α ∈ (0, 2), and let β ∈ R, where β = 1, 0, −1, −2, . . . for α = 1. Then all sufficiently large (in modulus) zeros zn of the function
3.4. The Mittag-Leffler function
73
Eα,β (z) are simple and the asymptotic formula 1
znα = 2πin + ατβ ln 2πin + ln cβ +
cβ dβ ln 2πin − ατβ + rn + (ατβ )2 α cβ (2πin) 2πin 2πin
holds as n → ±∞, where the constants cβ , dβ , and τβ are defined by α α 1−β , dβ = , τβ = 1 + , if β = α − l, Γ(β − α) Γ(β − 2α) α α α 1−β cβ = , dβ = , τβ = 2 + , if β = α − l, α = N, Γ(β − 2α) Γ(β − 3α) α
cβ =
where l ∈ N and the remainder rn is given by 2 ln |n| 1 ln |n| rn = O + O + O . |n|1+α |n|2α n2 The following result specialises Theorem 3.26. We sketch a short derivation to give the flavour of the complete proof, which can be found in [169]. Proposition 3.2. Let α ∈ (0, 2) and α = 1. Then all sufficiently large (in modulus) zeros zn of the function Eα,2 (z) are simple, and the asymptotic formula
1 α π znα = 2nπi − (α − 1) ln 2π|n| + sign(n) i + ln + rn 2 Γ(2 − α)
|n| holds, where the remainder rn is O ln|n| as n → ∞. Proof. Taking N = 1 in the asymptotics (3.45) gives 1 1 1 1 1 +O Eα,2 (z) = z − α ez α − α Γ(2 − α) z for |z| → ∞. Hence we have 1
z 1
1 1− α zα
e
α = +O Γ(2 − α)
1 z2
1 . z
Next let ζ = z α and w = ζ +(α−1) ln ζ. Then the equation can be rewritten as 1 α w . e = +O Γ(2 − α) wα The solutions wn to the above equation for all sufficiently large n satisfy 1 α +O , wn = 2πn i + ln Γ(2 − α) |n|α or equivalently 1 α +O , ζn + (α − 1) ln ζn = 2πn i + ln Γ(2 − α) |n|α from which we arrive at the desired assertion.
74
3. Special Functions and Tools
The strict positivity of Eα,α is a simple consequence of Theorem 3.20. Corollary 3.3. For 0 < α < 1, we have Eα,α (−x) > 0,
x ≥ 0.
Proof. Assume the contrary that it vanishes at some x = x0 > 0. By Theorem 3.20, Eα,α (−x) is completely monotone, and thus from Eα,α (−x0 ) = 0 we conclude that it vanishes for all x ≥ x0 . The analyticity of Eα,α (−x) in x implies that it vanishes identically over R+ , which contradicts Eα,α (0) = 1/Γ(α). A similar strict positivity result holds for Eα (−λxα ) and its derivative. From (3.35), by the inverse Laplace transform, for t > 0 1 ezx dz. Eα (−λxα ) = 2πi Ha z + λz 1−α By collapsing the Hankel contour onto the negative real axis, we get 1 ∞ e−rx λrα sin απ dr α , (3.55) Eα (−λx ) = π 0 (rα + λ cos απ)2 + (λ sin απ)2 which shows that d Eα (−λxα ) > 0 and Eα (−λxα ) < 0 ∀x > 0. dx 3.4.6. Numerical algorithms. The computation of the Mittag-Leffler function Eα,β (z) over the complex plane C is a nontrivial issue. Naturally, it has received much attention and there are quite comprehensive algorithms in the literature, see for example, [115, 317]. For the case β = 1 and z restricted to the negative real axis, the situation is much simpler but the ideas are representative of techniques used for the complete picture. We will use the case of Eα (−λxα ) as this is in fact the most directly used version needed for subdiffusion problems to be studied later. First, since Eα (−λz α ) is an entire function, for small enough values of the argument the power series can be used. From this, if x is bounded away from the origin, then the representation (3.55) comes into play. The presence of the exponential term in the integrand shows there is no difficulty with the upper limit if x > δ > 0, and we have already taken care of this case by direct use of the power series. For all r the denominator is bounded below by λ2 and there is also no difficulty with the lower limit, so the quadrature can be done in a completely standard way. The plots of the Mittag-Leffler function shown earlier in Figures 3.3 and 3.4 were produced using this algorithm. As an alternative method one can seek uniform Pad´e approximations on the interval ([0, ∞). The usual Pad´e approximation of a function is taken
3.4. The Mittag-Leffler function
75
about a fixed point of the Taylor series and has an interval of effective numerical validity [17], and it is frequently the case that several such intervals are pieced together to form an enlarged range. It is also possible to find a global approximation over an interval by using anchor points at both its ends. Here is one such approach for the function Eα,β (−x), x > 0 [62]. For the case 0 < α < 1 and β > α we have, from (3.26) m−2
(−x)k + O xm Γ(αk+β) k=0 m = A(x) + O x ,
Γ(β −α)xEα,β (−x) = Γ(β −α)x (3.56)
n (−x)−k + O x−n Γ(β −αk) k=1 = B(x−1 ) + O x−n .
Γ(β −α)xEα,β (−x) = −Γ(β −α)x
Here the first formula is taken at x → 0 and the second at x → ∞, and where the multiplication by Γ(β − α)x is to ensure that the leading coefficient of the asymptotic series (3.56) is 1. We now look for a rational approximation of the form (3.57)
Γ(β − α)xEα,β (−x) ≈
p0 + p1 x + · · · pν xν p(x) = , q(x) q0 + q1 x + · · · qν xν
where ν is an appropriately chosen integer and the task is to find the coefficients {pi } and {qi } such that (3.56) has the correct expansions at both x → 0 and x → ∞. Since the leading term of (3.56) at x → ∞ is pν /qν , we set pν = qν = 1. The coefficients {pi } and {qi } are solutions of the system of linear equations p(x) − q(x)A(x) = O(xm )
at x → 0,
p(x) − q(x) B(x−1 ) = O(x−n ) at x → ∞. xν
(3.58)
These two equations form an inhomogeneous linear system of (m + n − 1) equations for 2ν unknowns {pi , qi }, 0 ≤ i < ν − 1. For example, if m = 3 and n = 2, we look for coefficients such that Γ(β − α)xEα,β (−x) ≈
(3.59)
p0 + p1 x + x2 , q0 + q1 x + x2
and we obtain A(x) =
Γ(β − α) 2 Γ(β − α) x− x , Γ(β) Γ(β + α)
B(x) = 1 −
Γ(β − α) 1 . Γ(β − 2α) x
76
3. Special Functions and Tools
Solving for the coefficients, we obtain Γ(β) , Γ(β + α)Γ(β − α) − Γ(β)2 Γ(β − α) Γ(β − α) Γ(β − α) q0 = q1 , , q1 − p1 = 1+ Γ(β + α) Γ(β) Γ(β − 2α) Γ(β)Γ(β + α) Γ(β + α)Γ(β − α)
− , q0 = C Γ(β − α) Γ(β − 2α) Γ(β)Γ(β − α)
, q1 = C Γ(β + α) − Γ(β − 2α) p0 = 0,
p1 − Γ(β − α)q0 = 0,
so that (3.60)
Eα,β (−x) =
C=
q −1 1 0 + x q0 + q1 x + x2 . Γ(β) Γ(β − α)
The case of Eα,α (−x) works using an identical approach [62], giving (3.61)
Eα,α (−x) =
2Γ(1 + α)2 Γ(1 − α) 2 −1 1 1+ x+ x . Γ(α) Γ(1 − α)Γ(2 − α) Γ(1 + α)
Clearly this rational function approximation is extremely fast to evaluate, and in fact it gives an approximation within a few percent for all x ≥ 0 and α away from unity. As in the uniform bound (3.51), it is much more accurate for smaller α. Figure 3.5 shows the bounds from Theorem 3.25 as well as plots of Eα,1 (−x) for α = 12 and using both the standard method (namely the combination of power series, integral representation, and asymptotic behaviour) as well as the uniform Pad´e (1, 2) method shown above. The function Diff(x) represents the difference of the two numerical schemes. We do not show α = 14 here since all four plots would essentially overlap at the scale used: the maximum difference between the upper and lower bounds is approximately 0.026. For α greater than about 0.7 the uniform Pad´e scheme still is very accurate for medium to large values of |x| but deteriorates considerably as α approaches unity for x near zero. It is unlikely that taking a higher order scheme will be fruitful, but one possible remedy here is to use a standard Pad´e rational approximation for small values of x for which taking higher order methods will offer improvement. In different regions of the complex plane C, the function Eα,β (z) shows considerably different behaviour and consequently the numerical schemes for complex z have to be adapted to these regions. In fact the power series argument above holds for the general case of Eα,β (z) and a suggested value for the upper bound of |z| that should be
3.5. The Wright function
1.0
0.5
E
77
(−x)
... ... α ,1 .. ...... ....... ........ ....... ........ ....... .......... ....... ........ . ...... . ........ . ........ . ......... . . ..... . . ........ . . ....... . . . ....... . . ......... . . . .................. . . . . . . ..................... . ................................ . . . . . . . . . .............................................. . . ....................
0.0
100
x
Diff (x)
......... ........ ....... 10−2 ....... ...... ...... ...... ...... ...... ...... 10−4 ...... ...... ...... ...... ...... ...... ...... 10−6 ...... ...... ...... ...... ...... ...... ...... 10−8 ...... ...... ...... ...... ...... ... −10 /
10
01
Figure 3.5. Left: Plots of Eα,1 (−x) with α = uniform bounds. Right: Difference of methods.
1 2
2
x 3
by two methods and
used is just under unity. Indeed, in [317] it is shown that for any > 0 and |z| < 1 the choice $ # ln((1 − |z|)) 2−β + 1, +1 (3.62) N ≥ max α ln(|z|) guarantees that the error involved in using the first N terms of the Taylor −1 z k series N k=0 Γ(αk+β) is less than . For intermediate values of |z| there are again integral representations available for which standard quadrature arguments apply. There are two cases used depending on whether β ≤ 1 or β > 1. For large values of |z| asymptotic expansions are used; specifically (3.45) when Re(z) > 0 and (3.46) when Re(z) < 0. However, there is a catch. The Stokes lines (see Section 3.1.2) here are the radial rays with argument ±απ. Thus in a region of width δ about these lines modification is required in the form of Berry smoothing [317].
3.5. The Wright function In the previous section we introduced the Mittag-Leffler function as being the extension of the exponential function when solving the relaxation equation D α u + λu = 0 with a derivative of fractional order, and indeed this function plays the lead role in fractional calculus. The reader will also have noticed the prominent role played by Fourier and Laplace transforms in the solution of ordinary and partial differential equations, and finding pairs of functions coupled by such transforms is a crucial strategy. One might then ask which function plays this role for Eα,β (z)? The answer is the so-called Wright function. The first application of the Wright function was connected with the asymptotic theory of partitions. A partition of a positive integer n is a way
78
3. Special Functions and Tools
of writing n as a sum of positive integers. Two sums that differ only in the order of their summands are considered equivalent. The function p(n) represents the number of possible partitions of a nonnegative integer n and pk (n) is the number of partitions of n into exactly k parts and is equal to the number of partitions of n in which the largest part has size k. The function pk (n) satisfies the recurrence relation pk (n) = p k (n − k) + pk−1 (n − 1). One recovers the function p(n) from this by p(n)= nk=0 pk (n). The asymptotic √ growth rate for p(n) is given by log p(n) ∼ 3/2π n as n → ∞. The more precise asymptotic formula, 1 p(n) ∼ √ e 4n 3
π 2n 3
as n → ∞,
was obtained by Hardy and Ramanujan in 1918. Extending this work, Wright [354] considered the more general problem, namely to find an asymptotic expansion for the function pk (n). Following Hardy and Ramanujan, Wright considered the generating function Pk (z) =
∞
pk (n)z n =
n=1
∞ i=1
1 , 1 − z ki
which counts all partitions of all numbers n, with weight z. The infinite product converges since only a finite number of the factors contribute to any given term. Turning this around, one can write pk (n) as 1 Pk (z) dz , pk (n) = 2πi γ z n+1 where the contour γ is a circle centred at the origin and with radius 1 − n1 . After several more steps, this led Wright to the asymptotic expansion for pk (n) and along the way to obtaining a power series expansion for Pk (z) in what is now known as the Wright function. For μ, ρ ∈ R with ρ > −1, the Wright function Wρ,μ (z) is defined by (3.63)
Wρ,μ (z) =
∞ k=0
zk , k!Γ(ρk + μ)
z ∈ C.
Wright published this work in a series of notes starting from 1933. Originally, he assumed that ρ ≥ 0 [354–356], and only in 1940 [357] did he consider the more difficult case −1 < ρ < 0. There was further work on the Wright function by Stankovic in 1970 [322], involving operators of Mikusi´ nski type [248]. In a prophetic remark, he noted, “A number of authors have used some functions which were really simply Wright functions or some particular instance of it.”
3.5. The Wright function
79
More recently, the asymptotics of the Wright function was discussed in [230, Theorems 3.1 and 3.2]. The exponential asymptotics can be used to deduce the distribution of its zeros [228]. We also refer the reader to [116,117] for further details on this function. An excellent tutorial survey of the M -Wright function is [237], where many related references and historical remarks can be found. The function Wρ,μ (z) generalises both the exponential function W0,1 (z) = ez , and the Bessel functions Jν and Iν , z 2 z −ν z 2 z −ν (3.64) W1,ν+1 − = = Jν (z), W1,ν+1 Iν (z). 4 2 4 2 For this reason it is often called the generalised Bessel function. The next result gives the order of the function Wρ,μ (z) [116, Theorem 2.4.1]. It is noteworthy that for ρ ∈ (−1, 0), the function is not of exponential order. Lemma 3.3. For any ρ > −1, μ ∈ R, the Wright function Wρ,μ (z) is entire of order 1/(1 + ρ). Proof. We sketch the proof only for ρ ∈ (−1, 0), since the case ρ > 1 is simpler. Using the reflection formula (3.12), we rewrite Wρ,μ (z) as Wρ,μ (z) =
∞ 1 Γ(1 − kρ − μ) sin(π(kρ + μ)) k z . π k! k=0
Next we introduce an auxiliary majorising sequence, ∞ 1 ck |z|k π
with ck =
k=0
|Γ(1 − kρ − μ)| , k!
and then by Stirling’s formula (3.25) we deduce that lim
ck
k→∞ ck+1
= lim
k→∞
k+1 = ∞. |ρ|ρ k −ρ
Hence the auxiliary sequence and also the series (3.63) converge over the whole complex plane C. The order follows analogously. The asymptotic behaviour of the function (3.63) was first studied by Wright himself: first in 1935 [355] for the case ρ > 0, then in 1940 for the more difficult case of −1 < ρ < 0 [357]. It is still an entire function for this range and it shows certain different features. These estimates were recently refined by Wong and Zhao [351, 352] by including Stokes phenomena. As usual, the asymptotic expansions are based on a suitable integral
80
3. Special Functions and Tools
representation 1 Wρ,μ (z) = 2πi
(3.65)
eζ+zζ
−ρ
ζ −μ dζ,
Ha
where Ha is the Hankel contour. It follows from the integral representation of the reciprocal Gamma function (3.23) that ∞ ∞ 1 zk 1 (zζ −ρ )k dζ eζ ζ −ρk−μ dζ = eζ ζ −μ Wρ,μ (z) = k! 2πi Ha 2πi Ha k! k=0 k=0 1 −ρ eζ+zζ ζ −μ dζ. = 2πi Ha We state only the asymptotic expansions. Theorem 3.27. The following asymptotic formulae hold. (i) Let ρ > 0, arg(z) = θ, |θ| ≤ π − , > 0. Then % M (−1)m am 1+ρ 1 Z + O(|Z|−M −1) , Wρ,μ (z) = Z 2 −μ e ρ Zm
Z → ∞,
m=0
1
i
θ
where Z = (ρ|z|) ρ+1 e ρ+1 and the coefficients am , m = 0, 1, . . ., are defined as the coefficients of v 2m in the expansion
− 2m+1
m+ 1 Γ(m+ 12 ) 2 2 (ρ+2)(ρ+3) 2 ρ+2 −β 2 (1 − v) v + v + · · · . 1 + 2π ρ+1 3 3·4 (ii) Let −1 < ρ < 0, y = −z, arg(z) ≤ π, −π < arg(y) ≤ π, |arg(y)| ≤ min(3π(1 + ρ)/2, π) − , > 0. Then % M −1 1 Wρ,μ (z) = Y 2 −μ e−Y Am Y −m + O(Y −M ) , Y → ∞, m=0 1
where Y = (1 + ρ)((−ρ)−ρ y) 1+ρ and the coefficients Am , m = 0, 1, . . ., are defined by the asymptotic expansion Γ(1 − μ − ρt) + ρ)(1+ρ)(t+1) Γ(t + 1)
2π(−ρ)−ρt (1 =
M −1 m=0
(−1)m Am +O Γ((1 + ρ)t + μ + 12 + m)
1 Γ((1 + ρ)t + β +
1 2
+ M)
,
valid for arg(t), arg(−ρt), and arg(1 − μ − ρt) all lying between −π and π and t tending to infinity. Remark 3.3. For ρ > 0, the case z = −x, x > 0, is not covered by Theorem 3.27. Then the following asymptotic formula holds Wρ,μ (−x) = xp( 2 −μ) eσx 1
p
cos pπ
cos(( 12 − β)pπ + σxp sin pπ){c1 + O(x−p )},
3.5. The Wright function
81
ρ
−
1 where p = 1+ρ , σ = (1 + ρ)ρ 1+ρ , and the constant c1 can be evaluated exactly. Similarly, for −1/3 < ρ < 0, the asymptotic expansion as z = x → +∞ is not covered, and it is given by
Wρ,μ (−x) = xp( 2 −μ) e−σx 1
where p = exactly.
1 1+ρ ,
p
cos pπ
cos(( 12 − β)pπ − σxp sin pπ){c2 + O(x−p )},
σ = (1 + ρ)(−ρ)
ρ − 1+ρ
and the constant c2 can be evaluated
One simple consequence is the following asymptotic. If z ∈ C and | arg(z)| ≤ π − (0 < < π), then the asymptotic behaviour of Wρ,μ (z) at infinity is given by 1−μ
Wρ,μ (z) = (2π(1 + ρ))− 2 (ρz) 1+ρ e 1
1 1+ρ (ρz) 1+ρ ρ
(1 + O(z
1 − 1+ρ
)) as z → ∞.
There is an interesting link between Wright and Mittag-Leffler functions: the Laplace transform of a Wright function is a Mittag-Leffler function ∞ −ρ −zx 1 L[Wρ,μ (−x)](z) = e eζ−xζ ζ −μ dζ dx 2πi Ha 0 ∞ 1 −ρ eζ ζ −μ e−(z+ζ )x dx dζ = 2πi Ha 0 −μ z −1 Eρ,μ (−z −1 ), ρ > 0, ζ 1 eζ dζ = = 2πi Ha z + ζ −ρ E−ρ,μ−ρ (−z), 0 > ρ > −1. 3.5.1. M -Wright function. One specific case of the Wright function relevant to fractional diffusion is the so-called M -Wright function based on the single parameter ν when ρ = −ν and μ = 1 − ν, Mν (z) = W−ν,1−ν (−z) (3.66)
=
∞ k=0
∞ 1 (−z)k−1 (−1)k z k = Γ(kν) sin(kνπ), k!Γ(1 − ν(k + 1)) π (k − 1)! k=0
where reflection formula (3.12) shows that the two series are indeed equal. This function was first introduced by Francesco Mainardi [234] in his studies of fractional diffusion, but he was unaware of the original work of Wright and prior work of Stankovic in 1970 [322]. Thus it is sometimes called the Mainardi function. It is an entire function of order 1/(1 − ν); for ν = 1/2 and ν = 1/3 it recovers the familiar Gaussian and Airy functions, 2 1 − z2 z . (3.67) M 1 (z) = √ e 4 , M 1 (z) = 3 3 Ai 1 2 3 π 33 We have the following asymptotic behaviour for the M -Wright function [238], (3.68)
Mν (x) ∼ Axa e−bx , c
x→∞
82
1
3. Special Functions and Tools
Mν (x)
ν =1/8 ... ... ... ν =1/4 ... . ... ... ν =3/8 ... .. ... ... ... ... ν =1/2 ...... ... ...... ... ...... ... .... ... ...... ... ..... ........ ......... ........ ....... ...... .... ...... ....... ......... .......... .......... ......... ............ ........... ............ ............. ............... ............... ............... ................. ................ .................... ................... ...................... ........................ ........................ ...................... ..................... ...................... ........................................ ..............
...... .. ... .. .. .. .... ν =5/4 . . .. . ... .. ν =3/2 .. ... ... .... ... . ν =7/4 ... .. . . ... .. ... .. ... ... . ... .. ... .. ... ... .. .... ... ... ... ... . ... .......................... ..... ..... .. ........... ..... .. . . . . . . ..... ..... . . . .... . .............. ...... .................. . ..................... ......... ........ .... .. ....... ........ ...... .... ..... ... ... . . ....... . . . . ...... .. ... ... ... . . . . . . . ........ .... . .... . . . . . . .......... .... . .. ..... .. ........... ..... ... ... ........ ..... ...... .... ... . . . ....... .. .. . . . .......... . .. .... ... ...... . .. . . .. ... ....... ... . . . . ... ....... .. ... . .... .. ..... .............. .. ......... ..... .. ............ ....... ... . ........... ..... ...................... .......
Mμ (x)
1
x
x
Figure 3.6. The M -Wright function Mν (−x).
with A = (2π(1 − ν)ν
1−2ν 1−ν
)−1/2 ,
a=
2ν − 1 , 2 − 2ν
ν
b = (1 − ν)ν 1−ν ,
c=
1 . 1−ν
Figure 3.6 shows the plots of Mν (x) for selected cases in the range 0 < ν ≤ 1 1 2 and in the more delicate range ( 2 , 1). As we will see, the lower range corresponds to an α value of the fractional operator in the range 0 < α ≤ 1; ν ∈ ( 12 , 1) corresponds to 1 < α < 2. As ν → 1, Mν (x) tends to the Dirac-δ distribution centred at x = 1. Computation of the Wright function for ν near unity is quite delicate (as is the Wright function in general for complex arguments). Next we derive the Laplace and Fourier transformation of the M -Wright function. The Laplace transform of an M -Wright function is a Mittag-Leffler function. Theorem 3.28. For ν ∈ (0, 1), L[Mν (x)](z) = Eν (−z). Proof. By the integral representation (3.65), ∞ ∞ 1 ν −xz −zx e Mν (x)dx = e e−ζ−xζ ζ ν−1 dζdx 2πi 0 0 Ha ∞ 1 ν ζ ν−1 e ζ e−x(z+ζ ) dxdζ = 2πi Ha 0 ζ ν−1 ζ 1 e dζ = Eν (−z). = 2πi Ha ζ ν + z To derive the Fourier transform, we first show an important identity.
3.5. The Wright function
83
Lemma 3.4. For any k > −1 and 0 < ν < 1, there holds ∞ Γ(k + 1) . xk Mν (x)dx = Γ(νk + 1) 0 Proof. Using the representation (3.65) of the Wright function, we deduce ∞ ∞ 1 ν dz xk Mν (x) = xk ez−xz 1−ν dx 2πi z 0 0 Ha ∞ 1 ν z ν−1 ez xk e−xz dxdz. = 2πi Ha 0 ν By a change of variable s = xz , the inner integral is given by ∞ ∞ k −xz ν −(k+1)ν x e dx = z sk e−s ds = z −(k+1)ν Γ(k + 1). 0
0
Consequently, ∞ 1 Γ(k + 1) k , x Mν (x)dx = Γ(k + 1) z −(kν+1) ez dz = 2πi Γ(kν + 1) 0 Ha where we have used the reciprocal Gamma function; cf. (3.23), (3.38).
Theorem 3.29. For ν ∈ (0, 1), the Fourier transform of Mν (|x|) is given by F [Mν (|x|)](ξ) = 2E2ν (−ξ 2 ). 2k k (ξx) , we deduce Proof. Using the series expansion of cos ξx = ∞ k=0 (−1) 2k! ∞ ∞ F [Mν (|x|)](ξ) = e−iξx Mν (|x|)dx = 2 Mν (x) cos ξxdx
=2
−∞ ∞
(−1)k
k=0
(2k)!
Now using Lemma 3.4, we have ∞ (−1)k F [Mν (|x|)](ξ) = 2 k=0
0
ξ 2k
∞
x2k Mν (x)dx.
0
ξ 2k = 2E2ν (−ξ 2 ). Γ(2kν + 1)
This completes the proof of the theorem.
3.5.2. Computing the Wright function. The effective computation of the Wright function Wρ,ν (z) is nontrivial. In theory, as before in the case of the Mittag-Leffler function, it can be computed using power series for small values of the argument and a known asymptotic formula for large values, while for the intermediate case, values are obtained by using an integral representation. However, an algorithm that works on the whole complex plane with rigorous error analysis is still missing. In the case of a real argument, some preliminary analysis has been done [230], which we
84
3. Special Functions and Tools
describe below. The following representation formulas are derived from the representation (3.65) by suitably deforming the Hankel contour Ha. They can be used to compute the Wright function with a real argument, especially the M -Wright function Mν . Theorem 3.30. The following integral representation holds. (i) Let z = −x, x > 0. Then the function Wρ,ν (−x) is given by 1 ∞ K(ρ, ν, −x, r)dr, Wρ,ν (−x) = π 0 if −1 < ρ < 0 and ν < 1, or 0 < ρ < 1/2, or ρ = 1/2 and ν < 1+ρ, 1 ∞ K(ρ, ν, −x, r)dr, if − 1 < ρ < 0 and ν = 1, Wρ,ν (−x) = e + π 0 and 1 ∞ 1 π ˜ P (ρ, ν, −x, ϕ)dϕ, Wρ,ν (−x) = K(ρ, ν, −x, r)dr + π 1 π 0 in all other cases, with K(ρ, ν, x, r) = e−r+xr
−ρ
cos ρπ −ν
r
sin(xr−ρ sin ρπ + νπ),
P˜ (ρ, ν, x, ϕ) = ecos ϕ+x cos ρϕ+cos(ν−1)ϕ cos(sin ϕ − x sin ρϕ − sin(ν − 1)ϕ). (ii) Let z = x > 0. Then the function Wρ,μ (x) has the integral representation, 1 ∞ Wρ,μ (x) = K(ρ, μ, x, r)dr, if − 1 < ρ < 0 and μ < 1, π 0 1 ∞ K(ρ, μ, x, r)dr, if − 1 < ρ < 0 and μ = 1, Wρ,μ (x) = e + π 0 1 ∞ 1 π K(ρ, μ, x, r)dr + P (ρ, μ, x, ϕ)dϕ, Wρ,μ (x) = π 1 π 0 in all other cases, where the function K and P are the same as before. The integral representations in Theorem 3.30 give one way to compute the Wright function with a real argument. In principle, these integrals can be evaluated by any quadrature rule. It is worth noting that the kernel K(ρ, ν, x, r) is singular with a leading order r−ν , with successive singular kernels. Hence, a direct treatment via numerical quadrature can be very inefficient. Dependent upon the parameter ρ, a suitable transformation might be needed. For example, for ρ > 0, a more efficient approach is to use the change of variable s = r−ρ (i.e., r = s−1/ρ ), and the transformed kernel is 1 −ρ
μ, x, s) = (−ρ)−1 s(−ρ)−1 (−μ+1)−1 e−s K(ρ,
+x cos(πρ)s
sin(x sin(πρ)s + πμ).
3.5. The Wright function
85
The case directly relevant to fractional diffusion is ρ = −α/2 < 0, 0 < μ = 1 + ρ < 1 (and z = −x, x > 0). Together with the choice ρ = −α/2 and μ = 1 + ρ, it simplifies to 1 −ρ
1 + ρ, x, s) = (−ρ)−1 e−s K(ρ,
+x cos(πρ)s
sin(x sin(πρ)s + (1 + ρ)π).
Now the kernel is free from the grave singularity, and the integral can be computed efficiently using Gauss–Jacobi quadrature with the weight func−1 tion s(−ρ) (1−μ)−1 . 1.00 0.75 0.50 0.25
2 α ... ..... α ,1 α = 1/4 ....... . ....... α = 1/2 ......... ......... α = 3/4 ........ ......... α =1 ...... ... ...... ... .......... .... ...... ... ... ... ... ... ... ... ... ... ... .... .... .... ..... ... .... ...... ... .. ...... ... ... ...... .... .... ... ... ... ... . .... .... ..... ..... ... ...... ..... ... ..... ................. .......... ................ ................................. .................................................................................................................................... ....... ...................................................................................................... ....... .......... ........................................................................................................... ..........................................................
E
0.00 0.00
(−π t )
0.25
0.50
0.75
t 1.00
1
M
(x)
α = 1/4 .... α /2 ...... α = 1/2 ... .... ................ α = 3/4 .... .... ...... ........ α =1 ............ ...... ............ .............. ...... ........... ... ........ .................................... ....... ..... . . . ....... ...... . . ......... ........ ......... ............ ......... . . .......... ............ . ............. . ............ ................ . .......... ............. . . ............ .............. . . ............ .............. . .............. . ............... .................... . . .................. ......................... . .................. . . . ................ ................... . . . .............. ....... . . . . . . ............... . . . . .......... . . ............................................ . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... ......... ... ..
x
−5 −4 −3 −2 −1
Figure 3.7. Plots of Eα,1 (−λtα ) with λ = π 2 and Mα/2 .
We close this chapter by showing, in Figure 3.7, some plots of special functions that we have discussed here. On the left, the Mittag-Leffler function Eα,1 (−λtα ) with λ = π 2 , that is, the solution of the subdiffusion equation with the first eigenfunction of the Laplace operator as initial condition. On the right, the M -Wright function Mα/2 (x), that is, the fundamental solution of the subdiffusion equation for fixed t = 1. Except for the case μ = 1/2, for which the function is infinitely differentiable, generally it shows a clear “kink” at the origin, indicating the (potential) limited smoothing properties of any related fractional differential equation.
Chapter 4
Fractional Calculus
In this chapter, we discuss fractional-order integrals and derivatives in some depth and this will serve as the main modelling tool in fractional diffusion. As noted in Chapter 1 the history of fractional calculus is as old as calculus itself, but only in recent decades has it found applications in physics and thus received revived interest sufficient to merit comprehensive study.
4.1. Fractional integrals The Abel fractional integral can be considered a natural generalisation of Cauchy’s well-known iterated integral formula. Let a, b ∈ R, with a < b, and we denote D = (a, b). Then for any n ∈ N, the n-fold integral a Ixn , based at the left endpoint x = a, is defined recursively by 0 a Ix f (x)
= f (x), x n n−1 f (s) ds, a Is a Ix f (x) =
n = 1, 2, . . . .
a
We claim that the following iterated integral formula holds for n ≥ 1, x 1 n (x − s)n−1 f (s) ds, (4.1) a Ix f (x) = (n − 1)! a and we verify this by mathematical induction. Clearly, this formula holds 0 = 1. Now assume that formula (4.1) holds for some for n = 1 since (x−s) 0! n ≥ 1. Then by definition and a change of integration order, we deduce that x s x 1 n+1 n f (x) = (s − t)n−1 f (t) dtds a Is f (s) ds = a Ix (n − 1)! a a a x x 1 x 1 n−1 (s − t) dsf (t) dt = (x − s)n f (s) ds, = (n − 1)! a t n! a 87
88
4. Fractional Calculus
where the last identity follows from x 1 1 (s − t)n−1 ds = (x − t)n . (n − 1)! t n! This shows the desired formula (4.1). Likewise, for the n-fold integral based at the right endpoint x = b, Cauchy’s integral formula also holds b 1 n (s − x)n−1 f (s) ds. (4.2) x Ib f (x) = (n − 1)! x One way to generalise these formulas for an integer n to any real number α ∈ R is to use the Gamma function Γ(z) defined in (3.9), that generalises the factorial due to the identity (n − 1)! = Γ(n). Then for any α > 0, by replacing n in the formulas with α, and (n − 1)! with Γ(α), we arrive at the following definition of Abel fractional integrals. Definition 4.1. For any f ∈ L1 (D), the left sided Abel fractional integral of order α > 0, denoted by a Ixα f , is defined by x 1 α (x − s)α−1 f (s) ds, (4.3) (a Ix f )(x) = Γ(α) a and the right sided Abel fractional integral of order α > 0, denoted by x Ibα f , is defined by b 1 α (4.4) (x Ib f )(x) = (s − x)α−1 f (s) ds. Γ(α) x In view of formulas (4.1) and (4.2), when α = n, this definition is consistent with n-fold integrals a Ixn and x Ibn ; thus they indeed generalise the n-fold integrals to the fractional case. In case α = 0, we adopt the convention a Ix0 f (x) = f (x), i.e., the identity operator. Further, we define for x → a+ by (a Ixα f )(a+ ) := lim (a Ixα f )(x), x→a+
if the integral on the right-hand side exists, and likewise (x Ibα f )(b− ) := lim (x Ibα f )(x). x→b−
The Abel integral operators a Ixα and x Ibα inherit a number of important properties from integer-order integrals. Consider power functions (x − a)γ , with γ > −1, where the choice γ > −1 ensures that (x − a)γ ∈ L1 (D). It can be verified directly that for α > 0 and x > a, Γ(γ + 1) α γ (x − a)γ+α , (4.5) a Ix (x − a) = Γ(γ + α + 1) and similarly for x < b, Γ(γ + 1) α γ (b − x)γ+α . x Ib (b − x) = Γ(γ + α + 1)
4.1. Fractional integrals
89
s−a The identity (4.5) follows by a change of variables s = x−a x 1 α γ I (x − a) = (x − s)α−1 (s − a)γ ds a x Γ(α) a (x − a)γ+α 1 (1 − s)α−1 sγ ds = Γ(α) 0 Γ(γ + 1) B(α, γ + 1) (x − a)γ+α = (x − a)γ+α , = Γ(α) Γ(γ + α + 1)
where B(·, ·) is the Beta function defined in (3.19). Here the last line follows from the relation between the Gamma and Beta functions, cf. (3.19). Clearly, for any integer α ∈ N, these formulas recover the integral counterparts. Example 4.1. In calculus, the primitive of the exponential function f (x) = eλx , λ ∈ R, is still an exponential function. In this example, we compute left-sided αth order Abel fractional integral of f (x) = eλx (based at a = 0). By (4.5), we have α λx 0 Ix e
=
α 0 Ix
∞ ∞ α k (λx)k 0 Ix x = λk Γ(k + 1) Γ(k + 1) k=0
(4.6) =
∞
k
λ
k=0
xk+α Γ(k + α + 1)
k=0
=x
α
∞ k=0
(λx)k . Γ(k + α + 1)
Example 4.2. Another consequence of (4.5) is the fact that the Abel fractional integrals of composite Mittag-Leffler functions are again MittagLeffler functions; for γ > 0, γ β−1 Eα,β (λxα ) 0 Ix x
= xβ+γ−1 Eα,β+γ (λxα ).
Since the function Eα,β (z) is an entire function (cf. Proposition 3.1), we can integrate termwise and use (4.5) to obtain γ β−1 Eα,β (λxα ) 0 Ix x
=
γ 0 Ix
∞ λk xkα+β−1 k=0
=
Γ(kα + β)
=
∞ λk 0 Ixγ xkα+β−1 k=0
Γ(kα + β)
∞ λk xkα+β+γ−1 = xβ+γ−1 Eα,β+γ (λxα ). Γ(kα + β + γ) k=0
By definition, the following semigroup property holds for n-fold integrals n m a Ix a Ix f
= a Ixn+m f
∀ integers m, n ≥ 0.
Similarly, we have the following semigroup property (and as a byproduct, also the commutativity) for the Abel fractional integrals.
90
4. Fractional Calculus
Lemma 4.1. For f ∈ L1 (D), α, β ≥ 0, there holds α β a Ix a Ix f
= a Ixβ a Ixα f = a Ixα+β f,
α β x Ib x Ib f
= x Ibβ x Ibα f = x Ibα+β f.
Proof. The case α = 0 or β = 0 is trivial, by the convention that a Ix0 is the identity operator. It suffices to consider α, β > 0, and the identity α+β α β f . By the definition and a change of integration order, we a Ix a Ix f = a Ix have x s 1 α β α−1 (x − s) (s − t)β−1 f (t) dtds (a Ix a Ix f )(x) = Γ(α)Γ(β) a a x x 1 f (t) (x − s)α−1 (s − t)β−1 ds dt. = Γ(α)Γ(β) a t With a change of variables, the integral in the bracket can be simplified to x 1 α−1 β−1 α+β−1 (x − s) (s − t) ds = (x − t) (1 − s)α−1 sβ−1 ds t
= B(α, β)(x − t)
0 α+β−1
,
where B(·, ·) is the Beta function. Then by (3.19), we have x B(α, β) (x − t)α+β−1 f (t) dt = (a Ixα+β f )(x). (a Ixα a Ixβ f )(x) = Γ(α)Γ(β) a
The Abel fractional integral operators are bounded in Lp (D) spaces for 1 ≤ p ≤ ∞. In particular, this implies that for any f ∈ L1 (D), the fractional integrals a Ixα f and x Ibα f of order α > 0 exist almost everywhere, and thus these integrals are well-defined for functions in L1 (D). Lemma 4.2. Let α > 0. Then the fractional integral operators a Ixα and x Ibα are bounded on Lp (D) for any 1 ≤ p ≤ ∞. Proof. By Young’s inequality for convolutions, (A.15) in Section A.3, we deduce & α−1 & & &x (b − a)α α & p (D) =
f Lp (D) ,
f
a Ix f Lp (D) ≤ & L & Γ(α) & 1 Γ(α + 1) L (D) showing the assertion for a Ixα . The case x Ibα is analogous.
Remark 4.1. With a more careful application of Young’s inequality, one can deduce that for α ∈ (0, 1) and 1 < p < 1/α, a Ixα and x Ibα are also bounded from Lp (D) to Lq (D), with q = p/(1 − αp), which in the literature is commonly known as the Hardy–Littlewood or the Hardy–Littlewood– Sobolev inequality [308].
4.2. Fractional derivatives
91
The following useful result is commonly referred to as the fractional integration by parts formula. In particular, it shows that the left and right sided Abel integral operators are adjoint to each other in L2 (D). Lemma 4.3. For any f ∈ Lp (D), g ∈ Lq (D), p, q ≥ 1 with p−1 +q −1 ≤ 1+α and p, q = 1, for p−1 + q −1 = 1 + α, the following identity holds, b b α g(x)(a Ix f )(x) dx = f (x)(x Ibα g)(x) dx. a
a
Proof. We only sketch the proof for f, g ∈ L2 (D), and for the general case we refer to [308]. By Lemma 4.2, for f ∈ L2 (D), a Ixα f ∈ L2 (D), and hence we deduce immediately b |g(x)||(aIxα f )(x)| dx ≤ c f L2 (D) g L2 (D) . a
Now the desired formula follows from Fubini’s theorem.
4.2. Fractional derivatives Now we turn to fractional derivatives and their analytical properties. Formally, it amounts to replacing α with −α, for α > 0 in the fractional integrals. Clearly, one has to make sure that the resulting integrals do make sense. There are many different definitions of fractional derivatives, and they are generally not equivalent on bounded intervals. We shall only discuss the Riemann–Liouville and Djrbashian–Caputo fractional derivatives, which represent the two most popular choices in practice, and we briefly mention the Gr¨ unwald–Letnikov fractional derivative, which is often used in numerical approximation. Throughout, we assume α > 0, and the integer n ∈ N is related to α by n − 1 < α ≤ n, denoted by n = α, the smallest integer larger than or equal to α. As before, we assume a, b ∈ R, a < b, and set D = (a, b). Further, for n ∈ N, we often denote by f (n) the nth order derivative of the function f . 4.2.1. Riemann–Liouville fractional derivative. We begin with the Riemann–Liouville fractional derivative, since it is mathematically more tractable. Definition 4.2. For f ∈ L1 (D), the left sided Riemann–Liouville fractional α derivative of order α, denoted by RL a Dx f (based at x = a), with n = α is defined by x dn dn 1 RL α n−α (a I f )(x) = (x − s)n−α−1 f (s) ds, a Dx f (x) := dxn x Γ(n − α) dxn a
92
4. Fractional Calculus
if the integral on the right-hand side exists. The right sided Riemann– α Liouville fractional derivative of order α, denoted by RL x Db f (based at x = b), is defined by b n (−1)n dn n−α RL α n d := D f (x) (−1) ( I f )(x) = (s−x)n−α−1 f (s) ds, x x b dxn b Γ(n − α) dxn x if the integral on the right-hand side exists. Due to the presence of the fractional integral a Ixn−α , the Riemann– α Liouville fractional derivative RL a Dx f is inherently nonlocal: the value of the derivatives at x > a depend on the values of the function f from a until x. The nonlocality dramatically influences its analytical properties. This represents one distinct feature of many fractional differential operators. To illustrate some of the features of fractional derivatives, we will compute the derivatives of the power function. Given any α > 0, we want to α γ determine RL a Dx (x − a) , γ > −1. By (4.5), we have n−α (x a Ix
− a)γ =
Γ(γ + 1) (x − a)γ+n−α , Γ(γ + 1 + n − α)
which, upon differentiation together with the recursion formula for the Gamma function Γ(z + 1) = zΓ(z) (cf. (3.10)), yields (4.7)
RL α a Dx (x
− a)γ =
Γ(γ + 1) (x − a)γ−α , Γ(γ + 1 − α)
x > a.
In (4.7), the argument for the Gamma function can be negative. Obviously, for any integer α ∈ N, the formula recovers the formula for integer-order derivatives. However, one can say more, and draw a number of interesting observations. First, for α ∈ / N, the Riemann–Liouville fractional derivative of α of the constant function f (x) ≡ 1 is nonzero, since for x > a, (4.8)
RL α a Dx 1
=
1 (x − a)−α . Γ(1 − α)
This last fact is particularly inconvenient for practical applications involving initial/boundary conditions where the interpretation of (4.8) then becomes unclear. Nonetheless, when α ∈ N, since 0, −1, −2, . . . are the poles of the Gamma function, the right-hand side of (4.8) vanishes, and thus it recovers dn the identity dx n 1 = 0. Second, given α > 0, for any γ = α−1, α−2, . . . , α−n, γ +1−α is a negative integer or zero, which are poles of the Gamma function (i.e., 1/Γ(γ + 1 − α) = 0), and thus the right hand side of (4.7) vanishes identically, i.e., (4.9)
RL α a Dx (x
− a)α−j ≡ 0,
j = 1, . . . , n.
4.2. Fractional derivatives
93
In particular, this identity immediately implies that if f, g ∈ L1 (D), then RL α a Dx f (x)
⇔
α = RL a Dx g(x)
f (x) = g(x) +
n
cj (x − a)α−j ,
j=1
where cj , j = 1, 2, . . . , n, are arbitrary constants. Example 4.3. The integer-order derivative of an exponential function is still an exponential function. In this example, we compute the αth order Riemann–Liouville derivative of the exponential function f (x) = eλx , λ ∈ R. By (4.7) and a change of Riemann–Liouville fractional derivative and infinite sum, we deduce (4.10) ∞ ∞ RL α k (λx)k RL α λx RL α k 0 Dx x = D e = D λ x x 0 0 Γ(k + 1) Γ(k + 1) k=0
=
k=0
∞
λk xk−α
k=0
Γ(k + 1 − α)
= x−α
∞ k=0
(λx)k = x−α E1,1−α (λx). Γ(k + 1 − α)
Clearly, generally the right-hand side is no longer an exponential function if α ∈ / N. Nonetheless, it can be succinctly expressed as x−α E1,1−α (λx) in terms of the Mittag-Leffler function Eα,β (z) discussed in Section 3.4. The motivation to study the eponymous function by G¨osta Mittag-Leffler in his seminal papers beginning in 1903 was to provide a generalisation of the exponential function, but the awareness of how this was intrinsically connected to fractional calculus was not appreciated until much later. Remark 4.2. Example 4.3 for the exponential function can be used to comα pute both the Riemann–Liouville fractional derivatives RL 0 Dx sin(λx) and RLD α cos(λx). It can be shown directly from the series expansion of the x 0 α −α E 2 2 cosine function that RL 2,1−α (−λ x ). This can also be 0 Dx cos(λx) = x repeated for sin(λx). Example 4.4. Given α > 0 and any λ ∈ R, consider the Mittag-Leffler function, ∞ (λxα )k . f (x) = xα−1 Eα,α (λxα ) := xα−1 Γ(kα + α) k=0
Then the Riemann–Liouville derivative RL α 0 Dx f (x)
=
RL α α−1 0 Dx x
∞ k=0
=
∞ λk xkα−1 k=1
Γ(kα)
RLD α f x 0
is given by ∞
(λxα )k k RL D α x(k+1)α−1 = λ 0 x Γ(kα + α) Γ(kα + α) k=0
= λxα−1 Eα,α (λxα ),
94
4. Fractional Calculus
where we have used (4.9). That is, the function xα−1 Eα,α (λxα ) is invariα ant under the fractional differential operator RL 0 Dx , or equivalently it is an RL α eigenfunction of 0 Dx . Remark 4.3. It is important to point out that in Examples 4.3 and 4.4 the formulae obtained depend on the starting value a of the derivative, and in these cases at the origin. As an example here, if we take a = −∞, then some modest calculations show that RL α λx Dx e
(4.11)
−∞
= λα eλx ,
λ > 0.
The Riemann–Liouville derivative with starting point a = −∞ is usually called the Weyl derivative, and in many cases one sees reference to formula (4.11) as being the fractional derivative of the exponential function. It is indeed, but only if it is clearly understood that the derivative is of Riemann– Liouville type and with the endpoint a = −∞. Next we examine analytical properties of Riemann–Liouville fractional derivatives more closely. In calculus, both integral and differential operators satisfy the commutativity and semigroup property. This is also true for the fractional integral operator, cf. Lemma 4.1. However, in general, the fractional differential operators do not satisfy the semigroup or even the weaker commutativity, due to their nonlocality, as shown by the following example. Example 4.5. One can construct functions f such that RL α RL β RL β RL α RL α+β f (x), a Dx a Dx f (x) = a Dx a Dx f (x) = a Dx RL α RL β a Dx a Dx f (x)
β RL α RL α+β = RL f (x). a Dx a Dx f (x) = a Dx
To see the first case, let f (x) = (x − a)− 2 and α = β = 12 . Then 1
1
RL 2 a Dx f (x)
1
= 0,
1
RL 2 RL 2 a Dx a Dx f (x)
= 0,
but 1
1
RL 2 + 2 f (x) a Dx
= f (x) = − 12 (x − a)− 2 . 3
1
To see the second case, let f (x) = (x − a) 2 , α = 1
RL 2 a Dx f (x)
=
√ π 2 ,
1 2
and β = 32 . Then
3
RL 2 a Dx f (x)
= 0.
Hence, 1
3
RL 2 RL 2 a Dx a Dx f (x)
3
= 0,
1
RL 2 RL 2 a Dx a Dx f (x)
3 2
= − (x−a) , 4
and 1
3
RL 2 + 2 f (x) a Dx
3 2
= f (2) (x) = − (x−a) . 4
4.2. Fractional derivatives
95
These examples show that the composition of two fractional differential operators may lead to unexpected results. To further illustrate the delicacy of sequential differentiation, we consider a continuous function f ∈ C(D) and α, β > 0 with α + β = 1. Then we have the following three statements: ⎧ RL α RL β ⎪ D D u(x) = f (x) ⇒ u(x) = 0 Ix1 f (x) + a1 + a2 (x − a)β−1 , ⎪ ⎨ a x a x RL β RL α ⇒ v(x) = 0 Ix1 f (x) + b1 + b2 (x − a)α−1 , a Dx a Dx v(x) = f (x) ⎪ ⎪ ⎩ RL 1 D w(x) = f (x) ⇒ w(x) = I 1 f (x) + c, a
0 x
x
where a1 , a2 , b1 , b2 , and c are arbitrary constants. The third case follows easily. In order to obtain the first case, one first applies the operator 0 Ixα to both sides of the equation, and then the operator 0 Ixβ . Thus the extra γ γ−1 = 0, 0 Ixγ xγ−1 = Γ(γ). terms have to be included properly. Since RL a Dx x Hence, the first two cases include two arbitrary constants, whereas the last case includes only one. It is also interesting to observe that if we restrict f ∈ C(D), then the extra term (x − a)α−1 (or (x − a)β−1 ) must disappear, indicating a composition rule is possible on suitable subspaces. The composition rule does hold for the Riemann–Liouville fractional derivative on a special function space a Ixγ (Lp (D)), γ > 0, defined by γ p a Ix (L (D))
= {f ∈ Lp (D) : f = a Ixγ ϕ for some ϕ ∈ Lp (D)}.
Similarly one can define the space x Ibγ (Lp (D)). Roughly speaking, a function in these spaces has the property that its function value and a sufficient number of derivatives vanish at the endpoints. Lemma 4.4. For any α, β ≥ 0, there holds RL α RL β a Dx a Dx f
α+β = RL f a Dx
∀f ∈ a Ixα+β (L1 (D)).
Proof. Since f ∈ a Ixα+β (L1 (D)), f = a Ixα+β ϕ for some ϕ ∈ L1 (D). Now the desired assertion follows directly from Theorem 4.1(i) below and Lemma 4.1, and hence it is left to the reader. There is also no useful formula for the derivative of the product of two functions, (4.12)
RL α a Dx (f g)
α RL α = (RL a Dx f )g + f a Dx g,
and accordingly one can expect that many useful tools derived directly from this, e.g., the basic integration by parts formula or Green’s identities useful for the theory of pdes are either invalid or require substantial modification. The following result gives the sequel to the fundamental theorem of calculus for the Riemann–Liouville fractional derivative. We state the result only for the left sided fractional derivative, and the formulae for right sided
96
4. Fractional Calculus
versions are similar. The result indicates that the Riemann–Liouville fracα α tional derivative RL a Dx is the left inverse of the Abel fractional integral a Ix 1 on L (D), but generally is not a right inverse. The definition of the Sobolev space W k,p can be found in Section A.2. Theorem 4.1. Let α > 0, n − 1 < α ≤ n. Then (i) For any f ∈ L1 (D), RL α α a D x a Ix f
= f.
(ii) If a Ixn−α f ∈ W n,1 (D), then = f (x) −
α RL α a Ix a Dx f (x)
n−1
− a)α−k−1 . Γ(α − k)
(x RL α−k−1 f (a+ ) a Dx
k=0
Proof. If f ∈ L1 (D), then by Lemma 4.2, a Ixα f ∈ L1 (D), and by Lemma 4.1 we obtain a Ixn−α a Ixα f = a Ixn f ∈ W n,1 (D). Upon differentiating, we obtain assertion (i). If a Ixn−α f ∈ W n,1 (D), by the characterisation of the space W n,1 (D), we deduce n−1 (x − a)k n−α + a Ixn ϕf f= ck (4.13) a Ix Γ(k + 1) k=0
for some ϕf ∈
L1 (D),
k−n+α f (a+ ). Consequently, where we set ck = RL a Dx α RL α a Ix a D x f
(4.14)
= a Ixα ϕf .
Now applying a Ixα to both sides of (4.13) and Lemma 4.1, we obtain n a Ix f
=
n−1 k=0
ck
(x − a)α+k + a Ixα+n ϕf . Γ(α + k + 1)
Differentiating both sides n times gives f=
n−1 k=0
ck
(x − a)α+k−n + a Ixα ϕf , Γ(α + k − n + 1)
and this together with (4.14) yields assertion (ii).
From the above we can obtain the following fractional integration by parts formula for the Riemann–Liouville fractional derivative which, as might be expected, turns out to be very useful. Lemma 4.5. Let α > 0, p, q ≥ 1, and p−1 + q −1 ≤ 1 + α (p, q = 1 if p−1 + q −1 = 1 + α). Then if f ∈ a Ixα (Lp (D)) and g ∈ x Ibα (Lq (D)), there holds b
b
α g(x)RL a Dx f (x) dx = a
a
α f (x)RL x Db g(x) dx.
4.2. Fractional derivatives
97
Proof. Since f ∈ a Ixα (Lp (D)), f = a Ixα ϕf for some ϕf ∈ Lp (D), and similarly g = x Ibα ϕg for some ϕg ∈ Lq (D). By Theorem 4.1(i), the desired identity is equivalent to b b ϕf (x)(x Ibα ϕg )(x) dx = (a Ixα ϕf )(x)ϕg (x) dx, a
a
which holds according to Lemma 4.3.
The next result gives the Laplace transform of the Abel fractional integral and the Riemann–Liouville derivative. Lemma 4.6. Let α > 0, f ∈ L1 (0, b) for any b > 0, and |f (x)| ≤ cep0 x for all x > b > 0 hold for some constants c and p0 > 0. (i) The following relation is valid for any Re(z) > p0 : L[0 Ixα f ](z) = z −α L[f ](z). α+k−n f (x) = 0, (ii) If f ∈ W n,1 (0, b) for any b > 0 and limx→∞ RL 0 Dx then for any Re(z) > p0 , α L[RL 0 Dx f ](z)
= z L[f ](z) − α
n−1
α+k−n z n−k−1 (RL f )(0+ ). 0 Dx
k=0
Proof. The growth condition on f ensures that the Laplace transform is well-defined for Re(z) > p0 . To see (i), we write the Abel fractional integral in convolution form, 0 Ixα f (x) = (ωα ∗ f )(x), where ωα (x) = xα−1 /Γ(α). It can be verified that L[ωα ] = z −α , and part (i) follows from the convolution rule for Laplace transforms. Likewise, we write RL α 0 Dx f (x)
= g (n) (x) and
g(x) = 0 Ixn−α f (x).
Then L[g](z) = z α−n L[f ](z). Again by the convolution rule, we deduce α (n) L[RL ](z) = z n L[g](z) − 0 Dx f (x)](z) = L[g
n−1
z n−1−k g (k) (0+ )
k=0
= z α L[f ](z) −
n−1
α+k−n z n−1−kRL f (0+ ), 0 Dx
k=0
which shows the desired formula.
α involves noninteger derivatives The Laplace transform L[RL 0 Dx f ] at x = 0. This implies that when applying Laplace transform techniques to solve a fractional differential equation, such initial conditions have to be specified. Given the weak formulation under which they have been derived, the physical interpretation of such initial conditions is generally unclear. RLD α+k−n f x 0
98
4. Fractional Calculus
Last, we give an alternative representation of the Riemann–Liouville fractional derivative, which is useful in numerical approximation. Lemma 4.7. If α ∈ (0, 1), then for x > a, f ∈ W 1,1 (D) x 1 (x − a)−α RL α + (x − s)−α−1 (f (s) − f (x)) ds. a Dx f (x) = f (x) Γ(1 − α) Γ(−α) a Proof. First we recall the identity
x 1 (x − a)1−α + = f (x) (x − s)−α (f (s) − f (x)) ds. Γ(2 − α) Γ(1 − α) a Upon differentiation and noting the derivative of the integral is given by x (x − a)1−α 1 , (x − s)−α−1 (f (s) − f (x)) ds − f (x) Γ(−α) a Γ(2 − α) the desired identity follows. 1−α f (x) a Ix
Remark 4.4. Lemma 4.7 can be extended to the case α ∈ (1, 2). 4.2.2. Djrbashian–Caputo fractional derivative. Now we turn to the so-called Djrbashian–Caputo fractional derivative, more commonly known as the Caputo fractional derivative in the literature. Although the possibility of this alternative version has been known since the nineteenth century, it was rigorously and extensively studied by the Armenian mathematician Mkhitar M. Djrbashian in the late 1950s (see the monograph [79] (in Russian) for a summary, and see also [81]). There was a considerable amount of work on this form of the derivative by others, but mostly only available in the Russian literature. Then the geophysicist Michele Caputo rediscovered this version in 1967 [52] as a tool for understanding seismological phenomena, and later together with Francesco Mainardi in viscoelasticity where the memory effect of these derivatives is crucial. [53, 54] Currently, it is the most popular fractional derivative for time-dependent problems in no small part due to is ability to handle initial conditions in a more physically transparent way than the Riemann–Liouville formulation. Definition 4.3. For f ∈ L1 (D), the left sided Djrbashian–Caputo fractional α derivative of order α, denoted by DC a Dx f , is defined by x 1 DC α n−α (n) f )(x) = (x − s)n−α−1 f (n) (s) ds, a Dx f (x) := (a Ix Γ(n − α) a if the integral on the right-hand side exists. Likewise, the right sided α Djrbashian–Caputo fractional derivative of order α, denoted by DC x Db f , is defined by b (−1)n n−α (n) DC α n := D f (x) (−1) ( I f )(x) = (s − x)n−α−1 f (n) (s) ds, x b x b Γ(n − α) x if the integral on the right-hand side exists.
4.2. Fractional derivatives
99
Remark 4.5. It is worth pointing out that the limits of the Djrbashian– α + − Caputo fractional derivative DC a Dx f as α approaches (n−1) and n are not quite as expected. Actually, by integration by parts, for sufficiently regular f , we deduce DC α a Dx f (x)
→ (a Ix1 f (n) )(x) = f (n−1) (x) − f (n−1) (a+ ) as α → (n − 1)+ ,
DC α a Dx f (x)
→ f (n) (x) as α → n− .
Hence one has to be even more cautious when approximating these fractional derivatives. The Djrbashian–Caputo derivative is clearly more restrictive than the Riemann–Liouville version since it requires the nth order classical derivative to be absolutely integrable. When using the Djrbashian–Caputo derivative, one always implicitly assumes that this condition holds. Generally, the Riemann–Liouville and Djrbashian–Caputo derivatives are different, i.e., α DC α (RL a Dx f )(x) = ( a Dx f )(x),
even when both derivatives are defined. To see this, consider the constant function f (x) = 1, then the Djrbashian–Caputo fractional derivative DCD α f (x) = 0 like the integer-order derivatives, while RLD α f (x) is nonzero a x a x by (4.8). Nonetheless, as must be expected, they are closely related to each other. Theorem 4.2. Let f ∈ W n,1 (D). Then for the left sided fractional derivatives, there holds (4.15)
α (RL a Dx f )(x)
=
α (DC a Dx f )(x) +
n−1 k=0
(x − a)k−α (k) + f (a ), Γ(k − α + 1)
and likewise for the right sided derivatives, there holds α DC α (RL x Db f )(x) = ( x Db f )(x) +
n−1 k=0
(−1)k (b − x)k−α (k) − f (b ). Γ(k − α + 1)
To begin the proof we first state a lemma and use the notation that Dx represents the first order derivative and a Ix the first order integral. Lemma 4.8. For any β > 0, f ∈ W 1,1 (D), there holds (Dx a Ixβ − a Ixβ Dx )f (x) = f (a)
(x − a)β−1 . Γ(β)
Proof. By the fundamental theorem of calculus, x I D f (x) = f (s) ds = f (x) − f (a), a x x a
100
4. Fractional Calculus
i.e., f (x) = a Ix Dx f (x) + f (a). Thus, by applying a Ixβ to both sides and using Lemma 4.1, we deduce β a Ix f (x)
= a Ixβ+1 Dx f (x) + f (a)(a Ixβ 1)(x),
which upon differentiation yields the desired identity.
Proof of Theorem 4.2. The case α ∈ (0, 1) is trivial by setting β = 1 − α in Lemma 4.8. Now for α ∈ (1, 2), i.e., n = 2, Lemma 4.8 with β = 2 − α yields (x − a)1−α RL α 2 2−α 2−α f (x) = Dx a Ix Dx f (x) + f (a) a Dx f (x) = Dx a Ix Γ(2 − α) = a Ix2−α Dx2 f (x) + Dx f (a)
x−α (x − a)1−α + f (a) , Γ(2 − α) Γ(1 − α)
thereby showing (4.15) for n = 2. The general case follows in a similar manner. Together with the differentiation formula (4.7) for powers, we find immediately n−1 (x − a)k DC α RL α (k) + f (a ) , a Dx f (x) = a Dx f (x) − k! k=0 n−1 (−1)k (b − x)k DC α RL α (k) − f (b ) . x Db f (x) = x Db f (x) − k! k=0
Hence the Djrbashian–Caputo fractional derivative can be regarded as the Riemann–Liouville fractional derivative with an initial correction of its Taylor expansion of order n−1 at the initial point x = a (or x = b). This can be viewed as a form of regularisation to remove the singular behaviour at the endpoint, cf. (4.8). Very often, these relations are employed as the defining identities for the Djrbashian–Caputo derivatives. The relations also imply the identities, (4.16)
DC α a Dx f (x)
α = RL a Dx f (x),
if f (j) (a) = 0, j = 0, 1, . . . , n − 1,
DC α x Db f (x)
α = RL x Db f (x),
if f (j) (b) = 0, j = 0, 1, . . . , n − 1.
α γ As previously, we compute DC a Dx (x−a) , γ > n−1, to gain some insight. The condition γ > n − 1 ensures that the nth derivative inside the integral is indeed integrable, so that the Abel integral operator can be applied. Then a straightforward computation shows
(4.17)
DC α a Dx (x
− a)γ =
Γ(γ + 1) xγ−α . Γ(γ + 1 − α)
4.2. Fractional derivatives
101
For the case γ ≤ n − 1, the Djrbashian–Caputo fractional derivative is generally undefined, except for γ = 0, 1, . . . , n − 1, for which it vanishes identically, i.e., DC α a Dx (x
− a)j = 0,
j = 0, 1, . . . , n − 1.
This immediately implies DC α a Dx f (x)
=
DC α a Dx g(x)
⇔ f (x) = g(x) +
n
cj (x − a)n−j .
j=1
Example 4.6. Let us repeat Example 4.3 for the Djrbashian–Caputo fractional derivative with f (x) = eλx , λ ∈ R. Then for α > 0 (with n = [k]), DC α 0 Dx f (x)
=
DC α 0 Dx
∞ λk xk k=0
=
∞ k=n
λk
k!
=
∞
λk
DCD α xk x 0
k=0
k!
xk−α . Γ(k − α + 1)
Note that this derivative is completely different from the Riemann–Liouville version. Remark 4.6. Analogously to Example 4.4 one can show the following identity: given α > 0 and λ ∈ R, we have DC α α 0 Dx Eα,1 (λx )
= λEα,1 (λxα ),
(λxα )k where Eα,1 (λxα ) = ∞ k=0 Γ(kα+1) is the Mittag-Leffler function, cf. Section 3.4. Hence the function Eα,1 (λxα ) is invariant under the Djrbashian–Caputo α fractional derivative DC 0 Dx . Just as in the Riemann–Liouville case, neither the composition rule nor the product rule hold for the Djrbashian–Caputo fractional derivative. We leave the details to readers in the following statements. Lemma 4.9. There exists f such that DC α DC β a Dx a Dx f (x)
β DC α DC α+β = DC f (x). a Dx a Dx f (x) = a Dx
Lemma 4.10. Let f ∈ C 2 (D) and let 0 < α, β < 1, α + β < 1. Then DC β DC α a Dx ( a Dx f )
α+β = DC f. a Dx
Lemma 4.11. Let f ∈ C 2 (D), α ∈ (1, 2). Then α
DC α 2 2 (DC a Dx ) f (x) = a Dx f (x) −
f (a) (x − a)1−α . Γ(2 − α)
102
4. Fractional Calculus
Theorem 4.3 gives the fundamental theorem of calculus for the Djrbashian–Caputo fractional derivative. Part (i) shows that when α ∈ / N, the Djrbashian–Caputo fractional differential operators provide a left inverse to the Abel fractional integral operator a Ixα . However, it is not a left inverse when α ∈ N. Theorem 4.3. Let α > 0, n − 1 < α < n, n ∈ N. Then (i) If f ∈ L∞ (D), then (ii) If f ∈ W n,1 (D) ∩
DC α α a D x a Ix f = C n−1 (D), then
α DC α a Ix a Dx f (x) = f (x) −
n−1 k=0
Proof. Since f ∈
L∞ (D),
f.
f (k) (0) (x − a)k . k!
there holds
(a Ixα f )(k) (x) = (a Ixα−k f )(x),
k = 0, 1, . . . , n − 1.
Similarly, for almost every x ∈ D, we have
f L∞ (D) (x − a)α−k , |(a Ixα−k f )(x)| ≤ Γ(α − k + 1)
k = 0, 1, . . . , n − 1,
and hence (a Ixα−k f )(a+ ) = 0 for k = 0, 1, . . . , n − 1. Then by identity (4.15), α α RL α α we deduce DC a Dx a Ix f = a Dx a Ix f = f , cf. Theorem 4.1(i). This shows assertion (i). Next, if f ∈ W n,1 (D), then by the semigroup property (4.1), we deduce α α n−α (n) (a Ixα DC f )(x) = (a Ixn f (n) )(x), a Dx )f (x) = (a Ix a Ix
from which part (ii) of the assertion follows.
We note that the conditions in Theorem 4.3 in the Djrbashian–Caputo situation are more restrictive than that for the Riemann–Liouville case; this is to be expected from the more restrictive definition of the former case. Lemma 4.12 provides the Laplace transform of the Caputo fractional α + As in the case derivative DC 0 Dx , defined over the positive real line R . of integer order derivatives, it is useful for solving fractional differential equations. Lemma 4.12. Let α > 0, n − 1 < α ≤ n, n ∈ N, f ∈ C n (R+ ), f ∈ W n,1 (0, b), for any b > 0, and let |f (x)| ≤ cep0 x for all x > b > 0 hold for some constants c and p0 > 0. Assume that the Laplace transforms f(z) (k) (x) = 0 for k = 0, 1, . . . , n − 1. Then the (n) (z) exist, and lim and f' x→∞ f following relation holds: α α L[DC 0 Dx f ](z) = z f (z) −
n−1 k=0
z α−k−1 f (k) (0).
4.2. Fractional derivatives
103
α n−α g(x) for g(x) = f (n) (x). Proof. For n − 1 < α < n, DC 0 Dx f (x) = 0 Ix Then by the convolution rule for Laplace transforms, we deduce n−1 α −(n−α) α−n n L[DC D f ](z) = z g (z) = z z n−1−k f (k) (0) f (z) − z x 0 k=0
= z α f(z) −
n−1
z α−1−k f (k) (0),
k=0
which shows the desired formula.
α By definition the αth-order Caputo derivative DC a Dx f is defined in terms of the higher-order classical derivative f (n) (x). One can give an alternative representation in terms of the lower-order derivative f (n−1) (x). To this end, we consider the H¨ older space C 0,γ (D) defined by ( ) C 0,γ (D) := f ∈ C(D) : |f (x) − f (y)| ≤ Lγ |x − y|γ ∀x, y ∈ D ,
where γ ∈ (0, 1), Lγ is a nonnegative constant. Then we have the following alternative representation of the Djrbashian–Caputo fractional derivative of order α ∈ (0, 1), analogous to Lemma 4.7 for the Riemann–Liouville case. Lemma 4.13. If f ∈ C 0,γ ([a, b]) for some γ ∈ (0, 1], then for any α ∈ (0, γ), x f (x) − f (s) f (x) − f (a) 1 DC α (4.18) +α ds . a Dx f (x) = α+1 Γ(1 − α) (x − a)α a (x − s) Proof. For α ∈ (0, 1), we have x DC α (x − s)−α f (s) ds Γ(1 − α) a Dx f (x) = a x d (x − s)−α (f (s) − f (x)) ds = ds a x f (x) − f (a) +α (x − s)−α−1 (f (x) − f (s)) ds, = (x − a)α a since f ∈ C 0,γ ([a, b]). This shows the desired assertion.
The general case of higher order derivatives can be treated analogously and is left as an exercise for the reader. Lemma 4.14. If f ∈ C n−1,γ (D) with γ + n − 1 > α, then DC α a Dx f (x)
=
f (n−1) (x) − f (n−1) (a) 1 Γ(n − α) (x − a)α+1−n x (n−1) α+1−n f (x) − f (n−1) (s) + ds. Γ(n − α) a (x − s)α+2−n
104
4. Fractional Calculus
The following result gives an extremal principle for the Djrbashian– Caputo derivative. Lemma 4.15. Let f ∈ C 0,γ (D), 0 < α < γ ≤ 1 with f (x) ≤ f (x0 ) for all α x ∈ [a, x0 ] for some x0 ∈ (a, b). Then DC a Dx f (x0 ) ≥ 0. Proof. By Lemma 4.13, for f ∈ C 0,γ (D), there holds for any x ∈ (a, b] x f (x) − f (a) 1 1 DC α +α (f (x) − f (s)) ds . a Dx f (x) = α+1 Γ(1 − α) (x − a)α a (x − s) Since f (x0 ) − f (s) ≥ 0 for all a ≤ s ≤ x0 , the assertion follows.
DCD α f (x ) a x 0
≥ 0
However, one cannot determine the local behaviour of function f near α DC α x = x0 by taking DC a Dx f (x0 ) because a Dx is nonlocal. The next result gives a partial converse to Lemma 4.15. Lemma 4.16. Let f ∈ C 0,γ (D), 0 < α < γ ≤ 1 and (i) f (x) ≤ f (x0 ) for all α a ≤ x ≤ x0 ≤ b, and (ii) DC a Dx (x0 ) = 0. Then f (x) = f (x0 ) ∀a ≤ x ≤ x0 . Proof. By condition (ii) and the fact that both terms in (4.18) are nonnegative at x = x0 due to (i), we obtain x0 f (x0 ) − f (a) = 0 and (x0 − s)−α−1 (f (x0 ) − f (s)) ds = 0. (x0 − a)α a Since f is continuous, it follows directly that f (x0 ) − f (s) = 0 for all s ∈ [a, x0 ]. The extremal property can be further refined as follows [8]: Theorem 4.4. Let α ∈ (1, 2), and let f ∈ C 2 [a, b] attain its minimum at x0 ∈ [a, b]. Then DC α a Dx f (x0 )
≥
(x0 − a)−α [(α − 1)(f (a) − f (x0 )) − x0 f (a)]. Γ(2 − α)
4.2.3. Gr¨ unwald–Letnikov derivative. In calculus, the derivative f (x) of a function f : D → R at a point x ∈ D can be defined by Δh f (x) , h→0 h where Δh f (x) is the backward difference defined by Δh f (x) = f (x) − f (x − h). Similarly one can construct higher order derivatives directly by higher order backward differences. By induction on k, one can verify that f (x) = lim
Δkh f (x)
=
k j=0
C(j, k)(−1)j f (x − jh)
for k ∈ N ∪ {0},
4.2. Fractional derivatives
105
where C(j, k) is the binomial coefficient C(j, k) =
k! Γ(k + 1) = . j!(k − j)! Γ(j + 1)Γ(k − j + 1)
An induction on k shows the identity h h k ··· f (k) (x − s1 − · · · − sk ) ds1 · · · dsk , (4.19) Δh f (x) = 0
0
and thus by the mean value theorem, we obtain Δkh f (x) . h→0 hk
f (k) (x) = lim
(4.20)
It is natural to ask whether we can define a fractional derivative (or integral) in a similar manner, without resorting to the integer-order derivative and integrals? This line of pursuit leads to the Gr¨ unwald–Letnikov definition of a fractional derivative. We can indeed follow the approach of the Riemann–Liouville fractional derivative: by replacing the integer k with a real number α, one may define the fractional backward difference formula of order α, denoted by Δαh,k f , by Δαh,k f (x)
=
k
C(j, α)(−1)j f (x − jh)
j=0
with the binomial coefficient C(j, α) defined analogously C(j, α) =
α−j+1 Γ(α + 1) αα−1 ··· = . 1 2 j Γ(j + 1)Γ(α − j + 1)
In view of the identity (4.20), given x and a, we therefore define GL α a Dx f (x)
= lim
Δαh,k f (x)
, hα where the limit is obtained by sending k → ∞ and h → 0 by keeping h = (x − a)/k, so that kh = x − a is constant. One can show that if n − 1 < α < n and f ∈ C n (D), then x n−1 (x − a)k−α (x − s)n−α−1 (n) GL α (k) + f (s) ds, D f (x) = f (a) a x Γ(k + 1 − α) Γ(n − α) a h→0
k=0
which, in light of Theorem 4.2, gives GL α a Dx f (x)
=
DC α a Dx f (x) +
n−1
f (k) (a)
k=0
(x − a)k−α α = RL a Dx f (x). Γ(k + 1 − α)
Furthermore, for α < 0 GL α a Dx f (x)
= a Ix−α f (x).
106
4. Fractional Calculus
The preceding two identities allow one to use the Gr¨ unwald–Letnikov definition as the numerical approximation of the Abel fractional integrals and Riemann–Liouville derivatives. However, the accuracy of the approximation is at best of first order O(h). This is to be expected due to the backward Euler type of construction of the formula.
4.3. Fractional derivatives in Sobolev spaces In this section, following [165], we provide a discussion of the mapping properties of the Abel integral and Riemann–Liouville fractional differential operators in Sobolev spaces. These properties are important for the analysis of related boundary value problems. Throughout, we denote by D = (0, 1) α (D) and the unit interval, and we will make extensive use of the spaces H α (D), etc., defined in Section A.2. H L The following smoothing property of the fractional integral operators β β I 0 x and x I1 will play a central role in the construction. Theorem 4.5. For any s, β ≥ 0, the operators 0 Ixβ and x I1β are bounded s+β (D) and H s+β (D), respectively. s (D) into H maps from H L R s (D) to a function f ∈ Proof. The key idea of the proof is to extend f ∈ H s ˜ (0, 2) whose moments up to (k − 1)-th order vanish with k > β − 1/2. To H this end, we employ orthogonal polynomials {p0 , p1 , . . . , pk−1 } with respect to the inner product ·, · defined by 2 ((x − 1)(2 − x))l u(x)v(x) dx, u, v = 1
where the integer l satisfies l > s − 1/2 so that s (D)(1, 2), ((x − 1)(2 − x))l pi ∈ H
i = 0, . . . , k − 1.
Then we set wj = γj ((x − 1)(2 − x))l pj with γj chosen so that 2 wj pl dx = δj,l , j, l = 0, . . . , k − 1. 1
Next we extend both f and wj , j = 0, . . . , k − 1, by zero to (0, 2) by setting fe = f −
k−1 j=0
1
f pj dx wj .
0
The resulting function fe has vanishing moments for j = 0, . . . , k − 1, and ˜ s (0, 2). Further, obviously the inequality by construction it is in the space H
fe L2 (0,2) ≤ C f L2 (D) holds, i.e., the extension is bounded in L2 (D).
4.3. Fractional derivatives in Sobolev spaces
107
Denoting by f e the extension of fe to R by zero, for x ∈ (0, 1), we have (0 Ixβ f )(x)
= (−∞ Ixβ f e )(x), where β −∞ Ix (f )
1 = Γ(β)
x −∞
(x − t)β−1 f (t) dt.
We have (see [272, pp. 112, eq. (2.272)] or [195, pp. 90, eq. (2.3.27)]) F (−∞ Ixβ f e )(ω) = (−iω)−β F (f e )(ω)
(4.21)
and hence by Plancherel’s theorem, Theorem A.8, β 2 |ω|−2β |F (f e )(ω)|2 dω. (4.22)
−∞ Ix f e L2 (R) = R
We note that by a Taylor expansion centred at 0, there holds e−iωx −1 − (−iω)x − · · · − =
(−iω)k−1 xk−1 (k − 1)!
(−iωx)k+1 (−iωx)k+2 (−iωx)k + + + · · · = (−iω)k 0 Ixk (e−iωx ). k! (k + 1)! (k + 2)!
Clearly, there holds |0 Ixk (e−iωx )| ≤ |0 Ixk (1)| = xk! . Since the first k moments of f e vanish, multiplying the above identity by f e and integrating over R gives k 1 k −iωx F (f e )(ω) = (−iω) √ )f e (x) dx. 0 Ix (e 2π R Upon noting that supp(f e ) ⊂ (0, 2), we obtain k
2k |F (f e )(ω)| ≤ √ |ω|k fe L2 (0,2) ≤ c|ω|k f L2 (D) . πk! We then have
(1 + |ω|2 )β+s |ω|−2β |F (f e )(ω)|2 dω R ≤ c1 f L2 (D) |ω|−2β+2k dω |ω|1
≤ c f H s (D) . The desired assertion follows from this and the trivial inequality
0 Ixβ f H β+s (D) ≤ −∞ Ixβ f e 2H β+s (R) .
Remark 4.7. By means of the extension in Theorem 4.7, the operator 0 Ixβ s (D) to H β+s (D), and x I β is bounded from H s (D) to is bounded from H 1 L R L β+s (D). H R
108
4. Fractional Calculus
A direct consequence of Theorem 4.5 is the following useful corollary. Corollary 4.1. Let γ be nonnegative. Then the functions xγ and (1 − x)γ β (D) and H β (D), respectively, for any 0 ≤ β < γ + 1/2. belong to H L R Proof. We note the relations xγ = cγ 0 Ixγ (1) and (1 − x)γ = cγ x I1γ (1). The δ (D) and desired result follows from Theorem 4.5 and the fact that 1 ∈ H L δ (D) for any δ ∈ [0, 1/2). 1∈H R
The following result gives the mapping property of the Riemann–Liouville fractional differential operator. β RLD β u Theorem 4.6. For any β ∈ (n − 1, n), the operators RL x 1 0 Dx u and defined for u ∈ C0∞ (D) extend continuously to operators (still denoted by RLD β u and RLD β u) from H β (D) to L2 (D). x x 1 0
Proof. We first consider the left sided case. For v ∈ Cc∞ (R), we define x dn RL β β−n (4.23) (x − t) v(t) dt . −∞ Dx v = dxn −∞ We note the following identity for the Fourier transform (cf. (A.9)), RL β Dx v (ω) = (−iω)β F (v)(ω), i.e., F −∞ (4.24) RL β −1 (−iω)β F (v)(ω) , −∞ Dx v(x) = F which holds for v ∈ Cc∞ (R) (cf. [272, pp. 112, eq. (2.272)] or [195, pp. 90, eq. (2.3.27)]). It follows from Plancherel’s theorem that RL β RL β Dx v L2 (R) = F −∞ Dx v L2 (R) ≤ c v H β (R) .
−∞ RLD β to an operator from H β (R) into Thus, we can continuously extend −∞ x L2 (R) by formula (4.24).
We note that for u ∈ Cc∞ (D), there holds (4.25)
RL β 0 Dx u
RL β =−∞ Dx u|D .
β (D) implies that u is in H β (R) and hence By definition, u ∈ H RL β Dx u L2 (R) ≤ c u H β (D) .
−∞ β Thus, formula (4.25) provides an extension of the operator RL 0 Dx defined on β (D) into L2 (D). Cc∞ (D) to a bounded operator from the space H
4.3. Fractional derivatives in Sobolev spaces
109
The right sided derivative case is essentially identical except for replacing (4.23) and (4.24) with ∞ n RL β n d β−n (t − x) v(t) dt x D∞ v = (−1) dxn x and F
RL β β x D∞ v (ω) = (iω) F (v)(ω).
RLD β v(x) = 0 for x < 1 when v ∈ C β (0, 2) Remark 4.8. We clearly have−∞ x L is supported on the interval [1, 2]. By a density argument this also holds for β (0, 2) supported on [1, 2]. v∈H L
The next result slightly relaxes the condition in Theorem 4.6. β Theorem 4.7. For β ∈ (n − 1, n), the operator RL 0 Dx u defined for u ∈ β (D) to L2 (D), and the β (D) extends continuously to an operator from H C L L β β operator RL x D1 defined for u ∈ CR (D) extends continuously to an operator β (D) to L2 (D). from H R
Proof. We only prove the result for the left sided derivative since the proof β (D), we for the right sided derivative case is identical. For a given u ∈ H L β (D)(0, 2) and then set let u0 be a bounded extension of u to H RL β 0 Dx u
β = RL 0 D x u0 | D .
β It is a consequence of Remark 4.8 that RL 0 Dx u is independent of the extension β β u0 and coincides with the formal definition of RL 0 Dx u when u ∈ CR (D). We obviously have β RL β
RL β (D) . 0 Dx u L2 (D) ≤ −∞ Dx u0 L2 (R) ≤ c u H L
As a counterpart of Theorems 4.6 and 4.7 for the Djrbashian–Caputo derivative, we extract (without detailing the proof) from [203, Theorems 2.1, 2.2, 2.4 and Proposition 2.1] the following norm equivalences. Theorem 4.8. For any α ∈ (0, 1), there exist constants 0 < C(α) < C(α) such that for any w ∈ H α (0, x) (with w(0) = 0 if α ≥ 12 ), the following equivalence estimates hold: α C(α) w H α (0,x) ≤ DC 0 Dx w L2 (0,x) ≤ C(α) w H α (0,x) .
110
4. Fractional Calculus
Likewise, for any w ∈ L2 (0, x) we have
DCD α w x 0
∈ H −α (0, x) and
α C(α) w L2 (0,L) ≤ DC 0 Dx w H −α (0,x) ≤ C(α) w L2 (0,x) .
From this we can conclude the following bound. Lemma 4.17. For any α∈ (0, 2) there exists a constant C(α) > 0 such x w(s) 0 Ix α w(s)ds ≤ C(α) w 2H −α/2 (0,x) . that for any w ∈ L2 (0, x), 0
Proof. Using the Cauchy–Schwarz inequality and 0 Ix α = 0 Ix α/2 0 Ix α/2 , we have
x
w(s) 0 Ix α w(s)ds ≤ (0 Ix α/2 )∗ w L2 (0,x) 0 Ix α/2 w L2 (0,x) ,
0
where the action of the adjoint operator (0 Ix α/2 )∗ can be estimated by
0 I x
α/2 ∗
w L2 (0,x) = ≤ =
sup
ψ∈C0∞ (0,x)
1
ψ L2 (0,x)
x
w(s) 0 Is α/2 ψ(s) ds 0
1 1 sup C(α/2) φ∈C0∞ (0,x) φ H α/2 (0,x) 1
w H −α/2 (0,x) . C(α/2)
x
w(s)φ(s) ds 0
4.3.1. Coercivity estimates. Coercivity is the property of an operator A : V → V ∗ from a normed space into its dual, when it satisfies Av, vV ∗ ,V ≥ γ v 2 for some γ > 0 and all v ∈ V . It is often used in the analysis of variational formulations of differential equations and one of the key assumptions in the famous Lax-Milgram lemma. We now provide some upper and lower bounds that will be useful for deriving energy estimates for pde solutions. The first one is a pointwise bound from below taken from [11, Lemma 1]. Lemma 4.18. For α ∈ (0, 1), x > 0, and any absolutely continuous function w, α 1 DC α 2 w(x) DC 0 Dx w(x) ≥ 2 ( 0 Dx w )(x).
4.3. Fractional derivatives in Sobolev spaces
Proof. With get
DCD α w x 0
= 0 Ix 1−α w = (gα ∗ w )(x) for gα (x) =
111
1 −α , Γ(1−α) x
1 w(x) (gα ∗ w )(x) − (gα ∗ (w2 ) )(x) 2 x (w(x) − w(s))gα (x − s)w (s) ds = 0 x x = w (r) dr gα (x − s)w (s) ds 0 x s r w (r)gα (x − s)w (s) ds dr = 0 0 x r 1 = gα (x − r)w (r) gα (x − s)w (s) ds dr 0 gα (x − r) 0 2
d r 1 1 x dr gα (x − s)w (s) ds = 2 0 gα (x − r) dr 0 2
1 x gα (x − r) r =− g (x − s)w (s) ds dr α 2 0 gα2 (x − r) 0 r 2 x 1 1 gα (x − s)w (s) ds + 2 gα (x − r) 0 0 ≥ 0.
we
From this we can derive the following coercivity estimate with respect to the L2 inner product; see also [187, Lemma 3.1]. α 2 Lemma 4.19. For α ∈ (0, 1), x > 0, and v ∈ W 1,∞ (0, x) with (DC 0 Dx v) ∈ W 1,1 (0, x) the following estimate holds: x 1 DC α α DC α 2
DCD α v 2 2 . 0 Ds v(s) v (s) ds ≥ 0 Ix ( 0 Dx v) (x) ≥ 2Γ(α)x1−α 0 x L (0,x) 0
Proof. We apply Lemma 4.18 with α replaced by 1 − α to w = using the identities
DCD α v, x 0
DC 1−α 1−α DC α DC 1−α 1−α w = DC Ix v = v , 0 Dx 0 Dx 0 Dx v = 0 Dx x x α 2 α 2 1−α 2 d (DC w )(s) ds = 0 Ds ds 0 Ix [w ](s) ds = 0 Ix [w ](x) 0 0 for v ∈ L∞ (0, L) (cf. Theorem 4.3), and for w2 ∈ W 1,1 (0, L)
with that hold DC α w(0) = 0 (cf. Theorem 4.2). Note that for w = 0 Dx v we automatically have w(0) = 0 and 0 Ix α [w2 ](0) = 0. An integration with respect to time then implies the assertion. 1−α α v , Lemma 4.19 can also be viewed as a coercivity Since DC 0 D x v = 0 Ix estimate for the Abel integral operator. The following stronger result of this type has been obtained in [89, Lemma 2.3]; see also [337, Theorem 1].
112
4. Fractional Calculus
Lemma 4.20. For α ∈ (0, 1), x > 0, and w ∈ H −α/2 (0, x), x w(s)0 Is α w(s) ds ≥ cos(πα/2) w 2H −α/2 (0,x) . 0
Proof. We first assume w ∈ L2 (0, x) and approximate 0 Ix α w = g1−α ∗ w 1 xα−1 by 0 Ix α w = g1−α, ∗ w with g1−α, (x) = g1−α, (x)e−x with g1−α (x) = Γ(α) for > 0 and extend w ∈ L2 (0, x) to all of R by zero. Plancherel’s theorem and the convolution theorem then yield x α w(s)0 Is w(s) ds = F w(ω)F [g1−α, ∗ w](ω) dω 0 R = |F w(ω)|2 F g1−α, (ω) dω . R
Here both sides have to be real valued, because obviously the left-hand side is, and therefore x α w(s)0 Is w(s) ds = |F w(ω)|2 Re F g1−α, (ω) dω, 0
R
where, with the principal value of the power function, Re F g1−α, (ω) = Re( + iω)−α > Re(iω)−α = cos(πα/2) |ω|−α. Letting → 0 and recalling that w 2H −α/2 (0,x) = R |F w(ω)|2 (1+ω 2 )−α/2 dω,
we can conclude inequality (7.8) for w ∈ L2 (0, x). Now the assertion follows for general w ∈ H −α/2 (0, x) by a density argument.
4.4. Notes The encyclopedic and authoritative treatment of fractional calculus is the monograph by Samko, Kilbas, and Mrichev [308]. The readers are also referred to [78, 195, 272] for extensive discussions on fractional calculus. Example 4.5 is taken from [78, pp. 30]. The alternative definition of the Djrbashian–Caputo derivative and the extremal principle in Section 4.2 are taken from [39]. A slightly weaker version of the extremal principle was first given by Luchko [232]. A detailed discussion on the connections between Gr¨ unwald–Letnikov and Riemann–Liouville definitions can be found in [272, Section 2.2]. A useful reference for many of the other versions of fractional derivatives (and there are many, although of more limited scope than those three we have featured) can be found in [146].
Chapter 5
Fractional Ordinary Differential Equations
Since it is an obvious generalisation of a basic problem in mathematics, it is not surprising that there is early work on initial-value problems for fractional order ordinary differential equations (odes). Equally, one might expect that the approach would be to make the transformation to an integral equation and use the contraction mapping theorem to obtain a unique solution when the right-hand side f (x, y) is Lipschitz in the second variable as we also do in this chapter. The initial work here goes back to 1938 by Pitcher and α Sewell [270, 271] for RL a Dx u = f (x, u) when 0 < α < 1. The linear case can be found in the paper by Barrett [24] from 1954, who showed existence and uniqueness by essentially the idea noted above. Djrbashian and Nersesyan [82] gave a generalisation of this to include coefficients coupling each fractional derivative. A more complete history of fractional ordinary differential equations can be found in the book [195] which is encyclopedic in its coverage of this topic. We split the exposition into two obvious cases corresponding to the Riemann–Liouville and the Djrbashian–Caputo derivative versions, as well as into initial and boundary value problems.
5.1. Initial value problems A ubiquitous problem in mathematics is the initial value problem for an ordinary differential equation, namely, given an integer n and an initial 113
114
5. Fractional Ordinary Differential Equations
vector of values {ck }n−1 k=0 , for a given f = f (x, u) find u(x) such that dn u dk ** (5.1) = f (x, u), x ∈ (a, b), u* = ck , k = 0, . . . , n − 1. dxn dxk x=a It is well known [58] that if f (x, u) is continuous on [a, b] with respect to x and Lipschitz continuous in u, then for some X > a there exists a unique solution to (5.1) in [a, X]. Further, if f (x, u) is uniformly Lipschitz continuous in u (|f (·, u1 ) − f (·, u2 )|∞ ≤ L|u1 − u2 | for some L independent of u), then a unique solution exists for x ∈ [a, b]. Furthermore, this solution can be computed by converting (5.1) to the integral equation x n−1 ck 1 xk , (x−t)n−1 f (t, u(t)) dt, where u0 (x) = u(x) = u0 (x)+ (n − 1)! a k! k=0
and solving by successive approximations. Of particular interest is the situation when f is linear in the second variable and we have the Cauchy problem dk u ** dn u − μu = g(x), x ∈ (a, b), (5.2) * = ck , k = 0, . . . , n − 1, dxn dxk x=a where μ is real. In this case we can use the linearity of the problem to obtain the explicit integral representation of the solution in terms of g and a sum of the homogeneous problem. of the fundamental set of solutions {vk }n−1 0 Thus for n = 2 and μ < 0, we have the familiar solution x √ √ √ u(x) = c0 cos( −μx) + c1 sin( −μx) + sin( −μ(x − t))g(t) dt. 0
Lp (a, b)
seeking a weak solution in the We can also set (5.1) in the space Sobolev space W n,p (a, b). Here, Young’s convolution inequality (A.15) gives the condition required by the function g. 5.1.1. The Riemann–Liouville derivative. Consider the Cauchy problem for the Riemann–Liouville differential equation, (5.3)
α (RL a Dx − μ)u = g,
α−k (RL u)(a) = ck , a Dx
k = 1, . . . , n,
taking n = α, μ ∈ R, and g ∈ C(a, b). For the initial value problem (5.3) the following holds. Theorem 5.1. If u satisfies (5.3), where a Ixα (g) ∈ L1 (a, b), then it has the representation n cj (x − a)α−j Eα,α−j+1 (μ(x − a)α ) u(x) = j=1 (5.4) x
(x − t)α−1 Eα,α (μ(x − t)α )g(t) dt.
+ a
5.1. Initial value problems
115
Proof. There are several ways to show this. We will take an approach that indicates the path to a more general equation. We claim that u must satisfy the equivalent integral equation (5.5) x n cj 1 α−j u(x) = (x − a) + (x − t)α−1 [μu(t) + g(t)] dt. Γ(α − j + 1) Γ(α) a j=1
This follows directly from the observation that inversion of the Riemann– Liouville derivative with zero initial conditions is equivalent to integration by the fractional-power convolution with xα−1 as in Theorem 4.1. The verification of the initial terms follows directly from (4.7). In order to solve (5.5), we use the method of successive approximations familiar from ordinary differential equations, that is, with
u0 (x) =
(5.6)
n
cj (x − a)α−j , Γ(α − j + 1)
j=1
we set 1 um (x) = u0 (x) + Γ(α)
x
(x − t)α−1 [μum−1 (t) + g(t)] dt,
a
for m ∈ N. Due to the linearity of (5.3), it suffices to treat the equation essentially term-by-term. Thus we first look at the case g = 0, but with the initial conditions as given. We then follow with the case of nontrivial g but with zero initial conditions. If we take g = 0, then we have the initial approximation u0 as in (5.6), and for each m ≥ 1 we recursively set um (x) = u0 (x) +
μ Γ(α)
x
(x − t)α−1 um−1 (t) dt.
a
Then the next iteration u1 (x) is u1 (x) = u0 (x) + μ(a Ixα u0 )(x) or expanding out the second term then collecting into powers of x − a,
u1 (x) =
n j=1
cj
2 μk−1 (x − a)αk−j k=1
Γ(αk − j + 1)
.
116
5. Fractional Ordinary Differential Equations
Continuing, since u2 (x) = u0 (x) + μ(a Ixα u1 )(x), we can again collect terms and compute u2 to be u2 (x) =
n j=1
cj (x − a)α−j Γ(α−j +1)
n 2 + μ cj
(5.7)
j=1
=
n
cj
j=1
k=1
3 k=1
μk−1 α αk−j (x) a Ix (x − a) Γ(αk−j +1)
μk−1 (x − a)αk−j , Γ(αk − j + 1)
where we have made an index shift k → k − 1. This continues in the same manner to give um (x) =
n
cj
j=1
m+1 k=1
μk−1 (x − a)αk−j . Γ(αk − j + 1)
We can take the limit as m → ∞ under the summation sign as the series converges uniformly to obtain (5.8) ∞ n n μk−1 (x − a)αk−j = cj cj (x − a)α−j Eα,α−j+1 (μ(x − a)α ), u(x) = Γ(αk − j + 1) j=1
j=1
k=1
where we have used the series definition of the Mittag-Leffler function. Now take cj = 0, j = 1, . . . , n, so that u0 = 0 and for g ∈ C(a, b) we have u1 (x) = (a Ixα g)(x). Then
(5.9)
u2 (x) = μ(a Ixα u1 )(x) + (a Ixα g)(x) = μ(a Ixα a Ixα g)(x) + (a Ixα g)(x) x 2 μk−1 (x − t)αk−1 g(t) dt. = Γ(αk) a k=1
Again continuing, we obtain x m μk−1 (x − t)αk−1 g(t) dt. um (x) = Γ(αk) a k=1
Taking the limit as m → ∞ gives the contribution here as (5.10) x x ∞ μk−1 (x−t)αk−1 g(t) dt = (x−t)α−1 Eα,α (μ(x−t)α )g(t) dt. u(x) = Γ(αk) a a k=1
Combining both solutions (5.8) and (5.10) gives the general form, as required.
5.1. Initial value problems
117
We now turn to the nonlinear equation (5.3) in the case 0 < α ≤ 1 and so (5.11)
RL α a Dx u
= f (x, u),
α−1 (RL u)(a) = (a Ix1−α u)(a) = c. a Dx
+ α,1 (a, b), where We say that u is a solution to (5.11) if u ∈ W (5.12)
α 1 + α,1 (a, b) = {v ∈ L1 (a, b) : RL W a Dx v ∈ L (a, b)},
α the equation RL a Dx u(x) = f (x, u(x)) holds for almost every x ∈ (a, b), and the initial condition (a Ix1−α u)(a) = c is satisfied.
From representation (5.5) with μ = 0 and g(x) = f (x, u(x)) we obtain x 1 α−1 c(x − a) + (x − t)α−1 f (t, u(t)) dt . (5.13) u(x) = Γ(α) a We have to show that the solution to (5.13) is equivalent to that of (5.11). This follows by applying the the left sided Abel fractional integral α a Ix defined in equation (4.3) and using Theorem 4.1. The details are left to the reader but can be found in many standard references such as [195, 272]. We assume the function f : [a, b) × R → R is continuous in x for some b > a and satisfies a Lipschitz condition in u, that is |f (x, u) − f (x, v)| ≤ L|u − v|, where L is independent of x ∈ [a, b). These are the standard assumptions familiar from odes. Our first task is to show an existence and uniqueness result for (5.11). Theorem 5.2. Assume f satisfies the above conditions, then there exists a + α,1 (a, b) to (5.11) on [a, b). unique solution u ∈ W Proof. The proof closely parallels that for the classical situation despite the fact that the differential operator in the fractional case is nonlocal. The reason is we choose to use the integral representation (5.13) which differs form the case α = 1 only due to the weakly singular kernel in the nonlinear Volterra integral formulation. We will show that the sequence of iterates {uk (x)} defined by u0 (x) = (5.14)
c (x − a)α−1 , Γ(α)
1 uk+1 (x) = T [uk (x)] := u0 (x) + Γ(α)
x
(x − t)α−1 f (t, uk (t)) dt
a
converges by applying the contraction mapping theorem to the operator T . + α,1 (a, b) with D α u0 = D 1 a I 1−α u0 = It follows immediately that u0 ∈ W x 1 D c = 0. Due to our assumptions on f , we also have that g := f (·, uk ) ∈ L1 (a, b) and thus the integral in (5.14), which is just a Ixα g(x), lies in
118
5. Fractional Ordinary Differential Equations
+ α,1 (a, b) for each k. Now by (A.15), Young’s + α,1 (a, b). Hence T [uk ] ∈ W W convolution inequality,
T [y1 ] − T [y2 ] L1 (a,x1 ) ≤ a Ixα [f (·, y1 ) − f (·, y2 )] L1 (a,x1 ) ≤ L a Ixα [|y1 − y2 |] L1 (a,x1 ) ≤
L(x1 −a)α
y1 − y2 L1 (a,x1 ) . Γ(α+1)
1 −a) If we take x1 such that L(x Γ(α+1) ≤ ω < 1, then we see that T is a contraction on the space L1 (a, x1 ). This shows the existence of a unique solution u in the complete space L1 (a, x1 ). The next step is to show that in fact u ∈ W α,1 (a, x1 ). We have uk − u L1 → 0 as k → ∞ and so α
α RL α
RL a Dx uk − a Dx u L1 = f (·, um ) − f (·, u) L1 ≤ L um − u L1
showing that the sequence {um } converges to u in W α,1 (a, x1 ) and by the completeness of this space that u ∈ W α,1 (a, x1 ). Finally, we extend the solution of the Volterra integral equation in the usual manner successively to the intervals [x1 , a + 2(x1 − a)],
[a + 2(x1 − a), a + 3(x1 − a)], . . . ,
using the fact that L is a uniform Lipschitz constant on (a, b).
Remark 5.1. In the above we assumed that f (x, u) is continuous in the first variable. In fact, Theorem 5.2 can be extended to assuming only that f is integrable in the first variable. Here is the fractional version of the standard stability result in terms of the initial values from odes. We expand this to include the fractional index of differentiation. Specifically, we assume that u satisfies (5.11) with initial value c and fractional index α while u ˜ satisfies (5.11) with initial value c˜ and fractional index α ˜. Theorem 5.3. Assume 0 < δ ≤ α ˜ ≤ α ≤ 1 for some δ > 0, and that the function f : [a, b) × R → R is continuous in x with bound sup[a,b] |f | = M and satisfies a Lipschitz condition in u, that is, (5.15)
|f (x, u) − f (x, v)| ≤ L|u − v| ,
where L is independent of u, v for x ∈ [a, b). Let u and u ˜ be solutions of (5.11) with initial values c and c˜ and fractional indices α and α ˜, α ˜ ≤ α, respectively. Then (5.16) x
|u(x) − u ˜(x)| ≤ a(x) + L 0
˜ α ˜ (x − t)α−1 Eα, ˜α ˜ (L(x − t) )a(t) dt,
a ≤ x < b,
5.1. Initial value problems
119
where a(x) = a1 (x, α, α ˜ )|c−˜ c|+a2 (x, α, α ˜ , c, c˜)|α− α| ˜ such that a1 and a2 are 1 1 L functions of x with L norm bounded for bounded c, c˜. Thus the solution of (5.11) depends continuously on α and the initial value c. Proof. We have
x 1 , α−1 u(x) = c(x − a) + (x − t)α−1 f (t, u(t)) dt, Γ(α) a x 1 , α−1 ˜ ˜ c˜(x − a) + (x − t)α−1 f (t, u ˜(t)) dt. u ˜(x) = Γ(α ˜) a
The strategy is to look at the difference u − u ˜ and break this up into like terms in α and α ˜ and u and u ˜. * c * c˜ * ˜ * |u(x) − u ˜(x)| ≤* (x − a)α−1 − (x − a)α−1 * Γ(α) Γ(α ˜) * * x (x − t)α−1
˜ (x − t)α−1 * * − f (t, u(t)) dt* +* Γ(α) Γ( α ˜ ) (5.17) a * * 1 x * * ˜ (x − t)α−1 [f (t, u(t)) − f (t, u ˜(t))] dt)* +* Γ(α ˜) a =:J0 + J1 + J2 . Here
(x − a)α−1 ˜ (x − a)α−1 (x − a)α−1 + |˜ c| − , Γ(δ) Γ(α) Γ(α ˜) x* * ˜ (x − t)α−1 * * (x − t)α−1 − J1 ≤ M * dt, * Γ(α) Γ(α ˜) a x L ˜ J2 ≤ (x − t)α−1 |u(t) − u ˜(t)| dt. Γ(α ˜) a J0 ≤ |c − c˜|
Now set a(x) = J0 + J1 and so (5.18)
a(x) ≤ a1 (x)|c − c˜| + a2 (x)|α − α ˜ |,
where a1 L1 (a,b) and a2 L1 (a,b) are bounded for bounded c, c˜. Applying Gronwall’s inequality, Lemma A.1, then gives the estimate (5.16). 5.1.2. The Djrbashian–Caputo derivative case. Consider first of all the linear Cauchy problem (5.19)
α (DC a Dx − μ)u = g,
dk u (a) = ck , dxk
where n − 1 < α < n, n ∈ Z, μ ∈ R.
k = 0, . . . , n − 1,
120
5. Fractional Ordinary Differential Equations
Theorem 5.4. If u satisfies (5.19), where Ixα g ∈ L1 (a, b), then it has the representation (5.20) x n−1 j α u(x) = cj (x −a) Eα,j+1 (μ(x −a) ) + (x −t)α−1 Eα,α (μ(x −t)α )g(t) dt. a
j=0
Proof. As in the Riemann–Liouville case, (5.19) can be converted to an integral equation and treated by Banach’s contraction principle. We shall use an alternative approach: taking the Laplace transform in x. It is easily seen that a simple change of variables allows us take the endpoint a to be a = 0 without loss of generality. Then using Lemma 4.12 we see that (5.19) becomes n−1 u(s) = cj sα−1−j + gˆ(s), (sα − μ)ˆ j=0
where s is the transform variable and u ˆ = (Lv)(s). We now just need the γ inverse transforms of the functions s /(sα − μ) with γ set to be γ = 0, γ = α − 1 − j. These are available directly from (3.28) and together with the convolution theorem for Laplace transforms, they give (5.20). Remark 5.2. It is clear from the Abel integral equation formulation for both the Riemann–Liouville and Djrbashian–Caputo cases that the regularity of u or v depends on that for g through the convolution with tα−1 . Just as in the Riemann–Liouville case this can be extended to a nonlinear equation. We replace (5.19) by (5.21)
α (DC a Dx − μ)u = f (x, u),
dk u (a) = ck , dxk
k = 0, . . . , n − 1.
This leads to the nonlinear integral equation x (x − t)α−1 Eα,α (μ(x − t)α )f (t, u(t)) dt u(x) = u0 (x) + a x (5.22) K(x − t; α)f (t, u(t)) dt, =: a
n−1
where u0 (x) = j=0 cj (x − a)j Eα,j+1 (μ(x − a)α ). If we assume that f (x, v) lies in C((a, b) × R) and is Lipschitz with respect to the second variable (5.15), then (5.22) can be solved by successive approximations in the usual manner x K(x − t; α)f (t, uk−1 (t)) dt. uk (x) = u0 (x) + a
5.2. Boundary value problems
121
As in the Riemann–Liouville case, the contraction mapping theorem then guarantees the existence of a unique fixed point, first of all on a sufficiently small interval [a, x1 ], but again we can extend to [x1 , a + 2(x1 − a)], [a + 2(x1 − a), a + 3(x1 − a)],. . . , using uniformity of the Lipschitz constant. The tricky part is showing that the solution of the integral equation and the initial value problem are equivalent, but the same ideas hold as in the Riemann–Liouville case. We leave this for the reader noting that the proof can be found in [195, Chap. 3].
5.2. Boundary value problems This is motivated, among others reasons, by superdiffusion processes derived in Section 2.1. We consider, for example, the following one-dimensional superdiffusion model,
(5.23)
∂u(x, t) = 0 Dxα u(x, t) + f (x, t) in (x, t) ∈ (0, 1) × (0, T ], ∂t u(0, t) = u(1, t) = 0, t ∈ [0, T ], u(x, 0) = u0 (x),
x ∈ (0, 1),
where α ∈ (1, 2) is the order of the derivative. The notation 0 Dxα refers to either the left sided Djrbashian–Caputo or Riemann–Liouville fractional derivative of order α defined in Chapter 4. The theory relies crucially on the theory for the steady-state problem (5.24)
−0 Dxα u + qu = f
in D = (0, 1),
u(0) = u(1) = 0,
where f is a function in L2 (D) or other suitable Sobolev space. The potential coefficient q ∈ L∞ (D) is an essentially bounded measurable function. In comparison with the subdiffusion model (6.1) to be studied in Chapter 6, the superdiffusion model (5.23) is far less understood, and there remain many challenges. We here focus on the derivation of Green’s functions for the fractional odes based on each of the two derivative cases and α ∈ (1, 2], since this is the most important setting for applications corresponding to a fractional version of the classical second order equations. For well-posedness and regularity results based on variational formulations, we refer to the detailed exposition in [161, Section 5.2]. Green’s function has been extensively discussed in the literature, notably [16, 364], where the main concern was to show the existence of positive solutions to nonlinear fractional boundary value problems.
122
5. Fractional Ordinary Differential Equations
5.2.1. Green’s function. First we consider the following boundary value problem − 0 Dxα u(x) = f (x, u(x)) in D,
(5.25)
u(0) = u(1) = 0,
where 1 < α ≤ 2, f : D × R → R is continuous. We shall consider the Riemann–Liouville and Djrbashian–Caputo cases separately. 5.2.1.1. Riemann–Liouville case. The starting point of the derivation is the fundamental calculus of the Riemann–Liouville fractional derivative in α α 1 Theorem 4.1. By the identity RL 0 Dx 0 Ix u = u for any u ∈ C(D) ∩ L (D) from Theorem 4.1 and (4.9), we immediately arrive at the following useful result. Lemma 5.1. Let 1 < α ≤ 2. If u ∈ C(D) ∩ L1 (D) has an αth Riemann– α 1 Liouville derivative and RL 0 Dx u ∈ L (D), then for some constants c1 and c2 , α RL α 0 Ix 0 Dx u(x)
= u(x) + c1 xα−1 + c2 xα−2 .
Lemma 5.2. For f independent of u and f ∈ C(D), problem (5.25) has a unique solution
1
u(x) =
G(x, s)f (s) ds, 0
where Green’s function G(x, s) is given by (with cα = Γ(α)−1 ) G(x, s) =
cα (x(1 − s))α−1 − (x − s)α−1 , 0 ≤ s ≤ x ≤ 1, cα (x(1 − s))α−1 ,
0 ≤ x ≤ s ≤ 1.
Proof. By Lemma 5.1, the equation can be equivalently reformulated as u(x) = 0 Ixα f (x) + c1 xα−1 + c2 xα−2 , where c1 , c2 are constants. Hence, the general solution u(x) is given by
x
u(x) = −cα 0
(x − s)α−1 f (s) ds + c1 xα−1 + c2 xα−2 .
5.2. Boundary value problems
123
Using the boundary condition u(0) = u(1) = 0, we deduce c2 = 0 and 1 c1 = cα 0 (1 − s)α−1 f (s) ds. Hence it has a unique solution given by x 1 (x − s)α−1 f (s) ds + cα (1 − s)α−1 xα−1 f (s) ds u(x) = − cα 0 x0 α−1 α−1 [(x(1 − s)) − (x − s) ]f (s) ds =cα 0 1 + cα (x(1 − s))α−1 f (s) ds x 1 G(x, s)f (s) ds. = 0
This completes the proof of the lemma.
Formally, Green’s function G(x, s) satisfies the following fractional differential equation: for any x ∈ D, G(x, ·) satisfies α −RL 0 Ds G(x, s) = δx (s),
in D,
G(x, 0) = G(x, 1) = 0, where δx (s) is the Dirac delta function concentrated at s = x. Clearly one has to make proper sense of the differential equation, since so far the fractional integral and derivative are only defined for L1 (D) functions. For a rigorous analysis of such problems, one has to extend the fractional operators to distributions, which has been done by a number of researchers. Nonetheless, in the limit α = 2, it recovers the classical two point boundary value problems, and accordingly Green’s function also converges. Green’s function G(x, s) in the Riemann–Liouville case is shown in Figure 5.1 for several α values and in Figure 5.2 for the fractional order α = 3/2. For α close to unity, the function has a sharp gradient along the boundary and the line x = s, indicating a limited smoothing property of the solution operator. As α tends to two, Green’s function G(x, s) becomes increasingly smooth. It is positive in the interior of the domain D × D, and achieves its maximum along the wedge x = s. Lemma 5.3 analytically verifies the experimental observations on Green’s function G(x, s). Lemma 5.3. Green’s function G(x, s) has the following properties. (i) G(x, s) > 0 for any x, s ∈ D. (ii) There exists a positive function γ ∈ C(D) such that min
1/4≤x≤3/4
G(x, s) ≥ γ(s) max G(x, s) = γ(s)G(s, s), 0≤x≤1
0 < s < 1.
124
1.0 0.8 0.6 0.4 0.2 0.0
5. Fractional Ordinary Differential Equations
G(x, x)
1.0
................................................................................................ ............... ......... ......... ....... ...... ....... . . . . . ..... ... ... .... ... .. . ... .... ... ... ... ... .... ... ... .................................... . . . . . . . . . . . . . . . ... . .......... ... ....... . . . . . . . . . ... . . ....... ... ...... . . . . . . . ... . . ...... .... . ..... . . . . ... . ..... .... . . . . .... ... . . ..... .. . . ... . . ... . ..... ... . ... . . ... . ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... .... . . ............ ... .... ......... . . . . . . . . . . . . ... ... . . ........ ... ... ..... . . . . . . . . ... ... . . ....... .. ... .... ...... ....... .. ... .. .. ...... ...... .... ...... ...... ...... ...... ........ ....... .......... ..... ... .... ....... ....... ......... .....
0.8 0.6 0.4 0.2 0.0
G(x, 1 − x)
α = 1.1
. .............. α = 1.5 .......... .... ......... .. ....... . . ... . . . α = 1.9 . ...... . . . . . . . . ... ...... .. ..... . . . . . . .... ..... .... ... ... ... ..... . . .. . . . ... ... . . .. . . . .... .... ... ..... ........ ... ..... ..... ...... ... ..... . . ...... . . .... ... . ... ... . . . ... ... . . . . . . . . ... . . ... ............... ... . . . . . . . ... . . ... .... .... ............ . . . . . . . ... . .. ......... ... .... . . . . . . . ... . .. ..... . .. ...... .......... ..... .. .... ...... ............. ..... ... ....... ..... . ..... ....... ........ ............................. ... ....... ........ ............. ........................ .......................................... .............................................................
Figure 5.1. Riemann–Liouville Green’s function G(x, s): left, diagonal values G(x, x); right, antidiagonal values G(x, 1 − x).
Figure 5.2. Green’s function G(x, s) in the Riemann–Liouville (left) and Djrbashian–Caputo (right) case, α = 32 .
Proof. The first assertion is obvious. It is easy to observe that for any s ∈ D, the function G(x, s) is monotonically decreasing in x on [s, 1) and is monotonically increasing on (0, s]. Let g1 (x, s) = cα ((x(1 − s))α−1 − (x − s)α−1 ) and
g2 (x, s) = cα (x(1 − s))α−1 .
Then there holds
⎧ s ∈ (0, 1/4], ⎨ g1 (3/4, s), min(g1 (3/4, s), g2 (1/4, s)), s ∈ (1/4, 3/4), min G(x, s) = ⎩ 1/4≤x≤3/4 s ∈ [3/4, 1), g2 (1/4, s), # g1 (3/4, s), s ∈ (0, r], = g2 (1/4, s), s ∈ [r, 1),
where the constant r ∈ (1/4, 3/4) is the unique solution to g1 (3/4, s) = g2 (1/4, s), that is, (3/4(1 − s))α−1 − (3/4 − s)α−1 = (1/4(1 − s))α−1 .
5.2. Boundary value problems
125
In particular, for α → 1 we have r → 3/4 and for α → 2, r → 1/2. Note that max G(x, s) = G(s, s) = cα (s(1 − s))α−1 ,
s ∈ (0, 1).
0≤x≤1
Hence assertion (ii) follows by setting ⎧ (3(1 − s)/4)α−1 − (3/4 − s)α−1 ⎪ ⎪ ⎨ , s ∈ (0, r], (s(1 − s))α−1 γ(s) = 1 ⎪ ⎪ , s ∈ [r, 1). ⎩ (4s)α−1
An important application of Green’s function is to discuss the existence of positive solutions to nonlinear problems, via a suitable fixed point argument. The basic idea is to reformulate the problem into a Fredholm integral equation involving a sufficiently smooth kernel to allow a compact operator, then apply a suitable fixed point theorem. Below we will briefly discuss one such result [16]. Let E = C(D) be endowed with the ordering u ≤ v if u(x) ≤ v(x) for all x ∈ D, and the maximum norm u = max0≤x≤ |u(x)|. Define a cone P ⊂ E by P = {u ∈ E : u(x) ≥ 0}. Lemma 5.4. Let T : P → E be the operator defined by
1
G(x, s)f (s, u(s)) ds,
T u(x) := 0
with f ∈ C(D × R), f ≥ 0. Then T : P → P is completely continuous. Proof. Since G(x, s) is nonnegative and integrable and f (x, u(x)) is nonneagtive and continuous, we deduce that the operator T : P → P is continuous. Let P ⊂ P be bounded; i.e., there exists a positive constant M > 0 such that u C ≤ M for all u ∈ P . Let K = max0≤x≤1,0≤u≤M |f (x, u)| + 1. Then for u ∈ P , by Lemma 5.3 we have
1
|T u(x)| ≤ 0
1
G(x, s)|f (s, u(s))| ds ≤ K
G(s, s) ds. 0 1
α−1 . Then for every Hence T (P ) is bounded. Given > 0, let δ = 12 ( Γ(α) K ) u ∈ P , x1 , x2 ∈ D, x1 < x2 , and x2 −x1 < δ, we have |T u(x2 )−T u(x1 )| < ,
126
5. Fractional Ordinary Differential Equations
i.e., T (P ) is equicontinuous. In fact, it is easy to deduce that (5.26) 1 K (1 − s)α−1 (xα−1 − xα−1 ) ds |T u(x2 ) − T u(x1 )| ≤ 2 1 Γ(α) 0 x1
(x2 − s)α−1 − (x1 − s)α−1 ds − 0 x2
+
(x2 − s)α−1 ds
x1
≤
K (xα−1 − xα−1 ). 1 Γ(α) 2
Now we consider the following two cases separately. Case 1. δ ≤ x1 < x2 < 1. K K α−1 |T u(x2 ) − T u(x1 )| ≤ (xα−1 − xα−1 )≤ (x2 − x1 ) 2 1 Γ(α) Γ(α) δ 2−α K (α − 1)δ α−1 ≤ . ≤ Γ(α) Case 2. 0 ≤ x1 < δ, x2 < 2δ. |T u(x2 ) − T u(x1 )| ≤
K K α−1 (xα−1 − xα−1 x )≤ ≤ . 1 Γ(α) 2 Γ(α) 2
By the Arzel`a-Ascoli theorem, the operator T : P → P is completely continuous. The following fixed point theorem is very useful [202]. Lemma 5.5. Let E be a Banach space, let P ⊂ E be a cone, and let D1 , D2 be two bounded open balls of E centred at the origin with D 1 ⊂ D2 . Suppose that T : P ∩ (D2 \ D1 ) → P is a completely continuous operator such that either (i) T x ≤ x , x ∈ P ∩ ∂D1 , and T x ≥ x , x ∈ P ∩ ∂D2 , or (ii) T x ≥ x , x ∈ P ∩ ∂D1 , and T x ≤ x , x ∈ P ∩ ∂D2 holds. Then T has a fixed point in P ∩ (D2 \ D1 ). Now we can state a result on the existence of a positive solution. Let 3/4 1 M = ( 0 G(s, s) ds)−1 and N = ( 1/4 γ(s)G(s, s) ds)−1 . Theorem 5.5. Let f (x, u) be continuous on [0, 1] × [0, ∞). Suppose that there exist two positive constants r2 > r1 > 0 such that (H1) f (x, u) ≤ M r2 for all (x, u) ∈ [0, 1] × [0, r2 ], (H1) f (x, u) ≥ N r1 for all (x, u) ∈ [0, 1] × [0, r1 ].
5.2. Boundary value problems
127
Then problem (5.25) has at least one positive solution u such that r1 ≤
u C ≤ r2 . Proof. By Lemmas 5.2 and 5.4, the operator T is completely continuous and problem (5.25) has a solution u = u(x) if and only if it solves the operator equation u = T u. To apply the fixed point theorem, we divide the proof into two steps corresponding to (H1), (H2). Let P2 := {u ∈ P : u C ≤ r2 }. For u ∈ ∂P2 , we have 0 ≤ u(x) ≤ r2 for all x ∈ [0, 1]. It follows from (H1) that for x ∈ [0, 1], 1 1
T u = max G(x, s)f (s, u(s)) ds ≤ M r2 G(s, s) ds = r2 = u . 0≤x≤1 0
0
Let P1 := {u ∈ P : u < r1 }. For u ∈ ∂P1 , we have 0 ≤ u(x) ≤ r1 for all x ∈ [0, 1]. By assumption (H2), for x ∈ [1/4, 3/4], there holds 1 1 G(x, s)f (s, u(s)) ds ≥ γ(s)G(s, s)f (s, u(s)) ds T u(x) = 0
≥ N r1
0 3/4
γ(s)G(s, s) ds = r1 = u C .
1/4
So T u ≥ u for u ∈ ∂P1 and therefore by Lemma 5.5, we complete the proof. Example 5.1. Consider the following problem, 3 sin x 2 2 −RL + 1, x ∈ D, 0 Dx u = u + 4 u(0) = u(1) = 0. √ A simple computation shows M = 4/ π ≈ 2.25676, N = 13.6649. Choosing r1 = 1/14 and r2 = 1, we have sin x + u2 ≤ 2.2107 ≤ M r2 for (x, u) ∈ [0, 1] × [0, 1], f (x, u) = 1 + 4 sin x + u ≥ 1 ≥ N r1 for (x, u) ∈ [0, 1] × [0, 1/14]. f (x, u) = 1 + 4 By Theorem 5.5, this problem has at least one solution u such that 1/14 ≤
u ≤ 1. 5.2.1.2. Djrbashian–Caputo case. This case is more delicate. From Theorem 4.3 we conclude the following. Lemma 5.6. Let 1 < α ≤ 2. If u ∈ W 2,1 (D) ∩ C 1 (D), then for some constants c1 and c2 , α DC α 0 Ix 0 Dx u(x)
= u(x) + c1 + c2 x.
We begin again with deriving the associated Green’s function.
128
5. Fractional Ordinary Differential Equations
Lemma 5.7. Let 1 < α ≤ 2. For f ∈ C(D), the boundary value problem α −DC 0 Dx u = f
(5.27)
in D,
u(0) = u(1) = 0
has a unique solution,
1
G(x, s)f (s) ds,
u(x) = 0
where Green’s function G(x, s) is defined by (with cα = 1/Γ(α)) cα (x(1 − s)α−1 − (x − s)α−1 ), 0 ≤ s ≤ x ≤ 1, G(x, s) = 0 ≤ x ≤ s ≤ 1. cα x(1 − s)α−1 , Proof. By Lemma 5.6, problem (5.27) can be equivalently reformulated to u(x) = −0 Ixα f (x) + c1 + c2 x for some constants c1 , c2 . Now using the boundary condition u(0) = u(1) = 0, we have c1 = 0 and c2 = 0 Ixα f (1). Consequently, u(x) = −0 Ixα f (x) + 0 Ixα f (1)x x 1 1 x α−1 =− (x − s) f (s) ds + (1 − s)α−1 f (s) ds Γ(α) 0 Γ(α) 0 1 G(x, s)f (s) ds, = 0
which shows the solution representation. 0.8
0.6
0.4
0.2
0.0
G(x, −x)
G(x, x)
..... ............. ........... ........ ... ....... ... ....... . . . . . ... . ....... .. ...... . ... . . . . ... ..... . . . . ... ... . . . . . ... ... . . . . . ... .... . . . . ... .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . ........... ......... ... . . . . . . . . . . ... . . . . . . . . ....... ... .............. . . . ... . . . . ...... .............. . . . . . . . .... ..... ............. . . . . . ... . . . ... ............ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ............. ... .... .......... ................... . . . . . . . . . . . . . . ......... ... ... ..... ................ . . . . . . . . ....... ... .... . .. ....... ...... ....... ...... ...... .............. .... ...... ................ . . . . . . . ..... ........ ...... ..... ..... ......... ..... .. ...... . . . . ....... ..... ... .....
0.4
0.2
0.0 −0.2
. ....... α = 1.1 ..... .. ..... .... ..... α = 1.5 ..... ........... . . . . ..... ..... ..... α = 1.9 ...... ......... ........ ... . ...... ...... .......... .......... ..... . . . ..... .... . . . . . . ..... .... ............. ... . ... . . . . . . . . . . . . . . .. . ... ... ....... ..... .......... ........... . . ..... . .... .... .......... ..... ........... ............ ...... ... .... ..... ....... . . .. ... ..... ............ .............. ....... . . . . . ... ......... ... .... ..... ...... ..... ............ ...... ......... ............ ..... ................................ ..... .................................. ...... ........... ........ .. ........................................................... .... . ...... .... ..... . . . .. . .. ..... ... ...... ....... ... ....... .. ....... . . . .... . . . . ........................
Figure 5.3. Djrbashian–Caputo Green’s function G(x, s): left, diagonal values G(x, x); right, antidiagonal values G(x, 1 − x).
Green’s function G(x, s) in the Djrbashian–Caputo case is shown in Figures 5.2 for α = 3/2, and in Figure 5.3 for several α values. One observes that it is no longer positive in the interior of the domain, which immediately implies that the solution operator is not positivity preserving, and lacks, for example, the comparison principle. It is worth noting that the function
5.2. Boundary value problems
129
G(x, s) is continuous, but there is a very steep change around the diagonal x = s for α close to unity. The magnitude of the negative part decreases as α approaches two. Example 5.2. Let α > 1 and consider the function u(x) = x(x−1)(x−b) = x3 − (1 + b)x2 + bx with b yet to be determined. Clearly, u(0) = u(1) = 0. α From the differentiation formula (4.17) and DC 0 Dx x = 0 for α > 1, we get Γ(4) Γ(3) x3−α + (1 + b) x2−α Γ(4 − α) Γ(3 − α)
Γ(3) 4 x x2−α ≥ 0, = (1 + b) − 4 − α Γ(3 − α)
α −DC 0 Dx u(x) = −
α ∈ (0, 1). However, u(x) < 0 for x ∈ (b, 1), which illustrates the for b = 4−α above statements on the lack of preservation of positivity and a comparison principle.
0.4
0.3
0.2
0.1
u(x) ..........
..... ............ ... ..... α = 1.25 ..... ... ..... .. . ..... α = 1.5 ..... ... ..... .... α = 1.75 ..... .. ..... .... α = 1.95 ... ..... ... .... ..... .... ..... ... ..... .. ..... .......................................... .. ....... ......... ... ..... ........ ........ ... ........ ..... .... ........ ..... .... ..... .... ... ... .... ...................................................................................... .. .. .......... ............................................................................. ........ ... ... .................. ............. . ...... . . . . . . . . . . ......... . ..................... ........ ..... ...................... ... ....... ..... .................... ...... ..... .......... ...... ............. ...... .............. ......... ..... ........... ....... ............ .... ... ........ .................... .................... ............. ............... .......... .....
0.0 0.00
0.25
0.50
0.75
1.00
0.15
0.10
0.05
u(x) ................................................. ................. .............. ............ .............. ........... ............ ........................... ........... ........ ................................ . . . . . ....... ...... .................. ......................... . . ...... ........ .. . . . . ...... .............. .......... . ..... ....... . ..... ....... ........ ..... ...... ........... . ..... ..... . . ... ..... ...... ............................................... ........ ..... ..... ......... ......... ....... ....... ... ....... ........ ........ .......... .... ...... . . . . ...... ... ...... ..... ....... . . . . . ...... ... ..... ..... ...... . . . . ... ..... . ...... .............. . . . . ..... .............. ........... . . . . ..... ........... ......... . . . ..... ......... ...... ..... ..... ..... ..... ....... .......... ..... ......... ....... ......... . . ... ..
0.00 0.00
0.25
0.50
0.75
1.00
Figure 5.4. Solution of Poisson’s equation with Dirichlet conditions and f = 1: left, Riemann–Liouville derivative; right, Djrbashian– Caputo derivative
In Figure 5.4 we show the solution u(x, α) of the fractional Poisson equation 0 Dxα u = 1 with homogeneous Dirichlet conditions for both the Riemann–Liouville and Djrbashian–Caputo derivatives for several values of α. Note the effect of the singularity in the Riemann–Liouville case for smaller α and the quite substantial difference in the solution size. In both cases as α → 2, the solution converges to u(x) = 12 x(1 − x) and recovers the solution of −u = 1 as expected. These figures should be compared to two sided derivative cases, which we will meet in Section 12.1 and Figure 12.1 for the same example. Remark 5.3. One possible means of regaining the positivity of Green’s function G(x, s) is to use a Robin-type boundary condition at one endpoint, that is take u(0) − γu (0) = 0, u(1) = 0 for some γ > 0. One can derive
130
5. Fractional Ordinary Differential Equations
Green’s function in a similar way, and it is an interesting exercise to determine the range for γ that restores the positivity of G. Of course, the minimum such value of will depend on the value of α.
Chapter 6
Mathematical Theory of Subdiffusion
The main purpose of this chapter is to provide some basic mathematical theory for the subdiffusion model derived from the continuous time random walk framework in Section 2.1. We will first of all derive solution representations based on fundamental solutions in the spatially one-dimensional setting. Then, we will use separation of variables to obtain a solution representation as well as regularity results in higher space dimensions. Also maximum principles for subdiffusion equations will be addressed, and we end with a brief energy argument for establishing well-posednenss and regularity. We here mainly focus on those results that will be needed for analyzing the inverse problems in Chapter 10. For a much more detailed exposition we refer to, e.g., the recent monographs [161, 203] and the references therein. Let Ω ⊂ Rd be an open bounded domain with a smooth boundary ∂Ω. We consider the following initial boundary value problem for u(x, t): ⎧ α ⎪ ⎨∂t u(x, t) = Lu(x, t) + f (x, t), (x, t) ∈ Ω × (0, T ], u(x, t) = 0, (x, t) ∈ ∂Ω × (0, T ], (6.1) ⎪ ⎩ u(x, 0) = u0 (x), x ∈ Ω. Here ∂tα u denotes the Djrbashian–Caputo fractional partial derivative of the function u of order α ∈ (0, 1) with respect to time t, that is, t 1 du(s) α ds. (t − s)−α ∂t u(t) = Γ(1 − α) 0 ds 131
132
6. Mathematical Theory of Subdiffusion
Moreover, L is a strongly elliptic partial differential operator d ∂ ∂u Lu = (aij (x) ) − q(x)u(x), ∂xi ∂xj
x ∈ Ω,
i,j=1
where the potential function q(x) is smooth and nonnegative, aij = aji , i, j = 1, . . . , d, and the matrix-valued function A = [aij ] : Ω → Rd×d is smooth and satisfies the uniform ellipticity (6.2)
c1 ξ 2 ≤ ξ · A(x)ξ ≤ c2 ξ 2
∀ξ ∈ Rd , ∀x ∈ Rd ,
for some positive constants c1 , c2 > 0. The smoothness assumptions on the source term f (x, t) and the initial condition u0 (x) will be specified later. In the case α = 1, model (6.1) recovers the standard parabolic equation.
6.1. Fundamental solution In this section, we derive the fundamental solution, also known as the free space Green’s function, for problem (6.1) in one space dimension Ω = R. Exactly as with the classical diffusion equation, this is a very useful tool for understanding the behavior of the model. In free space Ω = R, the problem is given by (6.3)
∂tα u = uxx u(·, 0) = u0
in Ω × (0, ∞), in Ω,
where the initial data u0 (x) is a given bounded function. We shall assume that u(x, t) is bounded as x → ±∞. To derive the explicit solution representation, we employ the Laplace transform in time and the Fourier transform in space (cf. Section A.1), and we denote by v (ξ) the Fourier transform of v : R → R in the spatial variable x, and by v(z) the Laplace transform of v : R+ → R in the time variable t. Upon applying the Fourier transform to (6.3), we arrive at ∂tα u + ξ2u = 0, t > 0 with the initial condition u (ξ, 0) = u 0 (ξ). Then applying the Laplace transform to the function u (ξ, t) with respect to time t and using the Laplace transform rule for the Djrbashian–Caputo fractional derivative (see Lemma 4.12), we deduce that 0 (ξ), (ξ, z) + ξ 2 u (ξ, z) = z α−1 u zαu which upon simple algebraic manipulations yields z α−1 u 0 (ξ). u (ξ, z) = α z + ξ2
6.1. Fundamental solution
133
By Lemma 3.1, the inverse Laplace transform is a Mittag-Leffler function Eα,β (z) (see Section 3.4 for details), and for any λ > 0 α−1 z −1 L = Eα (−λtα ). zα + λ Hence, the solution u (ξ, t) in the Fourier domain is given by u0 (ξ), u (ξ, t) = Eα (−ξ 2 tα ) and the convolution rule for the Fourier transform yields ∞ Gα (x − y, t)u0 (y) dy, (6.4) u(x, t) = −∞
where the fundamental solution Gα (x, t) is given by ∞ 1 −1 2 α Gα (x, t) = F [Eα (−ξ t )] = eiξx Eα,1 (−ξ 2 tα )dξ. 2π −∞ The inverse Fourier transform may be expressed in terms of an M -Wright function Mμ (z), defined in equation (3.66), or more generally the Wright function Wρ,μ (z); cf. Section 3.5. By Theorem 3.29, the fundamental solution Gα (x, t) is given by 1 |x| (6.5) Gα (x, t) = α M α2 √ α . t 2t 2 Note that for α ∈ (0, 1), for every t > 0, the function x → Gα (x, t) is not differentiable at x = 0; cf. Figure 6.1 for an illustration. Nonetheless, in the limiting case α → 1 (see (3.67)) we have x2 1 M 1 (x) = √ e− 4 , 2 π
1
Gα (x, 1)
α =1/4
. ...... α =1/2 ... .... ... .. α =3/4 .. ............ . .. ... ... ... . α =1 ...... ........ ...... .... ....... ...... .... .... ...... ........... ............... . .. . ........ ..... ......... ......... ..................................... ............ ................ ............. ....... . . . . ........... .......... . . . ............. . ............. . .............. . . . ............... ................ . . . . .......... ..... .............. . . . .......... ..... . .................. . ............. ..... . . . .......... ..... ................... . . . .............. .... . ........................ ............... ..... . . . ............... ..... ........................ . . . . ..................... .......................... . ..................... . . . ...................... ............................ . . . . ......................... ............................. . . .......................... . . . . . .......................... ................................. . . . . . . . ........................ ....................... . . . .................. . . . . . . . . . . ..................... .............. . . . . . . . . . . . . . . . . ............................................ . . . . ............... .................................................................................. x ..............................................................................
−5
−4
−3
−2
−1
Figure 6.1. Fundamental solution for the subdiffusion equation
134
6. Mathematical Theory of Subdiffusion
and Gα (x, t) = G(x, t) is just the classical heat kernel, that is, (6.6)
x2 1 G(x, t) = √ e− 4t , 2 πt
which is C ∞ in space for every t > 0. In general, for any α ∈ (0, 1), the behavior of Gα (x, t) for large x may be seen from the asymptotic formula (3.68), according to which Mα/2 decays faster than linear exponential order but slower than a Gaussian as x → ∞. This property is in line with the anomalous subdiffusion concept. In particular, the decay implies that the integral solution representation (6.4) is convergent for t > 0 if u0 is locally integrable and bounded on R. The fundamental solution Gα (x, t) is a good example to illustrate the role moments play in the subdiffusion process. First we note that the fundamental solution (6.6) can be regarded as a Gaussian probability density function (of the random walkefr at position x at time t, cf. Section 2.1) and it evolves in time with moments (of even order) ∞ (2n)! n t , n = 0, 1, . . . , t ≥ 0. x2n G(x, t) dx = μ2n (t) := n! −∞ The mean squares displacement or variance σ 2 := μ2 (t) = 2t is thus proportional to the time, following the diffusion law of Einstein as noted in Chapter 1. Next we turn to the subdiffusion case, by exploiting the properties of the M -function Mμ (z). Of particular relevance here is the case of a positive argument. By Theorem 3.28, the following Laplace transform pair holds L[Mμ (x)](z) = Eμ (−z). From Bernstein’s theorem, Theorem A.5, this relation implies the positivity of Mμ (x) for x > 0, since for 0 < μ < 1, the function Eμ (−t) is completely monotone on the positive real axis; cf. Theorem 3.20. Hence, Gα (x, t) is indeed a proper probability density function for any fixed t > 0, that is, Gα (x, t) ≥ 0 and ∞ Gα (x, t) dx = 1. −∞
Using the following integral identity (cf. Lemma 3.4) ∞ Γ(k + 1) , xk Mμ (x) x = Γ(kμ + 1) 0 the moments (of even order) of the fundamental solution Gα (x, t) are given by (6.7) ∞ Γ(2n + 1) αn t , n = 0, 1, . . . , t ≥ 0. x2n Gα (x, t) x = μ2n (t) := Γ(αn + 1) −∞
6.1. Fundamental solution
135
It is instructive to compare this expression with the Gaussian case. In particular, the variance σ 2 := μ2 = 2tα /Γ(α + 1) is now proportional to the αth power of time t, consistent with anomalous slow diffusion for 0 < α < 1. Now we consider the problem on the half line Ω = (0, ∞) with a homogeneous Dirichlet boundary condition ∂tα u = uxx in Ω × (0, ∞), u(0, t) = 0
(6.8)
u(·, 0) = u0
in (0, ∞), in Ω.
By taking the odd extension of the initial data u0 to R, that is, u0 (−x) = −u0 (x) for x > 0, we obtain the solution u to problem (6.8) ∞ ∞ Gα (x−y, t)u0 (y) dy = (Gα (x−y, t)−Gα (x+y, t))u0 (y) dy. u(x, t) = −∞
0
Clearly, for any integrable and bounded u0 , the representation gives a continuous solution for any t > 0. The interesting question is what happens as t → 0+ and it turns out this depends on the compatibility of the initial and the boundary conditions. If the initial data u0 is continuous and is compatible with the boundary condition, that is, u0 (0) = 0, then the solution is continuous up to t = 0. To gain further insight into the subdiffusion model, we suppose that the initial condition and boundary condition are not compatible, that is, u0 (0) = 0. Then the solution u(x, t) is discontinuous at (x, t) = (0, 0). To determine the nature of the discontinuity, we rewrite the solution as ∞ x Gα (y, t)u0 (x − y) dy − Gα (y, t)u0 (y − x) dy. u(x, t) = −∞
x
Let ψ(x, t) be the solution in the special case when u0 (x) = 1 for all x > 0, that is x ∞ ψ(x, t) = Gα (y, t) dy − Gα (y, t) dy. −∞ x ∞ α (0, t) = Eα (0) = 1, we can simplify the formula Since −∞ Gα (y, t) dy = G to ∞ ∞ 1 |y| Gα (y, t) dy = 1 − √ α M α2 √ α dy, ψ(x, t) = 1 − 2 t x t x and with the substitution ξ = √|y|tα we obtain ψ in the form of a similarity solution ∞ x M α2 (ξ) dξ. with Ψ(z) = 1 − ψ(x, t) = Ψ √ α t z If we fix t > 0 and let x → 0, then ψ(x, t) → Ψ(0) = 0, whereas if we fix x > 0 and let t → 0, then ψ(x, t) = Ψ(∞) = 1. The function ψ(x, t) characterises precisely the singular behaviour of the solution with incompatible
136
6. Mathematical Theory of Subdiffusion
data. With the function ψ(x, t), the solution u(x, t) can be written as u(x, t) = u0 (0)ψ(x, t) + w(x, t), where w(x, t) is the solution to problem (6.8) with initial data u0 (x) − u0 (0), and thus continuous at (0, 0). By means of reflection about given spatial points, the fundamental solution Gα (x, t) can also be used to solve boundary value problems on domains of special geometry. We shall consider the unit interval Ω = (0, 1). In the case of the heat equation, this was nicely summarised by Cannon [49], where the main tool is the so-called θ-function. Below we consider the following initial boundary value problem: ∂tα u = uxx + f (x, t), 0 < x < 1, 0 < t < T, (6.9)
u(0, t) = g0 (t),
u(1, t) = g1 (t),
u(x, 0) = u0 (x),
0 < t ≤ T, 0 < x < 1.
To this end, we introduce the (fractional) function θα (x, t) defined by ∞ Gα (x + 2m, t), 0 < x < 2, 0 < t. θα (x, t) = m=−∞
In view of the identity (6.6), the function θα coincides with the θ function for the heat equation when α = 1. The following lemma collects some important properties of the θα function, which are useful in verifying the solution representation. Lemma 6.1. The function θα is C ∞ for x ∈ (0, 2) and t > 0, and it is an even function with respect to x. α
Proof. Let rm = |x + 2m|/t 2 , m ∈ Z. By the asymptotics (3.68) of the M -Wright function Mμ (z), we deduce that |θα (x, t)|
0, where the constant A, a, b, and c > 1 depend only on α. For a fixed t, there exists an M ∈ N such that rm > 1 for all m > M , c and thus e−brm < e−brm . It suffices to consider the terms with m > M . α a e−brm . Then for any t > 0, t ≥ t > 0, the equality Let um = t− 2 Arm 0 0 |rm+1 | − |rm | = 2α holds if m > M , and hence t2 2b um+1 = exp − α < 1, lim m→∞ um t2 since b > 0 and t ≥ t0 > 0. Thus the series ∞ m=0 um (x, t) is uniformly convergent for x ∈ R and t ≥ t0 > 0, and so is the series for θα (x, t). Similarly, the series for all partial derivatives of θα are uniformly convergent
6.1. Fundamental solution
137
for x ∈ (0, 2) and t > 0, and thus θα is in C ∞ (0, ∞) with respect to t and in C ∞ (0, 2) with respect to x. Lemma 6.2. The following relations hold: (6.10) (6.11)
lim
x→0+
lim L [∂x θα (x, t)] (z) x→0+ , 1−α L ∂x RL θα (x, t) (z) 0 Dt
(6.12)
= − 12 z α−1 , = − 12 ,
lim θα (x, t) = 0,
t→0+
(6.13)
lim ∂x θα (x, t) = 0
x→1−
∀t > 0.
α
Proof. Let rm = |x + 2m|/t 2 , m ∈ Z. By direct computation, for m = 1, 2, . . ., we deduce 1 ∂x Gα (x ± 2m, t) = ∓ α W α2 ,1−α (−r±m ). 2t From this equation and the uniform convergence of the series from Lemma 6.1, we conclude that lim ∂x θα (x, t) = lim ∂x Gα (x, t).
x→0+
x→0+
Combining this last equation with the series representation of Gα (x, t) leads to the relation α
lim ∂x θα (x, t) = lim − 2t1α W− α2 ,1−α (|x|/t 2 ). x→0+ x→0+ α α The Laplace transform formula L t−α W− α2 ,1−α (|x|/t 2 ) (z) = z −(1−α) e−|x| z 2 can be used to give the first identity (6.10). To show (6.11), we prove t 1−α ∂x RL θα (x, t − s)ϕ(s) ds = − 12 ϕ(t) lim 0 Dt x→0+
0 ∞ C0 (0, ∞).
1−α for all ϕ(t) ∈ Since ϕ(t) ∈ C0∞ (0, ∞), RL ϕ(t) exists and 0 Dt 1−α 1−α L[ϕ(t)](z). By D ϕ(t)](z) = z is continuous, and by Lemma 4.6, L[RL 0 t Lemma 4.5 and the convolution rule for the Laplace transform, we obtain t 1−α ∂x RL D θ (x, t − s)ϕ(s) ds (z) lim L α 0 t x→0+ 0 t 1−α ∂x θα (x, t − s)RL D ϕ(s) ds (z) = lim L s 0 x→0+ 0 1−α D ϕ(t) (z) = lim L ∂x θα (x, t) (z)L RL 0 t
=
x→0+ − 12 z α−1 z 1−α L[ϕ(t)](z)
= L[− 12 ϕ(t)](z)
from which (6.11) follows. Then (6.12) is a direct consequence of the asymptotics of the Wright function. To see (6.13), we first compute ∂x θα (x, t) at
138
6. Mathematical Theory of Subdiffusion
x = 1 for t > 0 obtaining ∂x θα (x, t)|x=1 = 0. The continuity of ∂x θα (x, t) in x at x = 1 gives (6.13). Now we can state a solution representation for problem (6.9). Theorem 6.1. For piecewise continuous f , u0 , g0 , and g1 , the following representation gives a solution to problem (6.9), u(x, t) =
4
vi ,
i=1 1−α θα (x, t)) where the functions vi are defined by (with θ¯α (x, t) = RL 0 Dt 1 v1 (x, t) = (θα (x − y, t) − θα (x + y, t))u0 (y) dy, 0 t ∂x θ¯α (x, t − s)g0 (s) ds, v2 (x, t) = −2 0 t ∂x θ¯α (x − 1, t − s)g1 (s) ds, v3 (x, t) = 2 0 t 1 [θα (x − y, t − s) − θα (x + y, t − s)]f (y, s) dyds. v4 (x, t) = 0
0
Proof. We partly leave the proof as an exercise. It is easy to verify that the first term v1 satisfies the subdiffusion equation. For the second integral, denoted by v2 , we deduce t RL 1−α ∂ ∂tα v2 = −2∂tα θα (x, t − s)g0 (s) ds ∂t ∂x 0 t ∂ α θα (x, t − s)RL ∂t1−α g0 (s) ds = 2∂t ∂x 0 t ∂ α ∂t θα (x, t − s)RL ∂t1−α g0 (s) ds =2 0 ∂x t ∂ (θα )xx (x, t − s)RL ∂t1−α g0 (s) ds =2 ∂x 0 t ∂ = −2 ( RL ∂t1−α θα )xx (x − s)g0 (s) ds = (v2 )xx (x, t). 0 ∂x The subdiffusion equation for the other terms can be verified similarly. It remains to verify the initial and boundary conditions, which are a simple consequence of Lemma 6.1. Theorem 6.1 provides one explicit solution representation. The uniqueness of the solution has to be shown independently; see Section 6.2.5 for an approach using an energy estimate.
6.2. Existence and regularity theory
139
Remark 6.1. If we modify the boundary conditions to be of Neumann type, that is suppose that instead u satisfies ∂tα u = uxx + f (x, t), (6.14)
ux (0, t) = g0 (t),
ux (1, t) = g1 (t),
u(x, 0) = u0 (x),
0 < x < 1,
0 < t < T,
0 < t ≤ T,
0 < x < 1,
then in an entirely analogous way we obtain the representation 1 u(x, t) = [θα (x − y, t) − θα (x + y, t)] u0 (y) dy 0 t t (6.15) θ¯α (x, t − s)g0 (s) ds + 2 θ¯α (x − 1, t − s)g1 (s) ds −2 0 0 t 1 + [θα (x − y, t − s) − θα (x + y, t − s)] f (y, s) dy ds. 0
0
6.2. Existence and regularity theory In this section, we return to higher space dimensions Ω ⊆ Rd , with arbitrary d ∈ N, and derive a solution representation using separation of variables, which will also allow us to develop a regularity theory in Sobolev spaces. 6.2.1. Solution representation. First we derive a solution representation to problem (6.1) using the classical separation of variables. To this end, let {(ϕj , λj )}∞ j=1 be the Dirichlet eigenpairs of the elliptic operator −L, that is, −Lϕj = λj ϕj ϕj = 0
in Ω,
on ∂Ω.
Then it is known that the set {ϕj } forms an orthonormal basis in L2 (Ω), and an orthogonal basis in H01 (Ω). By multiplying both sides of (6.1) by ϕj , integrating over the domain Ω, and applying integration by parts, we obtain for the L2 (Ω) inner products ·, · ∂tα u(·, t), ϕj = Lu(·, t), ϕj + f (·, t), ϕj = u(·, t), Lϕj + f (·, t), ϕj = −λj u(·, t), ϕj + f (·, t), ϕj . Let uj (t) = u(·, t), ϕj , fj (t) = f (·, t), ϕj , u0j = u0 , ϕj . Then we arrive at the following system of fractional ordinary differential equations, α ∂t uj (t) = −λj uj (t) + fj (t), t > 0, uj (0) = u0j
140
6. Mathematical Theory of Subdiffusion
for j = 1, 2, . . .. It remains to find uj (t). To this end, we consider the following fractional ode α ∂t uλ (t) = −λuλ (t) + f (t), t > 0, (6.16) uλ (0) = 1, where the scalar λ > 0. By means of the Laplace transform (see also Example 4.4 and Remark 4.6), we find that the solution uλ (t) is given by t (t − s)α−1 Eα,α (−λ(t − s)α )f (s) ds, (6.17) uλ (t) = Eα,1 (−λtα ) + 0
where Eα,β (z) is the Mittag-Leffler function defined in Section 3.4. Hence the solution u(t) to problem (6.1) can be formally represented by u(x, t) =
∞
u0 , ϕj Eα,1 (−λj tα ) ϕj (x)
j=1 ∞ t
+
j=1
(t − s)α−1 Eα,α (−λj (t − s)α )f (·, s), ϕj ds ϕj (x).
0
This can be succinctly rewritten as
t
u(t) = E(t)u0 +
(6.18)
E(t − s)f (s) ds,
0
where the solution operators E and E are defined by (6.19)
E(t)v =
∞
Eα (−λj tα )v, ϕj ϕj ,
v ∈ L2 (Ω),
j=1
and (6.20)
E(t)v =
∞
tα−1 Eα,α (−λj tα )v, ϕj ϕj ,
v ∈ L2 (Ω),
j=1
respectively. Convergence of these series will be addressed in Sections 6.2.2 d Eα,1 (−λj tα ) = −λj tα−1 Eα,α (−λj tα ) (see (3.34)), we have and 6.2.3. Since dt (6.21)
d E(t)v = E(t)Lv . dt
The operators E and E denote the solution operators for problem (6.1) with f ≡ 0 and u0 ≡ 0, respectively. This representation will be used extensively in the discussion of existence and regularity issues below. In the rest of this section, we show the existence, uniqueness, and regularity of the solution to problem (6.1) using the solution representation (6.18). First we introduce the necessary functional analytic framework. We
6.2. Existence and regularity theory
141
refer to Section A.2 for preliminaries on Sobolev spaces. For better readability we will denote the L2 (Ω) inner product by angle brackets. The same notation will be used for dual pairings, if needed. We define an operator A in L2 (Ω) by (Au)(x) = (−Lu)(x),
x ∈ Ω,
which is equipped with homogeneous Dirichlet boundary conditions with its domain D(A) = H 2 (Ω) ∩ H01 (Ω). Since A is a symmetric uniformly elliptic operator, the spectrum of A is entirely composed of eigenvalues, and counting according to the multiplicities, we can set 0 < λ1 ≤ λ2 · · · . By ϕj ∈ H 2 (Ω)∩H01 (Ω), we denote the L2 (Ω) orthonormal eigenfunctions corresponding to λj . See Section 9.2.1.1 for a more detailed discussion ∞in the onedimensional case. This allows us to diagonalise A as Au = j=1 λj u, ϕ, and we can define the fractional power Aγ for some γ ∈ R by A u= γ
∞
λγj u, ϕ,
j=1
whenever this sum converges. Its domain for γ = s/2 > 0 is the space H˙ s (Ω) defined by ∞ . / s 2 ˙ H (Ω) = v ∈ L (Ω) : λsj |v, ϕj |2 < ∞ , j=1
which is a Hilbert space with the norm v 2H˙ s (Ω) = definition, we have the equivalent form,
v 2H˙ s (Ω)
=
=
∞ j=1 ∞
s 2
|v, λj ϕj | = 2
∞
∞
s 2 j=1 λj |v, ϕj | .
By
s
v, A 2 ϕj 2
j=1 s
s
A 2 v, ϕj 2 = A 2 v 2L2 (Ω) .
j=1
⊂ for s > 0. In particular, H˙ 1 (Ω) = H01 (Ω). To deWe have fine H˙ s (Ω) also for negative s < 0, note that since H˙ −s (Ω) ⊂ L2 (Ω), identifying the dual (L2 (Ω))∗ with itself, we have H˙ −s (Ω) ⊂ L2 (Ω) ⊂ (H˙ −s (Ω))∗ . Correspondingly, for s < 0, we set H˙ s (Ω) = (H˙ −s (Ω))∗ , the space of bounded linear functionals on H˙ s (Ω). H˙ s (Ω)
H s (Ω)
Due to the fact that A was defined with homogeneous Dirichlet boundary conditions, powers of A come with boundary conditions as well, which has implications on the space H˙ s (Ω). If the domain Ω is C ∞ , as assumed in this chapter, then H˙ s (Ω) = H s (Ω) for 0 < s < 1/2, and for j ∈ N, such that 2j − 3/2 < s < 2j + 1/2, ( ) H˙ s (Ω) = v ∈ H s (Ω) : v = Av = · · · = Aj−1 v = 0 on ∂Ω .
142
6. Mathematical Theory of Subdiffusion
For the exceptional indices s = 2j − 3/2, the condition Aj−1 v = 0 on the boundary ∂Ω is replaced by # $ 1 1 |v(x)|2 2 (Ω) = v ∈ H 2 (Ω) : Aj−1 v ∈ H00 dx < ∞ . Ω dist(x, ∂Ω) These results can be proved using elliptic regularity theory and interpolation [328, p. 34]. If the domain Ω is not C ∞ , then one must restrict s accordingly. For example, if Ω is Lipschitz, then the above relations are valid for s ≤ 1, and if Ω is convex or C 1,1 , then we can allow s ≤ 2. Now we introduce the concept of a weak solution for problem (6.1). Definition 6.1. We call u a solution to problem (6.1) if for almost all t ∈ (0, T ), u(·, t) ∈ H01 (Ω), ∂tα u(·, t) ∈ H −1 (Ω), and (6.1) holds in H −1 (Ω), and, moreover for some γ ≥ 0 (which may depend on α), u ∈ C([0, T ]; H˙ −γ (Ω)) with lim u(·, t) − u0 H˙ −γ (Ω) = 0.
t→0
6.2.2. Homogeneous problem. We first consider the homogeneous problem with f ≡ 0. The regularity results will then follow from a bound on the solution operator E(t) defined in (6.19). Lemma 6.3. For the operator E(t) defined in (6.19), we have the estimate
E(t)v H˙ p (Ω) ≤ ct
q−p α 2
v H˙ q (Ω) ,
v ∈ H˙ q (Ω),
where 0 ≤ q ≤ 2 and q ≤ p ≤ q + 2, and the constant c depends only on α and p − q. Proof. By the definition of the operator E(t), we have
E(t)v 2H˙ p (Ω) =
∞
λpj |Eα (−λj tα )|2 v, ϕj 2
j=1
=t
(q−p)α
∞
λjp−q t(p−q)α |Eα (−λj tα )|2 λqj v, ϕj 2 .
j=1
By Corollary 3.2, we have |Eα (−λj tα )| ≤
cα , 1 + λj tα
where the constant cα depends only on α. Consequently,
E(t)v 2H˙ p (Ω)
≤ cα t
(q−p)α
sup j
∞ λjp−q t(p−q)α
(1 + λj
tα )2
j=1
λqj v, ϕj 2 ≤ ct(q−p)α v 2H˙ q (Ω) ,
6.2. Existence and regularity theory
143
where the last inequality follows from the choice of p and q such that 0 ≤ p − q ≤ 2, so that λp−q t(p−q)α < ∞. 2 λ,t>0 (1 + λtα)
c(α, p − q) := sup
Remark 6.2. The choice p ≤ q+2 in Lemma 6.3 indicates that the operator E(t) has at best a smoothing property of order 2 in space, which contrasts sharply with the classical parabolic case, for which the following estimate holds [328]:
E(t)v H˙ p (Ω) ≤ ct(q−p)/2 v H˙ q (Ω) for q ≥ 0 and any p ≥ q. This is due to the exponential decay of e−z , instead of the linear decay Eα,1 (−z) for 0 < α < 1. Now we can state well-posedness and some spatial regularity for the homogeneous problem. Theorem 6.2. Let α ∈ (0, 1), and f = 0. (i) For u0 ∈ L2 (Ω), there exists a unique weak solution u ∈ C([0, T ]; L2 (Ω)) ∩ C((0, T ]; H˙ 2 (Ω)) to problem (6.1) such that ∂tα u ∈ C((0, T ]; L2 (Ω)), and the following estimates hold for some constant c = c(α, Ω) > 0:
u C([0,T ];L2 (Ω)) ≤ c u0 L2 (Ω) ,
u(·, t) H 2 (Ω) + ∂tα u(·, t) L2 (Ω) ≤ ct−α u0 L2 (Ω) ,
t ∈ (0, T ].
The representation (6.18), that is, u(t) = E(t)v, holds in C([0, T ; L2 (Ω)) ∩ C((0, T ]; H˙ 2 (Ω)). Further, u : (0, T ] → L2 (Ω) can be analytically extended to a sector Σ =: {z ∈ C : z = 0, | arg(z)| ≤ π/2}. (ii) For u0 ∈ H01 (Ω) and any p < α2 , the unique weak solution u further belongs to Lp (0, T ; H˙ 2 (Ω)) with ∂tα u ∈ Lp (0, T ; L2 (Ω)), and there exists a constant c = c(p, α, Ω) > 0 such that
u Lp (0,T ;H 2 (Ω)) + ∂tα u Lp (0,T ;L2 (Ω)) ≤ c u0 H 1 (Ω) . (iii) For u0 ∈ H˙ 2 (Ω), the unique weak solution u belongs to C([0, T ]; H˙ 2 (Ω)), with ∂tα u ∈ C([0, T ]; L2 (Ω)) ∩ C((0, T ]; H01 (Ω)) and there exists a constant c = c(α, Ω) > 0 such that
u C([0,T ];H˙ 2 (Ω)) + ∂tα u C([0,T ];L2 (Ω)) ≤ c u0 H˙ 2 (Ω) .
144
6. Mathematical Theory of Subdiffusion
Proof. (i) First we will show that the representation u(t) = E(t)v gives a weak solution to problem (6.1). By Lemma 6.3 with p = q = 0, we have (6.22)
u(·, t) L2 (Ω) = E(t)u0 L2 (Ω) ≤ c u0 L2 (Ω) ,
and by Lemma 6.3 with p = 2 and q = 0, (6.23)
u(·, t) H˙ 2 (Ω) = E(t)u0 H˙ 2 (Ω) ≤ ct−α u0 L2 (Ω) ,
t > 0.
In (6.22), since the series ∞ u0 , ϕj Eα,1 (−λj tα )ϕj j=1 2 is convergent in L2 (Ω) uniformly ∞ in t ∈ [0, T ], we deduceα u ∈ C([0, T ]; L (Ω)). Further, using (6.23), since j=1 λj u0 , ϕj Eα,1 (−λj t )ϕj is convergent in L2 (Ω) uniformly in t ∈ [δ, T ] for any δ > 0, we have u ∈ C((0, T ]; H˙ 2 (Ω)). Hence, we obtain u ∈ C([0, T ]; L2 (Ω)) ∩ C((0, T ]; H˙ 2 (Ω)). Now using the pde, we conclude ∂tα u ∈ C((0, T ]; L2 (Ω)), and the desired estimate follows. It remains to show that the representation u(t) = E(t)u0 satisfies the initial condition, that is,
lim u(·, t) − u0 L2 (Ω) = 0.
(6.24)
t→0
In fact,
u(·, t) −
u0 2L2 (Ω)
=
∞
|u0 , ϕj |2 (Eα,1 (−λj tα ) − 1)2
j=1
tα ) − 1)
with limt→0 (Eα,1 (−λj = 0 for each j ∈ N and Eα,1 ((−∞, 0]) ⊆ [0, 1], hence ∞ ∞ |u0 , ϕj |2 |Eα,1 (−λj tα ) − 1|2 ≤ 4 |u0 , ϕj | < 4 u0 L2 (Ω) < ∞ j=1
j=1
for any 0 ≤ t ≤ T . Thus Lebesgue’s dominated convergence theorem yields the limit (6.24), hence the representation u(t) = E(t)u0 is indeed a solution to (6.1), in the sense of Definition 6.1. We now prove uniqueness of the weak solution in the function space under consideration. Due to linearity of the pde, the difference u = u1 − u2 between two possible solutions u1 , u2 of (6.1) satisfies the same problem (6.1) with zero initial conditions and vanishing right-hand side u0 = 0 and f = 0. We will show that this solution u must be trivial. Taking the inner product of (6.1) with ϕj and noting that u(t) ∈ L2 (Ω) for almost every t, we conclude that uj (t) = u(·, t), ϕj solves the ode ∂tα uj (t) = −λj uj (t)
for almost all t ∈ (0, T ).
6.2. Existence and regularity theory
145
Since u(·, t) ∈ L2 (Ω) for almost all t ∈ (0, T ), it follows from (6.24) that uj (0) = 0. Due to uniqueness of a solution to this fractional initial value problem, we deduce 0 = uj (t) = u(·, t), ϕj almost everywhere in time. Since {ϕj } is a complete orthonormal system in L2 (Ω), we have u ≡ 0 in Ω × (0, T ) almost everywhere. Finally, we show the analyticity of u(·, t) in the sector Σ ≡ {z ∈ C : z = 0, | arg(z)| < π/2}. Since Eα,1 (−z) is an entire function, as shown in Section 3.4, Eα,1 (−λn tα ) is analytic in Σ. Hence, the finite sum uN (·, t) =
N
u0 , ϕj Eα,1 (−λj tα )ϕj
j=1
is analytic in Σ. Further, by Corollary 3.2, ∞ u0 , ϕj 2 |Eα,1 (−λj z α )|2
uN (·, z) − u(·, z) 2L2 (Ω) = j=N +1 ∞
≤c
|u0 , ϕj |2 ,
z ∈ Σ,
j=N +1
which by (u0 , ϕj )j∈N ∈ l2 yields limN →∞ uN (·, z)−u(·, z) L∞(Σ;L2 (Ω)) = 0. Hence u is also analytic in Σ. This completes the proof of part (i). (ii) By Lemma 6.3 with p = 2 and q = 1, we have
u(t) 2H˙ 2 (Ω) ≤ c2 u0 2H˙ 1 (Ω) t−α . By 0 < α < 2p , it follows directly that
0
Lu Lp (0,T ;L2 (Ω)) = u Lp (0,T ;H˙ 2 (Ω)) ≤ c
T 1−αp/2
u0 H 1 (Ω) , 1 − αp/2
that is, u ∈ Lp (0, T ; H˙ 2 (Ω)). Now the pde implies that the same bound holds for ∂tα u L2 (0,T ;L2 (Ω)) , and the proof of part (ii) is complete. (iii) Let u0 ∈ H˙ 2 (Ω). Then for any t ≥ 0, by Lemma 6.3 with p = q = 2
u(·, t) 2H˙ 2 (Ω) = E(t)u0 H˙ 2 (Ω) ≤ c2 u0 2H˙ 2 (Ω) , that is Lu C([0,T ];L2 (Ω)) = u C([0,T ];H˙ 2 (Ω)) ≤ c u0 H˙ 2 (Ω) . Again, using the pde, we get the same bound for ∂tα u C([0,T ];L2 (Ω)) . By repeating the proof in Theorem 6.2 and interpolation, we have the following regularity estimates for intermediate cases. Corollary 6.1. The solution u(t) = E(t)u0 to problem (6.1) with f ≡ 0 satisfies
(∂tα ) u(·, t) H˙ p (Ω) ≤ ct−α(+
p−q ) 2
u0 H˙ q (Ω) ,
t > 0,
∈ {0, 1},
146
6. Mathematical Theory of Subdiffusion
where for = 0, 0 ≤ q ≤ p ≤ 2, and for = 1, 0 ≤ p ≤ q ≤ 2 and q ≤ p + 2. In classical diffusion, the jth Fourier mode of the initial data u0 is damped by a factor E1,1 (−λj t) = e−λj t , which gives
u(t) H˙ p (Ω) ≤ ct−(p−q)/2 u0 H˙ q (Ω) for every p > q. The weak damping on the high frequency modes in fractional diffusion accounts for the restriction p ≤ q + 2 in Corollary 6.1. The following result gives the temporal regularity of the solution. Note that the existence and uniqueness result Theorem 6.2 remains valid in the time interval (0, ∞). Theorem 6.3. Let 0 < α < 1, u0 ∈ L2 (Ω), and f = 0. Then for the unique weak solution u ∈ C([0, ∞); L2 (Ω)) ∩ C((0, ∞); H˙ 2 (Ω)) to problem (6.1), the estimate c
u0 L2 (Ω) , t ≥ 0,
u(·, t) L2 (Ω) ≤ 1 + λ1 tα holds. Moreover, for any q ≥ 0, q − 2 ≤ p ≤ q, and m ∈ N
∂tm u(·, t) H˙ p (Ω) ≤ ct−m−
p−q α 2
u0 H˙ q (Ω) ,
t > 0.
Proof. By Corollary 3.2, for any t ≥ 0,
u(·, t) 2L2 (Ω) = ≤
∞ j=1 ∞
u0 , ϕj 2 Eα,1 (−λj tα )2 u0 , ϕj
2
j=1
c 1 + λj tα
2
≤
c 1 + λ1 tα
2
u0 2L2 (Ω) .
Upon differentiation (cf. (3.34)), we have for any m ∈ N ∂tm u(t)
=−
∞
λj tα−m u0 , ϕj Eα,α−m+1 (−λj tα )ϕj .
j=1
Consequently, by Corollary 3.2
∂tm u(·, t) 2H˙ p (Ω)
=t
−2m
∞
λ2+p t2α u0 , ϕj 2 (Eα,α−m+1 (−λj tα ))2 j
j=1 ∞
≤ c2 sup λ>0
= c˜t
(λtα )2+p−q −2m−(p−q)α q t λj u0 , ϕj 2 (1 + λtα )2
−2m−(p−q)α
j=1
u0 2H˙ q (Ω) .
6.2. Existence and regularity theory
147
Example 6.1. Consider problem (6.1) on the unit interval Ω = (0, 1), that is, ∂tα u = uxx in Ω × (0, ∞), u(·, t) = 0 u(·, 0) = u0
on ∂Ω × (0, ∞), in Ω,
with u0 (x) = sin(πx). The initial data u0 belongs to H˙ q (Ω) for any q > 0. Since sin(πx) is the first Dirichlet eigenfunction of the negative Laplacian on the unit interval Ω (and the corresponding eigenvalue is π 2 ), the unique solution u to the problem is given by u(x, t) = Eα,1 (−π 2 tα ) sin(πx). Despite the smoothness of u0 , the temporal regularity of the solution u(x, t) is limited, and the estimate in Theorem 6.3 is nearly sharp. 6.2.3. Inhomogeneous problems. Next we turn to the inhomogeneous problem for (6.1) with u0 = 0. We begin with an estimate for the solution operator E defined in (6.20). Lemma 6.4. For the operator E defined in (6.20), the estimate
E(t)v H˙ p (Ω) ≤ ct−1+α(1+(q−p)/2) v H˙ q (Ω) ,
v ∈ H˙ q (Ω),
holds for any t > 0, q ≥ 0, q ≤ p ≤ q + 2. The next result gives well-posedness of the inhomogeneous problem with zero initial data. Theorem 6.4. Let u0 = 0, and f ∈ L2 (0, T ; H˙ q (Ω)), q ≥ −1. Then there exists a unique weak solution u ∈ L2 (0, T ; H˙ 2+q (Ω)) to problem (6.1) such that ∂tα u ∈ L2 (0, T ; L2 (Ω)). Moreover,
u L2 (0,T ;H˙ 2+q (Ω)) + ∂tα u L2 (0,T ;H˙ q (Ω)) ≤ c f L2 (0,T ;H˙ q (Ω)) , t and the representation (6.18), that is, u(t) = 0 E(t − s)f (s) ds, holds in the space L2 (0, T ; H˙ q (Ω)). Further, if α > 12 , then limt→0 u(·, t) H˙ q (Ω) = 0. Proof. By (3.31) and Corollary 3.3, for t > 0, we have t t α−1 α |s Eα,α (−λj s )|ds = λj sα−1 Eα,α (−λj sα ) ds λj 0 0 t d Eα,1 (−λj sα ) = 1 − Eα,1 (−λj tα ) ≤ 1. =− ds 0 Thus by Young’s inequality for convolution (A.15) with p = r = 2, q = 1, that is,
g ∗ h L2 (0,T ) ≤ g L2 (0,T ) h L1 (0,T ) ,
148
6. Mathematical Theory of Subdiffusion
we obtain
u 2L2 (0,T ;H˙ 2+q (Ω)) T t ∞ * * * f (·, s), ϕj λj (t−s)α−1 Eα,α (−λj (t − s)α ) ds*2 dt = λqj 0
j=1
≤ ≤
∞ j=1 ∞
λqj
T
j=1
T
|f (·, t), ϕj | dt 2
0
λqj
0
|λj t
α
Eα,α (−λj t ) dt|
0 T
0
2 α−1
|f (·, t), ϕj |2 dt = f 2L2 (0,T ;H˙ q (Ω)) .
Using the pde, we also get
∂tα u 2L2 (0,T ;H˙ q (Ω)) = Lu + f 2L2 (0,T ;H˙ q (Ω)) ≤ 2 f 2L2 (0,T ;H˙ q (Ω)) . Finally, the fact that the initial condition is satisfied follows from boundedness of u in H α (0, T ; H˙ q (Ω)) (cf. Theorem 4.8), which by continuity of the embedding H α (0, T ) → C[0, T ] for α > 1/2 implies continuity with respect to time. 6.2.
The uniqueness of the solution has already been proven in Theorem
Remark 6.3. The initial condition can also be made sense of if α ≤ 1/2, provided f is sufficiently regular: if for some p > 1, q˜ ∈ R, Aq˜f ∈ q the initial conL∞ (0, T ; Lp (Ω)), then for any γ > d(p − 1)/(2p) − 2 − 2˜ dition is satisfied in the sense that limt→0 u(·, t) H˙ −γ (Ω) = 0. This can be seen as follows. By the solution representation and another application of Young’s inequality (A.15), now with p = r = ∞, q = 1, and setting η = (2 + γ + 2˜ q )2p/(p − 1), for convenience we have
u(·, t) 2H˙ −γ (Ω) *2 * t ∞ * −γ ** α−1 α = λj * f (·, s), ϕj (t − s) Eα,α (−λj (t − s) ) ds** ≤
j=1 ∞ j=1
0
*
q λ2˜ j
sup 0≤s≤t
* |f (·, s), ϕj |2 λj−γ−2˜q **
≤ Aq˜f 2L∞ (0,T ;L2p (Ω)) ≤ Aq˜f 2L∞ (0,T ;L2p (Ω))
∞ j=1 ∞ j=1
t α−1
s 0
*2 * Eα,α (−λj s ) ds**
α 2 λ−η j (1 − Eα,1 (−λj t ))
−η) 2(p−1)/p
λj
.
α
2(p−1)/p
6.2. Existence and regularity theory
149
Due to Weyl’s eigenvalue estimate [345] λj ≥ cj 2/d , j ∈ N, the sum ∞ −η is finite for γ > d(p − 1)/(2p) − 2 − 2˜ q . Then by the j=1 λj limiting behaviour of the Mittag-Leffler function for small argument, limt→0 (1 − Eα,1 (−λj tα )) = 0 for each j ∈ N. Now Lebesgue’s dominated convergence theorem implies lim u(·, t) 2H˙ −γ (Ω) = 0. t→0
Our next result is an L∞ estimate in time for the solution u to problem (6.1) with zero initial data. The factor in the estimate reflects the limited smoothing property of the subdiffusion operator. Theorem 6.5. Let u0 = 0, and f ∈ L∞ (0, T ; H˙ q (Ω)), 0 ≤ q ≤ 1. Then for any 0 < < 1, the unique weak solution u to problem (6.1), according to Theorem 6.4, satisfies u ∈ L∞ (0, T ; H˙ q+2− (Ω)) and
u(t) H˙ q+2− (Ω) ≤ C−1 tα/2 f L∞ (0,t;H˙ q (Ω)) .
(6.25)
Proof. By Lemma 6.4 we have
u(t) H˙ q+2− (Ω) t t = E(t − s)f (s) ds H˙ q+2− (Ω) ≤
E(t − s)f (s) H˙ q+2− (Ω) ds 0 0 t ≤ c (t − s)α/2−1 f (s) H˙ q (Ω) ds 0
≤ 2c(α)−1 tα/2 f L∞ (0,t;H˙ q (Ω)) , which shows the desired estimate. Finally, it follows directly that the representation u also satisfies the initial condition u(0) = 0, that is, for any ∈ (0, 1), limt→0+ u(t) H˙ q+2− (Ω) = 0. Temporal regularity of the inhomogeneous problem, following [242], can be derived by introducing an operator D defined by Dw(t) = tw (t). Lemma 6.5. There exist constants amjk such that D m (v ∗ w) = amjk (D j v) ∗ (D k w). j+k≤m
Proof. The assertion follows by induction on m. First we observe that t d t v(t − s)w(s) ds = v(0)w(t) + v (t − s)w(s) ds, dt 0 0 which gives
t
D(v ∗ w)(t) = v(0)tw(t) + 0
t
(Dv)(t − s)w(s) ds + 0
sv (t − s)w(s) ds.
150
6. Mathematical Theory of Subdiffusion
Integration by parts gives t sv (t − s)w(s) ds = −v(0)tw(t) + (v ∗ w)(t) + (v ∗ Dw)(t), 0
and thus D(v ∗ w) = v ∗ w + (Dv) ∗ w + v ∗ (Dw). This shows the case of m = 1, with a100 = a110 = a101 = 1, and the identity D[(D j v) ∗ (D k w)] = (D j v) ∗ (D k w) + (D j+1 v) ∗ (D k w) + (D j v) ∗ (D k+1 w),
from which the induction step follows immediately. From this we can conclude the following. Corollary 6.2. For any m ∈ N, there exist constants cmjk such that cmjk (tj u(j) ) ∗ (tk u(k) ). tm (u ∗ v)(m) = j+k≤m
Applying this with u(t) = tα−1 Eα,α (−λj t) yields another estimate on E(t). Lemma 6.6. For any t > 0, we have for any m ≥ 0,
∂tm E(t)v H˙ p (Ω) ≤ ctα−m−1−
p−q α 2
v H˙ q (Ω) ,
where q ≥ 0 and q ≤ p ≤ q + 2. Based on this, we can state a regularity result for the inhomogeneous problem. Theorem 6.6. If u0 ≡ 0 and f ∈ W m,∞ (0, T ; L2 (Ω)), m ∈ N, then
∂tm u(t) H˙ p (Ω) ≤ cT tα−m f W m,∞ (0,T ;L2 (Ω)) ,
0 ≤ p ≤ 2.
Proof. Using Corollary 6.2 and Lemma 6.6, we derive that for t ∈ (0, T ], t m m
(t − s)j ∂tj E(t − s)(sk f (k) ) H˙ p (Ω) ds t ∂t u H˙ p (Ω) ≤ c j+k≤m 0
≤c
t
p
(t − s)α−1− 2 α sk f (m) (s)) L2 (Ω) ds
j+k≤m 0
≤ c f W m,∞ (0,T ;L2 (Ω))
t
2−p α+k 2
.
j+k≤m
For t ∈ (0, T ],
j+k≤m t
2−p α+k−m 2
≤ cT t
2−p α−m 2
, and the assertion follows.
6.2. Existence and regularity theory
151
6.2.4. Semilinear problems. Now we briefly discuss a slightly more complex case, that is, semilinear problems using a fixed point argument. Consider the following initial boundary value problem for a semilinear subdiffusion equation ⎧ ∂tα u = Lu + f (u) + r in Ω × (0, T ), ⎪ ⎨ ∂ν u + γu = 0 on ∂Ω × (0, T ), (6.26) ⎪ ⎩ u(·, 0) = u0 in Ω. In the model, we assume (6.27) f ∈ C 0,1 (R) = W 1,∞ (R)
and
r ∈ W 1,P (0, T ; L2 (Ω)) ∩ L∞ (0, T ; L2 (Ω))
for some P > 1/α, that is, f is Lipschitz continuous on R. The argument below is based on the operator theoretic approach in L2 (Ω). For existence and uniqueness, we refer to [166, Theorem 3.1]. We will here provide a regularity result. Theorem 6.7. Let u0 ∈ H˙ 2 (Ω) and (6.27) hold. Then the solution u to problem (6.26) belongs to C([0, T ]; H˙ 2 (Ω)) ∩ C 1 ((0, T ]; L2 (Ω)) with ∂tα u ∈ C([0, T ]; L2 (Ω)) and (6.28)
ut (t) L2 (Ω) ≤ Ctα−1 ( u0 H˙ 2 (Ω) + f L∞ (R) + rt LP (0,t;L2 (Ω) ) .
If additionally u0 ∈ H˙ 2/α (Ω), then we even have u ∈ C 1 ([0, T ]; L2 (Ω)) with (6.29)
ut (t) L2 (Ω) ≤ C( u0 H˙ 2/α (Ω) + f L∞ (R) + rt LP (0,t;L2 (Ω) ) .
Proof. Note that for unique existence of u, it suffices to discuss the integral equation (cf. (6.18)) t t u(t) = E(t)u0 + E(t − s)f (u(s)) ds + E(t − s)r(s) ds, 0 < t < T, 0
0
similarly to the ode setting from Section 5.1.2. We will again use the self-adjoint positive definite operator A = −L with homogeneous Dirichlet boundary conditions. First we estimate ut (t), which for almost every t ∈ (0, T ) is given by t t ut (t) = ∂t E(t)u0 + ∂t ( E(s)f (u(t − s)) ds) + ∂t ( E(s)r(t − s) ds) 0 0 t t E(s)f (u(t − s))ut (t − s) ds + E(s)rt (t − s) ds , = −E(t)Au0 + 0
0
152
6. Mathematical Theory of Subdiffusion
where we have used (6.21). Consequently, for any 0 < t ≤ T , by Lemma 6.4 with p = q = 0, we get t
ut (t) L2 (Ω) ≤
E(s) L2 (Ω)→L2 (Ω) f (u(t − s)) L∞ (R) ut (t − s) L2 (Ω) ds 0 t
E(s) L2 (Ω)→L2 (Ω) rt (t − s) L2 (Ω) ds + E(t)Au0 L2 (Ω) + 0 t ≤c sα−1 f L∞ (R) ut (t − s) L2 (Ω) ds 0 t sα−1 rt (t − s) L2 (Ω) ds + E(t)Au0 L2 (Ω) . +c 0
For the last term, depending on the regularity of the initial data, by Lemma 6.4 with p = 0 and q = 0 or q = 2/α − 2, we get ctα−1 u0 H˙ 2 (Ω) if u0 ∈ H˙ 2 (Ω),
E(t)Au0 L2 (Ω) ≤ if u0 ∈ H˙ 2/α (Ω). c u0 ˙ 2/α H
(Ω)
The term containing r can be estimated by H¨older’s inequality with p = P/(P − 1), q = P , t t
(P −1)/P α−1 s
rt (t − s) L2 (Ω) ds ≤ s(α−1)P/(P −1) ds
rt LP (0,T ;L2 (Ω)) , 0
0
where due to P > 1/α, we have (α − 1)P/(P − 1) > −1 and the first integral is finite. By the special version of Gronwall’s inequality Lemma A.1, we obtain (6.28) or (6.29), respectively. To obtain spatial regularity, using again the solution representation (6.18) and (6.21), we write t AE(t − s)(f (u(s)) + r(s)) ds Au(t) = AE(t)u0 + 0
= AE(t)u0 + (E(t) − I)(f (u(t)) + r(t)) t AE(t − s)(f (u(s)) − f (u(t)) + r(s) − r(t)) ds. + 0
Lemma 6.3 with p = q = 0 yields
AE(t)u0 L2 (Ω) ≤ u0 H˙ 2 (Ω) ,
(E(t) − I)(f (u(t)) + r(t)) L2 (Ω) ≤ f (u(t)) L2 (Ω) + r(t) L2 (Ω) ≤ |Ω|1/2 f L∞ (R) + r L∞ (0,T ;L2 (Ω)) . By the mean value theorem, we have 1 f (u(t) + θ(u(s) − u(t))) dθ(u(s) − u(t)) . f (u(s)) − f (u(t)) = 0
6.2. Existence and regularity theory
153
Consequently,
f (u(s)) − f (u(t)) L2 (Ω) ≤ f L∞ (R) u(s) − u(t) L2 (Ω) s ≤ f L∞ (R)
ut (τ ) L2 (Ω) dτ t t ˜ τ α−1 dτ ≤ C(u0 , f, r) s
with ˜ 0 , f, r) = C( u0 ˙ 2 + f L∞ (R) + rt LP (0,t;L2 (Ω) ) f L∞ (R) . C(u H (Ω) Hence, by Lemma 6.4 with p = 2, q = 0, we deduce t AE(t − s)(f (u(s)) − f (u(t))) ds L2 (Ω)
0 t t −1 ˜ ≤ cC(u0 , f, r) f L∞ (R) (t − s) τ α−1 dτ ds, 0
where t 0
τ
(t − s)−1 ds τ α−1 dτ =
0
t
t
s
(ln(t) − ln(t − τ )) τ α−1 dτ < ∞.
0
Likewise, since r(t) − r(s) = s rt (τ )dτ , using an argument similar to above and Lemma 6.4 with p = 2, q = 0, we have t AE(t − s)(r(s) − r(t)) ds 0 t t −1
rt (τ ) L2 (Ω) dτ ds ≤ c (t − s) 0 s t τ (t − s)−1 ds rt (τ ) L2 (Ω) dτ =c 0 0 t = c (ln(t) − ln(t − τ )) rt (τ ) L2 (Ω) dτ 0
≤ c˜ rt LP (0,t;L2 (Ω) . Therefore, we have Au ∈ C([0, T ]; L2 (Ω)), that is, u ∈ C([0, T ]; H˙ 2 (Ω)), and
u C([0,T ];H˙ 2 (Ω)) ≤ c( u0 H˙ 2 (Ω) + f L∞ (R) + rt LP (0,t;L2 (Ω) ). From the pde we obtain the temporal regularity
∂tα u(t) L2 (Ω) = Lu(t) + f (u(t)) + r(t) L2 (Ω) ≤ u C([0,T ];H˙ 2 (Ω)) + |Ω|1/2 f L∞ (R) + r L∞ (0,T ;L2 (Ω)) .
154
6. Mathematical Theory of Subdiffusion
6.2.5. An energy argument. Finally, we return to the linear setting and we briefly mention an energy argument based on the coercivity estimate Lemma 4.18 that is due to Alikhanov [10]. This allows us to prove that the solution of problem (6.1) is unique and continuously depends on the input data. Theorem 6.8. The solution u to problem (6.1) satisfies the following a priori estimate
u 2L2 (Ω) + 0 Itα u 2H˙ 1 (Ω) ≤ c u0 L2 (Ω) + 0 Itα f 2H˙ −1 (Ω) . Proof. Multiplying the equation by u and integrating over Ω yields α 1/2 2 u∂t u dx + |A u| dx = uf dx, Ω
Ω
Ω
where by Young’s inequality Ω uf dx ≤ 12 A1/2 u 2L2 (Ω) + 12 f 2H˙ −1 (Ω) . By Lemma 4.18, we have α 1 u(x, t)∂t u(x, t) dx ≥ 2 ∂tα u2 (x, t) dx = 12 ∂tα u 2L2 (Ω) (t) , Ω
Ω
thus, altogether ∂tα u 2L2 (Ω) + A1/2 u 2L2 (Ω) ≤ f 2H˙ −1 (Ω) . Now applying the Abel fractional integral operator 0 Itα to both sides of the inequality yields the desired estimate.
6.3. Maximum principle For the standard parabolic equation, there are many important qualitative properties, e.g., (weak and strong) maximum principle, and unique continuation property. These properties are less well understood for subdiffusion, and we discuss only the weak maximum principle, following [39, 232]. We do so for the following model: ∂tα u − Δu + qu = f, (6.30)
u = h, u(x, 0) = u0 ,
in Ω × (0, T ], on ∂Ω × (0, T ], in Ω,
where u0 and h are given smooth functions on Ω and ∂Ω×[0, T ], respectively, with h(x, 0) = u0 (x) for x ∈ ∂Ω, and f is a given continuous function on Ω × [0, T ]. Now we state the weak maximum principle for the subdiffusion model.
6.3. Maximum principle
155
Theorem 6.9. Let α ∈ (0, 1), and let q be a nonnegative smooth function 2 on Ω. Suppose u is a continuous function on Ω × [0, T ], with ∂tα u, ∂∂xu2 in i
C(Ω × [0, T ]), i ∈ {1, . . . , d}. (i) If ∂tα u − Δu + qu ≤ 0 on Ω × [0, T ], then u(x, t) ≤ max{0, M }
∀(x, t) ∈ Ω × [0, T ]
with M := max{maxx∈Ω u(x, 0), max(x,t)∈∂Ω×[0,T ] u(x, t)}. (ii) If ∂tα u − Δu + qu ≥ 0 on Ω × [0, T ], then min{m, 0} ≤ u(x, t)
∀(x, t) ∈ Ω × [0, T ]
with m := min{minx∈Ω u(x, 0), min(x,t)∈∂Ω×[0,T ] u(x, t)}. Proof. As (ii) can be obtained from (i) by multiplication with (−1), it suffices to show assertion (i). Since u is continuous on the compact domain Ω × [0, T ], according to Weierstraß’s theorem there exists a maximiser, that is, a point (x0 , t0 ) ∈ Ω × [0, T ] such that u(x, t) ≤ u(x0 , t0 )
for all (x, t) ∈ Ω × [0, T ].
If u(x0 , t0 ) ≤ 0, the desired inequality follows directly. Hence it suffices to consider u(x0 , t0 ) > 0, and we distinguish two cases. (a) If (x0 , t0 ) ∈ ∂Ω × [0, T ], or t0 = 0, then M = u(x0 , t0 ), and the assertion follows. (b) If (x0 , t0 ) ∈ Ω × (0, T ], then due to the second order necessary condition for a maximiser, Δu(x0 , t0 ) ≤ 0; moreover, qu(x0 , t0 ) ≥ 0, and the condition (∂tα u−Δu+qu)(x0 , t0 ) ≤ 0 implies ∂tα u(x0 , t0 ) ≤ 0, while Lemma 4.15 implies ∂tα u(x0 , t0 ) ≥ 0. Hence we obtain ∂tα u(x0 , t0 ) = 0 and
u(x0 , t) ≤ u(x0 , t0 )
for all 0 ≤ t ≤ t0 .
Now Lemma 4.16 yields u(x0 , t0 ) = u(x0 , 0), and hence the proof is complete. The maximum principle in Theorem 6.9 allows us to derive a maximum norm a priori estimate for the solution of problem (6.30). Theorem 6.10. If u is a classical solution to problem (6.30), then # $ max |h(x, t)| max |u(x, t)| ≤ max max |u0 (x)|, x∈Ω,t∈[0,T ]
+
x∈Ω Tα
(x,t)∈∂Ω×[0,T ]
max |f (x, t)|. Γ(1 + α) (x,t)∈Ω×[0,T ]
156
6. Mathematical Theory of Subdiffusion
One can strengthen the weak maximum principle to a nearly strong one [223, Corollary 3.1], but the strong maximum principle as known for the heat equation remains elusive. Theorem 6.11. Let u be the solution to (6.30) with u0 ∈ L2 (Ω) with u0 ≥ 0 and u0 = 0, f = 0. Then for any x ∈ Ω, the set Ex := {t > 0 : u(x, t) ≤ 0} is at most a finite set.
6.4. Notes The fundamental solution for subdiffusion has been discussed extensively in the literature; see for example [90, 234, 236, 314]. The θ function was discussed in [301], and it is useful for studying certain inverse problems (see Section 10.7), a tradition parallel to the classical heat equation [49]. The mathematical theory of subdiffusion has undergone extensive investigation in recent years. One of the early rigorous mathematical studies is [90], and much of the material in this chapter was taken from [306]. The material in Section 6.2 also follows that of [242, 306]. We particularly refer to the recent monographs [161, 203]. We mostly focused on linear problems in a Hilbert Sobolev space setting. For obtaining results in general Lp spaces and related Sobolev norms, recently techniques related to maximal Lp regularity have been extended to the subdiffusion setting. We refer to, e.g., [25, 167, 341] and the references therein.
Chapter 7
Analysis of Fractionally Damped Wave Equations
We will now return to the wave type equations derived in Section 2.2.2 and analyse their well-posedness when equipped with initial and boundary conditions. In addition to proving existence, uniqueness, and regularity of solutions to linear equations, we will also study nonlinear pdes arising in nonlinear acoustics related to some of the inverse problems to be discussed in Chapter 11. As in Chapter 6, ∂tα denotes the partial Djrbashian–Caputo fractional derivative with left-hand endpoint at zero and L as a second order elliptic differential operator. Also, throughout this chapter we will assume Ω ⊆ Rd , d ∈ {1, 2, 3}, to be a bounded C 1,1 smooth domain so that we can make use of elliptic regularity. Correspondingly, we will assume that A = −L with homogeneous Dirichlet boundary conditions is self-adjoint and positive definite with bounded inverse A−1 : L2 (Ω) → H 2 (Ω). As a consequence, A−1 : L2 (Ω) → L2 (Ω) is compact and therefore has a complete orthonormal system of eigenfunctions ϕi with the reciprocals λi of the corresponding eigenvalues satisfying Aϕi = λi ϕi . In the analysis of the nonlinear wave equations below, we will need the following embedding estimates: Ω
v L∞ (Ω) ≤ CH 2 ,L∞ Av L2 (Ω) Ω
v L6 (Ω) ≤ CH 1 ,L6 ∇v L2 (Ω)
for all v ∈ H 2 (Ω) ∩ H01 (Ω) , for all v ∈ H01 (Ω) . 157
158
7. Analysis of Fractionally Damped Wave Equations
Also the Poincar´e–Friedrichs inequality,
v L2 (Ω) ≤ CPΩF ∇v L2 (Ω)
for all v ∈ H01 (Ω) ,
will be made use of; see Sections A.2, A.3.
7.1. Linear damped wave equations Referring to (2.54), we start with the general fractional linear acoustic wave model N M 2+αn an ∂t u− bm ∂tβm Lu = r , (7.1) n=0
m=0
where (7.2)
0 ≤ α0 < α1 < · · · < αN ≤ 1 ,
0 ≤ β0 < β1 < · · · < βM ≤ 1 ,
and L is a self-adjoint elliptic operator
with possibly space dependent coef 1 ficients, for example Lu = ∇ · ρ ∇u as in (2.54). For simplicity, we equip the pde with homogeneous Dirichlet boundary conditions, but the analysis below can be extended to other boundary conditions. The coefficients an and bm may depend on the space variable as well, with a smoothness that will be specified below—in the lowest order regularity regime of Theorem 7.1 it is enough to assume them to be contained in L∞ (Ω) and satisfy the nonnegativity, nondegeneracy, and stability assumptions bm (x) ≥ 0 , m ∈ {m∗ , . . . , M } , an (x) ≥ 0 , n ∈ {1, . . . , N } , (7.3) aN (x) ≥ a > 0 , bM (x) ≥ b > 0 for almost all x ∈ Ω , αN ≤ βM , as well as ∇bm ∈ L3 (Ω) for all m ∈ {1, . . . , M }, with m∗ to be defined in (7.16) below. As an elementary example for the fact that the spatial dependencies of coefficients in (7.1) and their position in- and outside the spatial derivatives are practically highly relevant, we point to the classical (nonfractional) acoustic wave equation, which in the case of spatially variable sound speed c(x) and mass density ρ(x) reads as 1 2 utt = c (x) ρ(x)∇ · ∇u ; ρ(x) see, e.g., [21], [190]. We mention in passing that we will consider the inverse problem of identifying c(x) within some particular cases of fractional models (7.1) in Section 11.2. In order to make assertions on the well-posedness of these equations, we will heavily rely on energy estimates that arise from multiplying (formally) (7.1) with (−L)k ∂tτ u for certain orders k ∈ N, τ ∈ [0, ∞). To demonstrate how this works in principle, let us return to an elementary example of an
7.1. Linear damped wave equations
159
integer order pde. In the linear wave equation with Kelvin–Voigt damping (N = 0, M = 1, α0 = 0, β0 = 0, β1 = 1, a0 = 1, L = ) and constant coefficients b0 , b1 , (7.4)
utt − b0 u − b1 ut = r .
Multiplying with ut , integrating over Ω, and integrating by parts using the fact that 1 d 1d |∇u|2 , utt ut = (ut )2 , (7.5) ∇u · ∇ut = 2 dt 2 dt we obtain 1 d b0 d |ut |2 dx + |∇u|2 dx + b1 |∇ut |2 dx = r ut dx, 2 dt Ω 2 dt Ω Ω Ω where we have skipped the arguments (x, t) of the integrands for better readability. Integrating over (0, t) and applying Young’s inequality to estimate the term containing r, we arrive at the estimate t |∇ut |2 dx ds E[u](t) + b1 0 Ω (7.6) t t 1 E[u](s) ds + r(s)2 dx ds ≤ E[u](0) + d 2d 0 0 Ω for the wave energy defined by
1 |ut (x, t)|2 + b0 |∇u(x, t)|2 dx . E[u](t) = 2 Ω Inequality (7.6) together with Gronwall’s lemma allows us to derive an estimate on E[u] and from this we recover estimates on the L2 (Ω) norms of ut and ∇u. Choosing d < (C Ωb1 )2 , we even obtain decay of the energy in case PF r = 0. For this to work, it was crucial that we had chosen our multiplier as the right order time derivative of u (ut in this case). To see this, consider the results that we would have obtained by applying this procedure after multiplying with u instead: t t b1 2 2 |ut | dx ds + b0 |∇u| dx ds + |∇u(t)|2 dx − 2 0 0 Ω Ω Ω t b1 2 ut (0)u(0) dx − ut (t)u(t) dx + |∇u(0)| dx + r u dx . = 2 Ω 0 Ω Ω Ω Here we have integrated by parts with respect to time in the first term. The two positive terms on the left-hand side with coefficients b0 and b1 are not able to compensate for the negative first term since this contains a higher time derivative of u. A similar problem arises with the b0 term when multiplying with utt in case b1 = 0. The tools that we have used for obtaining positive sign energy terms in (7.6) are the identities (7.5) and
160
7. Analysis of Fractionally Damped Wave Equations
the fact that from the b1 -term we got −b1 Ω ut ut dx = b1 Ω |∇ut |2 dx and therefore another nonnegative contribution to the left-hand side of the energy estimate. As an extension of these tools to fractional order derivatives, we recall some coercivity and boundedness estimates that will be crucial in the energy estimates below and that hold for γ ∈ [0, 1). • From Lemma 4.18: For any absolutely continuous function w, ∂tγ w(t)w(t) ≥ 12 (∂tγ w2 )(t).
(7.7)
• From Lemma 4.20: For any w ∈ H −γ/2 (0, t), t w(s) 0 It γ w(s)ds ≥ cos(πγ/2) w 2H −γ/2 (0,t) . (7.8) 0
• From Lemma 4.19: For any w ∈ H 1 (0, t), t 1 , 1 wt (s) ∂sγ [w](s) ds ≥ Itγ (∂tγ w)2 ≥
∂tγ w 2L2 (0,t) . (7.9) 1−γ 2 2Γ(γ)t 0 • From Lemma 4.17: There exists a constant C(2γ) > 0 such that for any w ∈ L2 (0, t), t (7.10) w(s) 0 It 2γ w(s)ds ≤ C(2γ) w 2H −γ (0,t;V ) . 0
• From Theorem 4.8: There exist constants 0 < C(γ) < C(γ) such that for any w ∈ H γ (0, t) (with w(0) = 0 if γ ≥ 12 ), the equivalence estimates (7.11)
C(γ) w H γ (0,t) ≤ ∂tγ w L2 (0,t) ≤ C(γ) w H γ (0,t) hold. Likewise, for any w ∈ L2 (0, t), we have ∂tγ w ∈ H −γ (0, t) and
(7.12)
C(γ) w L2 (0,T ) ≤ ∂tγ w H −γ (0,t) ≤ C(γ) w L2 (0,t) .
Note that (7.10) and (7.12) remain valid with γ = 1. These provide us with tools for obtaining useful estimates from below and above products of different order derivatives of u, as they will arise when multiplying (7.1) with ∂tτ u for a certain τ . This approach works as long as the difference between the differentiation orders of the two factors is smaller than one, whereas it can give terms with adverse sign otherwise, as the integer order example above shows. As a result we will arrive at estimates of the form (7.13)
α
∂tα u L2 (0,t;V ) ≤ C˜0 (t) + C˜1 ∂t u L2 (0,t;V ) ,
t ∈ (0, T ),
7.1. Linear damped wave equations
161
with α < α and V = L2 (Ω) or V = H˙ 1 (Ω) (or some other subspace of L2 (Ω) in the regularity scale induced by powers of A). As long as the difference γ α between α and α is large enough, this allows us to estimate ∂t u L∞ (0,t;V ) , which is useful for extracting estimates from (7.13) by means of Gronwall’s lemma. Lemma 7.1. For any Banach space V and any v ∈ L2 (0, T ; V ), γ ∈ ( 12 , 1], the inequality
v 2L2 (0,t;V ) ≥ Γ(γ)2 (2γ − 1)t1−2γ 0 It γ v 2L∞ (0,t;V ) holds.
Proof. The estimate is an immediate consequence of the continuity of the embedding H γ (0, T ; V ) → L∞ (0, T, V ) for γ > 12 . More explicitly, using the Cauchy–Schwarz inequality, we get s 1 sup
(s − r)γ−1 v(r) dr 2V Γ(γ)2 s∈(0,t) 0 s s 1 2γ−2 ≤ sup (s − r) dr
v(r) 2V dr Γ(γ)2 s∈(0,t) 0 0
0 It γ v 2L∞ (0,t;V ) =
≤
t2γ−1
v 2L2 (0,t;V ) . Γ(γ)2 (2γ − 1)
Our goal is to state and prove a theorem on well-posedness of an initial value problem,
(7.14)
⎧ N ⎪ ⎪ ⎪ ⎪ an (x)∂t2+αn u(x, t) ⎪ ⎪ ⎪ ⎪ n=0 ⎪ ⎪ ⎪ M ⎪ ⎪ ⎨ bm (x)∂tβm Lu(x, t) = r(x, t), (x, t) ∈ Ω × (0, T ), − m=0 ⎪ ⎪ ⎪ ⎪ ⎪ u(x, t) = 0, (x, t) ∈ ∂Ω × (0, T ), ⎪ ⎪ ⎪ ⎪ ⎪ u(x, 0) = u0 (x) , ut (x, 0) = u1 (x), x ∈ Ω, ⎪ ⎪ ⎪ ⎩ (if α > 0 : u (x, 0) = u (x), x ∈ Ω), tt 2 N
162
7. Analysis of Fractionally Damped Wave Equations
for the general linear fractionally damped wave equation (7.1). We first of all do so for the following time integrated and therefore weaker form ⎧ N
⎪ t1−αn ⎪ ⎪ ⎪ u2 (x) an (x) ∂t1+αn u(x, t) − ⎪ ⎪ Γ(2 − αn ) ⎪ ⎪ n=0 ⎪ ⎪ ⎪ M ⎪
⎪ t1−βm ⎪ ⎪ − Lu0 (x) bm (x) 0 It 1−βm Lu(x, t) − ⎨ Γ(2 − βm ) (7.15) m=0 ⎪ ⎪ t ⎪ ⎪ ⎪ r(x, s) ds, (x, t) ∈ Ω × (0, T ), = ⎪ ⎪ ⎪ 0 ⎪ ⎪ ⎪ ⎪ u(x, t) = 0, (x, t) ∈ ∂Ω × (0, T ), ⎪ ⎪ ⎪ ⎩ u(x, 0) = u0 (x) , ut (x, 0) = u1 (x), x ∈ Ω. The results will be obtained by testing, analogously to (7.4)–(7.6), the pde (7.1) with ∂tτ u for some τ > 0 that needs to be well chosen in order to allow for exploiting the estimates (7.8)–(7.12). Generalizing the common choice τ = 1 in case of the classical second order wave equation α0 = · · · = αN = β0 = · · · = βM = 0 and in view of the stability condition αN ≤ βM , we choose τ ∈ [1 + αN , 1 + βM ], actually, τ = 1 + βm∗ with m∗ according to (7.16)
m∗ ∈ {1, . . . , M } such that βm∗ −1 < αN ≤ βm∗ ,
if β0 < αN and m∗ = 0 otherwise. More precisely, we will assume that (7.17)
(i) β0 < αN , m∗ as in (7.16), or (ii) β0 ≥ αN , m∗ := 0
holds. Additionally, we assume (7.18) βM − 1 ≤ βm∗ ≤ 1 + min{α0 , β0 } and either αN < βM or αN = βM = β0 . The upper and lower bounds on βm∗ in (7.18) are needed to guarantee γ ∈ [0, 1) in several instances of γ used in (7.8)–(7.12) in the proof below. Note that we are considering the case αN = βM only in the specific setting of a single A term. Indeed, when admitting several A terms, well-posedness depends on specific coefficient constellations; see, e.g., [180]. It will turn out in the proof that we therefore have to impose a bound on the magnitude of the lower order term coefficients in terms of the highest order one, (7.19) m ∗ −1 (β −β )/2 C(1+βm∗ −βk )C((1 + βm∗ − βm )/2) |bm | 0 It M m L2 (0,T )→L2 (0,T ) m=0
≤ C((1 + βm∗ − βM )/2) bM cos(π(1 + βm∗ − βM )/2), (β
−β )/2
that due to T dependence of 0 It M m L2 (0,T )→L2 (0,T ) clearly restricts the maximal time T unless all bm with index lower than m∗ vanish; cf. Corollary 7.1.
7.1. Linear damped wave equations
163
In order to carry out integration by parts with respect to the x variable, analogously to (7.4)–(7.6), but with spatially varying coefficients, the following coefficient regularity will be needed: an , bm ∈ L∞ (Ω) , n ∈ {0, . . . , N } , m ∈ {0, . . . , M },
(7.20)
[bm − A−1/2 bm A1/2 ]w L2 (Ω) ≤ Cb w L2 (Ω)
for all w ∈ H˙ 1 (Ω).
Testing as described above, we will derive estimates for the energy functionals (which are not supposed to represent physical energies but are just used for mathematical purposes), Ea [u](t) =
(7.21)
N −1
cos(π(1 + αn − βm∗ )/2)
n=0
√ (3+αn +βm∗ )/2 u 2L2 (0,t;L2 (Ω)) + E a [u](t), × an ∂t
where E a [u](t) =
(7.22)
(3+αN +βm∗ )/2 a ˜ ∂t u 2L2 (0,t;L2 (Ω)) a 1+αN u 2L∞ (0,t;L2 (Ω)) 4 ∂t
if αN < βm∗ , if αN = βm∗ ,
where a ˜ = a cos(π(1 + αN − βm∗ )/2)C((1 + αN − βm∗ )/2)2 with C as in (7.12) and 1 β Eb [u](t) = bm∗ ∂t m∗ A1/2 uL 2L∞ (0,t);L2 (Ω) 4 M (7.23) + cos(π(1 + βm∗ − βm )/2) m=m∗ +1
×
(1+βm∗ +βm )/2
bm ∂t
A1/2 uL 2L2 (0,t;L2 (Ω)) ,
where the sum in (7.23) is void in case m∗ = M . The solution space induced by these energies is Uα ∩ Uβ , where (7.24) Uα = Uβ =
H (3+αN +βm∗ )/2 (0, T ; L2 (Ω)) {v ∈ L2 (0, T ; L2 (Ω)) : ∂ 1+αN v ∈ L∞ (0, T ; L2 (Ω))
if αN < βm∗ if αN = βm∗ ,
H (1+βM +βm∗ )/2 (0, T ; H˙ 1 (Ω)) {v ∈ H (1−)/2+βM (0, T ; H˙ 1 (Ω)) : ∂ βM v ∈ L∞ (0, T ; H˙ 1 (Ω))
if M > m∗ if M = m∗
for any ∈ (0, βM − max{βM −1 , αN }). Theorem 7.1. Assume that T ∈ (0, ∞] and the coefficients αn , βm , an , bm satisfy (7.2), (7.3), (7.17), (7.18), (7.19), and (7.20). Then for any u0 , u1 ∈ H˙ 1 (Ω), (u2 ∈ L2 (Ω) if αN > 0), r ∈ H βm∗ −αN (0, T ; L2 (Ω)), the time integrated initial boundary value problem (7.15) (to be understood in a weak H˙ −1 (Ω) sense with respect to space and an L2 (0, T ) sense with respect
164
7. Analysis of Fractionally Damped Wave Equations
to time) has a unique solution u ∈ Uα ∩Uβ . This solution satisfies the energy estimate (7.25) Ea [u](t) + Eb [u](t) ≤ C(t) u0 2H˙ 1 (Ω) + C0 + r 2H βm∗ −αN (0,t,L2 (Ω)) , t ∈ (0, T ), for some constant C(t) depending on time, which is bounded as t → 0 but in general grows exponentially as t → ∞, Ea , Eb defined as in (7.21), (7.23), and
u1 2H˙ 1 (Ω) + u2 2L2 (Ω) if βm∗ > 0, (7.26) C0 = if βm∗ = 0.
u1 2L2 (Ω) Proof of Theorem 7.1. As in the case of the classical evolution equations (cf., e.g., [98]) we split the proof into the following four steps: 1. Galerkin approximation: We approximate the original initial-boundary value problem by a sequence of finite dimensional problems for each of which unique solvability can be shown. 2. Energy estimates: By means of energy methods, we derive norm estimates that hold uniformly for the sequence of finite dimensional approximations. 3. Limits: Due to this uniform boundedness, we can select a weakly convergent subsequence of these Galerkin approximations; its limit is shown to solve the original infinite dimensional problem. 4. Uniqueness: This is also verified via energy estimates. Since the first two steps are relatively long, they will be put into separate lemmas (see Lemmas 7.2 and 7.3 below). We exclude the case βM = 0, since this via (7.3) implies that also αN = 0, and we would therefore deal with the conventional second order wave equation whose analysis can be found in textbooks; cf., e.g., [98]. Thus for the remainder of this proof we assume βM > 0 to hold. Step 1. Galerkin discretisation. In order to prove existence and uniqueness of solutions to (7.14), we apply the usual Faedo–Galerkin approach of discretisation in space with eigenfunctions ϕi corresponding to eigenvalues L (t)ϕ (x) and testing with ϕ , that is, u λi of A, u(x, t) ≈ uL (x, t) = L i j i=1 i (7.27) ⎧1 2 N M ⎪ ⎪ βm 2+αn L L ⎨ an ∂t u (t) + bm ∂t Au (t) − r(t), v = 0, t ∈ (0, T ), n=0 m=0 L2 (Ω) ⎪ ⎪ ⎩uL (0) − u , v = uL (0) − u , v = 0 (if α > 0 : u − u , v = 0) 0 1 tt 2 N t for all v ∈ span(ϕ1 , . . . , ϕL ) .
7.1. Linear damped wave equations
165
Lemma 7.2. For each L ∈ N, the initial value problem (7.27) has a unique solution uL on some time interval (0, T L ). Proof. Due to the fact that the eigenfunctions ϕi form an orthonormal basis L L of L2 (Ω), we can write the second line of (7.27) as uL (0) = uL 0 , ut (0) = u1 L L L (uL tt (0) = u2 ) with uk (x) = i=1 uk , ϕi ϕi (x). The projected pde initial value problem (7.27) leads to the ode system, (7.28)
N
L Ma,n ∂t2+αn uL (t) +
n=0
M
L Kb,m ∂tβm uL (t) = rL (t) ,
t ∈ (0, T ),
m=0
with initial conditions (7.29)
uL (0) = u0L ,
L uL t (0) = u1
L (if αN > 0 : uL tt (0) = u2 ) .
Here the matrices and vectors are defined by uL (t) = (uL i (t))i=1,...L ,
uL k = (uk , ϕi )i=1,...L , k ∈ {0, 1, 2} ,
rL (t) = (r(t), ϕi )i=1,...L ,
Λn = diag(λ1 , . . . , λn ) ,
L = (an ϕi , ϕj )i,j=1,...L , Ma,n
L Mb,m = (bm ϕi , ϕj )i,j=1,...L ,
L L = (λi bm ϕi , ϕj )i,j=1,...L = ΛL Mb,m (t) . Kb,m
Note that due to (7.3) and since we assume the eigenfunctions to be norL is positive definite with eigenvalues malised in L2 (Ω), the matrix Ma,N bounded away from zero by a. To prove existence of a solution uL ∈ H 2+αN (0, T ; RL ) to (7.28) and (7.29), we rewrite it as a system of Volterra integral equations for 1−αN L uttt if αN > 0, 0 It 2+αN L L u = ξ := ∂t L if αN = 0 . utt For this purpose we use the identities (cf. (4.5)) ∂t2+αn uL = 0 It αN −αn ξ L for αn > 0 , 2 L L L 0 It uttt (t) + tu2 + u1 L ut (t) = 1 L L 0 It utt + u1
∂t2+α0 uL = 0 It αN ξ L + uL 2 if α0 = 0 , if αN > 0 if αN = 0,
∂tβm uL (t) = 0 It 1−βm uL t (t) = 0 It 2+αN −βm ξ L (t) +
t1−βm uL t2−βm uL 2 1 + , Γ(3 − βm ) Γ(2 − βm )
166
7. Analysis of Fractionally Damped Wave Equations
where the term containing u2L can be skipped in case αN = 0. This results in the following Volterra integral equation for ξ N . αN −αn L L L −1 L ξ (t) + (Ma,N ) Ma,n ξ (t) 0 It n=0 M
+
/ 2+αN −βm L L Kb,m I ξ (t) = r˜L (t) , 0 t
t ∈ (0, T ),
m=0
. 2−βm L M t u2 L )−1 r L (t) − L where r˜L (t) = (Ma,N m=0 Kb,m Γ(3−βm ) + again the terms containing
u2L
t1−βm uL 1 Γ(2−βm )
/ , and
can be skipped in case αN = 0.
Unique solvability of this system in L2 (0, T L ) follows from [120, Theorem 4.2, p. 241 in §9]. Then from α L ∂t N utt = ξ ∈ L2 (0, T L ), L uL tt (0) = u2 L ∈ H αN (0, T L ) (cf. [203, §3.3]). The in case αN > 0, we have a unique utt same trivially holds true in case αN = 0.
Due to the energy estimates below, this solution actually exists for all times t ∈ [0, T ), so the maximal time horizon of existence in Lemma 7.2 is actually T L = T , independently of the discretisation level L. Step 2. Energy estimates. Lemma 7.3. The solutions uL of (7.27) according to Lemma 7.2 satisfy the estimate (7.30) Ea [uL ](t) + Eb [uL ](t) ≤ C(t) u0 2H˙ 1 (Ω) + C0 + r 2H βm∗ −αN (0,T,L2 (Ω)) with Ea , Eb as in (7.21), (7.23), and a constant C(t) independent of L. Proof. With m∗ defined by (7.16) in case (i) and set to m∗ = 0 in case (ii), 1+β we use v = ∂t m∗ uL (s) as a test function in (7.27) (with t replaced by s) and integrate for s from 0 to t. The resulting terms can be estimated as follows. First of all we integrate by parts with respect to space in the bm terms, t 1+β bm ∂tβm AuL (s), ∂t m∗ uL (s) ds 0 t 1+β bm ∂tβm A1/2 uL (s), ∂t m∗ A1/2 uL (s) ds = 0 t 1+β [A1/2 bm − bm A1/2 ]∂tβm A1/2 uL (s), ∂t m∗ uL (s) ds , − 0
7.1. Linear damped wave equations
167
where from (7.20) and Young’s inequality we deduce t 1+β [A1/2 bm − bm A1/2 ]∂tβm A1/2 uL (s), ∂t m∗ uL (s) ds 0 & & & 1+β & ≤ Cb &∂t m∗ A1/2 uL & −(1+β −β )/2 m m∗ H (0,t;L2 (Ω) & & & βm 1/2 L & × &∂t A u & (1+β −β )/2 2 m m ∗
H
≤
C((1 + βm∗ C((1 + βm∗
(0,t;L (Ω)
& − βm )/2) & & (1+βm∗ +βm )/2 1/2 L &2 C b &∂ t A u & 2 . − βm )/2) L (0,t;L2 (Ω)
Inequality (7.8) yields t 0
1+β
an ∂t2+αn uL (s), ∂t m∗ uL (s) ds t an ∂t2+αn uL (s), 0 It 1+αn −βm∗ ∂t2+αn uL (s) ds = 0
√ ≥ cos(π(1 + αn − βm∗ )/2) an ∂t2+αn uL 2H −(1+αn −βm∗ )/2 (0,t;L2 (Ω)) for αn < βm∗ and t 1+β bm ∂tβm A1/2 uL (s), ∂t m∗ A1/2 uL (s) ds 0 t 1+β 1+β = bm0 It 1+βm∗ −βm ∂t m∗ A1/2 uL (s), ∂t m∗ A1/2 uL (s) ds 0 1+β ≥ cos(π(1 + βm∗ − βm )/2) bm ∂t m∗ A1/2 uL 2H −(1+βm∗ −βm )/2 (0,t;L2 (Ω)) for m > m∗ . In particular for m = M > m∗ (which implies 1 + βm∗ − βM < 1 and the case m∗ = M will be considered below) by (7.12), noting that (1+βm∗ +βM )/2 1/2 L ∂t A u vanishes at t = 0 and that (1+βm∗ −βm )/2 (1+βm∗ +βM )/2 1/2 L ∂t A u
∂t
(7.31)
t 0
1+βm∗
bM ∂tβM A1/2 uL (s), ∂t
(1+βm∗ +βM )/2
≥ C M,βm∗ ∂t
1+βm∗
= ∂t
A1/2 uL ,
A1/2 uL (s) ds A1/2 uL 2L2 (0,t;L2 (Ω))
with C M,βm∗ = C((1 + βm∗ − βM )/2)b cos(π(1 + βm∗ − βM )/2). t From the identity 0 ∂t v(s), v(s)ds = 12 v(t) 2 − 12 v(0) 2 and Young’s inequality in case αN = βm∗ depending on whether these are in the interior
168
7. Analysis of Fractionally Damped Wave Equations
of the interval [0, 1] or on its boundary, we get the estimates: t 1+β aN ∂t2+αN uL (s), ∂t m∗ uL (s) ds 0 t
s−βm∗ 1+β 1+βm∗ L uL aN ∂t ∂t m∗ uL (s) − u (s) ds = 2 , ∂t Γ(1 − βm∗ ) 0 √ 2 t2(1−βm∗ ) aN uL 1 √ 2 L2 (Ω) 1+βm∗ L 2 u )(t) L2 (Ω) − ≥ aN ∂t 2 Γ(2 − βm∗ )2 1 √ 1+β − aN ∂t m∗ uL 2L∞ (0,t);L2 (Ω) if αN = βm∗ ∈ (0, 1), 4 t 1+β aN ∂t2+αN uL (s), ∂t m∗ uL (s) ds 0
1 √ 1 √ 1+β 2 = aN ∂t m∗ uL )(t) 2L2 (Ω) − aN uL 1+βm∗ L2 (Ω) 2 2 if αN = βm∗ ∈ {0, 1} .
Likewise, t 1+β bm∗ ∂tβm∗ A1/2 uL (s), ∂t m∗ A1/2 uL (s) ds 0 t s−βm∗ β βm∗ 1/2 L A1/2 uL bm∗ ∂t ∂t m∗ A1/2 uL (s) − A u (s) ds = 1 , ∂t Γ(1 − βm∗ ) 0 √ 2 t2(1−βm∗ ) bm∗ A1/2 uL 1 1 L2 (Ω) βm∗ 1/2 L 2 ≥ bm∗ ∂t A u )(t) L2 (Ω) − 2 Γ(2 − βm∗ )2 1 β − bm∗ ∂t m∗ A1/2 uL 2L∞ (0,t);L2 (Ω) if βm∗ ∈ (0, 1) , 4 t 1+β bm∗ ∂tβm∗ A1/2 uL (s), ∂t m∗ A1/2 uL (s) ds 0
1 1 β 2 = bm∗ ∂t m∗ A1/2 uL )(t) 2L2 (Ω) − bm∗ A1/2 uL βm∗ L2 (Ω) 2 2 if βm∗ ∈ {0, 1}.
Hence in case m∗ = M , where due to our assumption that βM > 0 at the beginning of the proof, t 1+β bM ∂tβM A1/2 uL (s), ∂t m∗ A1/2 uL (s) ds sup t ∈(0,t) 0
√ 2 t2(1−βM ) bM A1/2 uL b βM 1/2 L 2 1 L2 (Ω) . ≥ ∂t A u L∞ (0,t);L2 (Ω) − 4 Γ(2 − βM )2
Since the difference 1 + βm∗ − βm is larger than one for m < m∗ , we have to employ (7.10) (for this purpose, note that still, according to our
7.1. Linear damped wave equations
169
assumptions, γ = (1 + βm∗ − βk )/2 ∈ [0, 1]). From this and (7.12) we obtain (7.32) t
1+β
bm ∂tβm uL (s), ∂t m∗ uL (s) ds 0 t 1+β 1+β = bm 0 It 1+βm∗ −βm ∂t m∗ A1/2 uL (s), ∂t m∗ A1/2 uL (s) ds 0 1+β ≥ −C(1 + βm∗ − βm ) bm ∂t m∗ A1/2 uL 2H −(1+βm∗ −βm )/2 (0,t;L2 (Ω)) ≥ −C(1 + βm∗ − βm )C((1 + βm∗ − βm )/2) (1+βm∗ +βm )/2 1/2 L 2 A u L2 (0,t;L2 (Ω)) × bm ∂t
for m < m∗ .
Note that in case m∗ < M , the potentially negative contributions arising from the m < m∗ terms (7.32) are dominated by means of the leading energy (1+βm∗ +βM )/2 1/2 L 2 A u L2 (0,t;L2 (Ω)) from (7.31), term from (7.23), that is, ∂t w due to (1 + βm∗ + βm )/2 < (1 + βm∗ + βM )/2 for m < m∗ . To achieve this also in case m∗ = M with βM −1 < αN < βM , we additionally test with v = ∂t1+βM − uL (s) in (7.27) where 0 < < βM − max{βM −1 , αN }. Repeating the above estimates with βm∗ replaced by βM − , one sees that the leading energy term then is β
∂t M
+(1−)/2
A1/2 uL 2L2 (0,t;L2w (Ω))
from (7.31). This still dominates the potentially negative contributions arising from the m < M terms according to (7.32) (with βM − in place of βm∗ ), since (1 + βM − + βm )/2 < βM + (1 − )/2 for m < M . Note that in the case αN = βM = β0 , no potentially negative terms (7.32) arise. Finally, the right-hand side term can be estimated by means of (7.12) and Young’s inequality, t 1+β r(s), ∂t m∗ uL (s) ds 0
1+βm∗ L
≤ r H βm∗ −αN (0,T,L2 (Ω)) ∂t
≤ a ∂t1+αN uL (s) 2L2 (0,t,L2 (Ω)) +
t
≤ 0
u (s) H −(βm∗ −αN ) (0,t,L2 (Ω)) 1
4aC(βm∗ − αN )2
E a [uL ](s) ds + C r 2H βm∗ −αN (0,T,L2 (Ω))
with some constants a, C > 0.
w
r 2H βm∗ −αN (0,T,L2 (Ω))
170
7. Analysis of Fractionally Damped Wave Equations
Altogether we arrive at the estimate, t L L (7.33) Ea [u ](t) + Eb [u ](t) ≤ E a [uL ](s) ds + rhs[uL ](t) + rhsL 0r (t), 0
where Ea , E a , Eb are defined as in (7.21), (7.22), (7.23), rhs[u](t) =
M −1
C((1 + βm∗ − βm )/2) & & Cb &∂ (1+βm∗ +βm )/2 A1/2 u 2L2 (0,t;L2 (Ω)) C((1 + βm∗ − βm )/2)
m=0 m ∗ −1
+
C(1 + βm∗ − βm )C((1 + βm∗ − βm )/2)
m=0
×
(1+βm∗ +βm )/2
bm ∂t
rhsL 0,r (t) = 1αN =βm∗ ∈(0,1) + 1βm∗ ∈(0,1) +
A1/2 u 2L2 (0,t;L2 (Ω)) , √ 2 t2(1−βm∗ ) aN uL 2 L2 (Ω)
Γ(2 − βm∗ )2 √ 2 t2(1−βm∗ ) bm∗ A1/2 uL 1 L2 (Ω) Γ(2 − βm∗ )2
1 2˜ aC(βm∗ − αN )2
r 2H βm∗ −αN (0,T,L2 (Ω)) ,
where the boolean variable 1B is equal to one if B is true and vanishes otherwise. We now substitute Ea [u](t), Eb [u](t) by lower bounds containing only an estimate from below of their leading order terms E a [u](t) as defined in (7.22) and (7.34) (1+βm∗ +βM )/2 1/2 A u 2L2 (0,t;L2 (Ω)) if m∗ < M, C M,βm∗ ∂t E b [u](t) := (1−)/2+βM 1/2 C M,βM − ∂t A u 2L2 (0,t;L2 (Ω)) if m∗ = M, to obtain from (7.33) (7.35)
t
E a [u ](t) + E b [u ](t) ≤ L
L
0
+L E a [uL ](s) ds + rhs[uL ](t) + rhsL 0,r (t) + rhs0 (t),
where + L (t) rhs 0
= 1m∗ =M, βM ∈(0,1)
√ 2 t2(1−βM ) bM A1/2 uL 1 L2 (Ω) Γ(2 − βM )2
It is readily checked that rhs[u](t) ≤ cb,β (t)E b [u](t)
.
7.1. Linear damped wave equations
171
for
cb,β (t) =
m ∗ −1
1
C(1 + βm∗ − βm )C((1 + m=0 (β −β )/2 × 0 It M m L2 (0,t)→L2 (0,t) . C M,βm∗
βm∗ − βm )/2)|bk |
Therefore, under the smallness assumption (7.19), that is, cb,β (T ) < 1, (7.35) yields Ea [uL ](t) + (1 − cb,β (T ))E b [uL ](t) t ˜ E a [uL ](s) ds + u0 2H˙ 1 (Ω) + C0 + r 2H βm∗ −αN (0,T,L2 (Ω)) ≤C 0
for some C˜ independent of t, L, with C0 as in (7.26). Applying Gronwall’s lemma and reinserting into (7.33), we end up with (7.30). Step 3. Weak limits. As a consequence of (7.30), the Galerkin solutions uL are uniformly bounded in Ua ∩ Ub . Therefore the sequence (uL )L∈N has a weakly(*) convergent subsequence (uLk )k∈N with limit u ∈ Ua ∩ Ub . To see that u satisfies (7.15), we revisit (7.27), integrate it with respect to time and take the L2 (0, T ) product with arbitrary smooth test functions ψ to conclude that
T
ψ(t)
N .3
0
an ∂t1+αn u(t) −
n=0
+
M 3
bm
4 t1−αn u2 , v 2 Γ(2 − αn ) L (Ω)
1−βm Au(t) − 0 It
m=0
t1−βm Au0 Γ(2 − βm ) − r(t), v
T
ψ(t)
=
N .3
0
an ∂t1+αn (u − uLk )(t), v
n=0
+
M 3 m=0
/
4 H˙ 1 (Ω)∗ ,H˙ 1 (Ω)
dt
4 L2 (Ω)
bm0 It 1−βm A(u − uLk )(t), v
4 H˙ 1 (Ω)∗ ,H˙ 1 (Ω)
/ dt
for all v ∈ span(ϕ1 , . . . , ϕM ), M ≤ Lk and any ψ ∈ Cc∞ (0, T ). Taking the limit k → ∞ by using the previously mentioned weak(*) limit of uLk , we
172
7. Analysis of Fractionally Damped Wave Equations
conclude
T 0
N . ψ(t) an ∂t1+αn u(t) − n=0
+
M
bm
0 It
1−βm
t1−αn u2 , vL2 (Ω) Γ(2 − αn )
Au(t) −
m=0
t1−βm Au0 Γ(2 − βm )
/ − r(t), vH˙ 1 (Ω)∗ ,H˙ 1 (Ω) dt = 0
for all v ∈ span(ϕ1 , . . . , ϕM ) with arbitrary M ∈ N, and any ψ ∈ Cc∞ (0, T ). Therefore the weak limit u indeed satisfies the time integrated pde, and we have proven the existence part of the theorem. To verify the initial conditions, we use the fact that due to our assumption βM > 0, Ub continuously embeds into C([0, T ]; H˙ 1 (Ω)) and therefore u(0) = u0 is attained in an H˙ 1 (Ω) sense. Also, Ua continuously embeds into C 1 ([0, T ]; L2 (Ω)) in case αN < βm∗ or αN > 0. In the remaining case αN = βm∗ = 0, attainment of ut (0) = u1 in an L2 (Ω) sense can be shown analogously to the conventional second order wave equation; cf., e.g., [98, Theorem 3 in Section 7.2]. Moreover taking weak limits in the energy estimate (7.30) together with weak lower semincontinuity of the norms contained in the definitions of Ea , Eb , implies (7.25). Step 4. Uniqueness. The manipulations carried out in Step 2 of the proof are also feasible with the Galerkin approximation uL replaced by a solution u of the pde itself and lead to the energy estimate (7.25) independently of the Galerkin approximation procedure. From this we conclude that the pde with vanishing right-hand side and initial data only has the zero solution, which due to linearity of the pde implies uniqueness. With this, the proof of Theorem 7.1 is finished.
Remark 7.1. In the practically relevant case of Lu = ∇ · ρ1 ∇u for some ρ ∈ L∞ (Ω), 0 < ρ ≤ ρ(x) ≤ ρ, the selfadjointness and positivity assumptions on A as well as compactness of A−1 are satisfied. We can substitute A1/2 in the proof by ρ−1/2 ∇ and so the second condition in (7.20) becomes
ρ−1/2 w∇bm L2 (Ω) ≤ Cb ρ−1/2 ∇w L2 (Ω)
for all w ∈ H01 (Ω) .
Due to the estimate
ρ−1/2 w∇bm L2 (Ω) ≤ ρ−1/2 L∞ (Ω) w L6 (Ω) ∇bm L3 (Ω) Ω ≤ρ−1/2 CH 1 ,L6 ∇w L2 (Ω) ∇bm L3 (Ω)
7.1. Linear damped wave equations
we can set Cb =
5
ρ Ω ρ CH 1 ,L6
173
maxm∈{0,...,M } ∇bm L3 (Ω) provided all coeffi-
cients bm are contained in W 1,3 (Ω). Remark 7.2. Using the so-called multinomial (or multivariate) MittagLeffler functions and separation of variables in principle enables a representation analogous to (6.18), where Eα,1 and Eα,α in (6.19), (6.20) are replaced by appropriate instances of their multinomial counterparts; cf. [229, Theorem 4.1]. However, proving convergence of the infinite sums in this representation would require extensive estimates of these Mittag-Leffler functions. In fact, estimates similar to those in Step 2 of the proof of Theorem 7.1 would be required. Thus, in order to keep the exposition transparent without having to introduce too much additional machinery, we chose to remain with the Galerkin approximation argument in the proof above. An inspection of the proof of Theorem 7.1 shows that in particular coefficient settings, certain time-dependent terms vanish and therefore a zero right-hand side leads to global boundedness of the energy. Corollary 7.1. Under the assumptions of Theorem 7.1 with constant coefficients bm , m = 0, . . . , M and bm = 0, m = 0, . . . , m∗ − 1, the assertion extends to hold globally in time, that is, for all t ∈ (0, ∞). If additionally r = 0, then the energy is globally bounded Ea [u](t) + Eb [u](t) ≤ C u0 2H˙ 1 (Ω) + C0 ,
t ∈ (0, ∞),
with some constant C > 0 independent of time. In order to establish sufficient regularity of the highest order term so that also the original equation (7.14) holds, we make use of the pde itself. For this purpose we need a condition similar to (7.20) on the an coefficients.
∂t2+αN uL
Corollary 7.2. Under the conditions of Theorem 7.1 with additionally (7.36) [am − A−1/2 am A1/2 ]w L2 (Ω) ≤ Ca w L2 (Ω)
for all w ∈ H˙ 1 (Ω),
the solution u of (7.15) satisfies ∂t2+αN u ∈ L2 (0, T ; H˙ 1 (Ω)∗ ) and is therefore also a solution to the original pde (2.54) in an L2 (0, T ; H˙ 1 (Ω)∗ ) sense. Proof. From the Galerkin approximation (7.27) of (7.14), due to invariance of the space span(ϕ1 , . . . , ϕL ), we can substitute v by A−1/2 v there and move
174
7. Analysis of Fractionally Damped Wave Equations
A−1/2 to the left-hand side of the inner product by adjoining to arrive at N 3
∂t2+αn A−1/2 [an uL (t)]
n=0
(7.37) +
M
∂tβm A−1/2 [bm AuL (t)] − A−1/2 r(t), v
m=0
4 L2 (Ω)
=0
t ∈ (0, T )
for all v ∈ span(ϕ1 , . . . , ϕL ) . βm −1/2 [bm Au(t)]+A−1/2 r(t) From Theorem 7.1 we know that r˜ = − M m=0 ∂t A is contained in L2 (0, T ; L2 (Ω)). Thus (7.37) can be viewed as a Galerkin discretisation of a Volterra integral equation for ζ = ∂t2+αN A−1/2 u(t) with r˜ as its right-hand side. The corresponding coefficient vector ζ L of the solution ζ L is therefore bounded by r L2 (0,T ;L2 (Ω)) + ζ(0) L2 (Ω) )
ζ L 2 (RL ) ≤ C( ˜ with C independent of L. From this we deduce, r L2 (0,T ;L2 (Ω)) + ζ(0) L2 (Ω) ),
ζ L L2 (0,T ;L2 (Ω)) ≤ C( ˜ and thus, by taking the limit L → ∞, the same bound for ζ L2 (0,T ;L2 (Ω)) itself. Remark 7.3. Higher regularity can be obtained by testing with the func1+β tion v = ∂t m∗ Aμ uL (s) for some μ > 0. This clearly leads to higher differentiability requirements on the coefficients and to higher order versions of conditions (7.20), (7.36); see, e.g., Theorems 7.2 and 7.8. We now revisit the three time fractional acoustic models outlined in Section 2.2, namely the Caputo–Wismer–Kelvin wave equation 1 utt + Au + b1 ∂tβ Au = r , (7.38) 2 c ρ the modified Szab`o model 1 (7.39) utt + a1 ∂t2+α u + Au = r , 2 c ρ and the fractional Zener wave equation 1 utt + a1 ∂t2+α u + Au + b1 ∂tβ Au = r , (7.40) c2 ρ
with A = −∇ · 1ρ ∇u and u physically playing the role of an acoustic pressure. In particular we take into account space-dependence of the sound speed c, the mass density ρ, and the damping parameters a1 , b1 . Spatial variability of c is actually the most relevant among these; e.g., in medical or geophysical imaging applications. In view of the fact that inclusions on
7.2. Some nonlinear damped wave equations
175
a homogeneous background lead to only piecewise smooth but not globally differentiable sound speed, we aim at keeping the regularity assumptions on c low. In fact, the low regularity well-posedness result from Theorem 7.1 can be applied with only 1c ∈ L∞ (Ω) (which theoretically even allows c to take +∞ as a value). Equations (7.38), (7.39), and (7.40) obviously take the form (7.14) with a0 = c21ρ , b0 = 1, α0 = 0, α1 = α, β0 = 0, β1 = β. For (7.38) we can invoke case (ii) with N = 0, 1 = M > m∗ = 0 of Theorem 7.1 and for (7.40) we make use of case (i) with N = M = m∗ = 1 and assume α ≤ β to satisfy the stability condition αN ≤ βM in (7.3). The latter is violated in equation (7.39) with N = 1, M = 0 though, hence we do not obtain well-posednedness for the modified Szab`o model. Corollary 7.3. Let T ∈ (0, ∞], a1 , b1 , 1c , ρ, ρ1 ∈ L∞ (Ω), b1 ∈ W 1,3 (Ω) with a1 (x) ≥ a1 > 0, b1 (x) ≥ b1 > 0 for almost every x ∈ Ω, and let 1 ≥ β ≥ α > 0. Then for any u0 , u1 ∈ H˙ 1 (Ω), r ∈ L2 (0, T ; L2 (Ω)) the time integrated version (7.15) of the Caputo–Wismer–Kelvin wave equation (7.38) has a unique solution u ∈ {v ∈ H (1+β)/2 (0, T ; H˙ 1 (Ω)) : vt ∈ L∞ (0, T ; L2 (Ω))}. For any u0 , u1 ∈ H˙ 1 (Ω), r ∈ H β−α (0, T ; L2 (Ω)), the time integrated version of the fractional Zener wave equation (7.40) has a unique solution u ∈ {v ∈ H (3+α+β)/2 (0, T ; L2 (Ω)) : ∂ β v ∈ L∞ (0, T ; H˙ 1 (Ω))} if α < β. If α = β u ∈ {v ∈L2 (0, T ; L2 (Ω)) : ∂ 1+α v ∈ L∞ (0, T ; L2 (Ω)), ∂ α v ∈ L∞ (0, T ; H˙ 1 (Ω))}. If additionally a1 , 1c ∈ W 1,3 (Ω), then these solutions also solve the original pdes (7.38), (7.40) in an L2 (0, T ; H˙ 1 (Ω)∗ ) sense with utt ∈ L2 (0, T ; H˙ 1 (Ω)∗ ) for (7.38) and ∂t2+α u ∈ L2 (0, T ; H˙ 1 (Ω)∗ ) for (7.40). We have only considered scalar equations here, as is typical for acoustics applications. The vectorial setting of linear viscoelasticity is treated, e.g., in [261] for the case of the fractional Zener model with α = β. An analysis of several different linear and nonlinear fractional acoustic wave equations as well as a derivation of these models and justification of their limits as α = β 1 can be found in [180].
7.2. Some nonlinear damped wave equations In this section we study nonlinear versions of the fractional wave equations (7.38) and (7.40). This is motivated by the extension of the Westervelt
176
7. Analysis of Fractionally Damped Wave Equations
equation utt − c2 u − but = κ(u2 )tt
(7.41)
(sometimes also written as utt − c2 u − ˜buttt = κ(u2 )tt , by substituting c2 u with utt in the damping term), which is a classical model of nonlinear acoustics, to the situation of fractional derivative attenuation. The underlying application field is high intensity ultrasound, which is widely used in medical imaging and therapy, but also in industrial processes such as ultrasound cleaning or welding. For a derivation of this and other models of nonlinear acoustics, see, e.g., [71, 129, 344]. These equations contain a quadratic nonlinearity with a coefficient κ incorporating the parameter of nonlinearity B/A, which is the quantity often used in the nonlinear acoustics literature. Since in Section 11.3 we will consider the problem of identifying this coefficient as a function of the space variable, here we will also allow for κ = κ(x). From an analysis point of view, the quadratic nonlinearity leads to potential degeneracy and the blowup of solutions (cf., e.g., [83, 173]) due to the fact that (7.41) can be rewritten as (1 − 2κu)utt − c2 u − but = 2κ(u2t ) . Fractional order terms are here required to model fractional power frequency dependence of damping as is typical for ultrasound propagation; see, e.g., [45, 61, 325, 331, 350] and the references therein. Thus, with a self-adjoint positive definite operator A to be specified below, we consider (7.42)
utt + c2 Au + Du = κ(x)(u2 )tt + r , u(0) = 0,
t ∈ (0, T )
ut (0) = 0
with damping of Caputo–Wismer–Kelvin–Chen–Holm (7.43) (ch)
˜
D = bAβ ∂tβ
with β ∈ [0, 1], β˜ ∈ [0, 1], b ≥ 0,
or fractional Zener (7.44) (fz) D = a1 ∂t2+α +b1 A∂tβ
with a1 > 0, b1 ≥ a1 c2 , 1 ≥ β ≥ α > 0,
type. Note that in (7.43) we slightly extend the damping term from (7.38) to contain a fractional power of A, which is simply defined by replacing the eigenvalues λj of A in the spectral decomposition by their fractional powers ˜ β˜ Aβ v = ∞ j=1 λj v, ϕj ϕj . In (7.42), c > 0 is the constant mean wave speed, r = r(x, t) is a known source term modelling excitation of the acoustic wave, e.g., by some
7.2. Some nonlinear damped wave equations
177
transducer array; see [187]. Throughout this section we assume Ω ⊆ Rd , d ∈ {1, 2, 3} to be a bounded domain with C 1,1 boundary. Here we allow for spatially varying sound speed c0 (x) for which we only require (7.45)
c0 ∈ L∞ (Ω) and c0 (x) ≥ c > 0,
unless otherwise stated, in the same way as it is done in, e.g., [7], by setting c20 , c2 where − is the Laplace operator equipped with homogeneous Dirichlet boundary conditions. We denote by (λj , ϕj ) an eigensystem of the operator A with domain H˙ 2 (Ω) := D(A) which is self-adjoint and positive definite with respect to the weighted L2 space L˙ 2 (Ω) := L2c2 /c2 (Ω). Note 0 that by these assumptions the operator A−1 : L˙ 2 (Ω) → L˙ 2 (Ω) is compact (based on the fact that Ω is bounded; for some comments on more general domain and boundary settings, we point to [187]), so that the eigensystem exists and is complete with λj → ∞ as j → ∞. Like in the setting of Chapter 6 and Section 7.1, this defines a scale of Hilbert spaces H˙ s (Ω) := D(As/2 ), s ∈ R, whose norm can be defined via the eigensystem
1/2 ∞ s |v, ϕ |2 λ in case s ≥ 0 and as the dual norm of as v H˙ s (Ω) = j j=1 j −s H˙ (Ω) in case s < 0.
(7.46)
A=−
We will denote by ·, · the L˙ 2 inner product (that is, the weighted one) on Ω, whereas the use of the ordinary L2 inner product will be indicated by a subscript ·, ·L2 . Moreover, we use the abbreviations u Lpt (Lq ) =
u Lp (0,t;Lq (Ω)) , u Lp (Lq ) = u Lp (0,T ;Lq (Ω)) for space-time norms. With the above-mentioned inverse problem in mind, we study wellposedness of the initial value problem for the parameter-to-state map G : κ → u where u solves (7.42) and its linearisation z = G (κ)δκ solving (7.47) (1 − 2κu)ztt + c2 Az + Dz − 4κut zt − 2κutt z = 2δκ(u utt + u2t ) , t ∈ (0, T ) z(0) = 0,
zt (0) = 0
for given κ and δκ, respectively, in the context of the two damping models D defined according to the ch and the fz cases. In order to prove Fr´echet differentiability of G, we will also have to consider the difference v = G(˜ κ) − G(κ) = u ˜ − u, which solves ut + ut ) vt − 2κ˜ utt v (1 − 2κu)vtt + c2 Av + Dv − 2κ(˜ (7.48)
˜2t ) , = 2(˜ κ − κ)(˜ uu ˜tt + u v(0) = 0,
vt (0) = 0,
t ∈ (0, T ),
178
7. Analysis of Fractionally Damped Wave Equations
as well as the first order Taylor remainder w = G(˜ κ) − G(κ) − G (κ)(˜ κ − κ) which satisfies (7.49) (1 − 2κu)wtt + c2 Aw + Dw − 4κut wt − 2κutt w utt + uvtt + (˜ ut + ut )vt ) + 2κ(vvtt + vt2 ) , = 2δκ(v˜ w(0) = 0,
t ∈ (0, T )
wt (0) = 0 ,
˜ − κ. with δκ = κ In order to prove well-posedness of the nonlinear equation (7.42) (with initial conditions) needed for defining the forward operator as well as the linear equations (7.47) and (7.48) required for establishing Fr´echet differentiability, we will proceed similarly in both damping model cases: First of all, we analyse a related linear equation with general coefficients that allows us to formulate the nonlinear equation as a fixed point problem and to apply Banach’s fixed point theorem for proving its well-posedness. Then we apply the same general coefficient linear result to the two linear problems (7.47) and (7.48) in order to prove differentiability. It will turn out that in the ch case we need three different regularity levels in the solution spaces, whereas the analysis is somewhat simpler for fz and allows us to work on the same solution space for all purposes pointed out above. 7.2.1. Caputo–Wismer–Kelvin–Chen–Holm damping. We initially consider the ch model (7.43) and first of all consider the initial boundary value problem for the general linear pde, (7.50) (7.51)
˜
(1 − σ)utt + c2 Au + bAβ ∂tβ u + μut + ρu = h u(0) = u0 ,
ut (0) = u1 ,
with constants b, c > 0 and given space- and time-dependent functions σ, μ, ρ, h, where σ satisfies the nondegeneracy condition (7.52)
σ(x, t) ≤ σ < 1 for all x ∈ Ω
t ∈ (0, T ) .
We will provide assertions on the solutions of (7.50) and (7.51) at three different levels of regularity, corresponding to the solution spaces, U =H 2 (0, T ; L˙ 2 (Ω)) ∩ W 1,∞ (0, T ; H˙ 2 (Ω)) ∩ L∞ (0, T ; H˙ 3 (Ω)), Ulo =H 2 (0, T ; L˙ 2 (Ω)) ∩ W 1,∞ (0, T ; H 1 (Ω)) ˜
∩ H β (0, T ; H˙ 1+β (Ω)) ∩ L∞ (0, T ; H˙ 2 (Ω)), ˜
Uvl =W 1,∞ (0, T ; L˙ 2 (Ω)) ∩ H β (0, T ; H˙ β (Ω)) ∩ L∞ (0, T ; H 1 (Ω)), with the subscripts “lo” and “vl” abbreviating low and very low.
7.2. Some nonlinear damped wave equations
179
We will also assume that the function σ does not vary too much in space, in the sense that 1−σ (7.53)
∇σ L∞ (L4 ) < CH 1 →L4 holds. The required regularity on σ, μ, ρ, h, u0 , and u1 is, besides (7.52) and (7.53), σ ∈ H 1 (0, T ; L∞ (Ω)) ∩ L∞ (0, T ; W 2,4 (Ω) ∩ W 1,∞ (Ω)), (7.54)
μ ∈ L2 (0, T ; H 2 (Ω)) , h ∈ L2 (0, T ; H˙ 2 (Ω)) ,
ρ ∈ L2 (0, T ; H 2 (Ω)), u0 ∈ H˙ 3 (Ω) , u1 ∈ H˙ 2 (Ω).
Moreover, we require c0 ∈ W 1,∞ (Ω) .
(7.55)
Theorem 7.2. Under conditions (7.52), (7.53), (7.54), and (7.55) there exists a unique solution (7.56) u ∈ U := H 2 (0, T ; L˙ 2 (Ω)) ∩ W 1,∞ (0, T ; H˙ 2 (Ω)) ∩ L∞ (0, T ; H˙ 3 (Ω)) to the initial boundary value problem (7.50) and (7.51). This solution satisfies the estimate (7.57) ˜
u 2U := ∇utt 2L2 (L2 ) + Aut 2L∞ (L˙ 2 ) + A1+β/2 ∂tβ u 2L2 (L˙ 2 ) + ∇Au 2L∞ (L2 ) t t t
2 2 2 ≤ C(T ) Aut (0) L˙ 2 (Ω) + ∇Au(0) L2 (Ω) + Ah L2 (L˙ 2 ) . t
Proof of Theorem 7.2. In order to prove existence and uniqueness of solutions to (7.50) and (7.51), we take in principle the same steps as in the proof of Theorem 7.1, that is, 1. Galerkin approximation; 2. Energy estimates; 3. Limits; 4. Uniqueness. However in Step 2, we will test with different multipliers to end up with somewhat different energy estimates. Moreover, we also have to take time dependence of the coefficients into account here. To make this more transparent, we will formulate the Galerkin discretization in a slightly different manner as compared to Step 1 of the proof of Theorem 7.1. Step 1. Galerkin discretisation. We apply the usual Faedo–Galerkin approach of discretisation in space with eigenfunctions to obtain the approxin n n mation between A and u(x, t) ≈ u (x, t) = i=1 ui (t)ϕi (x) and testing with
180
7. Analysis of Fractionally Damped Wave Equations
ϕj , that is, ˜
(7.58) (1 − σ)untt + c2 Aun + bAβ ∂tβ un + μunt + ρun − h, v = 0, v ∈ span(ϕ1 , . . . , ϕn ) . This leads to the ode system ˜
(7.59) (I − S n (t))un (t) + b (Λn )β (∂tβ un )(t)
+ M n (t)un (t) + c2 Λn + Rn (t) un (t) = hn
with matrices and vectors defined by un (t) = (uni (t))i=1,...,n , hn (t) = (h(t), ϕi )i=1,...,n , (7.60)
Λn = diag(λ1 , . . . , λn ) ,
S n (t) = (σ(t)ϕi , ϕj )i,j=1,...,n ,
M n (t) = (μ(t)ϕi , ϕj )i,j=1,...,n ,
Rn (t) = (ρ(t)ϕi , ϕj )i,j=1,...,n .
Lemma 7.4. For each n ∈ N, the initial value problem (7.59) has a unique solution un on some time interval (0, T n ). Proof. Existence of a unique solution un ∈ C 2 ([0, T ]; Rn ) to (7.59) follows from standard ode theory (Picard–Lindel¨of theorem and Gronwall’s inequality), as long as σ, μ, and ρ are in C([0, T ]; H˙ s (Ω)) for some s ∈ R (noting that the eigenfunctions ϕj are contained in H˙ k (Ω) for any k ∈ N and therefore the vector and matrix functions hn , S n , M n , Rn in (7.60) are well-defined and contained in C([0, T ]; Rn ) and C([0, T ]; Rn×n ), respectively.) Moreover due to (7.52), the symmetric matrix S n (t) is positive definite with its smallest eigenvalue bounded away from zero by 1 − σ cf (7.52). Step 2. Energy estimates. Like in the proof of Theorem 7.1, this central step of the proof is achieved by testing with some derivative of the Galerkin solution itself. Here, notation-wise, we stay with the ode system (7.59) (rather than the equivalent projected pde (7.58)). This is in order to keep the time dependence of quantities more transparent and also in order to point to this alternative way of writing things. Lemma 7.5. The solutions un of (7.58) according to Lemma 7.4 and (7.60) satisfy the estimate (7.61) ˜
∇untt 2L2 (L2 ) + Aunt 2L∞ (L˙ 2 ) + A1+β/2 ∂tβ un 2L2 (L˙ 2 ) + ∇Aun 2L∞ (L2 ) t t t
n 2 n 2 2 ≤ C(T ) Aut (0) L˙ 2 (Ω) + ∇Au (0) L2 (Ω) + Ah L2 (L˙ 2 ) . t
Proof. Testing (7.58) with A2 unt is equivalent to multiplying (7.59) with the vector (Λn )2 un (t). Considering, one after another, the terms in (7.59), we obtain the following.
7.2. Some nonlinear damped wave equations
181
We integrate with respect to time, using the identity (S n (t)un (t))T (Λn )2 un (t) =
n n
σ(t)ϕi , ϕj λ2j uni (t)unj (t)
i=1 j=1
=
n n
σ(t)ϕi , ϕj λi uni (t)λj unj (t)
i=1 j=1 n n
+
=
σ(t)ϕi , ϕj (λj − λi )uni (t)λj unj (t)
i=1 j=1 n n
1 d 2 dt −
1 2
σ(t)ϕi , ϕj λi uni (t)λj unj (t)
i=1 j=1 n n
σt (t)ϕi , ϕj λi uni (t)λj unj (t)
i=1 j=1
n n
σ(t)ϕi , Aϕj − σ(t)ϕj , Aϕi uni (t)λj unj (t), + i=1 j=1
where σ(t)ϕi , Aϕj − σ(t)ϕj , Aϕi = A[σ(t)ϕi ] − σ(t)Aϕi , ϕj = −[σ(t)ϕi ] + σ(t)ϕi , ϕj L2 = −(c20 /c2 )(σ(t) ϕi + ∇σ(t) · ∇ϕi ), ϕj , provided σ(t)ϕi ∈ H˙ s (Ω)) for some s ∈ R. (Note that the latter identity also holds true in case of spatially varying c0 since we then use the c2 /c20 weighted L2 inner product). Thus for the first term in (7.59) we obtain t ((I − S n (t))un (s))T (Λn )2 un (s) ds 0 t 1 1 d √
1 − σAunt 2L˙ 2 (Ω) (t) + σt (t)Aunt (t), Aunt (t) = 2 dt 2 0
+ (c20 /c2 )(σ(t) untt (t) + ∇σ(t) · ∇untt (t)), Aunt (t) 1 1 = 1 − σ(t)Aunt (t) 2L˙ 2 (Ω) − 1 − σ(0)Aunt (0) 2L˙ 2 (Ω) 2 2 t3 1 σt (s)Aunt (s) + 2 0 4
+ (c20 /c2 )(σ(s) untt (s) + ∇σ(s) · ∇untt (s)), Aunt (s) ds . Similarly, for the third term in (7.59) we have (M n (t)un (t))T (Λn )2 un (t) = μ(t)Aunt (t), Aunt (t) − (c20 /c2 )(μ(t) unt (t)+∇μ(t)·∇unt(t)), Aunt (t),
182
7. Analysis of Fractionally Damped Wave Equations
and for the fifth one (Rn (t)un (t))T (Λn )2 un (t) = ρ(t)Aun (t), Aunt (t)−(c20 /c2 )(ρ(t) un (t) + ∇ρ(t) · ∇un (t)), Aunt (t) . Finally, the two terms containing Λn in (7.59) yield t t t n ˜ β˜ ((Λn )β (∂tβ un )(t))T (Λn )2 un (t) = λ2+ (∂tβ unj )(s)unj (s) ds j 0
≥
0 j=1
1 2Γ(β)t1−β
n j=1
t
2 2+β˜ λj ∂tβ unj (s) ds = 0
and t (Λn un (t))T (Λn )2 un (t) = 0
=
1 2
n
t
λ3j 0
t
λ3j
j=1
j=1
n
0
1 ˜
A1+β/2 ∂tβ un 2L2 (L˙ 2 ) . 1−β t 2Γ(β)t
unj (s)unj (s) ds
0
1 1 d n 2 uj (s) ds = ∇Aun (t) 2L2 (Ω) − ∇Aun (0) 2L2 (Ω) . dt 2 2
Collecting all these terms and applying Young’s inequality on the right-hand side yields the energy estimate (7.62) 1
1 − σ(t)Aunt (t) 2L˙ 2 (Ω) 2 b c2 ˜ β n 2 1+β/2
∇Aun (t) 2L2 (Ω) +
A ∂ u
+ 2 2 ˙ t Lt ( L ) 2Γ(β)t1−β 2 1 c2 1 ≤ 1 − σ(0)Aunt (0) 2L˙ 2 (Ω) + ∇Aun (0) 2L2 (Ω) + Aunt 2L2 (L˙ 2 ) t 2 2 2 1 + − σt Aunt − μAunt − ρAun + Ah 2 2 + (c20 /c2 )(−σ untt − ∇σ · ∇untt + μ unt + ∇μ · ∇unt + ρ un + ∇ρ · ∇un ) 2L2 (L˙ 2 ) t
1 c2 1 ≤ 1 − σ(0)Aunt (0) 2L˙ 2 (Ω) + ∇Aun (0) 2L2 (Ω) + Aunt 2L2 (L˙ 2 ) t 2 2 2 1 n n
σt L2t (L∞ ) Aut L∞ + 2 + μ L2 (L∞ ) Aut L∞ (L2 ) t (L ) t t 2 2 + ρ L2t (L∞ ) Aun L∞ +
Ah 2 2 2 Lt (L ) t (L )
c0 L∞ (Ω) n ∞ ∇utt L2 (L2 )
σ L∞ (L4 ) untt L2 (L4 ) + ∇σ L∞ t (L ) t c n ∞ n + μ L2t (L2 ) ut Lt (L∞ ) + ∇μ L2 (L4 ) ∇ut L∞ (L4 ) 2 n + ρ L2t (L2 ) un L∞ ∞ ) + ∇ρ L2 (L4 ) ∇u L∞ (L4 ) (L t +
7.2. Some nonlinear damped wave equations
183
Here we can make use of the fact that
un L∞ (0,t;Z) ≤ un (0) Z +
√
T unt L2 (0,t;Z)
and the embedding estimates
(7.63)
v L4 (Ω) ≤ CH 1 ,L4 ∇v ,
v ∈ H01 (Ω) ,
v L∞ (Ω) ≤ CH 2 ,L∞ Av L˙ 2 (Ω) ,
v ∈ H01 (Ω) ∩ H 2 (Ω)
(with the latter resulting from elliptic regularity and continuity of the embedding H 2 (Ω) → L∞ (Ω)), in order to further estimate the terms containing u and its derivatives on the right-hand side of (7.62):
untt L2 (L4 ) ≤ CH 1 ,L4 ∇untt L2t (L2 ) , n ∞ ≤ CH 2 ,L∞ Aut ∞ ˙ 2 ,
unt L∞ L (L ) t (L ) t
n ∞ ≤ CH 2 ,L∞ Au ∞ ˙ 2 ,
un L∞ L (L ) t (L ) t
∇u n
∞ L∞ t (L )
≤ CH 2 ,L∞ ∇Au L∞ 2 . t (L ) n
For the last inequality to hold, we need more smoothness than the globally assumed L∞ boundedness on the variable wave speed c0 contained in A and its reciprocal. Namely, bearing in mind the identity ∇Aun = [−(c0 (x)2/c2 ) − ∇[(c0 (x)2/c2 )]∇·]∇un, we also need a bound on the gradient of c0 , (7.55). Now we proceed with estimating ∇untt 2L2 by multiplying the ode (7.59) with Λn un , that is, testing (7.58) with v = Auntt (t), and using integration ˜ by parts (note that all terms in (1−σ)untt +c2 Aun +bAβ ∂tβ un +μunt +ρun −h vanish on ∂Ω), as well as Young’s inequality ˜
0 = (1 − σ)untt + c2 Aun + bAβ ∂tβ un + μunt + ρun − h, Auntt √ = 1−σ∇untt 2L2 (Ω) 6 7 ˜ + −∇σ untt + ∇ c2 Aun + bAβ ∂tβ un + μunt + ρun −h , ∇untt L2 (Ω) √ 1 ≥ 1 − σ∇untt 2L2 (Ω) − (1 − σ) ∇untt 2L2 (Ω) 2 1 ˜ n
− ∇σ utt + c2 ∇Aun + b∇Aβ ∂tβ un − 2(1 − σ) + ∇μ unt + μ∇unt + ∇ρ un + ρ∇un − ∇h 2L2 (Ω) .
184
7. Analysis of Fractionally Damped Wave Equations
Resolving this for ∇untt 2L2 (Ω) and using nondegeneracy (7.52) yields 1
∇untt L2t (L2 ) ≤ 1−σ ˜ · ∇σ L∞ (L4 ) untt 2L2 (L4 ) + c2 ∇Aun L2t (L2 ) + b A1/2+β ∂tβ un L2 (L˙ 2 ) t
+ ∇μ L2 (L4 ) unt L∞ (L4 ) + μ L2t (L∞ ) ∇unt L∞ 2 t (L )
n + ∇ρ L2t (L2 ) un L∞ ∞ ) + ρ L2 (L∞ ) ∇u L∞ (L2 ) , (L t t t where we can again employ the embedding estimates (7.63) and assume smallness of ∇σ (7.53) (more precisely, the norm of the multiplication operator ∇σ(t) : H01 (Ω) → L2 (Ω)d is uniformly bounded by a constant that is smaller than 1 − σ) in order to extract an estimate of the form (7.64) ∇untt L2t (L2 ) ≤ C ∇Aun L2t (L2 )
˜ n ∞ + A1/2+β ∂tβ un L2 (L˙ 2 ) + ∇unt L∞ +
∇u
2 2 Lt (L ) . t (L ) t
Adding a multiple (with factor ( σ 2L∞ (L4 ) (CH 1 ,L4 )2 + ∇σ 2L∞ (L∞ ) )) t of the square of (7.64) to (7.62), making small enough (so that C 2 < b ) and all terms containing L∞ (0, T ) norms of un on the right-hand 2Γ(β)t1−β side of (7.62) can be dominated by left-hand side terms) and using the fact ˜ for β˜ ∈ [0, 1] and Gronwall’s inequality, we end up that 1/2 + β˜ ≤ 1 + β/2 with an estimate of the form (7.61). Step 3. Weak limits. Via weak limits, this shows the existence of a solution to the homogeneous initial boundary value problem for (7.50) and the estimate (7.61) transfers to u as (7.57). Step 4. Uniqueness. Uniqueness of a solution follows from an energy estimate obtained in a lower regularity regime (see Theorem 7.4). With this, the proof of Theorem 7.2 is finished.
The energy estimate leading to the results in Theorem 7.2 have been obtained by basically “multiplying (7.50) with A2 ut ”, that is, taking the L˙ 2 (Ω) inner product of the pde with A2 ut and using self-adjointness of A in L˙ 2 (Ω). Later on, we will also need less regular solutions along with estimates on them. Since the proofs are actually somewhat simpler then, we skip the details on Galerkin approximation and only provide the energy estimates. The next result is obtained by multiplication with Aut under the following
7.2. Some nonlinear damped wave equations
185
regularity assumptions on the variable coefficients:
σ ∈ H 1 (0, T ; L∞ (Ω)) ∩ L∞ (0, T ; W 1,∞ (Ω)) , (7.65)
μ ∈ L2 (0, T ; L∞ (Ω) ∩ W 1,4 (Ω)) , ρ ∈ L2 (0, T ; H 1 (Ω)), h ∈ L2 (0, T ; H 1 (Ω)) , u0 ∈ H˙ 2 (Ω) , u1 ∈ H˙ 1 (Ω) .
Theorem 7.3. Under the conditions (7.52) and (7.65), there exists a unique solution u ∈ Ulo ⊆ H 2 (0, T ; L˙ 2 (Ω)) ∩ W 1,∞ (0, T ; H˙ 1 (Ω)) ∩ L∞ (0, T ; H˙ 2 (Ω)) to the initial boundary value problem (7.50) and (7.51), and this solution satisfies the estimate (7.66) ˜
u 2Ulo := utt 2L2 (L˙ 2 ) + ∇ut 2L∞ (L˙ 2 ) + A(1+β)/2 ∂tβ u 2L2 (L˙ 2 ) + Au 2L∞ (L˙ 2 ) t t t t
2 2 2 ≤ C(T ) ∇ut (0) L2 (Ω) + Au(0) L˙ 2 (Ω) + ∇h L2 (L2 ) .
Proof. Since Steps 1, 3, and 4 are very much the same as in the previous proof, we concentrate on Step 2, the energy estimate, here. Multiplying (7.50) with Aut we obtain
(7.67) c2 1 1 ˜
1−σ(t)∇ut (t) 2L2 (Ω) + Au(t) 2L˙ 2 (Ω) +
A(1+β)/2 ∂tβ u 2L2 (L˙ 2 ) t 2 2 2Γ(β)t1−β 2 1 c ≤ 1 − σ(0)∇ut (0) 2L2 (Ω) + Au(0) 2L˙ 2 (Ω) 2 2 t3 1 σt (s)∇ut (s) + ∇σ(s) utt (s) − 2 0 4 + ∇μ ut + μ∇ut + ∇ρ u + ρ∇u−∇h, ∇ut (s) 2 ds L (Ω)
c2 1 1 ≤ 1 − σ(0)∇ut (0) 2L2 (Ω) + Au(0) 2L˙ 2 (Ω) + ∇ut 2L2 (L2 ) t 2 2 2 1
σt L2t (L∞ ) ∇ut L∞ + 2 + ∇σ L∞ (L∞ ) utt L2 (L2 ) t t (L ) t 2 2 + ∇μ L2 (L4 ) ut L∞ (L4 ) + μ L2t (L∞ ) ∇ut L∞ 2 t (L ) + ∇ρ L2t (L2 ) u L∞ ∞ + ρ L2 (L4 ) ∇u L∞ (L4 ) + ∇h L2 (L2 ) t (L ) t
2 .
186
7. Analysis of Fractionally Damped Wave Equations
The pde provides us with an estimate for utt ,
1 2 ˜ c Au + bAβ ∂tβ u + μut + ρu − h L2 (L2 ) 1−σ ≤ C( μ L2t (L∞ ) + ρ L2 (L4 ) ) c2 × sup 1−σ(t)∇ut (t) 2L2 (Ω) + Au(t) 2L˙ 2 (Ω) 2 t∈(0,T )
1 2 β˜ β +
A ∂t u L2 (L˙ 2 ) , t 2Γ(β)t1−β
utt L2t (L2 ) = −
˜ which, due to the relation β˜ ≤ (1 + β)/2, allows us to dominate the utt term on the right-hand side of (7.67). Thus, using Gronwall’s inequality, we get an estimate of the form (7.66) provided (7.65) holds. Again, uniqueness of a solution follows from an energy estimate in a lower regularity regime (see Theorem 7.4).
Uniqueness and an even lower regularity estimate can be obtained by multiplication of (7.50) with ut . Here it suffices to assume
σ ∈ H 1 (0, T ; L∞ (Ω)) , (7.68)
μ ∈ L2 (0, T ; L∞ (Ω)) , h ∈ L2 (0, T ; L˙ 2 (Ω)) ,
ρ ∈ L2 (0, T ; L4 (Ω)) u0 ∈ H˙ 1 (Ω) ,
u1 ∈ L˙ 2 (Ω) .
Theorem 7.4. Under conditions (7.52) and (7.68), any solution u lying in the space Ulo and satisfying the initial boundary value problem (7.50) and (7.51) satisfies the estimate
˜
(7.69)
u 2Uvl := ut 2L∞ (L˙ 2 ) + Aβ/2 ∂tβ u 2L2 (L2 ) + ∇u 2L∞ (L2 ) t t
≤ C(T ) ut (0) 2L˙ 2 (Ω) + ∇u(0) 2L2 (Ω) + h 2L2 (L˙ 2 ) . t
7.2. Some nonlinear damped wave equations
187
Proof. Note that testing with ut is admissible for u ∈ Ulo (cf., (7.66); so we do not move to Galerkin discretizations here), and yields (7.70) 1 c2 1 ˜
1 − σ(t)ut (t) 2L˙ 2 (Ω) + ∇u(t) 2L2 (Ω) +
Aβ/2 ∂tβ u 2L2 (L˙ 2 ) 1−β t 2 2 2Γ(β)t 2 1 c ≤ 1 − σ(0)ut (0) 2L˙ 2 (Ω) + ∇u(0) 2L2 (Ω) 2 2 t 1 − σt (s)ut (s) + μ ut + ρu − h, ut (s) ds 0 2 1 c2 1 ≤ 1 − σ(0)ut (0) 2L˙ 2 (Ω) + ∇u(0) 2L2 (Ω) + ut 2L2 (L˙ 2 ) t 2 2 2 1
σt L2t (L∞ ) ut L∞ (L˙ 2 ) + μ L2t (L∞ ) ut L∞ (L˙ 2 ) + t t 2 2
2 + ρ L2 (L4 ) u L∞ (L4 ) + h L2 (L˙ 2 ) t
(where we have used the fact that form (7.69).
cc0 L∞ (Ω)
≤ 1), hence an estimate of the
The estimate (7.69) also yields uniqueness of higher regularity solutions (see Theorems 7.2 and 7.3), since in the linear setting we are considering here, for this purpose it suffices to prove that any solution with zero right hand side and initial data h = 0, u0 = 0, u1 = 0 needs to vanish. Note that Theorem 7.4 does not state the existence of very low regularity solutions but only a very low order energy estimate for low order solutions (that exist according to Theorem 7.3). This will actually be enough for our purposes of proving Fr´echet differentiability of the parameter-to-state map G, and neither existence nor uniqueness is really needed for that; see the comment preceding Theorem 7.7. We proceed to proving well-posedness of the nonlinear problem (7.42) ˜ with D = bAβ ∂tβ by applying a fixed point argument to the operator T , that maps v to the solution of ˜
(7.71)
(1 − 2κv)utt + c2 Au + bAβ ∂tβ u − 2κvt ut = r in Ω × (0, T ), u(0) = 0,
ut (0) = 0 in Ω,
that is, of (7.50) and (7.51), where σ(x, t) = 2κ(x)v(x, t), −2κ(x)vt (x, t), ρ(x, t) = 0, and h(x, t) = r(x, t).
μ(x, t) =
Theorem 7.5. Assume that (7.55) holds. For any β ∈ (0, 1), T > 0, κ ∈ W 2,4 (Ω) ∩ W 1,∞ (Ω) there exists R0 > 0 such that for any data u0 ∈ H˙ 3 (Ω), u1 ∈ H˙ 2 (Ω), r ∈ L2 (0, T ; H˙ 2 (Ω)) satisfying (7.72)
Au1 2L˙ 2 (Ω) + ∇Au0 2L2 (Ω) + Ar 2L2 (L˙ 2 ) ≤ R02 ,
188
7. Analysis of Fractionally Damped Wave Equations
there exists a unique solution u ∈ U of ˜
(7.73)
(1 − 2κu)utt + c2 Au + bAβ ∂tβ u = 2κ(ut )2 + r in Ω × (0, T ), u(0) = u0 ,
ut (0) = u1 in Ω .
Proof. We first consider the self-mapping of T and, for this purpose, employ one of the linear existence theorems above. Even in case of a constant κ, the regularity requirements on σ(x, t) = 2κ(x)v(x, t), μ(x, t) = −2κ(x)vt (x, t) that induce sufficient regularity of v and likewise of u, force us into the high regularity scenario of Theorem 7.2. For the spatially variable κ, due to the estimates
μ L2 (L2 ) = 2 (κvt ) L2 (L2 ) ≤ 2 κ L2 (Ω) vt L2 (0,T ;L∞ (Ω)
+ 2 ∇κ L4 (Ω) ∇vt L2 (L4 ) + κ L∞ (Ω) vt L2 (L2 ) ,
σt L2t (L∞ ) ≤ 2CH 2 ,L∞ (κvt ) L2 (L2 ) ,
σ L∞ (L4 ) = 2 (κv) L∞ (L4 ) ≤ 2 κ L4 (Ω) v L∞ (L∞ )
+ 2 ∇κ L4 (Ω) ∇v L∞ (L∞ ) + κ L∞ (Ω) v L∞ (L4 ) ,
∞ = 2 ∇(κv) L∞ (L∞ )
∇σ L∞ t (L ) t
≤ 2 ∇κ L∞ (Ω) v L∞ (L∞ ) + κ L∞ (Ω) ∇v L∞ (L∞ ) ,
the regularity κ ∈ W 2,4 (Ω) ∩ W 1,∞ (Ω) as assumed here is sufficient for obtaining the coefficient regularity (7.54) for any v ∈ U . To achieve the nondegeneracy and smallness conditions (7.52) and (7.53), we use the estimates (7.74)
∞ ≤ 2 κ L∞ (Ω) v L∞ (L∞ ) ,
σ L∞ t (L ) t
∇σ L∞ (L4 ) ≤ 2 ∇κ L4 (Ω) v L∞ (L∞ ) + 2 κ L∞ (Ω) ∇v L∞ (L4 ) ,
and additionally to the assumed regularity of κ, we require smallness of v. Theorem 7.2 yields that T is a self-mapping on BR = {v ∈ U : v U ≤ R} provided the initial and right-hand side data are sufficiently small so that
(7.75) C(T ) Au1 2L˙ 2 (Ω) + ∇Au0 2L2 (Ω) + Ar 2L2 (L˙ 2 ) ≤ R2 t
7.2. Some nonlinear damped wave equations
189
with C(T ) as in Theorem 7.2. In view of (7.52), (7.53), (7.74), we choose R such that (7.76)
1 CH 1 →L4 ∇κ L4 (Ω) CH 2 ,L∞ + κ L∞ (Ω) CH 1 ,L4 + κ L∞ (Ω) CH 2 ,L∞ R < . 2 Contractivity of T can be shown by taking v (1) and v (2) in BR and considering u(1) = T v (1) and u(2) = T v (2) , whose differences u = u(1) − u(2) and v = v (1) − v (2) solve ˜
(1)
(2)
(2)
(7.77) (1 − 2κv (1) )utt + c2 Au + bAβ ∂tβ ut − 2κvt ut = 2κv t ut + 2κvutt
with homogeneous initial conditions. Similarly to above, with σ = 2κv (1) , (1) (2) (2) μ = −2κvt , ρ(x, t) = 0, h = 2κv t ut + 2κvutt , since v (1) , u(2) ∈ BR (the latter due to the already shown self-mapping property of T ), we satisfy the conditions (7.52), (7.53), (7.54) on σ and μ. However h in general fails to (2) be contained in L2 (0, T ; H˙ 2 (Ω)) (in particular the term 2κvutt ), hence we move to the lower order regularity regime from Theorem 7.3. To this end, we estimate
(2) (2) ∞ ) + v L∞ (L∞ ) utt L2 (L4 ) .
∇h L2 (L2 ) ≤ 2 κ L4 (Ω) v t L2 (L4 ) ut L∞ (L t t Thus imposing the additional smallness condition √ √ θ := 2 κ L4 (Ω) CCH 1 ,L4 CH 2 ,L∞ ( T + 1)R < 1 on R and employing Theorem 7.3, we obtain contractivity
T v (1) − T v (2) Ulo = u Ulo ≤ θ v Ulo = θ v (1) − v (2) Ulo .
Existence of the linearisation of G requires well-posedness of (7.47) with D = bA∂tβ , that is, (7.50) and (7.51) with σ = 2κu, μ = −4κut , ρ = −2κutt , h = 2δκ(u utt + u2t ). Due to the appearance of a utt term, we are in a similar situation to the contractivity proof above and therefore the lower regularity Theorem 7.3 is the right framework for analysing the linearisation of the forward problem. Theorem 7.6. Under the assumptions of Theorem 7.5 for any δκ ∈ W 1,∞ (Ω), there exists a unique solution z ∈ Ulo of ˜
(1−2κu)ztt + c2 Az + bAβ ∂tβ z (7.78)
−4κut zt −2κutt z = 2δκ(u utt +u2t ) in Ω × (0, T ), z(0) = 0,
where u ∈ U solves (7.73).
zt (0) = 0 in Ω ,
190
7. Analysis of Fractionally Damped Wave Equations
In order to prove Fr´echet differentiability, we also need to bound the solution w of (7.49) with (7.43), that is, (7.50) and (7.51) with σ = 2κu, utt + uvtt + (˜ ut + ut )vt ) + 2κ(vvtt + vt2 ), μ = −4κut , ρ = −2κutt , h = 2δκ(v˜ where v can be bounded analogously to z by Theorem 7.3; in particular we can only expect to have vtt ∈ L2 (0, T ; L2 (Ω)), so h ∈ L2 (0, T ; H 1 (Ω)) is out of reach, and we show Fr´echet differentiability in the very low regularity regime of Theorem 7.4. Note that we actually only need the energy estimate from Theorem 7.4 in order to bound w. Existence is already guaranteed by κ − κ), existence of the three terms which compose w = G(˜ κ) − G(κ) − G (κ)(˜ namely u = G(κ) (solving (7.73)), u ˜ = G(˜ κ) (solving (7.73) with κ replaced κ − κ) (solving (7.78)). Clearly, uniqueness is not by κ ˜ ), and z = G (κ)(˜ needed either. ¯ > 0, there exists R0 > 0 such Theorem 7.7. For any β ∈ (0, 1), T > 0, R 3 2 that for any data u0 ∈ H˙ (Ω), u1 ∈ H˙ (Ω), r ∈ L2 (0, T ; H˙ 2 (Ω)) satisfying (7.72), the parameter-to-state map G : BR¯ (0) → U is well-defined according to Theorem 7.5. Moreover, it is Fr´echet differentiable as an operator G : BR¯ (0) → Uvl . Here ¯ BR¯ (0) = {κ ∈ W 2,4 (Ω) ∩ W 1,∞ (Ω) : κ W 2,4 (Ω)∩W 1,∞ (Ω) ≤ R}. 7.2.2. Fractional Zener damping. Now we consider the damping model (7.44), where based on the analysis in Section 7.1, we expect to get wellposedness only in case α ≤ β. It will turn out that in order to tackle not only space- but also time-dependence of the coefficients (as needed for dealing with the nonlinear problem), we have to restrict ourselves to the case β = 1. We will point to this difficulty more explicitly, basically by particularizing the energy estimates from Section 7.1 in the borderline case β = α, and establish a well-posedness result at least on the equation linearized at κ = 0 for that case. Starting with the case β = 1, we proceed as in the previous subsection by first of all considering the initial boundary value problem for the general linear pde, (7.79)
(1 − σ)utt + c2 Au + b1 Aut + a1 ∂tα+2 u + μut + ρu = h,
(7.80)
u(0) = u0 ,
ut (0) = u1
1 (utt (0) = u2 in case α > ) 2
with given space- and time-dependent functions σ, μ, ρ, h.
7.2. Some nonlinear damped wave equations
191
For this purpose we assume the nondegeneracy and smallness of gradient conditions a1 (7.81) + 1 for all x ∈ Ω , t ∈ (0, T ) , σ(x, t) ≤ σ < Γ(1 − α)T α a1 Γ(1−α)T α + 1 − σ (7.82) .
∇σ L∞ (L4 ) < CH 1 →L4 as well as the following regularity on σ, μ, ρ, h, u0 , u1 , μ ∈ L∞ (0, T ; H 1 (Ω)) , ρ ∈ L2 (0, T ; H 1 (Ω)) , h ∈ L2 (0, T ; H˙ 1 (Ω)) , u0 , u1 ∈ H˙ 2 (Ω), u2 = 0 .
(7.83)
Theorem 7.8. Under conditions (7.81), (7.82), (7.83), there exists a unique solution u ∈ U := H 2 (0, T ; H˙ 1 (Ω)) ∩ W 1,∞ (0, T ; H˙ 2 (Ω))
(7.84)
to the initial boundary value problem (7.79) and (7.80), and this solution satisfies the estimate
(7.85) ∇utt 2L2 (L2 ) + Aut 2L∞ (L˙ 2 ) ≤ C Aut (0) 2L˙ 2 (Ω) + ∇h 2L2 (L˙ 2 ) . t
t
Proof. Again we skip the details about the Faedo–Galerkin approach and the discretisation index n and only provide the crucial energy estimate. We multiply (7.79) with Autt and integrate with respect to time, using the inequalities and identities t t Au(s), Autt (s) ds = Au(t), Aut (t)−Au(0), Aut (0)− Aut (s) 2L˙ 2 (Ω) 0
and
0
t 0
∂tα+2 [u](s), Autt (s) ds
t
∂tα [∇utt ](s), ∇utt (s) ds 1 t α ≥ ∂t ∇utt 2L˙ 2 (Ω) (s) ds 2 0 1 = It1−α ∇utt 2L2 (Ω) (t) 2 1
∇utt 2L2 (L2 ) . ≥ t 2Γ(1 − α)tα =
0
The latter equality holds provided utt (0) = 0 and we have applied (7.7) (ac1/2 tually to the Fourier components of the Galerkin discretisation w = λj unj ) with γ = α.
192
7. Analysis of Fractionally Damped Wave Equations
This yields the energy estimate
a1 b1 + 1 − σ
∇utt 2L2 (L2 ) + Aut (t) 2L˙ 2 (Ω) t 2Γ(1 − α)tα 2 b1 ≤ Aut (0) 2L2 (L˙ 2 ) + c2 Au(0), Aut (0) t 2 − c2 Au(t), Aut (t) + c2 Aut 2L2 (L˙ 2 ) t t + ∇σutt + ∇(μut + ρu − h)(s), ∇utt (s) ds 0
b1 ≤ Aut (0) 2L2 (L˙ 2 ) + c2 Au(0), Aut (0) t 2 b1 c4 + Aut (t) 2L˙ 2 (Ω) + Au(t) 2L˙ 2 (Ω) + c2 Aut 2L2 (L˙ 2 ) t 4 b1 + ∇σ L∞ (L4 ) utt L2 (L4 ) ∇utt L2t (L2 ) + ∇utt 2L2 (L2 ) t 2 1
∇μ L∞ + 2 ut L2 (0,t;L∞ (Ω) + μ L∞ (L4 ) ∇ut L2 (L4 ) t (L ) 2 ∞ + ρ L2 (L4 ) ∇u L∞ (L4 ) + ∇h L2 (L2 ) + ∇ρ L2t (L2 ) u L∞ t (L ) t
2
Here we assume nondegeneracy (7.81) and smallness of ∇σ (7.82) and choose a1 < Γ(1−α)T α +1−σ−CH 1→L4 ∇σ L∞ (L4 ) to obtain, using Gronwall’s lemma, the energy estimate (7.85). Analogously to Theorem 7.5, using a fixed point argument, we can prove well-posedness of the nonlinear problem. Theorem 7.9. For any α ∈ (0, 1), T > 0, κ ∈ W 1,4 (Ω), there exists R0 > 0 such that for any data u0 , u1 ∈ H˙ 2 (Ω), r ∈ L2 (0, T ; H˙ 1 (Ω)) satisfying (7.86)
Au1 2L˙ 2 (Ω) + Au0 2L˙ 2 (Ω) + ∇r 2L2 (L2 ) ≤ R02
there exists a unique solution u ∈ U of (7.87) a1 ∂tα+2 u + (1 − 2κu)utt + c2 Au + b1 Aut = 2κ(ut )2 + r in Ω × (0, T ), u(0) = u0 ,
ut (0) = u1 ,
utt = 0 in Ω .
To prove well-posedness of the linearisation (7.47) in the fz case, we apply Theorem 7.8 with h = 2δκ(u utt + u2t ), which is contained in L2 (0, T ; H˙ 1 (Ω)) for u ∈ U provided δκ ∈ L3 (Ω).
7.2. Some nonlinear damped wave equations
193
Theorem 7.10. Under the assumptions of Theorem 7.9, for any δκ ∈ L3 (Ω) there exists a unique solution z ∈ U of a1 ∂tα+2 z + (1−2κu)ztt +c2 Az +b1 Azt −4κut zt −2κutt z = 2δκ(u utt + u2t ) in Ω × (0, T ),
(7.88) z(0) = 0,
zt (0) = 0,
ztt = 0 in Ω,
where u ∈ U solves (7.87). Now Fr´echet differentiability of the mapping G follows from application ut +ut )vt )+2κ(vvtt +vt2 ) and of Theorem 7.8 with h = 2(˜ κ −κ)(v˜ utt +uvtt +(˜ where v = G(˜ κ) − G(κ) ∈ U and u = G(κ) ∈ U . It then follows that for h to ˜ −κ ∈ L3 (Ω). be contained in L2 (0, T ; H˙ 1 (Ω)), it is again enough to assume κ ¯ > 0 there exists R0 > 0 such Theorem 7.11. For any α ∈ (0, 1), T > 0, R 2 2 ˙ that for any data u0 , u1 ∈ H (Ω), r ∈ L (0, T ; H˙ 2 (Ω)) satisfying (7.86), the parameter-to-state map G : BR¯ (0) → U is well-defined according to Theorem 7.9. Moreover, it is Fr´echet differentiable as an operator G : BR¯ (0) → U . ¯ Here BR¯ (0) = {κ ∈ W 1,4 (Ω) : κ W 1,4 (Ω) ≤ R}. We now move to the case β = α, where it turns out that we can only do the case κ = 0. To see this, consider the linear problem (7.89)
(1 − σ)utt + c2 Au + b1 A∂tα u + a1 ∂tα+2 u + μut + ρu = h.
We also assume that the coeffcient b1 ≥ a1 c2 and write it as b1 = a1 c2 +δ with δ ≥ 0, which is actually the physically meaningful setting (cf., [151, Section iii.b]), so that the differential operator can partially be factorised as (1 − σ)∂tt + c2 A + (a1 c2 + δ)A∂tα + a1 ∂tα+2 + μ∂t + ρid
= ∂tt + c2 A a1 ∂tα + id − σ∂tt + δA∂tα + μ∂t + ρid. Thus, up to the perturbation terms containing σ, δ, μ, and ρ, the auxiliary function u ˜ = a1 ∂tα u + u satisfies a wave equation u ˜tt + c2 A˜ u ≈ h. Motivated by this fact, we multiply (7.89) with A˜ ut to obtain the energy identity c2 1
∇(a1 ∂tα u + u)t (t) 2L2 (Ω) + A(a1 ∂tα u + u)(t) 2L˙ 2 (Ω) 2 2 t +δ ∇∂tα u(s), ∇(a1 ∂tα u + u)t (s) ds (7.90)
0
c2 1 = ∇(a1 ∂tα u + u)t (0) 2L2 (Ω) + A(a1 ∂tα u + u)(0) 2L˙ 2 (Ω) 2 2 t + ∇(h + σutt − μut − ρu)(s), ∇(a1 ∂tα u + u)t (s) ds . 0
194
7. Analysis of Fractionally Damped Wave Equations
In (7.90), the term containing δ can be nicely tackled by means of Lemma 4.19, t ∇∂tα u(s), ∇(a1 ∂tα u + u)t (s) ds 0
a1
∇∂tα u(t) 2L2 (Ω) 2 t a1 α 2 − ∇∂t u(0) L2 (Ω) + ∇∂tα u(s), ∇ut (s) ds 2 0 1 a1
∇∂tα u 2L2 (L2 ) ≥ ∇∂tα u(t) 2L2 (Ω) + t 2 2Γ(α)t1−α =
(7.91)
(which reflects the physical fact that δ is the diffusivity of sound and therefore the corresponding term models damping). However, in the term containing σ in (7.90), this is inhibited by the time-dependence of σ. Thus in case β = α < 1, we have to restrict ourselves to the then linear forward problem (7.42), (7.44) at κ = 0 (where also μ = 0, ρ = 0). For well-posedness of the parameter-to-state map G we could simply use Corollary 7.3. However, for establishing existence of a linearzation of G at κ = 0 in Theorem 7.13 below, we need more regularity of u = G(0). To this end, we prove the following. Theorem 7.12. For any α ∈ (0, 1), T > 0, and for any data u0 ∈ H˙ 2 (Ω), u2 ∈ L˙ 2 (Ω), r ∈ L1 (0, T ; H01 (Ω)) ∩ L∞ (0, T ; L˙ 2 (Ω)), there exists a unique solution u = G(0) ∈ U := W 2+α,∞ (0, T ; L˙ 2 (Ω)) ∩ W 1+α,∞ (0, T ; H˙ 1 (Ω)) (7.92) ∩ W α,∞ (0, T ; H˙ 2 (Ω)) of a1 ∂tα+2 u + utt + c2 Au + b1 A∂tα u = r in Ω × (0, T ), u(0) = u0 ,
ut (0) = 0,
utt = u2 in Ω .
Proof. From (7.90) and (7.91) with σ = 0, μ = 0, ρ = 0, h = r, together with Young’s inequality, we obtain the energy estimate 1
∇(a1 ∂tα u + u)t 2L∞ (L2 ) + c2 A(a1 ∂tα u + u) 2L∞ (L˙ 2 ) t t 2 δ + δa1 ∇∂tα u(t) 2L∞ (L2 ) +
∇∂tα u 2L2 (L2 ) t t (7.93) Γ(α)t1−α ≤ ∇(a1 ∂tα u + u)t (0) 2L2 (Ω) + c2 A(a1 ∂tα u + u)(0) 2L˙ 2 (Ω) + 2 ∇r 2L1 (L2 ) t
(∂tα u)t (0)
1 limt 0 Γ(1−α) t−α ut (0),
= we need to assume Due to the identity ut (0) = 0 here, in order for the right-hand side to be finite.
7.2. Some nonlinear damped wave equations
195
The pde yields an estimate of u ˜tt as follows: (7.94)
2 α 2
(a1 ∂tα u + u)tt L∞ 2 = r − c A(a1 ∂t u + u) ∞ ˙ 2 . L (L ) t (L ) t
a1 ∂tα u + u
for a1 > 0, To extract temporal reguarity of u from regularity of we make use of regularity results of the time fractional odes: a1 ∂tα u + u = u ˜ ∈ W k,∞ (0, T ; Z) implies u ∈ W k+α,∞ (0, T ; Z), due to the fact that Iα maps L∞ (0, T ) to C 0,α (0, T ); see [308, Corollary 2, p. 56]. We now consider the linearization of G at κ = 0, which is defined by (7.47). To this end, we derive a weaker energy estimate by multiplying (7.89) with u ˜t = (a1 ∂tα u + u)t . Theorem 7.13. Under the assumptions of Theorem 7.12, for any δκ ∈ L∞ (Ω) there exists a unique solution ˇ (0)δκ ∈ Ulo = W 2+α,∞ (0, T ; H˙ −1 (Ω)) z=G ∩ W 1+α,∞ (0, T ; L˙ 2 (Ω)) ∩ W α,∞ (0, T ; H˙ 1 (Ω)) of a1 ∂tα+2 z + ztt + c2 Az + b1 A∂tα z = 2δκ(u utt + u2t ) in Ω × (0, T ), z(0) = 0,
zt (0) = 0,
ztt = 0 in Ω .
Proof. By testing (7.89) with u ˜t = (a1 ∂tα u + u)t , we obtain the estimate (7.95) 1
(a1 ∂tα u + u)t 2L∞ (L˙ 2 ) + c2 ∇(a1 ∂tα u + u) 2L∞ (L2 ) t t 2 δ + δa1 ∂tα u(t) 2L∞ (L˙ 2 ) +
∂ α u 2 2 t Γ(α)t1−α t Lt (L˙ 2 ) ≤ (a1 ∂tα u + u)t (0) 2L˙ 2 (Ω) + c2 ∇(a1 ∂tα u + u)(0) 2L2 (Ω) + 2 h 2L1 (L2 ) . t
The required for u = G(0)
regularity of δκ to guarantee h = 2δκ(u utt +u2t ) ∈ L1 (0, t; L2 (Ω)) ∈ U is obviously δκ ∈ L∞ (Ω).
Since we cannot establish well-definedness of G(˜ κ) for κ ˜ = 0, we cannot prove Fr´echet (actually not even directional) differentiability of G at κ = 0, ˇ (0)δκ is only a formal linearization of G at κ = 0 into the though; so z = G direction δκ.
Chapter 8
Methods for Solving Inverse Problems
The next three chapters will be devoted to inverse problems involving the models from Chapters 6 (subdiffusion) and 7 (damped wave equations). As pointed out in Section 1.3, these tend to be ill-posed in the sense that small perturbations in the data lead to large deviations in the reconstructions. Therefore, in this chapter we give a brief introduction to regularisation methods for linear and nonlinear inverse problems, as they will be crucial for the numerical solution of the inverse problems to be discussed in Chapters 9, 10, and 11.
8.1. Regularisation of linear ill-posed problems The basic setting is that we have a compact operator A : X → Y , where X and Y are complete normed linear spaces, and we wish to solve the equation (8.1)
Ax = y.
The inverse of A is unbounded, and so small perturbations in the value for y due to data error will be magnified, often making the solution obtained from the straightforward approach at inverting A completely useless. We will restrict ourselves to Hilbert spaces and point to, e.g., [311, 315] for possible extensions to Banach spaces. The norms will be used without subscripts denoting the spaces throughout most of this section, since it should be clear from the context whether we refer to the norm on X or on Y . Moreover, A∗ denotes the adjoint of A, characterised by the identity Ax, y = x, A∗ y and R(A), N (A) denotes the range and nullspace of A. 197
198
8. Methods for Solving Inverse Problems
The next theorem generalises the familiar singular value decomposition svd of matrices to the Hilbert space setting. For this purpose, the assumption of compactness of the operator A is critical. A generalisation of this decomposition to bounded but not necessarily compact operators can be deduced from the spectral theorem for bounded selfadjoint operators [128], replacing the sums by integrals with respect to appropriate measures. Theorem 8.1. Let X, Y be Hilbert spaces, and let A be a compact operator A : X → Y . Then the spaces X and Y have the decompositions (8.2)
X = N (A) ⊕ (N (A))⊥ = N (A) ⊕ R(A∗ ), Y = R(A) ⊕ (R(A))⊥ = R(A) ⊕ N (A∗ ).
There exist orthonormal sets of vectors {vn }n∈N ∈ X, {un }n∈N ∈ Y and a sequence {σn }n∈N of positive real numbers, σn " 0 such that (8.3)
span{vn } = (N (A))⊥ ,
span{un } = R(A),
and the operator A has the representation (8.4)
Ax =
∞
σn x, vn un .
n=1
Further, the equation Ax = y has a solution if and only if y=
∞
y, un un ,
n=1
∞ 1 with |y, un |2 ≤ ∞. σn2 n=1
In this case solutions are of the form (8.5)
x = x0 +
∞ 1 y, un vn σn
n=1
for any x0 ∈ N (A). The proof is a direct consequence of the spectral theorem for self-adjoint compact operators, applied to the operator A∗ A and setting σn2 and vn equal Avn . Note to the eigenvalues and eigenfunctions of A∗ A, as well as un = Av n ∗ ∗ that for any x ∈ X, A Ax, x = Ax, Ax ≥ 0, and so A A is also a positive semidefinite operator and thus its eigenvalues are nonnegative. The system {un , vn , σn }n∈N is called the singular system of the operator A and the representation in (8.4) is the singular value decomposition (svd) of A. If N (A) = {0}, then x0 = 0 and the equation Ax = y has at most one solution, while for the existence of a solution it is necessary and sufficient that y ∈ R(A). that has a What we would like to achieve is to construct an operator A bounded inverse and therefore permits a stable inversion of Az = y and is
8.1. Regularisation of linear ill-posed problems
199
are sufficiently close so that the difference such that the operators A and A
x − z in the two solutions x and z is small. If we select a parameter α as := Aα and A, then we would some measure of closeness of the operators A −1 require that Aα Ax → x for all x ∈ X. The difficulty is that if dim Y = ∞, then the operators A−1 α cannot be uniformly bounded; there must be a sequence {αk }, αk → 0, such that −1
A−1 αk → ∞ as k → ∞. Also, the operator Aα A cannot converge to the identity in the operator norm. The reason is that the operators A−1 α A being and a compact operator A must be the product of a bounded operator A−1 α compact, and so if it converged to I in the operator norm, then I would be compact, which is impossible unless X is finite dimensional. However, we might attempt to implement a weaker condition which would give an acceptable strategy: Definition 8.1. A regularisation strategy for an operator A : X → Y is a family of linear, bounded operators (8.6) Rα : Y → X,
α > 0,
such that
lim Rα Ax = x for all x ∈ X,
α→0+
that is, the operators Rα A converge pointwise to the identity. We will assume that y represents the exact data and x the solution to Ax = y (or one possible solution if N (A) is nontrivial). More precisely, given y ∈ D(A† ) := R(A) ⊕ (R(A))⊥ Y , we define x† as the minimum norm solution of the normal equation (8.7)
A∗ Ax = A∗ y,
which is often called best approximate solution. Note that here we generalise the solution in the sense of nontriviality of both (R(A))⊥ and N (A); the operator A† that maps y to x is called the (Moore–Penrose) generalised inverse of A and satisfies the Moore–Penrose equations A† AA† = A† on D(A† ), AA† = PR(A) on D(A† ),
AA† A = A on X, A† A = P(N (A))⊥ on X .
We further assume that the true value of y is measured as y δ and while y is unknown we may have a reasonable estimate of the maximum level of noise (8.8)
y − y δ ≤ δ.
Clearly, incorporating statistical information on the noise leads to refined assertions, but for simplicity of exposition we remain with a deterministic noise bound here. It is now clear that in order to allow convergence of the regularised solution as α = α(δ) → 0, we must also assume that the maximum error δ also tends to zero. We make this precise:
200
8. Methods for Solving Inverse Problems
Definition 8.2. A regularisation strategy for an operator A : X → Y is said to be admissible if for all x, y δ with Ax − y δ ≤ δ, (8.9)
α(δ) → 0
and Rα(δ) y δ − x → 0
as δ → 0.
8.1.1. The truncated singular value decomposition. The representation of the solution of Ax = y and the conditions on y described in Theorem 8.1 can disguise the difficulties. Certainly, since we need the condition y = ∞ n=1 y, un un , we see that y must lie in the closure of the range of A. This part is relatively easily resolved. If we let P denote the orthogonal projection from Y onto R(A), then it follows that Py =
∞
y, un un ,
n=1
and for any x ∈ X
Ax − y 2 = (Ax − P y) + (P − I)y 2 = Ax − P y 2 + (I − P )y 2 ≥ (I − P )y 2 . This means that if y does contain a component in the subspace orthogonal to R(A), then the equation Ax = y cannot be satisfied, and the best we can do is to solve only for the component that does lie in R(A). We can do this by solving the projected equation (8.10)
Ax = P Ax = P y,
or equivalently, the normal equation (8.7). However there is a further difficulty that is much harder to resolve. We also require that ∞ 1 |y, un |2 ≤ ∞, σn2
n=1
and while the Fourier coefficients y, un of y in the basis {un } may indeed tend to zero as n → ∞, there is no guarantee that they do so rapidly enough to offset the term 1/σn2 , which certainly tends to infinity with n. Indeed, if y contains any noise, while part of this can be filtered out by the projection P , that part lying in R(A) is unlikely to have rapidly decaying Fourier coefficients. Random noise is rarely so well behaved. If we define the finite dimensional orthogonal projection Pk by Pk : Y → span {u1 , . . . uk },
then clearly Pk y ∈ R(A) for all integers k and Pk y = kn=1 y, un un → P y as k → ∞. This suggests that we in fact consider the projected equation (8.11)
Ax = Pk y,
8.1. Regularisation of linear ill-posed problems
201
which is always solvable and in fact the solution is k 1 y, un vn , xk = x0 + σn
(8.12)
n=1
where x0 ∈ N (A). We can choose the solution of minimum norm by taking x0 = 0. Since Axk − P y 2 = (Pk − P )y → 0 as k → ∞ the residual of the projected equation (8.11) can be made as small as desired simply by increasing k. This approach is called the truncated singular value decomposition (tsvd) and the solution (8.12) with x0 = 0 to obtain in place of the best approximate solution is the tsvd solution. While the residual Ax − y →
(I − P )y and hence can be made as small as desired, we simply cannot just take k as large as we please as the best approximate solution then becomes very sensitive to errors in y. Thus while the tsvd approach gives an extremely elegant solution concept, there remains the thorny issue of how large, or small, to choose k. The larger k is chosen, the more basis vectors in X can be involved in forming xk , and so this dictates choosing k as large as possible. On the other hand, any error in y will result in components, especially those with large index, adding pollution of the solution since we know that 1/σn → ∞ as n → ∞. We will use the following notation. Let x† = A† y denote the best approximate solution from the exact y. Let xk be the tsvd approximate to x† , and let x ˜k the tsvd denote the approximate obtained from the measured δ y . Then ˜ k 2 =
xk − x
k 1 y − y δ , uj vj 2 σj j=1
(8.13) =
k k 1 1 δ2 δ 2 δ 2 | y − y , u | ≤ | y − y , u | ≤ . j j σ2 σk2 j=1 σk2 j=1 j
Therefore, (8.14)
˜ xk − x† ≤ xk − x† +
δ , σk
and consequently if k = k(δ) is chosen so that k(δ) → ∞ and δσk−1 → 0 as δ → 0, then x ˜k → x† = A† y, and so for this truncation choice the tsvd is a regularisation method. Another way to view (8.14) is the following. If we have converted the operator A into a possibly large matrix (assume for convenience this is square), then we have the solution x = V D −1 U T y δ where y δ is the measured data and D −1 is the diagonal matrix with (truncated) entries
202
8. Methods for Solving Inverse Problems
{1/σ1 , . . . , 1/σk , 0, . . . , 0}. Neither of the matrices U nor V change the scale as they are orthogonal and represent rotations of the space, so the only contributions to scaling come from the multiplication of the entries of the vector U T y δ by the values {1/σj }kj=1 . Since the entries in the vector U T y δ have errors of size ≤ δ, we want to choose k so that δ/σk is of order one at most. Anything larger would be simply increasing the noise in the components of the vector U T y by a factor greater than that of the signal. While tsvd is optimal both with respect to stability and with respect to approximation, this comes at a high computational cost if the singular system is not known analytically but has to be determined numerically, which is the case in most realistic scenarios. 8.1.2. Tikhonov regularisation. The tsvd approach to regularising the compact operator A can be thought of as replacing it with one that has the same singular values {σj }kj=1 (and same component vectors {uj , vj }kj=1 ) but neglecting all contributions for j > k. It is thus a finite dimensional approximation that agrees exactly with A in those components that the error level allows us to maintain. If now A were a nonnegative and self-adjoint operator, then its spectrum {λn }n∈N ∪ {0} would lie in the interval [0, A ], and we could define an approximation operator Aα for α > 0 that would be guaranteed to be invertible simply by taking Aα to have the spectrum α+λn . Of course there is no reason that our A will have these self-adjointness and nonnegativity properties, but the operator A∗ A will. This suggests that we consider the operator αI + A∗ A as a possible regulariser for A∗ A. In this case the equation Ax = y is replaced by (8.15)
(αI + A∗ A)x = A∗ y.
This is a regularised version of the normal equation (8.7). It can also be interpreted as the first order optimality condition for a quadratic minimisation problem. Definition 8.3. Let A : X → Y be a compact operator, and let α > 0 be a given constant. The Tikhonov regularised solution xα ∈ X is the minimiser of the functional (8.16)
Jα (x) := Ax − y 2 + α x 2 ,
provided that this exists. The constant α is the Tikhonov regularisation parameter. This variational formulation opens the door to extending Tikhonov regularisation in many directions: to the use of Banach in place of Hilbert spaces, to employ general convex functionals as regularisation terms, to incorporate information on the type of noise into the definition of the data misfit term, and last but not least, to extend the scope to nonlinear problems.
8.1. Regularisation of linear ill-posed problems
203
The following theorem shows the connection between this definition, the operator Aα , and the svd of A. Theorem 8.2. Let A : X → Y be a compact operator with the singular system {σn , un , vn }. Then the Tikhonov regularised solution exists and is given by the unique solution to the normal equation with the representation ∞ σn ∗ −1 ∗ (8.17) xα = (αI + A A) A y = y, un vn . 2 σn + α n=1
Proof. The operator αI + A∗ A satisfies (αI + A∗ A)x, x = Ax, Ax + αx, x ≥ α x 2 , and so from the Riesz representation theorem the inverse exists and satisfies 1
αI + A∗ A ≤ . α This shows that xα is well-defined and unique. If we now write the equation (αI + A∗ A)x = A∗ y in terms of the singular system of A, then we obtain ∞ ∞ 2 (α + σm )x, vm vm + P0 x = σm y, um vm , m=1
m=1
where P0 : X → N (A) is the orthogonal projection onto the nullspace of A. If we take the inner product of the above with vn , then since P0 x, vn = x, P0 vn = 0, we obtain (α + σn2 )x, vn = σn y, un and the representation for xα in (8.17) follows from (8.5). For any x ∈ X write x = xα + w. Then (8.18) Jα (xα + w) = Axα − y + Aw, Axα − y + Aw + αxα + w, xα + w = Jα (xα ) + 2w, (αI + A∗ A)xα − A∗ y + α w 2 + Aw 2 = Jα (xα ) + α w 2 + Aw 2 . This gives Jα (xα + w) ≥ Jα (xα ) with equality if and only if w = 0 showing that xα is a minimiser of Jα , as required. We can also view Tikhonov regularisation as using the representation (8.5) for x except that we have multiplied the nth mode by the filter factor σn2 /(α + σn2 ). Once again there is the balancing: To keep our regularised solution xα close to x, we should choose α small. However, if we look at the case where A is a square matrix, then the condition number of A is the ratio of the minimum to maximum singular values, and thus Aα will have condition number σα = (α + σ12 )/(α + σn2 ). This clearly goes to infinity as n → ∞ if α = 0 in the case where A approximates a compact operator since then σn → 0. In fact, if for any given n, we take α small enough so that 0 < α ≤ σn2 , then σα ≥ σ12 /2α. This shows that such an Aα will have unbounded
204
8. Methods for Solving Inverse Problems
condition number as α → 0 and hence the error in the solution of the normal equation will increase as α → 0—the more so, the larger is the expected noise level in the data. We must therefore find a compromisef value for α, and we should expect this to depend strongly on the level of data noise. We can make this situation more precise as follows. Assume that the actual data is measured instead as y δ , where y −y δ ≤ δ, then while x† = A† y is the true best approximate solution to Ax = y, ˜α where the Tikhonov approximation with perturbed data y δ is denoted by x xα = A∗ y δ . We have (αI + A∗ A)˜ (8.19)
x ˜α − x† = (αI + A∗ A)−1 A∗ (y δ − y) + (αI + A∗ A)−1 A∗ y − A† y
and therefore, using A∗ y = A∗ Ax† and A† y = x† , (8.20)
˜ xα − x† ≤ (αI + A∗ A)−1 A∗ (y δ − y) + α(αI + A∗ A)−1 x† =: E1 + E2 . The first term, corresponding to the propagated noise, can be written as (8.21)
E12 = A∗ (αI + A∗ A)−1 (y − y δ ), A∗ (αI + A∗ A)−1 (y − y δ ) = (αI + A∗ A)−1 (y − y δ ), AA∗ (αI + A∗ A)−1 (y − y δ ).
It is readily checked (e.g., by means of singular value expansion), that
AA∗ (αI + A∗ A)−1 ≤ 1 and (αI + A∗ A)−1 ≤ 1/α and hence δ E1 ≤ √ . α
(8.22)
The second term E2 represents the approximation error in replacing the solution of Ax = y by the solution of the regularised normal equation with parameter α. ∧....
Error
. ..... . ..... . . ..... . ..... . . .... . ... . .... . .... .... . .... .... . ..... . . . . ..... . ... . .... ..... . ..... . . .... ..... . . .... ..... . ...... . . . . . ...... . . . ....... . . ....... ........ . ...... . .......... . ......... . .................................................. . . . . . . . . . . . . . .. . . . . .. . . . 1 2. . . . . . . . . . . . . . . .
E
.
E
> α
As we will prove below, E2 → 0 as α → 0, while the first term can be estimated by E1 ≤ √δα , and for fixed δ > 0 the righthand side here is unbounded as α → 0. We thus have the picture shown to the left. The total error increases both for small and large values of α, and this is exactly the situation we also found for the tsvd. It is the classic dilemma of inverse problem regularisation.
Clearly, now the value of the α used must depend on the assumed error level δ, and we write α(δ). Then (8.22) shows that we must choose α such that α → 0 and δ 2 /α → 0 as δ → 0 to ensure that Tikhonov’s method gives a regularisation algorithm for Ax = y.
8.1. Regularisation of linear ill-posed problems
205
8.1.3. Landweber iteration. As an example of an iterative regularisation method—probably the most simple one—we consider the gradient descent iteration for the data misfit function 12 Ax − y δ 2 , which upon fixing the step size to unity leads to the iteration ˜k − A∗ (A˜ xk − y δ ) . x ˜k+1 = x Here without loss of generality we also assume the equation to be properly scaled such that A ≤ 1. To assess the total error, we resolve the recursion for the error: xk−1 − x† ) + A∗ (y δ − y) x ˜k − x† = (I − A∗ A)(˜ ∗
†
= (I − A A) (x0 − x ) + k
k−1
(I − A∗ A)j A∗ (y δ − y) .
j=0
The first term (I − A∗ A)k (x0 − x† ), representing the approximation error, tends to zero as k → ∞, as we will prove below. The second term, incorporating the propagated noise, can be estimated by means of the singular value decomposition ∗
∗
(I − A A) A (y − y) = j
δ
2
∞
σn2 (1 − σn2 )2j y δ − y, vn 2 ≤
n=1
where we have used the simple fact that λ(1 − λ) ≤ ∈ N. Thus
1 +1
1
y δ − y 2 , 2j + 1 for all λ ∈ (0, 1),
k−1 k−1 √ 1 √
y δ − y ≤ ( 2k + 1) y δ − y , (I − A∗ A)j A∗ (y δ − y) ≤ 2j + 1 j=0 j=0
and we encounter the same pattern as with the other two regularisation methods discussed above: The approximation error tends to zero, whereas the propagated noise potentially explodes as k tends to infinity. Thus, the iteration needs to be stopped early enough and the stopping index k = k∗ acts as a regularisation parameter, which needs to be chosen in dependence of the noise level.
8.1.4. Convergence analysis of linear regularisation methods. In fact all the three methods we have just discussed—truncated svd, Tikhonov regularisation, and Landweber iteration, as well as many more—are special cases of a more situation. From (8.5) the problem with the solu∞ general 1 † tion x = n=1 σn y, un vn is that the reciprocals of very small singular values, corresponding to n large (or higher frequency modes), give rise to amplitudes that overwhelm the information contained in the larger singular values corresponding to small n (or lower frequency modes). Regularisation
206
8. Methods for Solving Inverse Problems
attempts can be thought of as a means of modifying this behaviour. Indeed we can view a regularised solution as ∞ 1 (8.23) xα = ω(σn , α) y, un vn , σn n=1
where ω(σ, α) is a filter or dampening function. The case when ω(σ, α) is of identical unity corresponds to the un-regularised solution. Truncated svd, Tikhonov regularisation, and Landweber iteration result from the respective filter functions 1 if σn ≥ α σ2 ω(σ, α) = , ω(σ, α) = α + σ2 0 if σn < α, (8.24) k−1 (1 − σ 2 )j σ 2 = 1 − (1 − σ 2 )k ω(σ, α) = j=0
with k = α1 . Here we should view ω(σ, α) as being a function ω : (0, A ] × (0, α] → R for some maximal regularisation parameter α (note that we are only interested in small values of α). We shall assume that ω satisfies the following conditions, which can easily be verified for the above examples. $ ω(σ, α) ≤ 1 for all α > 0, ω(σ, α) ≤ C(α)σ (8.25) lim (ω(σ, α) − 1) = 0 α→0+
for all σ ∈ (0, A ]. Note that the latter condition typically does not hold at σ = 0. Under these assumptions we have: Theorem 8.3. Define Rα : Y → X by Rα y = xα according to (8.23), where ω satisfies (8.25). Then Rα provides a regularisation strategy with
Rα ≤ C(α). The parameter choice α(δ) is admissible if α(δ) → 0 and δC(α(δ)) → 0 as δ → 0. Proof. We have from (8.25) ∞ ∞ 1
Rα y 2 = ω(σ, α)2 2 y, un 2 ≤ C(α)2 y, un 2 ≤ C(α)2 y 2 , σn n=1
n=1
from which it follows that Rα is bounded and Rα ≤ C(α). Now ∞ ∞ 1 ω(σn , α) Ax, un vn − x, vn vn , Rα Ax − x = σn and since Ax, un = (8.26)
n=1 x, A∗ un
n=1
= σn x, vn , we have ∞ 2 ω(σn , α) − 1 x, vn 2 .
Rα Ax − x 2 = n=1
8.1. Regularisation of linear ill-posed problems
207
To prove convergence, we fix > 0 arbitrarily. Now since {vn }n∈N is a basis for X, for any > 0 there must exist an N such that ∞
1 x, vn 2 < 2 . 2
n=N +1
Since the family of functions (ω(·, α))α∈(0,α] : (0, A 2 ] converges to one uniformly on the compact interval [σN , A 2 ], there also exists α0 ∈ (0, α] such that 1 for all 0 < α < α0 , n ≤ N. |ω(σn , α) − 1| < √ 2 x Then (8.26) gives for all 0 < α < α0 (8.27) N ∞ 2 2 ω(σn , α) − 1 x, vn 2 + ω(σn , α) − 1 x, vn 2
Rα Ax − x 2 = n=1
0 and 0 < σ ≤ A
for some C˜ > 0 and assume that x ∈ R((A∗ A)μ ), that is, for some v ∈ X x = (A∗ A)μ v,
(8.29) then
˜ μ v
Rα Ax − x ≤ Cα
(8.30)
and with the a priori parameter choice rule α(δ) ∼ δ 2/(2μ+1)
Rα y δ − x ≤ Cδ 2μ/(2μ+1)
(8.31)
for all y δ ∈ Y with y δ − Ax ≤ δ. Proof. In terms of the svd, using the identity x, vn = v, (A∗ A)μ vn = σnμ v, vn , we have from (8.26)
Rα Ax − x 2 =
∞ 2 (ω(σn , α) − 1)σnμ v, vn 2 ≤ C˜ 2 α2μ v 2 . n=1
208
8. Methods for Solving Inverse Problems
It is readily checked that the condition (8.28) is satisfied by the examples (8.24) of tsvd, Tikhonov regularisation, and Landweber iteration. However, in case of Tikhonov regularisation, this only holds true for μ ≤ 1, and higher regularity of x beyond this does not yield better convergence; this is known as the saturation phenomenon and does not arise for tsvd nor Landweber iteration. Condition (8.29) is often referred to as a source-wise representation condition and, in terms of the svd, it can be written as
v = 2
∞
σn−2μ x, vn 2 < ∞ .
n=1
This means that the coefficients x, vn not only have to be square summable, as would be the case for an ordinary element of X, but they need to decay sufficiently fast to compensate for the unboundedly increasing factor σn−2μ . Obviously, this condition is the stronger, the higher μ is. In view of the fact that the operators A, A∗ , A∗ A are typically smoothing, (8.29) can also be interpreted as a condition on the regularity of x. Convergence rates have also been establishedunder generalised source conditions x = f (A∗ A)v 2 2 corresponding to v 2 = ∞ n=1 1/f (σn ) x, vn < ∞ [149]. This is of particular interest for severely ill-posed problems, where the singular values decay exponentially and therefore (8.29) would be far too strong for any μ > 0, whereas its generalisation can still make sense with finite Sobolev regularity of x when choosing f as a function that decays only logarithmically at zero. 8.1.5. Regularisation parameter choice. A central issue is how to select the value of α. Note that the a priori choice given in Corollary 8.1 requires knowledge of the smoothness parameter μ of the exact solution, which is hardly ever available in reality. There have been numerous papers and book chapters written on this topic and various strategies proposed [97, 122, 123, 134, 136, 329]. While many of these do lead to workable approaches, there remains no such thing as a procedure that works better than others for all operators A. In fact, for any of these methods it is possible to construct an A such that the reconstruction behaves relatively poorly. That being said, there are several methods that lead to satisfactory results for a wide class of operators A. One of the oldest and best established approaches is the Morozov discrepancy principle which once again says we should not try to find a fit with a residual much smaller than the estimated maximum noise level. If we let r : R+ → R+ , r(α) = Axα − y , then this is the residual (or data discrepancy) function in terms of the regularisation parameter α. The Morozov principle says that we should take this equal to the maximum estimate of
8.2. Methods for solving nonlinear equations
209
the noise, (8.32)
r(α) = Axα − y = δ.
While being very intuitive, the discrepancy principle is also theoretically well founded. In fact, the convergence results above remain valid with the discrepancy principle as an a posteriori rule for choosing α, provided (8.28) holds with μ replaced by μ + 12 . Computationally, it amounts to solving a scalar equation r(α) = δ for α which, depending on how smooth the dependence of xα on α is, can be done by means of Newton’s method or the bisection method. The most costly part of these types of iterations is the evaluation of xα inside r(α) in each step of such an iterative procedure. That is, the regularised solution needs to be computed for several tentative values of the regularisation parameter in order to determine the latter in an appropriate way. A practical issue about the choice of α according to the discrepancy principle or an a priori rule, such as the one given in Corollary 8.1, is due to the fact that the value of δ is often unknown in practice. In that case, so-called δ-free parameter choice strategies can be employed, the most well known being the so-called L-curve method developed by Hansen and coworkers in the early 1990s [134, 136]. See also, [135] for a matlab package designed to implement this approach. Theory says that δ-free methods cannot lead to convergence in the sense of Definition 8.2, a fact that is known as Bakushinskii’s veto. Still, under certain structural conditions on the noise, convergence assertions can also be made for these approaches.
8.2. Methods for solving nonlinear equations We now extend the scope to nonlinear operator equations, (8.33)
F (x) = y ,
where F : D(F )(⊆ X) → Y is a nonlinear operator mapping between Hilbert spaces X and Y and we only have a noisy version y δ of the data y available, whose noise level δ according to (8.34)
y − y δ ≤ δ
we assume known. The exact solution of (8.33) is assumed to exist (and in most of what follows, also to be unique) and is denoted by x† . We briefly introduce nonlinear Tikhonov regularisation, as it is certainly the most well-known and most widely used method for nonlinear inverse problems, but we will then mainly focus on iterative solution methods.
210
8. Methods for Solving Inverse Problems
8.2.1. Tikhonov regularisation. The principle of defining xα as a minimizer to the Tikhonov functional (8.16) extends in a straightforward manner to nonlinear inverse problems (8.33) and even to a more general Banach space setting by setting xδα to a minimizer of (8.35)
min Jα (x) = min F (x) − y δ 2 + α x − x0 2 ,
x∈D(F )
x∈D(F )
where x0 is some initial guess of x† , which may incorporate a priori knowledge on the exact solution and α the regularisation parameter. We refer to, e.g., [96, 316] for nonlinear Tikhonov regularisation in Hilbert space and, e.g., [41, 106, 148, 280, 285, 342] for an extension to Banach spaces including more general regularisation and data misfit terms. Existence of minimizers is not as trivial as in the linear case any more, but using the direct method of calculus of variations can be established provided F is weakly sequentially closed, that is, (8.36)
((xn )n∈N ⊂ D(F ) and xn x and F (xn ) f ) =⇒
(x ∈ D(F ) and F (x) = f ) .
Theorem 8.4. Let α > 0 and assume that F is weakly closed (8.36) and continuous. Then the Tikhonov functional (8.35) has a global minimizer. This property of F also implies stability with respect to noise for the strictly positive regularisation parameter α. Theorem 8.5. Let α > 0 and assume that F is weakly closed (8.36) and continuous. For any sequence y k → y δ as k → ∞, the corresponding minimizers xkα of (8.35) (with y k in place of y δ ) converge to xδα . We are mainly interested in convergence as the noise level tends to zero, which establishes the regularising property of the scheme. This can be achieved by either an a priori or a posteriori choice of the regularisation parameter. Theorem 8.6. Assume that F is weakly closed (8.36) and continuous, and that there exists a unique solution x† of (8.33) in D(F ). Let α = α(δ) be chosen such that (8.37)
α(δ) → 0
and
δ 2 /α(δ) → 0
as δ → 0.
δ δ For any family (y δ )δ∈(0,δ] ¯ of noisy data satisfying y − y ≤ δ and for xα denoting a minimizer of (8.35) and α = α(δ), then xδα −x† → 0 as δ → ∞.
The same result holds for α chosen by the discrepancy principle τ δ ≤ F (xδα ) − y δ ≤ τ δ with fixed 1 < τ < τ in place of the a priori choice (8.37).
8.2. Methods for solving nonlinear equations
211
Proof. Minimality of xδα for the Tikhonov functional implies (8.38)
F (xδα )−y δ 2 +α xδα −x0 2 ≤ F (x† )−y δ 2 +α x† −x0 2 ≤ δ 2 +α x† −x0 2 . Therefore with α = α(δ) according to (8.37), we have lim sup F (xδα(δ) − y δ = 0 , (8.39)
δ→0
lim sup xδα(δ) − x0 2 ≤ lim sup δ→0
δ→0
δ2 + x† − x0 2 ≤ x† − x0 2 . α(δ)
Hence both the minimizers xδα and their images under F are bounded and n , thus there exists a subsequence δn → 0 such that for xn := xδα(δ n) (8.40)
(xn )n∈N ⊂ D(F ) and xn x and F (xn ) y
holds. For any such subsequence, by (8.36) we know that the weak limit x lies in the domain of F and solves (8.33). Since we have assumed the solution to be unique, this implies x = x† . We even have norm convergence by the following Hilbert space trick: 0 ≤ lim sup xn − x† 2 n→∞
= lim sup xn − x0 2 + x0 − x† 2 + 2xn − x0 , x0 − x† n→∞
≤ 2 x0 − x† 2 + 2x† − x0 , x0 − x† = 0, where we have used (8.39) and weak convergence of xn to x† . Thus a subsequence-subsequence argument yields the assertion. Also convergence rates analogous to the linear case (cf. Corollary 8.1) can be obtained under source conditions. 8.2.2. Landweber iteration. The principle of Landweber iteration as a gradient descent method for the data misfit functional 12 F (x) − y δ 2 carries over to the nonlinear setting in a straightforward manner. Assuming that F has a continuous Fr´echet-derivative F (·), we can define the nonlinear Landweber iteration by (8.41)
xδk+1 = xδk + F (xδk )∗ (y δ − F (xδk )) ,
k ∈ N0 ,
where y δ are noisy data satisfying estimate (8.34), and xδ0 = x0 is an initial guess. By xk we will denote the Landweber iterates for exact data, y δ = y = F (x† ). As a stopping rule, we use the discrepancy principle, that is, we stop the iteration as soon as the residual norm falls below the noise level (times
212
8. Methods for Solving Inverse Problems
some fixed safety factor τ ). The stopping index is then k∗ = k∗ (δ, y δ ) such that (8.42)
y δ − F (xδk∗ ) ≤ τ δ < y δ − F (xδk ) ,
0 ≤ k < k∗ ,
where τ > 1 is appropriately chosen; more on this will be said below. As usual for nonlinear problems, iterative approximation schemes can in general only be shown to converge locally. We thus consider the solution as well as iterates in a (closed) ball B2ρ (x0 ). The following two assumptions will be crucial for convergence. First of all we need a proper scaling that allows us to set the step size in the descent method to unity: (8.43)
F (x) ≤ 1 ,
x ∈ B2ρ (x0 ) ⊂ D(F ) .
Second we need a restriction on the nonlinearity of F , that comes as a first order Taylor remainder estimate: (8.44)
˜) ≤ η F (x) − F (˜ x) ,
F (x) − F (˜ x) − F (x)(x − x x, x ˜ ∈ B2ρ (x0 ) ⊂ D(F ) .
η
2
(8.47)
1+η δ, 1 − 2η
then
xδk+1 − x∗ < xδk − x∗ and xδk , xδk+1 ∈ Bρ (x∗ ) ⊂ B2ρ (x0 ). Proof. Assume that xδk ∈ Bρ (x∗ ), which is a subset of B2ρ (x0 ) by the triangle inequality, so that (8.43) and (8.44) are applicable. Thus we can estimate as follows: (8.48)
xδk+1 −x∗ 2 − xδk − x∗ 2 = 2xδk+1 − xδk , xδk − x∗ + xδk+1 − xδk 2 = 2y δ − F (xδk ), F (xδk )(xδk − x∗ ) + F (xδk )∗ (y δ − F (xδk )) 2 ≤ 2y δ − F (xδk ), y δ − F (xδk ) − F (xδk )(x∗ − xδk ) − y δ − F (xδk ) 2 ≤ y δ − F (xδk ) (2δ + 2η y − F (xδk ) − y δ − F (xδk ) ) ≤ y δ − F (xδk ) (2(1 + η)δ − (1 − 2η) y δ − F (xδk ) )
The assertions now follow from (8.47).
Estimate (8.47) suggests choosing τ in the stopping rule (8.42) such that (8.49)
τ >2
1+η > 2. 1 − 2η
As a consequence of the estimate we obtained in the proof of Theorem 8.7, we obtain square summability of the residual norms. As a byproduct, we also get an O( δ12 ) estimate on the stopping index. Corollary 8.2. Let the assumptions of Proposition 8.7 hold, and let k∗ be chosen according to the stopping rule (8.42), (8.49). Then (8.50) k∗ (τ δ)2
0 such that • F is Gˆateaux differentiable on Bρ (x† ), • F (x) : X → Y is invertible on R(F (˜ x)) for any x, x ˜ ∈ Bρ (x† ), ˜ ∈ Bρ (x† ), • there exists a constant L1 such that for all x, x
x) − F (x) (˜ x − x) ≤ L1 ˜ x − x 2 . (8.57)
F (x)−1 F (˜ In particular this means that the domain D(F ) contains Bρ (x† ) and therefore has nonempty interior. Condition (8.57), although containing the inverse of F (x), does not necessarily involve its boundedness, which would imply well-posedness of (8.33) via the inverse function theorem. Indeed, it has been verified to hold for several examples of nonlinear inverse problems and has been used in their analysis in, e.g., [75, 175]. This condition is also known as the affine covariant Lipschitz condition [75–77] and is closely related to range invariance of F . Theorem 8.9. For F satisfying (8.57) and any solution x† of (8.33), there exists ρ0 > 0 such that for any starting value x0 with x0 − x† ≤ ρ0 ≤ ρ the iterates defined by (8.52) stay in Bρ0 (x† ) and converge quadratically to x† ; that is, there exists a constant C > 0 such that for all k ∈ N, (8.58)
xk+1 − x† ≤ C xk − x† 2 .
Upon setting c = Cρ0 and possibly decreasing ρ0 , quadratic convergence obviously implies linear convergence (8.59)
xk+1 − x† ≤ c xk − x†
for some c ∈ (0, 1). Proof. We assume that xk ∈ Bρ0 (x† ) ⊆ D(F ) and denote the error by ek = xk − x† . For any f ∈ X ∗ (with the dual pairing f, ·X ∗ ,X ) and the
8.2. Methods for solving nonlinear equations
217
fundamental theorem of calculus applied to the real function φf : [0, 1] → R, t → φf (r) := f, F (xk )−1 F (xk − rek )X ∗ ,X , we have 1 3
4 −1 † f f f, F (xk ) = φ (0) − φ (1) = − (φf ) (r) dr , F (xk ) − F (x ) ∗ X ,X
0
hence f, ek+1 X ∗ ,X = f, ek − F (xk )−1 (F (xk ) − F (x† ))X ∗ ,X 3
4 = − f, F (xk )−1 F (xk ) − F (x† ) − F (xk )ek X ∗ ,X 13
4 =− f, F (xk )−1 F (xk − rek ) − F (xk ) ek ∗ 0
X ,X
dr
1 ≤ f X ∗ L1 ek 2X , 2
1 1 where we have used the fact that 0 dr = 1, 0 r dr = 12 . Choosing f := e∗k+1 as a normalising functional (according to the Hahn-Banach theorem) of ek+1 , that is, e∗k+1 ∈ X ∗ such that e∗k+1 , ek+1 X ∗ ,X = ek+1 and
e∗k+1 = 1, we conclude that ek+1 X ≤ 12 L1 ek 2X . For ρ0 ≤ min{ρ, L21 } this implies that xk+1 ∈ Bρ0 (x† ) and an induction argument yields that the whole sequence remains in Bρ0 (x† ) and satisfies the quadratic convergence estimate (8.58) with C = 12 L1 . Halley’s method obviously requires higher smoothness of F , not only for defining the iterates but also for analysing the method. The following conditions on the forward operator are sufficient for cubic convergence: • F is twice Gˆ ateaux differentiable on Bρ (x† ), x)) • F (x) : X → Y is invertible on R(F (x)) + R(F (x)) + R(F (˜ for any x, x ˜ ∈ Bρ (x† ), ˜ ∈ Bρ (x† ), • there exist constants L0 , L1 , C2 such that for all x, x p, p˜ ∈ X,
(8.60) x − x ,
F (x)−1 F (˜ x) − F (x) ≤ L0 ˜
(8.61) x) − F (x) ≤ L1 ˜ x − x ,
F (x)−1 F (˜
(8.62) x) − F (x) [˜ x − x, x ˜ − x] ≤ L2 ˜ x − x 3 ,
F (x)−1 F (˜ (8.63)
p .
F (x)−1 F (x)[p, p˜] ≤ C2 p ˜
These conditions are obviously satisfied if F (x) is bounded invertible for every x and F is Lipschitz continuous. Note however, that we here assume that these estimates only hold when applying the inverse F (x)−1 to values x), which is typically as smoothing an operator as F (x), so that of F (˜
218
8. Methods for Solving Inverse Problems
these conditions can also hold for ill-posed problems, as the examples in Section 10.5.3 show. Theorem 8.10. For F satisfying (8.60)–(8.63) and any solution x† of (8.33), there exists ρ0 > 0 such that for any starting value x0 with x0 −x† ≤ ρ0 ≤ ρ the iterates defined by (8.55) stay in Bρ0 (x† ) and converge cubically to x† ; that is, there exists a constant C > 0 such that for all k ∈ N,
xk+1 − x† ≤ C xk − x† 3 .
(8.64)
Also here it is easy to see that cubic convergence implies a quadratic and linear one on a sufficiently small ball. Proof. Again, we assume that xk ∈ Bρ0 (x† ) ⊆ D(F ) and denote the error by ek = xk − x† . Using the abbreviations Bk = F (xk )−1 , Tk = F (xk ) + 1 1 −1 2 F (xk )[xk+ 1 −xk , ·] = F (xk )+ 2 F (xk )[F (xk ) (y−F (xk )), ·] =: F (xk )+ 2
Dk , so that Tk−1 = (I + Bk Dk )−1 Bk , we can write the recursion for the error ek = xk − x† as ek+1 = ek + Tk−1 (F (x† ) − F (xk )) = (I + Bk Dk )−1 Bk (F (x† ) − F (xk ) + F (xk )ek ) + 12 Bk F (xk )[Bk (F (x† ) − F (xk )), ek ]
Here by (8.63) and (8.60) (8.65)
Bk Dk = sup 12 F (xk )−1 F (xk )[F (xk )−1 (F (x† ) − F (xk )), p] p∈X,p=1
≤ C2 F (xk )−1 (F (x† ) − F (xk )) ≤ C2 L0 ek ≤ C2 L0 ρ0 < 1 . Hence (I + Bk Dk )−1 exists and is uniformly bounded (by 1−C21L0 ρ0 ). Thus, for any f ∈ X ∗ , setting gk := ((I + Bk Dk )−1 )∗ f , we can write f, ek+1 X ∗ ,X = gk , Bk F (x† ) − F (xk ) + F (xk )ek + 12 Bk F (xk )[Bk (F (x† ) − F (xk )), ek ]X ∗ ,X , where
gk , Bk F (x† ) − F (xk ) + F (xk )ek X ∗ ,X 1 = gk , Bk F (xk ) − F (xk − rek ) ek X ∗ ,X dr 0 1 1 r gk , Bk F (xk − rsek )[ek , ek ]X ∗ ,X ds dr , = 0
0
8.2. Methods for solving nonlinear equations
219
1 and by 0 r dr = 12 , 4 3 † 1 gk , 2 Bk F (xk )[Bk (F (x ) − F (xk )), ek ] ∗ X ,X 1 13 4 =− r gk , Bk F (xk )[Bk F (xk − sek )ek , ek ] 0
X ∗ ,X
0
ds dr .
Hence we have f, ek+1 X ∗ ,X 1 1 = r gk , Bk F (xk − rsek )[ek , ek ] 0
− F (xk )[Bk F (xk − sek )ek , ek ] X ∗ ,X ds dr
0
gk , Bk F (xk − rsek ) − F (xk ) [ek , ek ] 0 0 + Bk F (xk )[Bk F (xk ) − F (xk − sek ) ek , ek ]X ∗ ,X ds dr 1 1 1 1 2 3 ∗ r dr s ds ek + C2 L1 r dr s ds ek 3 ≤ gk X L2 =
1
1
r
0
0
1
1 1 ≤
f X ∗ L2 + C2 L1 ek 3 , 1 − C2 L0 ρ0 6 4
0
0
where we have estimated the first term by (8.62) with x = xk , x ˜ = xk + r(1 − s)ek and the second
term by (8.63) with x = xk , p˜ = ek , p = Bk F (xk ) − F (xk − sek ) ek , which in its turn, has been bounded by means ˜ = xk + (1 − s)ek . Moreover, in order to estimate of (8.61) with x = xk , x
gk X ∗ = ((I + Bk Dk )−1 )∗ f X ∗ , we have used the bound (8.65). The rest of the proof is analogous to that of Theorem 8.9.
For frozen Halley, the conditions somewhat simplify to the following: • F is twice Gˆ ateaux differentiable on Bρ (x† ), • F (x0 ) : X → Y is invertible on R(F (x))+R(F (x0 )) ∀x ∈ Bρ (x† ), ˜ ∈ Bρ (x† ), • there exist constants L0 , C2 such that for all x, x p, p˜ ∈ X,
(8.66) x − x , x) − F (x) ≤ L0 ˜
F (x0 )−1 F (˜
(8.67)
F (x0 )−1 F (x) − F (x0 ) ≤ c < 1, (8.68)
p .
F (x0 )−1 F (x0 )[p, p˜] ≤ C2 p ˜
Indeed as can be seen from the proof below (see also [182]), stronger assumptions on the smoothness of F in general are not able to improve the convergence of frozen Halley to quadratic. For frozen Newton, we assume
220
8. Methods for Solving Inverse Problems
• F is Gˆateaux differentiable on Bρ (x† ), • there exists a constant c ∈ (0, 1) such that for all x ∈ Bρ (x† ),
(8.69)
F (x0 )−1 F (x) − F (x0 ) ≤ c < 1. Theorem 8.11. For F satisfying (8.66)–(8.68) and any solution x† of (8.33), there exists ρ0 > 0 such that for any starting value x0 with x0 −x† ≤ ρ0 ≤ ρ the iterates defined by frozen Halley (8.56) stay in Bρ (x† ) and converge linearly to x† ; that is, there exists a constant c ∈ (0, 1) such that for all k ∈ N, (8.59) holds. The same holds true for frozen Newton (8.53) under condition (8.69). Proof. We follow the proof of Theorem 8.10 with the replacements Bk ↔ B = F (x0 )−1 , Dk ↔ 12 F (x0 )[F (x0 )−1 (y − F (xk )), ·], that is, basically replacing xk by x0 whenever it appears as an argument of F or F . This allows us to write ek+1 = (I + BDk )−1 B(F (x† ) − F (xk )
(8.70) + F (x0 )ek ) + 12 BF (x0 )[B(F (x† ) − F (xk )), ek ] , where
BDk =
sup p∈X,p=1
−1 −1 † 1 2 F (x0 ) F (x0 )[F (x0 ) (F (x )
− F (xk )), p]
≤ C2 F (x0 )−1 (F (x† ) − F (xk )) ≤ C2 L0 ek ≤ C2 L0 ρ0 < 1 . The bottleneck for the convergence rate lies in the first term of (8.70), which when tested with some f ∈ X ∗ and using the abbreviation gk = (I + BDk )−1 f only gives a linear estimate, gk , B F (x† ) − F (xk ) + F (x0 )ek X ∗ ,X 1 = gk , B F (x0 ) − F (xk − rek ) ek X ∗ ,X dr 0
≤ gk X ∗ c ek ≤ f X ∗
c
ek 1 − C2 L0 ρ0
due to (8.67), while the estimate for the second term in (8.70) is quadratic 4 3 gk , 12 BF (x0 )[B(F (x† ) − F (xk )), ek ] ∗ X ,X
≤ gk X ∗ C2 B(F (x† ) − F (xk )) ek ≤ f X ∗
C2 L 0
ek 2 . 1 − C2 L0 ρ0
8.2. Methods for solving nonlinear equations
221
The estimate for frozen Newton is is simply
f, ek+1 X ∗ ,X = −f, F (x0 )−1 F (xk ) − F (x† ) − F (x0 )ek X ∗ ,X 1
=− f, F (x0 )−1 F (xk − rek ) − F (x0 ) ek X ∗ ,X dr 0
≤ f X ∗ c ek .
8.2.3.2. Data smoothing and noise propagation. In real computations we only have noisy data y δ and sometimes also some information on the statistics of the noise or, in the deterministic setting considered here, on the noise level δ with respect to the Y norm according to (8.34). For ill-posed problems this is often too weak to establish well-definedness of unregularised iterative schemes like those in Section 8.2.3.1 above or Section 8.2.4 below. Therefore we lift the setting to an X–Yˆ pairing in which linear convergence in the sense of (8.59) can be shown (for which, as we have seen, local quadratic or cubic convergence is sufficient). For this purpose we also need data in Yˆ and—in view of the fact that the space Yˆ is typically one of more regular functions than those in Y —have to smooth the data y δ . This can be achieved by any of the schemes from Section 8.1, applied to the embedding operator A : Yˆ → Y . Under additional regularity of the exact solution, this yields a bound in the higher order norm Yˆ , according to the convergence rates results from Section 8.1.4
y − yˆ Yˆ ≤ δˆ .
(8.71)
We now study propagation of this noise through the iteration schemes. It is readily checked that for each of the linearly, quadratically, and cubically convergent schemes from the previous section, due to continuity of the involved operators on the bounded set Bρ (x† ) there exist constants c ∈ (0, 1) and M > 0 such that the error recursion
xk+1 − x† ≤ c xk − x† + M δˆ
(8.72)
holds. An easy induction argument shows that this implies †
†
xk − x ≤ c x0 − x + k
k−1 =1
M ˆ δ; c M δˆ ≤ ck x0 − x† + 1−c
that is, propagation of noise is uniformly bounded with respect to the iteration index and therefore no early stopping is needed.
222
8. Methods for Solving Inverse Problems
8.2.3.3. Regularised versions of Newton’s and Halley’s methods. Applying Tikhonov regularisation to the Newton step results in a minimisation problem similar to (8.35) but linearised under the norm (8.73)
min F (xk ) + F (xk )(x − xk ) − y δ 2 + αk x − x0 2
x∈D(F )
and is known as the iteratively regularised Gauss–Newton method (cf., e.g., [18,19,150,178,191]) and in this variational form it also extends to general Banach spaces. Alternatively one might centre the regularisation term at xk in place of x0 for Newton and arrive at the Levenberg–Marquardt method, (8.74)
min F (xk ) + F (xk )(x − xk ) − y δ 2 + αk x − xk 2 ;
x∈D(F )
cf., e.g., [130, 131, 286]. Levenberg–Marquardt method. Replacing the quadratic minimisation problem by its first order optimality condition allows us to write the Levenberg–Marquardt method as (8.75)
xδk+1 = xδk + (F (xδk )∗ F (xδk ) + αk I)−1 F (xδk )∗ (y δ − F (xδk )) .
Special care has to be taken in the choice of αk . It turns out that the following a posteriori rule, which can be interpreted as a discrepancy principle, with the noise level replaced by a multiple of the (old) residual, (8.76)
y δ − F (xδk ) − F (xδk )(xδk+1 (αk ) − xδk ) = q y δ − F (xδk )
for some q ∈ (0, 1), leads to linear convergence of the residuals. This also has an interpretation as an inexact Newton method, since the linearised residual on the left-hand side is not put to zero but the Newton step equation is only solved up to finite precision. Solvability of (8.76) for αk can be guaranteed if, for some γ > 1, (8.77)
y δ − F (xδk ) − F (xδk )(x† − xδk ) ≤
q δ
y − F (xδk ) γ
holds, which can be achieved under a tangential cone type condition on F , assuming that for all x, x ˜ ∈ B2ρ (x0 ) ⊆ D(F ), (8.78)
˜) ≤ c x − x ˜ F (x) − F (˜ x) .
F (x) − F (˜ x) − F (x)(x − x
Also the stopping index k∗ needs to be chosen. The result below suggests to do so again is the discrepancy principle (8.42). A favourable feature that Levenberg–Marquardt has in common with Landweber iteration is monotonicity of the error. As a matter of fact, the proof of this is to some extent similar.
8.2. Methods for solving nonlinear equations
223
Theorem 8.12. Let 0 < q < 1 < γ and assume that F (x) = y has a solution and that (8.77) holds so that αk can be defined via (8.76). Then
xδk+1 − x† 2 − xδk − x† 2 (8.79)
≤ − xδk+1 − xδk 2 −
2(γ − 1) δ
y − F (xδk ) − F (xδk )(xδk+1 − xδk ) 2 . γαk
Proof. We start similarly to the monotonicity proof of Landweber:
xδk+1 − x† 2 − xδk − x† 2 = 2xδk+1 − xδk , xδk − x† + xδk+1 − xδk 2 = 2xδk+1 − xδk , xδk+1 − x† − xδk+1 − xδk 2 . With
Kk := F (xδk ), r = y δ − F (xδk ) − Kk (xδk+1 − xδk ), r˜ = y δ − F (xδk ) − Kk (x† − xδk ),
and the identity αk (Kk Kk∗ + αk I)−1 (y δ − F (xδk )) = y δ − F (xδk ) − Kk (xδk+1 − xδk ) = r, we can write xδk+1 − xδk = Kk∗ (Kk Kk∗ + αk I)−1 (y δ − F (xδk )) =
1 ∗ K r, αk k
Kk (xδk+1 − x† ) = r˜ − r, so that xδk+1 − xδk , xδk+1 − x† =
1 1 r, r˜ − r ≤ − r ( r − ˜ r ), αk αk
where by (8.76) and (8.77), γ ˜ r = γ y δ −F (xδk )−Kk (x† −xδk ) ≤ y δ −F (xδk )−Kk (xδk+1 −xδk ) = r . Using this monotonicity, again similarly to Landweber iteration, one can establish convergence in the sense of a regularisation method. However, Levenberg–Marquardt is much faster as the logarithmic estimate for the stopping index shows. Theorem 8.13. Let 0 < q < 1 and assume that (8.33) is solvable in Bρ (x0 ), that F is uniformly bounded in Bρ (x† ), and that the Taylor remainder of F satisfies (8.78) for some c > 0. Additionally let k∗ = k∗ (δ, y δ ) be chosen according to the stopping rule (8.42) with τ > 1/q, and let x0 be sufficiently close to some solution x∗ of F (x) = y. Then k∗ (δ, y δ ) = O(1 + | ln δ|),
224
8. Methods for Solving Inverse Problems
and the Levenberg–Marquardt iterates {xδk∗ } converge to a solution of (8.33) as δ → 0. Establishing convergence rates under source conditions is challenging here; we refer to [131]. Gauss–Newton method. The iteratively regularised Gauss–Newton method (irgnm) (8.80) xδk+1 = xδk + (F (xδk )∗ F (xδk ) + αk I)−1 (F (xδk )∗ (y δ − F (xδk )) + αk (x0 − xδk )) works with an easy to implement a priori choice of αk : αk (8.81) αk > 0 , 1≤ ≤ r, lim αk = 0 , k→∞ αk+1 for some r > 1. The choice of the stopping index can be done a priori as well, or again according to the discrepancy principle (8.42). Proving convergence including rates under a source condition (8.82)
x† − x0 = (F (x† )∗ F (x† ))μ v ,
v ∈ N (F (x† ))⊥ ,
is more straightforward here. The key idea for this purpose is to see that
xδk+1 − x† ≈ αkμ wk (μ) with wk (s) as in the following lemma (cf. Corollary 8.1). Lemma 8.2. Let K ∈ L(X, Y ), s ∈ [0, 1], and let {αk } be a sequence satisfying αk > 0 and αk → 0 as k → ∞. Then it holds that (8.83) wk (s) := αk1−s (K ∗ K + αk I)−1 (K ∗ K)s v ≤ ss (1 − s)1−s v ≤ v , and that
0, lim wk (s) = k→∞
v ,
0 ≤ s < 1, s=1
for any v ∈ N (K)⊥ . Indeed, in the linear and noiseless case (F (x) = Kx, δ = 0) we get from (8.80) using Kx† = y and (8.82), xk+1 − x† = xk − x† + (K ∗ K + αk I)−1 (K ∗ K(x† − xk ) + αk (x0 − x† + x† − xk )) = − αk (K ∗ K + αk I)−1 (K ∗ K)μ v. Taking into account noisy data and nonlinearity, with Kk := F (xδk ), K := F (x† ) and setting Kα := K ∗ K + αk I)−1 , we can write the error in (8.80) as
xδk+1 − x† = − αk Kα K ∗ K − Kk∗ Kk Kα (K ∗ K)μ v − αk Kα (K ∗ K)μ v + Kα Kk∗ (y δ − F (xδk ) + Kk (xδk − x† )) .
8.2. Methods for solving nonlinear equations
225
Based on this, one can prove convergence and convergence rates under source conditions. Theorem 8.14. Let B2ρ (x0 ) ⊆ D(F ) for some ρ > 0, (8.81), x) = R(˜ x, x)F (x) + Q(˜ x, x), F (˜ (8.84)
I − R(˜ x, x) ≤ cR ,
Q(˜ x, x) ≤ cQ F (x† )(˜ x − x) ,
and (8.82) for some 0 ≤ μ ≤ 1/2, and let k∗ = k∗ (δ) be chosen according to the discrepancy principle (8.42) with τ > 1. Moreover, we assume that
x0 − x† , v , 1/τ , ρ, and cR are sufficiently small. Then we obtain the rates 2μ
o δ 2μ+1 , 0 ≤ μ < 12 ,
xδk∗ − x† = √ μ = 12 . O( δ) , It is readily checked that (8.84) implies the tangential cone condition (8.44). In fact, for proving convergence without rates, it suffices to assume (8.44) to hold; cf. [181]. The same convergence rates result can be shown with the a priori stopping rule δ (8.85) k∗ → ∞ and √ → 0 as δ → 0 , αk for μ = 0 and μ+ 21
1
ηαk sμ+ 2 ≤ δ < ηαk
(8.86)
,
0 ≤ k < k∗ ,
for 0 < μ ≤ 1. With this choice of k∗ , the convergence result remains valid under a range invariance conditions on F similar to those we encountered in the previous section, (8.87)
x) = F (x)R(˜ x, x) F (˜
and
I − R(˜ x, x) ≤ cR ˜ x − x
for x, x ˜ ∈ B2ρ (x0 ) and some positive constant cR . Regularised Halley. This reads as Sk = F (xδk ),
rk = F (xδk ) − y δ ,
xδk+ 1 = xδk − (Sk∗ Sk + βk I)−1 {Sk∗ rk + βk σ(xδk − x0 )}, 2
1 Tk = Sk + F (xδk )(xδk+ 1 − xδk , ·), 2 2 xδk+1 = xk − (Tk∗ Tk + αk I)−1 {Tk∗ rk + αk σ(xδk − x0 )} with a priori fixed sequences of regularisation parameters (αk )k∈N , (βk )k∈N satisfying αk βk ≤ q, 1 ≤ ≤ q. αk " 0 , βk " 0 , 1 ≤ αk+1 βk+1
226
8. Methods for Solving Inverse Problems
Here σ ∈ {0, 1} is a switch between a Levenberg–Marquardt type σ = 0 and an irgnm type σ = 1 version. For more details on the choice of the regularisation parameters αk , βk and the stopping index k∗ , as well as the analysis of these methods, we refer to [145,176,177]. Also here, convergence rates can be established under source conditions. 8.2.4. Fixed point schemes. If the inverse problem can be written as a fixed point equation x = T(x)
(8.88)
in place of (8.33), we can employ the Picard iteration xk+1 = T(xk ) .
(8.89)
The situation of an inverse problem being formulated as (8.88) often arises naturally when projecting the pde model onto the observation manifold, provided the resulting equation can be resolved for the searched for quantity x. We will give examples for this in Sections 10.5 and 10.6. Convergence of this scheme can, e.g., be shown if T is a contractive self-mapping, according to Banach’s fixed point theorem Theorem 8.15. Let X be a Banach space and let T : M (⊆ X) → X be a self-mapping on the closed set M , that is, T(x) ∈ M for all x ∈ M , and contractive
T(x) − T(˜ x) ≤ c x − x ˜
(8.90)
for some c ∈ (0, 1). Then there exists a unique fixed point x of (8.88). Moreover, the fixed point iteration (8.89) converges linearly
xk+1 − x† ≤ c xk − x†
(8.91)
with c ∈ (0, 1) as in (8.90). Proof. First of all, the self-mapping property of T renders the Picard sequence well-defined. We prove that it is a Cauchy sequence by first of all estimating the difference between subsequent iterates
xk+1 − xk = T(xk ) − T(xk−1 ) ≤ c xk − xk−1 ≤ · · · ≤ ck x1 − x0 , using this to bound the distance of some iterate from the starting value
xk+1 − x0 =
k j=0
(xj+1 − xj ) ≤
k j=0
cj x1 − x0 ≤
1
x1 − x0 , 1−c
8.2. Methods for solving nonlinear equations
227
and finally estimating the difference between arbitrary iterates (without loss of generality ≤ k)
xk − x = T(xk−1 ) − T(x−1 )
x1 − x0 c , 1−c which tends to zero as k, → 0. Hence, since X is a Banach space, the sequence {xk }k∈N ⊆ M converges to some x, which due to closedness lies in M . Taking limits on both sides of the Picard recursion xk+1 = T(xk ) and using (Lipschitz) continuity of T yields x = T(x), that is, x is a fixed point. ≤ c xk−1 − x−1 ≤ c xk− − x0 ≤
Uniqueness follows by taking the difference between two potential solutions x, x ˜, which by (8.90) satisfies x − x ˜ = T(x) − T(˜ x) ≤ c x − x ˜ , that is, (1 − c) x − x ˜ = 0 and therefore x ˜ = x. There exist many other fixed point theorems, some of them also constructive in the sense that they provide assertions on convergence of the fixed point iteration (8.89). We will here provide another one on existence of fixed points only that works under weaker conditions in the sense that instead of contractivity, only a certain continuity property of T is required. It is a generalisation of Schauder’s fixed point theorem to locally convex linear topological spaces, and it is also known as the Tikhonov (Tychonoff) fixed point theorem, cf. [334] Theorem 8.16. Let X be a locally convex topological vector space. For any nonempty compact convex set M ∈ X, any continuous function T : M → M has a fixed point. In particular this applies to bounded convex sets M in a reflexive Banach space X with the weak topology, that is xn x if and only if for all φ ∈ X ∗ (the dual space of X), φ(xn ) → φ(x). Likewise it works for bounded convex sets M in the dual X = Z ∗ of a separable Banach space Z with the weak* ∗ topology, that is xn x if and only if for all z ∈ Z, xn (z) → x(z). So far, we have been considering exact data only. Since (8.91) implies linear convergence of the iterates, the comments from Section 8.2.3.2 on noise propagation apply here as well. This also implies that inverse problems that can be written as fixed point equations with contractive operators are necessarily well-posed in the sense of stable dependence of the solution on the data, which for mildly ill-posed problems may still be achievable by data smoothing and an appropriate choice of norms; cf., Section 8.2.3.2. However, we do not expect severely ill-posed problems to fit into this scheme. Also note that contractivity obviously implies uniqueness (on the set M ), a fact
228
8. Methods for Solving Inverse Problems
that we are going to exploit in some of the inverse problems to be considered below.
8.3. The quasi-reversibility method A particular class of linear inverse problems is concerned with the reconstruction of the initial status of a diffusive evolution from measurements taken at a later time instance. Such problems arise in numerous applications ranging from image deblurring via the identification of airborne contaminants to imaging with acoustic or elastic waves in the presence of strong attenuation. Considering an abstract evolution (8.92)
u + Au = 0 , t > 0 ,
u(0) = u0 ,
with a self-adjoint possibly unbounded operator A mapping a subset D(A) of some Hilbert space X into X, we seek to recover the initial data u0 from final time measurements (8.93)
g = u(T ) .
Formally, we can write (8.94)
g = exp(−AT )u0 ,
u0 = exp(AT )g ,
with the operators exp(−At), t > 0, symbolizing the semigroup associated to the evolution (8.92); cf., Section 12.1.1. In the case of a matrix A acting on X = Rn , it is simply the matrix exponential function (−t)n n exp(−At) = ∞ n=0 n! A . However, if A is an elliptic spatial differential operator, then A is unbounded and exp(AT ) is even much more so. Thus its action (8.94) on g is only well-defined for very specific instances of g and is highly unstable. A prototypical example for this is the backwards heat problem on some spatial domain Ω, ⎧ ⎪ (x, t) ∈ Ω × (0, T ), ⎨ut (x, t) − u(x, t) = 0, (8.95) u(x, t) = 0, (x, t) ∈ ∂Ω × (0, T ), ⎪ ⎩ x ∈ Ω, u(x, 0) = u0 (x), u(x, T ) = g(x), which has a natural setting of A = − with homogeneous Dirichlet boundary conditions, D(A) = H 2 (Ω) ∩ H01 (Ω), X = L2 (Ω). We will revisit this example—along with its fractional counterpart—in Section 10.1. On the more abstract side, formally writing A = exp(−AT ) and making the replacements x = u0 , y = g, we can cast this into the form (8.1) and apply methods from Section 8.1. A class of regularisation method that is much more adapted to the structure of the problem arises from modifying
8.3. The quasi-reversibility method
229
the the ill-posed backwards evolution u + Au = 0, u(T ) = g to arrive at a family of well-posed backwards problems. In fact, dating from the late 1960s the initial attack on the inverse diffusion problem (8.95) was by the method of quasi-reversibility whereby the parabolic operator was replaced by a nearby differential operator for which the time reversal was well posed and the approach was popularised in the book by Lattes and Lions, [207]. Some examples suggested were adding a term u so that the equation (8.92) was of hyperbolic type or a fourth order operator term A2 (thus converting the heat equation into the beam equation with lower order terms). The difficulty with both these perturbations is that the new operators require either further initial or further boundary conditions that are not transparently available. It should also be noted that the idea of adding a small, artificial term to a differential operator in order to improve the ill-conditioning of a numerical scheme, such as adding artificial viscosity to control the behaviour of shocks, is even older. The quasi-reversibility approach by Showalter [318–320] was to instead use the pseudo-parabolic equation (I + A)u + Au = 0, t ∈ (0, T ) subject to the single initial condition u (0) = u0 . There is an interesting history to this equation. It occurs independently in numerous applications, such as a two-temperature theory of thermodynamics and flow in porous media, [22, 60, 67] and it is known in the Russian literature as an equation of Sobolev type. Of course, in these applications the additional term L ut was part of the extended model and not added merely for a stabilising effect. The operator B := (I + A)−1 A is a bounded operator on H 2 (Ω) ∩ H01 (Ω) for > 0. Thus group of operators exp(T B ) is easily defined by the the tfull n n (B power series ∞ ) under conditions on the resolvent R(λ, L) which n=0 n! are satisfied by any strongly elliptic operator A. Under such conditions, exp(tB ) converges to exp(tA) in the strong topology as → 0 and this is the basis of Yosida’s proof of the Hille–Phillips–Yosida theorem which shows the existence of semigroups of differential operators. There are known error estimates on the rate of this convergence. The quasi-reversibility step is to recover an approximation to u0 by computing u (0) = exp(T B )g(x). Here > 0 obviously plays the role of a regularisation parameter and has to be chosen, depending on the expected noise level δ in g. Thus replacing the heat equation by the pseudo-parabolic equation is a regularising method for solving the backwards heat problem and is well studied in the literature. Of course, there were other approaches. For example, a blending of the quasi-reversibility ideas with those of logarithmic convexity led Showalter to
230
8. Methods for Solving Inverse Problems
suggest that retaining the heat equation but introducing the quasi-boundary value u (0) + u (T ) = g
(8.96)
in place of the final value (8.93), gives superior reconstructions. Several authors have followed this idea, for example, [65]. A summary on some of this earlier work can be found in [12] and a more comprehensive discussion in [160, Section 3.1]. We will here provide a concise convergence analysis in the sense of regularisation from Section 8.1 for the examples mentioned above: • biharmonic regularization u + Au − A2 u = 0 ,
(8.97)
u (T ) = g;
• pseudo-parabolic regularisation u + Au + Au = 0 ,
(8.98)
u (T ) = g;
• quasi-boundary value regularisation (8.99)
u + Au = 0 ,
u (T ) + u (0) = g.
In Section 10.1 we will add the fractional derivative–based method ∂t1− u + Au = 0 ,
u (T ) = g,
and variants of it. Having the backwards heat problem in the back of our mind but in an actually much more general framework, we assume that the positive definite selfadjoint operator A admits a decomposition via eigenvalues and eigenfunctions λj , ϕj so that for all j ∈ N the identity Aϕj = λj ϕj holds and (ϕj )j∈N forms an orthonormal basis of X. This allows us to write the solutions of the above problems (8.97)–(8.99) as u (t) =
∞
u,j (t)ϕj ,
j=1
where the sum converges in X if and only if (u,j (t))j∈N is square summable and the coefficient functions u,j satisfy the following final value problems for scalar odes: ⎧ 2 ⎪ for (8.97), ⎨u,j + λj u,j − λj u,j = 0 , u,j (T ) = gj , u,j + λj u,j + λj u,j = 0 , u,j (T ) = gj , for (8.98), ⎪ ⎩ u,j + λj u,j = 0 , u,j (T ) + u,j (0) = gj , for (8.99).
8.3. The quasi-reversibility method
231
Here gj = g, ϕj denotes the coefficient of the final time data g = ∞ j=1 gj ϕj . These odes can be easily resolved, leading to explicit expressions for their solutions whose evaluation at t = 0 yields u,j (0) = q (λj , T ) gj with
⎧ ⎪ ⎨exp(λ(1 − λ)T ) λ T) q (λ, T ) = exp( 1+λ ⎪ ⎩ 1 +exp(−λT )
for (8.97), for (8.98), for (8.99),
as compared to uj (0) = eλj T gj for the exact solution to the inverse problem (8.92) and (8.93). This leads us to stating a convergence result, in which we apply the above method in the general framework of multiplying the coefficients gj with factors q (λj , T ). We here also take into account noise in the data g δ with noise level δ ≥ g δ − g . Theorem 8.17. Define an approximation ∞ q (λj , T )g δ , ϕj ϕj (8.100) uδ (0) = j=1
to u0 with the family of functions q satisfying the approximation and boundedness conditions, sup q (λj , T ) ≤ B() < ∞ , j∈N
(8.101)
¯ < ∞, sup sup q (λj , T )e−λj T ≤ B ∈(0,¯ ) j∈N
lim q (λj , T )e−λj T = 1
→0
for all j ∈ N .
Then by choosing the regularisation parameter = (δ) such that (δ) → 0 and δB((δ)) → 0, we achieve convergence uδ (0) − u0 → 0 as δ → 0. Proof. The assumed bounds provide us with the estimate ∞
2 1/2 δ δ λj T
u (0) − u0 = q (λj , T )g , ϕj − e g, ϕj j=0
≤
∞
1/2 q (λj , T ) g − g, ϕj
j=0
2
δ
2
∞
2 1/2 −λj T λj T + − 1)e g, ϕj (q (λj , T )e j=0
≤ B() δ + d ,
232
8. Methods for Solving Inverse Problems
1/2 ∞ −λj T −1)u , ϕ 2 where the approximation error d = 0 j j=0 (q (λj , T )e can be shown to tend to zero as → 0, by means of Lebesgue’s dominated convergence theorem. To this end note that by our assumption,
2 tends to zero as each of the summands (q (λj , T )e−λj T − 1)u0 , ϕj ¯ 2 u0 , ϕj 2 , where the latter are summable due to → 0 and is bounded by B ∞ 2 2 j=1 u0 , ϕj = u0 < ∞. This result obviously applies to the methods (8.97), (8.98), (8.99) above. In principle, one could choose arbitrary functions q satisfying the bounds (8.101) for defining u (0) according to (8.100). Note however, that this requires knowledge of the eigenfunctions ϕj of A, which in the concrete setting of elliptic differential operators is available explicitly only for very special domains and with constant coefficients. Thus, in the spirit of quasi-reversibility we here and in Section 10.1 focus on methods that have a meaningful interpretation via an evolution equation, for which efficient numerical solution methods are available.
Chapter 9
Fundamental Inverse Problems for Fractional Order Models
After having looked at fractional models in Chapters 6 and 7, we now turn to inverse problems involving these models. This and the following three chapters will deal with inverse problems arising in the context of pdes containing fractional order derivatives. The history of inverse problems for fractional differential equations is considerably more recent than that on analysis of fractional forward problems. Much of the work has been done for fractional derivative versions of classical inverse problems for a parabolic equation—that is, a subdiffusion α model where the time derivative ∂u ∂t is replaced by ∂t u with 0 < α < 1 and is usually of Djrbashian–Caputo type to accommodate an initial value u(x, 0). As a consequence, the most extensive of the inverse problems chapters will be the one on inverse problems for subdiffusion, Chapter 10. Further references concerning this part can be found in the tutorial paper [170]. Inverse problems for fractionally damped wave equations are studied in Chapter 11. These are mainly motivated by problems in ultrasound imaging. There is also a short outlook on inverse problems in the context of the fractional Laplacian (focusing on the fractional Calder´on problem as the most prominent instance) in Chapter 12. This would actually be a topic for a book on its own—in part because of the quite different mathematical tools and concepts to be used there.
233
234
9. Fundamental Inverse Problems for Fractional Order Models
Coming back to the content of this chapter, we start with the probably most fundamental question, namely the one of how to determine the order of the fractional derivative. Then we provide some facts on direct and inverse Sturm–Liouville problems, which play a crucial role in the identification of coefficients from time trace data. These are extended to the fractional case.
9.1. Determining fractional order This is probably the most obvious of all inverse problems for fractional operators, as while we might have a model that is known to be of fractional type, the actual order is likely to be unknown unless the specifics of the subdiffusion model can be determined by the microscopic description. This is unlikely to hold for most practical applications, and thus it has to be recovered from experimental data. The reader will have noticed that the Green’s functions in Chapter 5 and the fundamental solution of the subdiffusion operator in Chapter 6 have a very clear, strong dependence on the fractional exponent α. Therefore determining other items without knowing α would be doomed to failure. Much work has been done here, mostly looking for a least squares fit from experimental data. One of the earliest works here is from 1975 [310] and in part it was based on the random walk model [252] discussed in Section 2.1; see also [143]. It is certainly the case that the exponent α is deeply embedded within the fractional operator, and we must expect its reconstruction to be nonlinear. In many cases of the subdiffusion operator the asymptotic behaviour of the solution u as a function of time can be used to determine α. The first paper with a rigorous existence and uniqueness analysis is [144] from which the first example in Section 9.1.1 is taken. All the work mentioned above has been done in the context of the subdiffusion equation and variants thereof. We will first consider such models and later turn to much more recent work done in the context of fractionally damped wave equations. 9.1.1. Subdiffusion type equations. 9.1.1.1. Determining a single exponent. Let us begin with the example of the following subdiffusion problem. Let Ω ⊂ Rd (d = 1, 2, 3) be an open bounded domain with a smooth boundary ∂Ω. We consider the subdiffusion problem, ∂tα u = Lu (9.1)
u(x, t) = 0 u(·, 0) = u0
in Ω × (0, T ], on ∂Ω × (0, T ], in Ω,
9.1. Determining fractional order
235
where Lu is a strongly elliptic operator in divergence form d ∂u ∂ Lu = aij (x) − c(x)u ∂xi ∂xj i,j
with the quadratic form generated by aij uniformly positive definite and c(x) ≥ 0. Thus the eigenvalues {λn } of the operator A = −L (equipped with homogeneous Dirichlet boundary conditions) are strictly positive. The initial condition u0 is known (and possibly can be chosen at will). Suppose also that our additional data is the time trace of the flux of the solution at a fixed point x0 on the boundary ∂Ω ∂u (˜ x, t) = h(t), 0 < t ≤ T, ∂ν for some fixed measurement time T . (9.2)
x0 ∈ ∂Ω
The solution to (9.1) has a representation as an infinite series indexed by the sequencing of the eigenvalues {λn } (and in dimensions d > 1 these may have multiplicity greater than one); cf., (6.18), (6.19), (6.20). After we take the normal derivative at x0 , this becomes (9.3) ∞ ∞ ∂ ∂ Eα,1 (−λn tα )u0 , ϕn ϕn (x0 ) =: bn Eα,1 (−λn tα ), h(t) = u(x, t) = ∂ν ∂ν n=1
n=1
where {bn } is independent of α. Uniqueness follows quickly: Take the Laplace transform of the above to obtain ˆ h(s) =
∞ n=1
bn
sα sα + λn
using equation (3.28). Now if α1 and α2 were two solutions giving the same time trace data h(t), then for all s (9.4)
∞ n=1
∞
bn
sα1 sα2 = b , n α2 sα1 + λn s + λn n=1
where not all of the terms {bn } can vanish. The function of s represented by the left-hand term has poles where sα1 = −λn for each n, whereas the right-hand term has poles at sα2 = −λn . Since they are the same analytic function, these poles must have the same location, and so α1 = α2 . A complete proof for this can be found in [214, 217]. The above proof sketch gives very little insight as to how α might be recovered. Here is a possibility: Suppose we are able to select the initial value u0 at will. Then the simplest formula is to choose it to be one of the eigenfunctions, say ϕj for some fixed j, and to ensure that the measurement ∂ ϕj . In this case the determining equation becomes point x ˜ is not a zero of ∂ν
236
9. Fundamental Inverse Problems for Fractional Order Models
h(t) = bj Eα,1 (−λj tα ), where bj is known. From its power series expansion λj Eα,1 (−λj tα ) = 1 − Γ(1+α) tα + o(t2α ), and so (9.5)
h(t) = bj −
λj bj tα + o(t2α ) = h(0) + Ctα + o(t2α ) as t → 0. Γ(1 + α)
Thus the time trace of the flux is the sum of a constant (the value of the flux at t = 0) plus a term Ctα (where C is unknown due to the presence of the Γ(1 + α) term) together with terms that decay at a faster rate. Thus a simple least squares fit to recover C and α can be performed. Actually λj bj the only unknown part is also a function of α, and this since C = − Γ(1+α) gives further opportunities to couple these and solve the nonlinear equation for the single unknown α. If we are unable to measure for sufficiently small time values in order for the asymptotic behaviour to be effective, what now? Then there is the possibility of taking T large and looking to extract α from large values of t. We know that the large amplitude behaviour of Eα (−τ ) is reciprocal and linear in τ , and so each of the eigenvalue indices will contribute a term c(λj )/tα . Thus, summing up, the total contribution will have asymptotic ˜ −α . The value of the constant C˜ is unimportant, and again behaviour h(t)Ct a least squares fit will recover α. The brief analysis above would seem to indicate that knowing the eigenvalues and eigenvectors and hence in turn the coefficients in the equation and the properties of the underlying medium is necessary. This is not actually the case. If the coefficients are only spatially dependent, then the time trace of the data, in this case the flux (but it could be alternatives) is governed by a Fourier expansion with the time operator a Mittag-Leffler function of order α and with argument based on tα . This Mittag-Leffler function is analytic and tα then persists in any time trace. Further, from the power series of Eα,1 we see that the lowest such power occurring is in fact tα . Thus the recovery of the exponent boils down to looking for this singularity in an otherwise analytic function, and this is best achieved by having information on the measured data available for very small or very large times. The core of this argument from a uniqueness perspective can be seen from the problem brought into the complex plane through (9.4), and this needs no information on the underlying elliptic operator (and hence on the medium parameters other than the anomalous behaviour). Significant generalisations are therefore possible, but there is inevitable complexity; see, for example, [162]. There is a slightly alternative formulation that was in fact one of the original results in this direction [144] and which we now describe. The authors used the setup in (9.1) but imposed the more general impedance
9.1. Determining fractional order
237
boundary conditions ∂u ∂ν + γu = 0 on ∂Ω with 0 < γ < ∞, and now the overposed (measurement) data becomes the value of the solution at a fixed point x0 ∈ Ω, (9.6)
u(x0 , t) = h(t),
0 < t < T,
instead of (9.2). We then have the following result by Hatano, Nakagawa, Wang, and Yamamoto [144]. Theorem 9.1. Suppose u0 ∈ C0∞ (Ω) and Au0 (x0 ) = 0. Then the order α can be recovered by the formula, (9.7)
t h (t) . h(t) − u0 (x0 )
α = lim
t→0
If instead A−1 u0 (x0 ) = 0 on Ω, then α = − lim
(9.8)
t→∞
t h (t) . h(t)
Proof. Step 0. Representations of h(t) = u(x0 , t) and h (t) = ut (x0 , t). Since u0 ∈ C0∞ (Ω), for any ∈ Z there is a constant C > 0 such that |u0 , ϕn | ≤ C |λn |− and (9.9) u0 (x0 ) =
∞
u0 , ϕn ϕn (x0 ),
Au0 (x0 ) =
n=1
∞
λn u0 , ϕn ϕn (x0 ).
n=1
Also Am ϕn = |λn |mfor any m ∈ Z and standard elliptic regularity shows m that ϕn H 2m (Ω) ≤ C1 A ϕn + ϕn . The Sobolev embedding theorem shows that if m > d4 , then there is a constant C2 = C2 (m) such that maxx∈Ω |ϕn (x)| ≤ C2 ϕn H 2m (Ω) ≤ C1 C2 (|λn |m + 1)
∀n ∈ N,
and in turn constants C3 > 0, C4 > 0 such that (9.10)
|ϕn (x0 )| ≤ C3 |λn |m ,
|λn | ≤ C4 n2/d
∀n ∈ N,
where the last equation is the Weyl estimate for the eigenvalues [345]. These above estimates now combine to show that ∞ u0 , ϕn ϕn (x0 )Eα,1 (−λn tα ), 0 < t < T, (9.11) u(x0 , t) = n=1
where the series converges in C[0, T ]. Therefore, for 0 < t < T we have (9.12)
ut (x0 , t) = −
∞
λn u0 , ϕn ϕn (x0 ) tα−1 Eα,α (−λn tα ),
n=1
where we have used equation (3.34) with m = 1.
238
9. Fundamental Inverse Problems for Fractional Order Models
Proof of case t → 0.
0 (x0 ) + O(tα ). Now Step 1. Show that t1−α h (t) = − AuΓ(α)
Eα,α (−λn tα ) =
E (−λ tα ) − Γ(α)−1
1 1 α,α n + tα + tα rn (t), = α Γ(α) t Γ(α)
where rn (t) is continuous at t = 0, and so limt→0 rn (t) exists. We can now split equation (9.12) into two parts ut (x0 , t) = − −
∞ n=1 ∞
λn u0 , ϕn ϕn (x0 )
tα−1 Γ(α)
λn u0 , ϕn ϕn (x0 ) rn (t) t2α−1 .
n=1
We can estimate the remainder term for t ≥ 0
(9.13)
∞ * (−λn )k tα(k−1) ** * * |rn (t)| = * Γ (k + 1)α k=1 ∞ * (−λn )tα )(k − 1) ** * * = |λn | |Eα,2α (−λn tα )| ≤ |λn |, = |λn |* Γ (k + 1)α k=1
and so by (9.9) (9.14) lim t
1−α
t→0
1 Au0 (x0 ) + lim tα ut (x0 , t) = − −λn u0 , ϕn ϕn (x0 )rn (t) . t→0 Γ(α) ∞
n=1
Here by (9.13) ∞ * * * * −λ u , ϕ ϕ (x )r (t) * * n 0 n n 0 n n=1
≤
∞
|λn | u0 , ϕn ϕn (x0 )| ≤ 2
n=1
∞ n=1
|λn |2
C C3 |λn |m . |λn |
From equation (9.10)* we can choose sufficiently* large in order that the * * summand max0≤t≤T * ∞ n=1 λn u0 , ϕn ϕn (x0 )rn (t)* < ∞, and hence from (9.13) and (9.14) we obtain (9.15)
lim t1−α ut (x0 , t) = −
t→0
Au0 (x0 ) . Γ(α)
9.1. Determining fractional order
239
Proof of case t → 0.
0 (x0 ) α Step 2. Show that t−α (h(t) − u0 (x0 )) = − Au Γ(α+1) + O(t ). We also have
∞
(−λn )k tα(k−2) λn tα Eα,1 (−λn t ) = 1 − + t2α Γ(α + 1) Γ(αk + 1) α
k=2
(9.16)
λn t α + t2α + t2α λ2n Eα,2α+1 (−λn tα ), =1− Γ(α + 1)
and so u(x0 , t) =
∞
u0 , ϕn ϕn (x0 ) + tα
n=1
+ t2α
∞ (−λn )u0 , ϕn ϕn (x0 ) n=1
∞
Γ(α + 1)
Eα,2α (−λn tα )u0 , ϕn ϕn (x0 )
n=1
= u0 (x0 ) − tα
Au0 (x0 ) + t2α r˜(t), Γ(α + 1)
where sup0≤t≤T |˜ r(t)| < ∞.
Thus Au0 (x0 ) . lim t−α u(x0 , t) − u0 (x0 ) = − t→0 Γ(α + 1)
(9.17)
Proof of case t → 0. Step 3. Conclude (9.7). Now using the assumption that Au0 (x0 ) = 0 and Γ(α + 1) = α Γ(α), it follows by combining equations (9.15) and (9.17) that lim
t→0
t h (t) h(t) − u0 (x0 )
Au0 (x0 )8 limt→0 t1−α ut (x0 , t) = = Γ(α) Au0 (x0 ) = α, limt→0 t−α (u(x0 , t) − u0 (x0 )) Γ(α + 1)
showing the conclusion (9.7) for small times. Proof of case t → ∞: The second part of the result (9.8) relies on the asymptotic estimates (cf., (3.44)) y −1 + O y −2 , Γ(1 − α) −2 −y + O y −3 as y → ∞. Eα,α (−y) = Γ(−α) Eα,1 (−y) =
240
9. Fundamental Inverse Problems for Fractional Order Models
In turn, via (9.11) and (9.12), this translates to u(x0 , t) =
∞
u0 , ϕn ϕn (x0 )
n=1
1 Γ(1 − α)λn tα
∞ 1 + O t−2α u0 , ϕn ϕn (x0 ) 2 λn n=1
(9.18)
A−1 u0 (x0 ) + O(t−2α ) A−2 u0 (x0 ) , = t−α Γ(1 − α) ut (x0 , t) = −
∞
u0 , ϕn ϕn (x0 )
n=1
tα−1 Γ(−α)λn t2α
∞ −2α−1 1 +O t u0 , ϕn ϕn (x0 ) 2 λn n=1
A−1 u0 (x0 ) + O(t−2α−1 ) A−2 u0 (x0 ) . = −t−α−1 Γ(−α) Thus we obtain t ut (x0 , t) Γ(1 − α) t h (t) = −→ − = α (9.19) h(t) u(x0 , t) Γ(−α)
as t → ∞.
This type of problem has been generalised in several directions, and we address one of these in the next subsections. The first observation from the above is the fact that the solution u(x, t) can divulge information about α from its asymptotic behaviour at either small or large time values. As the above results make clear we only need the measured data over a very small time segment due to the analytic behaviour in time of the solution operator, and hence it is amenable to being combined with inverse problems of fractional type with overposed data designed to recover a specific coefficient, say the potential term c(x) in L, and also to obtain the fractional order. Problems such as these can be found in [213]. 9.1.1.2. Determining multiple fractional orders. But is nature so kind as to only require a single value for α? The next obvious step up in the model sophistication the single value fractional derivative by would αreplace j d ∂ u, where a linear combination of N fractional a finite sum ∂tα = N j=1 j t powers has been taken. The equation is then (9.20)
N
α
dj ∂t j u(x, t) − Lu(x, t) = 0,
j=1
where L is a strongly elliptic, symmetric operator defined on a bounded, open set Ω and subject to (for example) Dirichlet boundary conditions on
9.1. Determining fractional order
241
∂Ω with initial condition u(x, t) = u0 (x) on Ω. Physically this represents a fractional diffusion model that assumes diffusion takes place in a medium in which there is no single scaling exponent, for example, a medium in which there are memory effects over multiple time scales. This seemingly simple device leads to considerable complications. We have to use the so-called multi-index Mittag-Leffler function Eα1 , ... αN ,β1 , ... βN (z) in place of the two parameter Eα,β (z); see [233]. This adds complexity, not only to the notation, but also in proving required regularity results for the basic forward problem of knowing Ω, L, u0 and determining u(x, t); see [214, 218]. Note that here one must recover the number N of terms in the multiterm derivaN tive as well its components—the exponents {αj }N j=1 and the weights {dj }j=1 . The uniqueness argument above is based on the fact that after taking the Laplace transform u → u ˆ, the exponent appears as the location of a pole in u ˆ(s) and the same situation is the case in the multiterm configuration. One has to be careful to give the correct setting for this to work. In the paper [216] the setting was Ω = [0, L] and the solution was subject to Neumann boundary conditions, ux (0, t) = ux (L, t) = 0 with the initial state being the unit impulse u(x, 0) = δ(x0 ) for x0 ∈ (0, L). Uniqueness of the fractional operator was obtained from the time trace data u(0, t). As a second example in [216], recovery of {αj , dj }nj=1 was obtained from Dirichlet boundary conditions and a nonvanishing initial value u0 (x) = u(x, 0), where the measurement was the time trace u(x0 , t) for x0 ∈ Ω ⊆ Rd , d ≥ 1. In [162] it was shown that the impact of the various orders on the temporal asymptotics of the solution u is strong enough to even allow recovery of multiple orders without knowing the operation A. We will revisit the problem of determining multiple orders in more detail in the context of fractionally damped wave equations in Section 9.1.2. 9.1.1.3. Determining a distributed derivative. A much more challenging time fractional operator is one of distributed type (see for example, [198, 215, 231, 255]), where the discrete sum derivative μ(α) =
m
dj δ(α − αj )
j=1
is replaced by an integral and the finite sequence by a function μ(α) (or more generally a measure on [0, 1]), (9.21)
(μ) ∂t u(t)
1
=
μ(α)∂tα u(t) dα.
0
We describe the work in [303], where a suitable inverse problem was set up that allowed the recovery of the function μ(α).
242
9. Fundamental Inverse Problems for Fractional Order Models
The derivative type used will be the Djrbashian–Caputo version for D (μ) , 1 t 1 −α d u(x, τ )dτ that is D (μ) u = 0 μ(α)∂tα u dα with ∂tα u(x, t) = Γ(1−α) dτ 0 (t−τ ) and so t 1 d μ(α) (μ) −α D u= (t − τ ) dα u(x, τ )dτ dτ 0 0 Γ(1 − α) (9.22) t d := η(t − τ ) u(x, τ )dτ, dτ 0 where (9.23)
1
η(s) = 0
μ(α) −α s dα. Γ(1 − α)
The assumptions of μ that will be needed in the proof of the unique recovery of μ(α) are summarised in the admissible set, (9.24)
M = {μ ∈ C 1 [0, 1] : μ ≥ 0, μ(1) = 0, μ(α) ≥ CM on (α, α)}
for some 0 < α < α < 1, CM > 0. With this background, our distributed pde (dde) model is D (μ) u(x, t) − Lu(x, t) = f (x, t), (9.25)
u(x, t) = 0, u(x, 0) = u0 (x),
x ∈ Ω,
x ∈ ∂Ω,
t ∈ (0, T ),
t ∈ (0, T ),
x ∈ Ω.
In what follows we first describe some representation theorems for the solution to (9.25) and hence show existence of weak solutions. This is done by first analysing the ordinary distributed fractional order equation (9.26)
D (μ) v(t) = −λv(t), v(0) = 1, t ∈ (0, T ),
and to show there exists a unique solution. The main idea now is to determine the integral operator that serves as the inverse for D (μ) in analogy with the Riemann–Liouville being inverted by the Abel operator. If we now take the Laplace transform of η in (9.23), then we have 1 Φ(z) , where Φ(z) = μ(α)z α dα. (9.27) (Lη)(z) = z 0 Lemma 9.1 ([198, Proposition 3.2]). Define the operator I (μ) by γ+i∞ zt t 1 e (μ) dz. κ(t − s)φ(s)ds, where κ(t) = I φ(t) = 2πi γ−i∞ Φ(z) 0 Then the following conclusions hold: (1) D (μ) I (μ) φ(t) = φ(t), I (μ) D (μ) φ(t) = φ(t) − φ(0) for φ ∈ L1 (0, T ); (2) κ ∈ C ∞ (0, ∞), and for sufficiently small t > 0, κ(t) = |κ(t)| ≤ C ln 1t .
9.1. Determining fractional order
243
With this, we can convert the ordinary distributed fractional order equation (9.26) to a Volterra integral equation of the second kind v + λI (μ) v = 1, and conclude from, e.g., [120, Theorems 3.1, 3.5] the following well-posedness result. Lemma 9.2. For each λ > 0 there exists a unique solution v of (9.26). As a corollary of Lemma 9.2 one can obtain the following result on wellposedness, solution representation, and regularity for the solution to the dde. Theorem 9.2. There exists a unique weak (in the spatial H01 (Ω) sense) solution u of the dde, (9.25) and the representation formula u(x, t) = (9.28)
∞
u0 , ψn un (t) + f (·, 0), ψn I (μ) un (t)
n=1
t
+
ft (·, τ ), ψn I (μ) un (t − τ )dτ ψn (x)
0
holds, where un is the unique solution of the distributed ode (9.26) with λ = λn . Moreover the regularity
u C(0,T ;H 2 (Ω)) + D (μ) u C(0,T ;L2 (Ω))
≤ C u0 H 2 (Ω) + T 1/2 f H 1 (0,T ;H 2 (Ω)) + f C(0,T ;H 2 (Ω)) holds, where C > 0 depends only on μ, L, and Ω. We now consider the problem in one spatial variable, the case Ω = (0, 1), Lu = uxx of (9.25), ⎧ ⎪ D (μ) u − uxx = f (x, t), 0 < x < 1, 0 < t < ∞, ⎪ ⎪ ⎪ ⎨u(x, 0) = u (x), 0 < x < 1, 0 (9.29) ⎪ u(0, t) = g 0 (t), 0 ≤ t < ∞, ⎪ ⎪ ⎪ ⎩u(1, t) = g (t), 0 ≤ t < ∞, 1 where g0 , g1 ∈ L2 (0, ∞) and f (x, ·) ∈ L1 (0, ∞) for each x ∈ (0, 1). This allows for a representation analogous to that in Theorem 6.1, by means of the fundamental solution defined by γ+i∞ 1/2 1 Φ (z) zt−Φ1/2 (z)|x| e dz, G(μ) (x, t) = 2πi γ−i∞ 2z
244
9. Fundamental Inverse Problems for Fractional Order Models
with Φ as in (9.27), and the functions ∞
θ(μ) (x, t) =
G(μ) (x + 2m, t),
θ
(μ)
(μ)
= I (μ) θxt .
m=−∞
Theorem 9.3. For piecewise continuous f , u0 , g0 , and g1 , the representation 4 vi u(x, t) = i=1
gives a solution to problem (9.29), where the functions vi are defined by 1 (θ(μ) (x − y, t) − θ(μ) (x + y, t))u0 (y) dy, v1 (x, t) = 0 t (μ) v2 (x, t) = −2 ∂x θ (x, t − s)g0 (s) ds, 0 t (μ) ∂x θ (x − 1, t − s)g1 (s) ds, v3 (x, t) = 2 0 t 1 [θ(μ) (x − y, t − s) − θ(μ) (x + y, t − s)]f (y, s) dyds. v4 (x, t) = 0
0
The proof of the uniqueness result to be formulated below relies on some technical lemmas which we here state without proofs; these can be found in [303]. Lemma 9.3. e(x−2)Φ (z) − e−xΦ 1/2 2(1 − e−2Φ (z) ) 1/2
Lθ L
(μ)
(x, z) =
∂θ (μ) (x, t) ∂x
1/2 (z)
,
Φ1/2 (z)e(x−2)Φ (z) + Φ1/2 (z)e−xΦ = 1/2 2(1 − e−2Φ (z) ) 1/2
Lemma 9.4. For any x0 ∈ (0, 1), the function F defined by F (y; x0 ) =
e(x0 −2)y − e−x0 y 2(1 − e−2y )
0 )−ln x0 , ∞). is strictly increasing on the interval ( ln(2−x 2(1−x0 )
Lemma 9.5. For any x∗ ∈ (0, 1], the function Ff defined by ∗
∗y
ye(x −2)y − ye−x Ff (y; x0 ) = 2(1 − e−2y ) is strictly decreasing on the interval ( x1∗ , ∞).
1/2 (z)
.
9.1. Determining fractional order
245
The uniqueness proof in the end relies on a density argument requiring a generalisation of the Weierstrass approximation theorem giving a condition under which one can thin out the polynomials and still maintain a dense set; see [37, 107]. Theorem 9.4 (M¨ untz–Sz´asz). Let {ki }∞ i=1 be a sequence of real positive k1 , xk2 , . . . } is dense in C[a, b] if and only numbers. Then the span of {1, x 1 if ∞ i=1 ki = ∞. As a corollary of this result one can prove the following. Lemma 9.6. For any N ∈ N, the vector space spanned by the functions {x → (nN )x : n ∈ N} is dense in L2 (0, 1). 1 Proof. Assume that for some h ∈ C[0, 1], 0 (nN )x h(x) dx = 0 for all n ∈ N. We wish to prove that this implies h ≡ 0. With the change of variables y = ex and the identity (nN )x = ex ln(nN ) , we can conclude 0 = e ln(nN ) h(ln y) 1 dy for all n ∈ N. Since ∞ n=1 ln(nN ) diverges, Theorem 9.4 y 1 y implies that
h(ln y) y
= 0 for all y ∈ [1, e], and therefore h ≡ 0 on [0, 1].
We are now able to state and prove two uniqueness results for the recovery of the distributed derivative μ. First, by measuring the solution along a time trace from a fixed location x0 , one can use this data to uniquely recover μ(α). This time trace can be one where the sampling point is located within the interior of Ω = (0, 1) and we measure u(x0 , t), or we measure the flux at x∗ ; ux (x∗ , t) for 0 ≤ t ≤ T where 0 < x∗ ≤ 1. This latter case therefore includes measuring the flux on the right-hand boundary x = 1. Theorem 9.5. In the dde (9.29), set u0 = g1 = f = 0 and let (Lg0 )(z) = 0 for all z ∈ (0, +∞). Given μ1 , μ2 ∈ M, denote the two weak solutions with respect to μ1 and μ2 by u(x, t; μ1 ) and u(x, t; μ2 ), respectively. Then for any x0 ∈ (0, 1) and x∗ ∈ (0, 1], either (9.30)
u(x0 , t; μ1 ) = u(x0 , t; μ2 )
or ux (x∗ , t; μ1 ) = ux (x∗ , t; μ2 ), t ∈ (0, T ),
(9.31)
implies μ1 = μ2 on [0, 1]. Proof. For the first case of u(x0 , t; μ1 ) = u(x0 , t; μ2 ), Theorem 9.3 yields t (μj ) θ (x0 , t − s)g0 (s) ds, j = 1, 2, u(x0 , t; μj ) = −2 0
which implies t 0
θ(μ1 ) (x0 , t − s)g0 (s) ds =
t 0
θ(μ2 ) (x0 , t − s)g0 (s) ds.
246
9. Fundamental Inverse Problems for Fractional Order Models
Taking the Laplace transform in t on both sides of the above equality together with the fact that (Lg0 )(z) = 0 on (0, ∞), yields
L(θ(μ1 ) (x0 , ·)) (z) = L(θ (μ2 ) (x0 , ·)) (z) for z ∈ (0, ∞). This result and Lemma 9.3 then give 1/2
e(x0 −2)Φ1
(z)
2(1 − e
1/2
− e−x0 Φ1
1/2 −2Φ1 (z)
1/2
(z)
=
)
e(x0 −2)Φ2
(z)
2(1 − e
1/2
− e−x0 Φ2
1/2 −2Φ2 (z)
(z)
,
)
z ∈ (0, ∞),
where Φj is defined as in (9.27) with μ replaced by μj , j = 1, 2. The 1/2 definition of M and the fact z ∈ (0, ∞) yield Φj (z) ∈ (0, ∞) and hence we can rewrite the above equality as (9.32)
1/2
1/2
F (Φ1 (z); x0 ) = F (Φ2 (z); x0 ), z ∈ (0, ∞), ln(2−x0 )−ln x0 > 0 since 2(1−x0 ) α that α CM · (N ∗ )α dα >
where F is defined as in Lemma 9.4. It is obvious that
x0 ∈ (0, 1). Then we can pick a large N ∗ ∈ N+ such
2 ln(2−x0 )−ln x0 , which together with the definition of M gives that for 2(1−x0 ) each z ∈ (0, ∞) with z ≥ N ∗ , Φj (z) ∈ (0, ∞) and Φj (z) > j = 1, 2, hence 1/2
(9.33)
Φj (nN ∗ ) > 1/2
ln(2 − x0 ) − ln x0 , 2(1 − x0 )
ln(2−x0 )−ln x0 , 2(1−x0 )
j = 1, 2 ∀n ∈ N.
We can thus apply Lemma 9.4 to conclude from (9.32) and (9.33) that 1/2 1/2 Φ1 (nN ∗ ) = Φ2 (nN ∗ ), for all n ∈ N, that is, Φ1 (nN ∗ ) = Φ2 (nN ∗ )
∀n ∈ N.
Consequently, we have, cf., (9.27), 1 (μ1 (α) − μ2 (α))(nN ∗ )α dα = 0
∀n ∈ N,
0
which by the density result Lemma 9.6 and the continuity of μ1 , μ2 implies μ1 = μ2 on [0, 1]. ∂u ∗ ∗ For the case of ∂u ∂x (x , t; μ1 ) = ∂x (x , t; μ2 ), we proceed analogously using the second part of Lemma 9.3 as well as Lemma 9.5 in place of Lemma 9.4.
9.1.2. Fractionally damped wave equations. We now revisit the general damped wave equation (7.1) derived in Section 2.2.2 and analysed in Section 7.1: (9.34)
utt + c Au + 2
J j=1
2+γ dj ∂ t j u
+
N k=1
bk ∂tαk Aβk u = r.
9.1. Determining fractional order
247
This is actually a somewhat more general version, since it also contains a fractional power of the elliptic differential operator such as the Caputo–Wismer–Kelvin–Chen–Holm model (7.43). Here A = −L on Ω ⊆ Rd , equipped with homogeneous Dirichlet, Neumann, or impedance boundary conditions, for an elliptic differential operator L and αk ∈ (0, 1], βk ∈ ( 12 , 1]. Also note that we have made a slight change of notation as compared to (7.1) in order to more easily distinguish the differentiation orders notationally. We assume that the differentiation orders with respect to time are distinct, that is (9.35)
0 < α1 < α2 < · · · < αN ≤ 1,
0 < γ1 < γ2 < · · · < γJ ≤ 1,
two properties that are crucial for distinguishing the different asymptotic terms in the solution u and its Laplace transform. Typically, in acoustics we just have L = , and then the operator A is known. To take into account a (possibly unknown) spatially varying speed of sound c(x), one can instead consider (9.36)
utt + Ac u +
J
2+γ dj ∂ t j u
j=1
+
N
bk ∂tαk Aβc k u = r
k=1
−c(x)2 ;
in order to get a self-adjoint operator Ac , we then use with Ac = 2 the weighted L inner product with weight function c12 . The pde model (9.34) or (9.36) will be considered on a time interval t ∈ (0, T ) and driven by initial conditions and/or a separable source term, (9.37) r(x, t) = σ(t)f (x) , x ∈ Ω, t ∈ (0, T ), u(x, 0) = u0 (x) ,
ut (x, 0) = u1 (x) ,
(utt (x, 0) = u2 (x) if γJ > 0)
x ∈ Ω.
Note that boundary conditions are already incorporated into the operator A or Ac , respectively. We assume there are time trace observations for driving the solution by multiple initial and/or source data, (9.38)
hi (t) = (Bi ui )(t), t ∈ (0, T ),
where ui solves (9.34), (9.37) with (u0 , u1 , f ) = (u0,i , u1,i , fi ), i = 1, . . . I. For example, (a) Bi v = v(xi ) or (b) Bi v = Σi ηi (x)v(x) dx for some points xi ∈ Ω or some weight functions ηi ∈ L∞ (Σi ) on Σi ⊆ ∂Ω. We have to make sure that these evaluations are well-defined, that is, ui (t) ∈ C(Ω) in case (a), or ui (t) ∈ L1 (ω) in case (b). According to Theorem 7.1, this can be guaranteed if Ω ⊆ Rd with d = 1 for (a) or d ∈ {1, 2, 3} for (b).
248
9. Fundamental Inverse Problems for Fractional Order Models
The coefficients that we aim to recover are the constants N, J, c, bk , dj , αk , βk , γj ,
k = 1, . . . , N, j = 1, . . . , J,
in (9.34). In additional to that, one can consider the problem of identifying, besides the differentiation orders, some space dependent quantities, such as the speed of sound c = c(x) in ultrasound tonography, or the initial data u0 = u0 (x) while u1 = 0 (equivalently the source term f = f (x), while u0 = 0, u1 = 0) in photoacoustic tomography pat. We will focus on the reconstruction of differentiation orders and refer to [188] for details on the reconstruction of additional pde coefficients, as well as to Sections 11.1 and 11.2 for the reconstruction of u0 (x) or c(x) in the case of known differentiation orders. Separation of variables based on the eigensystem of A yields a solution representation, ∞ σ(t)wf n (t)fi , ϕn + w2n (t)u2,i , ϕn ui (x, t) = n=1
+ w1n (t)u1,i , ϕn + w0n (t)u0,i , ϕn ϕn (x),
and convergence of the sums is guaranteed by the energy estimates from Theorem 7.1. Here wf n , w2n , w1n , w0n are the solutions of the resolvent equation J N 1 for wf n 2+γj 2 βk αk dj D t wn + bk λn Dt wn = wn,tt + c λn wn + 0 else, j=1 k=1 1 for w2n 1 for w1n 1 for w0n wn,t (0) = wn (0) = wn,tt (0) = 0 else, 0 else, 0 else. From the Laplace transformed resolvent equations we obtain the resolvent solutions αk −1 b λβk s + Jj=1 dj sγj +1 + N ω(s, λn ) − c2 λn k n k=1 s = , w ˆ0n (s) = ω(s; λn ) sω(s; λn ) J γj −1 1 + Jj=1 dj sγj 1 j=1 dj s , w ˆ2n (s) = , w ˆf n (s) = , w ˆ1n (s) = ω(s; λn ) ω(s; λn ) ω(s; λn ) with ω(s; λn ) = s2 + c2 λn +
J j=1
where ˆ· denotes the Laplace transform.
dj sγj +2 +
N k=1
sαk bk λβnk ,
9.1. Determining fractional order
249
Formally, we can write u ˆi (x, s) =
∞
σ ˆ (s)w ˆf n (s)fi , ϕn +
2
n=1
j=0
∞
2
w ˆjn (s)uj,i , ϕn ϕn (x),
and therefore ' B i ui (s) =
σ ˆ (s)w ˆf n (s)fi , ϕn +
n=1
w ˆjn (s)uj,i , ϕn Bi ϕn .
j=0
Choose, for simplicity the excitations u0i = 0, u1i = 0, u2i = 0, fi = 0, and assume that for two models given by N, J, c, A, bk , αk , βk , dj , γj , Bi , ˜ , J, ˜ c˜, A, ˜ ˜bk , α ˜i , N ˜ k , β˜k , d˜j , γ˜j , B
(9.39)
we have equality of the observations ˜i u ˜i (t) , Bi ui (t) = B
t ∈ (0, T ),
from two possibly different space parts of the excitation fi , f˜i , and possibly ˜ while the temporal using two different test specimens characterised by A, A, part σ of the excitation is assumed to be the same in both experiments, and its Laplace transform to vanish nowhere σ ˆ (s) = 0 for all s. By analyticity (due to the multinomial Mittag-Leffler functions representation; cf., [229, Theorem 4.1]), we get equality for all t > 0, thus equality of the Laplace transforms, ∞ n=1
(9.40)
s2 + c2 λn + =
fi , ϕn Bi ϕn N α γj +2 + βk k j=1 dj s k=1 s bk (λn )
J
∞ 2 ˜n + ˜2 λ n=1 s + c
J˜
˜i ϕ˜n f˜i , ϕ˜n B ˜ d˜j sγ˜j +2 + N
j=1
α ˜ k ˜ ˜ β˜k k=1 s bk (λn )
i = 1, . . . I ,
, s ∈ Cθ ,
ˆ (s) = in some sector Cθ of the complex plane, due to our assumption that σ ˆ σ ˜ (s) and σ ˆ (s) = 0 for all s. We have to disentangle the sum over n with the asymptotics for s → 0 or s → ∞ in order to recover the unknown quantities in (9.39) or at least some of them. The most simple way to do so is to assume that we know at least some of the eigenfunctions and can use them as excitations fi = ϕni ,
250
9. Fundamental Inverse Problems for Fractional Order Models
f˜i = ϕ˜n˜ i , i = 1, . . . , I, so that (9.40) becomes s2 (9.41)
+
c2 λni =
+
J
B i ϕn i
γj +2 j=1 dj s
˜ n˜ + s2 + c˜2 λ i
+
J˜
N
αk βk k=1 s bk (λni )
˜i ϕ˜n˜ B i
˜ γ˜j +2 j=1 dj s
+
N˜
α ˜ k ˜ ˜ )β˜k ˜i k=1 s bk (λn
,
i = 1, . . . I , s ∈ Cθ . Alternatively to this single mode excitation, there is also the option to assume the excitation to be sufficiently smooth and use this to justify approximation of the sum over n of fractions in (9.40) by a single fraction. We refer to [188] for details and to [162] for the subdiffusion case. In case of rational powers (9.42)
nj n ˜j pk p˜k , α ˜ k = , γj = , γ˜j = , qk q˜k mj m ˜j ˜ 1, . . . , m ˜ J˜) q¯ = lcm(q1 , . . . , qN , q˜1 , . . . , q˜N˜ , m1 , . . . , mJ , m
αk =
(where lcm stands for least common multiple), setting z = s1/¯q we read (9.41) as an equality between two rational functions of z. Their coefficients and powers therefore need to coincide (provided {s1/¯q : s ∈ Cθ } ⊇ [z, ∞) for some z ≥ 0). In the general case of real powers one can use an induction proof (see [162, Theorem 1.1] for the subdiffusion case) to obtain the same uniqueness. This yields the following: ˜, (I) N = N (II) c2 λni
˜ J = J; ˜ n˜ , i = 1, . . . , I; = c˜2 λ i
(III) αk = α ˜ k , γj = γ˜j , k = 1, . . . , N, j = 1, . . . , J; ˜ n˜ )β˜k , dj = d˜j (IV) bk (λni )βk = ˜bk (λ i for i = 1, . . . , I, k = 1, . . . , N, j = 1, . . . , J. To extract bk and βk , which can be done separately for each k (skipping the subscript k), we use I = 2 excitations, assume that the corresponding eigenvalues of A and A˜ are known, and equal and define F (b, β) = (b(λn1 )β , b(λn2 )β )T . One easily sees that the 2 × 2 matrix F (b, β) is regular for λn1 = λn2 and b = 0 and thus by the inverse function theorem F is injective. Hence, from (I)–(IV) we obtain all the constants in (9.39). Theorem 9.6. Let f1 = ϕn1 , f2 = ϕn2 , f˜1 = ϕ˜n˜ 1 , f˜2 = ϕ˜n˜ 2 for some ˜ 1 = n ˜ 2 ∈ N. Then n1 = n2 , n ˜i u ˜i (t), Bi ui (t) = B
t ∈ (0, T ),
i = 1, 2,
9.1. Determining fractional order
251
for the solutions ui , u ˜i of (9.34) with vanishing initial data, known and equal ˜ λi = λi , i = 1, 2, and possibly unknown (rest of ) A, A˜ implies ˜ , J = J, ˜ c = c˜, bk = ˜bk , αk = α N =N ˜ k , βk = β˜k , dj = d˜j , γj = γ˜j , k ∈ {1, . . . , N },
j ∈ {1, . . . , J} .
To discuss some reconstruction approaches, we will focus on the second order model obtained by setting d1 = · · · = dJ = 0. The key tool here will be Tauberian theorems, which link the time asymptotics to those in a Laplace domain; see Section 3.2. It will turn out that one excitation suffices for this purpose, so we will set I = {1} and skip the i subscripts. 9.1.2.1. Large time asymptotics. In case u0 = ϕ, u1 = 0, f = 0, we have the Laplace transformed observations according to (9.38), (9.43) N βk αk −1 s+ N Bϕ k=1 bk λ s ˆ Bϕ ∼ bk λβk −1 sαk −1 as s → 0. h(s) = 2 βk sαk + c2 λ c s2 + N b λ k=1 k k=1
Thus by a Tauberian theorem, Theorem 3.14 (for s → 0, t → ∞; see also [103, Theorems 2, 3 in Chapter XIII.5]), we get Bϕ bk λβk −1 −αk t + O(t−2α1 ) c2 Γ(1 − αk ) N
(9.44)
h(t) =
as t → ∞ .
k=1
Upon redefining bk ← bk Bϕ, without loss of generality we can assume that Bϕ = 1. From (9.44) we get an asymptotic formula—analogous to one in Theorem 9.1 of Section 9.1.1 in the subdiffusion case—for the smallest order (9.45)
α1 = − lim
t→∞
log h(t) . log t
By l’Hˆopital’s rule and actually thereby removing the constant, we can instead also use d dt (log h(t)) d t→∞ dt (log t)
α1 = − lim
th (t) . t→∞ h(t)
= − lim
After having determined α1 this way, we can also compute b1 as a limit (9.46)
b1 = lim (h(t)tα1 ) c2 λ1−β1 Γ(1 − α1 ). t→∞
So we can successively (by the above procedure and by subtracting the recovered terms one after another) reconstruct those terms k for which αk < 2α1 ; see the remainder term in (9.44). However, if there are terms with αk ≥ 2α1 , they seem to get masked by the O(t−2α1 ) remainder.
252
9. Fundamental Inverse Problems for Fractional Order Models
The same holds true if we do the excitation by u0 = 0, u1 = ϕ, ˆ f = 0, or u0 = 0, u1 = 0, f = ϕ, where (9.43) changes to h(s) = 1 Bϕ. N βk αk 2 2 s +
k=1 bk λ
s
+c λ
In order to avoid the restriction αk < 2α1 , we thus refine the expansion ˆ of h(s) in terms of powers of s and retain the singular ones, that is, those with negative powers, since we are looking at the limiting case s → 0. Using the geometric series formula and the multinomial theorem, with the s2 +Σ βk αk abbreviations Σ = N k=1 bk λ s , q = c2 λ , yields ˆ h(s) =
∞ 1 q 1 s + Σ/s (−q)m = = − s2 + Σ + c2 λ s q+1 s m=1
=− =−
1 s
∞
(− c21λ )m
m=1
∞
N
s2(m−) Σ
=0
(− c21λ )m
m=1
×
m
m
s
=0 i
N
bjj λ
j=1
βj ij
2(m−)−1
i1 +···+iN = N
s
j=1
α j ij
i1 , . . . , iN
.
j=1
The terms corresponding to < m are obviously O(s), so to extract the singularities it suffices to consider = m. The set of indices leading to singular terms is then Im = Im (α1 , . . . , αN ) = {i = (i1 , . . . , iN ) :
N
ij = m,
j=1
N
αj ij < 1} ,
m ≤ mmax ,
j=1
ˆ can be written as where mmax = [ α11 ] and the singular part of h ˆ sing (s) = − h
m max
(− c21λ )m
m=1
(9.47)
with ˜bi =
N
˜b s m,i
j=1
αj ij −1
i∈I
m i1 , . . . , iN
N
i
N
bjj λ
j=1
βj ij
.
j=1
In case N = 2 with i = i1 , i2 = m − i, this reads as ˆ sing (s) = − h
m max
(− c21λ )m
m=1
with ˜bm,i =
m
˜bm,i sα1 i+α2 (m−i)−1
i=im,min
m
i
bi1 bm−i λβ1 i+β2 (m−i) , 2
9.1. Determining fractional order
253
since it is readily checked that then Im = {if ∈ {0,. . . , m} : α1 i + α2 (m − 2 −1 i) < 1} = {im,min , . . . , m} with im,min = mα α2 −α1 . For (9.47), by Theorem 3.14 m max ˜b N m,i ˆ t− j=1 αj ij , (− c21λ )m hsing (t) = − N Γ(1 − j=1 αj ij ) m=1 i∈I
which confirms (9.45) and (9.46). For the first damping term, we also get an asymptotic formula for the smallest order in the Laplace domain—the small s counterpart to (9.45): Due to (9.48)
ˆ log h(s) ≈ log b1 + (α1 − 1) log s − log(c2 λ1−β1 ) as s → 0,
we have (9.49)
ˆ log h(s) . s→0 log s
1 − α1 = − lim
ˆ
m) One might realise this limit by extrapolation: Fit the values loglogh(s sm at M sample points s1 , . . . , sM to a regression line or low order polynomial r(s), and set α1 = 1 + r(0). Next, recover b1 from
ˆ + log(c2 λ1−β1 ) − (α1 − 1) log s. log b1 ≈ log h(s) Formula (9.48) also shows why recovering b1 and α1 simultaneously appears to be so hard: the factor multiplied with log b1 is unity while the factor multiplied with 1 − α1 tends to minus infinity as s → 0. 9.1.2.2. Small time asymptotics. In case u0 = ϕ, u1 = 0, f = 0, where the Laplace transformed observations are given by βk αk −1 s+ N k=1 bk λ s ˆ (9.50) h(s) := , βk αk + c2 λ s2 + N k=1 bk λ s large s/small t asymptotics cannot be exploited for the following reason. As ˆ ˆ s → ∞ then h(s) → 0, and in fact sh(s) → 1, due to βk αk s2 + N c2 λ k=1 bk λ s ˆ − 1 = − . sh(s) −1= βk αk + c2 λ βk αk + c2 λ s2 + N s2 + N k=1 bk λ s k=1 bk λ s ˆ − 1] → 0 for any γ ∈ [0, 2). Thus clearly even sγ [sh(s) This shows that excitation by u0 is not appropriate for extracting the fractional orders from small t/large s asymptotics. However, in case u0 = 0, u1 = ϕ, f = 0, the situation is different since then (9.51)
ˆ := h(s)
s2
+
N
1
βk αk k=1 bk λ s
+ c2 λ
254
9. Fundamental Inverse Problems for Fractional Order Models
and therefore ˆ := (9.52) 1 − (s2 + c2 λ)h(s) where the factor
s2 +
N
s2 +
N
βk αk k=1 bk λ s
s2
βk αk s +c2 λ k=1 bk λ
N
s2 + c2 λ
bk λβk sαk −2 ,
k=1
tends to unity as s → ∞. Via a
Tauberian theorem (for s → ∞, t → 0, see [103, Theorems 2, 3 in Chapter XIII.5]), this yields −c2 λh(t) − h (t) ∼
N k=1
bk λβk t1−αk Γ(2 − αk )
as t → 0,
where we have used the fact that h(0) = 0, h (0) = Bϕ = 1, because of u0 = 0, u1 = ϕ. This provides the asymptotic formulas * * 2 * t(c λh (t) + h (t)) * ln(−c2 λh(t) − h (t)) *, 1 − αN = lim = lim ** 2 t→0 t→0 ln(t) c λh(t) + h (t) * where we have used l’Hˆ opital’s rule and bN = − lim(c2 λh(t) + h (t))tαN −1 ) λ−βN Γ(2 − αN ). t→0
Note that the order is now reversed as compared to the large time case, in the sense that we first recover the largest α exponent. Again, in order to also recover αN −1 , . . . , α1 , one may also take into account mixed terms in the expansion of (9.52) analogously to the large t asymptotics case. In case of two damping terms, this results in (9.53) − c2 λh(t) − h (t) =
b2 λβ2 b1 λβ1 c2 λ 3−α1 b1 λβ1 t1−α1 + t1−α2 + t Γ(2 − α1 ) Γ(2 − α2 ) Γ(4 − α1 )
+
b2 λβ2 c2 λ 3−α2 b21 λ2β1 b1 b2 λβ1 +β2 + t t3−2α1 + t3−α1 −α2 Γ(4 − α2 ) Γ(4 − 2α1 ) Γ(4 − α1 − α2 )
+
b22 λ2β2 t3−2α2 + O(t5−3α2 ) as t → 0, Γ(4 − 2α2 )
We will now show some numerical reconstructions for both the small and the large time regimes. We will confine ourselves to the case βi = 1, that is, without space fractional derivatives involved. 9.1.2.3. Large time measurements. Here we are trying to simulate the asymptotic values of the constituent powers of t occurring in the data function h(t). This is achieved by using a sample of points between tmin and tmax .
9.1. Determining fractional order
255
We apply Newton’s method to recover the constants {αi , ck, } in h(t) = c1,1 t−α1 + c1,2 t−α2 + c1,3 t−α3 + c2,1 t−2α1 + c2,2 t−2α2 + c2,3 t−2α3 + c2,4 t−α1 −α2 + c2,5 t−α2 −α3 + c2,6 t−α3 −α1 + c3,1 t−3α1 + c3,2 t−3α2 + c3,3 t−2α1 −α2 + c3,4 t−α1 −2α2 + c3,5 t−2α1 −α3 + c3,6 t−α1 −2α3 + c3,7 t−2α2 −α3 + c3,8 t−α2 −2α3 + O(t−3α3 ), which results from the expansion (9.47) in case of three terms, and then we recover bi = c1,i Γ(1 − αi ). In the above we have neglected the term t−3α3 as this would not arise from the Tauberian theorem in the case where the largest power α3 ≥ 13 . We may also have to exclude other terms such as t−2α1 −α2 if 2α1 + α2 ≥ 1. In practice, during the iteration process, terms should be included or excluded in the code depending on this criterion: we did so by checking if the argument passed to the Γ function would be negative, in which case the term is deleted from use for that iteration step. Values for Figure 9.1 shown below are tmin = 5 × 104 and tmax = 2 × 105 . As a general rule, terms with small α values can be resolved with a smaller value of tmax , but for, say, the recovery of a pair of damping terms with αi > 0.8, a larger value of tmax with commensurate increased accuracy will be needed. The case of three damping terms {bi ∂tαi }3i=1 with α = { 14 , 13 , 23 } and bi = 0.1 is shown in Figure 9.1. The starting values were taken to be within 10–30% from the actual.
• ◦
αi values
b i values • ◦
• ◦
0.14
0.7 0.6 0.5
0.11
0.08
◦ •
Iteration 0.05
◦ •
◦ •
◦ •
0.4
•
•
•
•
Iteration
0.3 0.2 0.1
Figure 9.1. Convergence of αi and bi in the large time case, where
i = 1 is shown as bullets, i = 2 as circles, i = 3 as stars.
There are features here that are typical of such reconstructions. The method resolves the lowest fractional power α1 and its coefficient b1 quickly as this term is the most persistent one for large times: essential numerical convergence for {α1 , b1 } is obtained by the third iteration. The next lowest power and coefficient lags behind; here {α2 , b2 } is already at the stated accuracy by the fifth iteration. In each case the power is resolved faster and more accurately than its coefficient. The third term also illustrates this; the
256
9. Fundamental Inverse Problems for Fractional Order Models
power is essentially resolved by iteration 8, but in fact its coefficient b3 is not resolved to the third decimal place until iteration number 30. This is seen quite clearly in the singular values of the Jacobian: the largest singular values correspond to the lowest α-values and the smallest to the coefficients of the largest α-powers. As might be expected, resolving terms whose powers are quite close is in general more difficult. This is relatively insignificant for low α values. For example, with α1 = 0.2 and 0.22 < α2 < 0.25, say, correct resolution will be obtained although the coefficients will take longer to resolve than indicated in Figure 9.1. On the other hand if, say, α1 = 0.25, α2 = 0.85, and α3 = 0.9, then with the indicated range of time values used the code will fail to recover this last pair. If this is sensed and now only a single second power is requested, this will give a good estimate for α2 but its coefficient will be overestimated. Also, as indicated previously, α values close to one require an extended time measurement range to stay closer to the asymptoic regime of u(t).
A few words are in order about an approach that, from the above discussion, might seem a good or even better alternative. Since each damping term bi ∂tαi contributes a time trace term with large time behaviour ci t−αi , it is feasible to take T sufficiently large so that < c1 t−α1 for t > T . That is, all but the smallest damping power ci t−αi < is negligible, and this can then be recovered. In successive steps we subtract this from the data h to get h1 (t) = h(t) − c1 t−α1 , and we now seek to recover the next lowest α power from the large time values of h1 (t) in a range (δT, T ) for 0 < δ < 1. Then these steps can be repeated until there is no discernible signal remaining in the sample interval tmin , tmax . This indeed works well under the right circumstances for recovering two α values but the coefficients {bi } are less well resolved. It also requires a delicate splitting of the time interval and gives a much poorer resolution of the two terms in the case where, say, α1 = 0.2 and α2 = 0.25 than that recovered from a Newton scheme. For the recovery of three damping terms this was in general quite ineffective. Every time an αi has been recovered, the remaining signal is significantly smaller than the previous, leading to an equally significant drop in effective accuracy. Also, even if we just make a small error in the coefficient bi , the relative error that is caused by this becomes completely dominant for large times. In short, this is an elegant and seemingly constructive approach to showing uniqueness for a finite number of damping terms. However, it has limited
9.1. Determining fractional order
257
value from a numerical recovery perspective when used under a wide range of parameter values. 9.1.3. Small time measurements. In this case we are simulating measurements taken over a very limited initial time range—in fact we take the measurement interval to be t ∈ [0, 0.1). The line of attack is to use the ˆ known form of h(s) for large values of s and convert the powers of s appearing into powers of t for small times using the Tauberian theorem. In the case of two damping terms this gives −c2 λhsmall (t) − hsmall (t) = c1,1 t1−α1 + c1,2 t1−α2 + c2,1 t3−α1 + c2,2 t3−α2 + c2,3 t3−2α1 + c2,4 t3−α1 −α2 + c2,5 t3−2α2 , where each term ck, is computed in terms of {αi }, {bi }, and λ; cf. (9.53). The values of {αi , ck, } are then computed from the data by a Newton scheme and finally converted back to the derived values of {bi }. For the results in Figure 9.2 the exact values chosen were α = {0.25, 0.2}, b = {0.1, 0.1} and the starting guesses werfe α = {0.3, 0.16}, b = {0.08, 0.12}. We show the progression of the iteration in Figure 9.2. b i values ◦
•
◦
◦
•
•
0.14
◦
◦
•
•
αi values
0.30
•
•
•
0.25
◦
◦
◦
0.20
0.11
0.08
Iteration 0.05
◦
◦
Iteration
0.15 0.10
Figure 9.2. Convergence of αi and bi in the small time case, where
i = 1 is shown as bullets, i = 2 as circles.
While theory predicts reconstructibility of an arbitrary number of terms in both cases, there is a clear difference in the ability to reconstruct terms between the small time and the large time asymptotics. First of all, the method we descibed only effectively recovers two terms with small time measurements, as compared to three in the large time measurement case. The bi coefficients, which are always harder to obtain than the αi exponents, are much worse in the small time than in the large time regime. This is partly explained by the higher degree of ill-posedness due to the necessity of differentiating the data twice.
258
9. Fundamental Inverse Problems for Fractional Order Models
9.2. Direct and inverse Sturm–Liouville problems 9.2.1. Sturm–Liouville problems. Sturm–Liouville theory is the study of the solutions of the equation dy d p(t) + q(t)y = λr(t)y, t ∈ (a, b), (9.54) − dt dt where p(t), q(t), and r(t) are real-valued continuous functions with p(t) and r(t) strictly positive on the closed interval [a, b] and where y is subject to the boundary conditions (9.55)
y (a) − γa y(a) = 0,
y (b) + γb y(b) = 0.
The numbers γa , γb are the impedance constants and the conditions (9.55) are impedance boundary conditions. For mathematical reasons to be explained shortly, we make the assumption that γa ≥ 0 and γb ≥ 0, and this also corresponds to a natural physical assumption when boundary problems such as (9.54) and (9.55) arise from separation of variables in pdes. Two special cases are of importance. If γa = 0 (y (a) = 0), (9.55) is a Neumann boundary condition; if γa = ∞ (y(a) = 0), it is a Dirichlet boundary condition. The case of 0 < γa < ∞ (or 0 < γb < ∞) allows for the intermediate or impedance situation. While the most general second order eigenvalue problem is of the form (9.54), we can apply a transformation that allows us to map the function y to u, where u satisfies a similar equation but with p = r = 1. If we assume that (p r) and p are continuous on the interval [a, b], then the equation (9.54) is transformed by the Liouville transformation y(t) → u(x), 1 t σ(s) ds, u = f (t)y, (9.56) x= L a where L, f , and σ are defined by 1 b r(t) 2 σ(s) ds, σ(t) = , (9.57) L= p(t) a (9.58) Lu :=
d2 u − Q(x)u = −μ2 u, dx2
Q(x) =
1
f (t) = (p(t)r(t)) 4 ,
σ(x) f (x) + L2 , f (x) r(x)
μ2 = L2 λ,
and the new independent variable x varies over the unit interval 0 ≤ x ≤ 1. Note that the mapping from {p, q, r} to Q is highly nonlinear as well as requiring two derivatives on p, r. It is easily checked that the boundary conditions y (a) − γa y(a) = 0, + γb y(b) = 0, with γa , γb ≥ 0 are mapped by the Liouville transformation into u (0) − hu(0) = 0,u (1) + Hu(1) = 0, h ≥ 0, H ≥ 0.
y (b)
9.2. Direct and inverse Sturm–Liouville problems
259
The point is, if we assume that the coefficients possess the regularity required above, than we can, without loss of generality, work with the equation −u (x) + Q(x)u(x) = μ2 u. We shall most often write q(x) instead of Q(x), use λ in place of μ2 , and call this the canonical form of the Sturm–Liouville equation
(9.59)
− u (x) + q(x)u(x) = λu,
u (0) − hu(0) = 0,
x ∈ (0, 1),
u (1) + Hu(1) = 0.
We must make some assumptions on the function q. If the potential is continuous on [0, 1], then it can be shown using Green’s function representation [330, 358] that the solutions u to the boundary value problem have a continuous second derivative. However, for many applications we cannot assume continuity or even boundedness of the potential. The assumption q ∈ L2 [0, 1] is actually sufficient for the results we are about to present.
9.2.1.1. Properties of the eigenvalues and eigenfunctions. The opd2 erator L = − dx 2 + q(x) with impedance boundary conditions as defined in (9.59) for h ≥ 0, H ≥ 0 (which we assume to hold throughout the remainder of this chapter) has an associated Green’s function G(x, t) so that if −L[u] = f and u satisfies impedance boundary conditions, then we can write
(9.60)
u(x) = ((−L)
−1
1
f )(x) =
G(x, t)f (t) dt, 0
where G is continuous on [0, 1]×[0, 1]. Thus the operator (−L)−1 is compact on the space L2 (0, 1) (or C[0, 1]). Hence the eigenvalues of (−L)−1 , and therefore also of L, form a countable set; in particular, since those of (−L)−1 accumulate at zero, those of the inverse −L must grow unboundedly. In fact we will shortly compute the exact rate. We will now state and prove some properties of the eigenvalues and associated eigenfunctions, that is, nontrivial solutions to (9.59). Theorem 9.7. The operator −L equipped with impedance boundary conditions is self-adjoint on L2 (0, 1). The eigenvalues are real and the eigenfunctions corresponding to different eigenvalues are orthogonal.
260
9. Fundamental Inverse Problems for Fractional Order Models
Proof. This is easily seen using integration by parts on two functions u, v that lie in H 2 (0, 1) and satisfy the impedance boundary conditions (9.59) 1 −L[u], v = (−u + qu)v dx 0 1 (u v + quv)dx =u (0)v(0) − u (1)v(1) + 0
=u (0)v(0) − u (1)v(1) + u(1)v (1) 1 u(qv − v ) dx − u(0)v (0) + 0
=hu(0)v(0) + Hu(1)v(1) − Hu(1)v(1) 1 u(qv − v )dx − hu(0)v(0) + 0
= − u, L[v]. That the eigenvalues must be real and the eigenfunctions orthogonal can very quickly be seen: if λk and φk are the kth eigenvalues and eigenfunctions, respectively, then we can integrate the identity −φn φm + φm φn = 1 (λn − λm )φn φm to obtain (λn − λm ) 0 φn φm dx = 0 from which the result follows. Theorem 9.8. If q(x), h, H ≥ 0, then the eigenvalues are nonnegative. To see this, multiply the differential equation by φn and integrating between x = a and x = b, we obtain 1 1 1 2 φn (x)φn (x) dx + q(x)φn (x) dx = λn φ2n (x) dx. − 0
0
0
Now integrating by parts and rearranging terms, we obtain the assertion 1, Hφ2n (1) + hφ2n (0) + 0 (φn (x))2 + q(x)φ2n (x) dx (9.61) λn = 1 2 0 φn (x) dx Theorem 9.9. The eigenvalues of L are simple. Proof. Suppose that u and v are eigenfunctions of −L corresponding to the same eigenvalue λ. Then simplifying the expression vL[u] − uL[v] = 0 gives u v − uv = 0. Thus (u v − uv ) = 0 and so u v − uv = C for some constant C. Now evaluating this expression at x = a shows that C = 0, and so we have (u/v) = 0. Thus u = cv for some constant c, showing that the two functions u and v are linearly dependent. Some remarks are in order.
9.2. Direct and inverse Sturm–Liouville problems
261
First, not only the form of the equation but also the boundary conditions imposed are important for the operator L to satisfy the above theorems. For example, if instead we take the periodic boundary conditions u(0) = u(1), u (0) = u (1), then the lowest eigenvalue is indeed simple (λ = 0 with eigenfunction φ(x) = 1), but all others have multiplicity 2: λn = n2 π 2 with eigenfunctions {cos nπx, sin nπx}. In addition, this theorem is no longer true in higher space dimensions for the operator − + q even with q = 0. To see this, note that with Dirichlet boundary conditions on the unit square of the equa˜ = tion −(uxx + uyy ) = λu, both ψ(x) = sin(7πx) sin(πy) and ψ(x) sin(5πx) sin(5πy) are eigenfunctions corresponding to the eigenvalue λ = 50. Second, we must be very careful in our assumptions on the functions p, q, and r. From the Liouville transform, we see that there are difficulties if for example p vanishes somewhere in the interval [0, 1]; the equation becomes singular and the above analysis may not hold. The following example shows what might happen. Consider the eigenvalue problem (9.62)
−(x2 u ) = λu
for 0 < x < 1,
u(1) = 0,
u bounded.
It is easily verified that Green’s function is G(x, s) = 1/x if s ≤ x and 11 G(x, s) = 1/s if x ≤ s. Note that 0 0 G2 (x, s) ds dx = ∞, so the inverse operator (−L)−1 is not even bounded from L2 (0, 1) into itself, and we cannot apply spectral theorems. The5solutions that vanish at x = 1 are u(x) = x−1/2 sin(β log x) with β = λ + 14 . However, these are not in L2 (0, 1) and so cannot be eigenfunctions. Indeed, this particular operator has no eigenvalues or eigenfunctions.
We can view the eigenvalues and eigenvectors of L from a slightly dif2 (0, 1) = ferent angle. Suppose for the moment that h = H = ∞. Let HDir 2 1 2 H (0, 1) ∩ H0 (0, 1) be the space of those functions in H (0, 1) satisfying the boundary conditions u(0) = u(1) = 0. If we let ϕ be any function in 2 (0, 1), then we can write this in terms of a Fourier series in the eigenHDir ∞ 1 2 −1 functions φn : ϕ(x) = ∞ c φ (x) = ϕ, φn φn (x). If n=1 n n n=1 0 φn dt we repeat this for the function −ϕ + qϕ, then from (9.63) 1 1 **1 [−ϕ + qϕ]φn dt = ϕ φn − φn ϕ * + [−φn + qφn ]ϕ dt = λn 0
0
0
1
φn ϕ dt, 0
where we have used the boundary conditions to remove the terms at x = a, b, we obtain −ϕ + qϕ = ∞ n=1 λn cn φn . Since the eigenfunctions are complete,
262
9. Fundamental Inverse Problems for Fractional Order Models
Parseval’s equality now gives 1 1 ∞ 2 2 ϕ dt = cn φ2n dt, 0
n=1
1
2
2
[ϕ + qϕ ]dt =
0
0
∞ n=1
1
1
λn c2n
φ2n dt.
0
1 Since λ1 ≤ λ2 ≤ · · · this shows that 0 [ϕ2 + qϕ2 ]dt ≥ λ1 0 ϕ2 dt with equality if and only if c2 = c3 = · · · = 0; that is, if and only if ϕ = c1 φ1 , that is, ϕ itself is an eigenfunction of L. If we consider the term (9.64)
R0 (ϕ) =
1 0
[ϕ2 + qϕ2 ]dt , 1 2 0 ϕ dt
2 (0, 1) this quotient has minimum value then amongst all functions ϕ ∈ HDir 2 (0, 1). equal to λ1 . Suppose a minimum λ is obtained for a function φ ∈ HDir 2 (0, 1) and any constant s, Then for any function ϕ ∈ HDir 1 1 2 )2 + q(φ + sϕ)2 ]dt [(φ + sϕ [φ + qφ]dt R0 (φ+sϕ) = 0 ≥ 0 1 = R0 (φ) = λ. 1 2 2 0 (φ + sϕ) dt 0 φ dt
The left-hand side is a continuously differentiable function of the parameter s which attains its minimum at s = 0, and so its derivative with respect to s must vanish at s = 0 1 1 1 2 0 [φ ϕ + qφϕ]dt 2 0 φϕ dt 0 [φ2 + qφ2 ]dt − = 0, 1
2 2 1 2 φ dt 0 φ dt 0 1 which simplifies to 0 [φ ϕ + qφϕ − λφϕ]dt = 0. Integration by parts then 1 gives 0 ϕ[−φ + qφ − λφ]dt = 0. Since this is true for every admissible function ϕ, it then follows that φ must satisfy the equations −φ (x) + q(x)φ(x) = λφ(x),
φ(0) = 0,
φ(1) = 0,
and so λ must be an eigenvalue and φ the corresponding eigenfunction. If indeed λ is the minimum, then it must be equal to λ1 . Thus we can characterise the lowest eigenvalue by the minimisation of the Rayleigh quotient 1 2 2 0 [ϕ + qϕ ]dt , (9.65) λ1 = min 1 2 2 (0,1) ϕ∈HDir 0 ϕ dt where the minimising function is a multiple of the first eigenfunction φ1 . We have not shown that the minimum actually exists, and this requires more detailed analysis which can be found in standard texts such as [70]. What about other eigenvalues and eigenfunctions? We simply have to modify our minimisation of the Rayleigh quotient to be taken over functions that exclude the first eigenfunction, and we achieve this by using functions in the orthogonal complement of the space spanned by φ1 , that is, functions
9.2. Direct and inverse Sturm–Liouville problems
263
1 satisfying ϕ, φ1 = 0 ϕφ1 dt = 0. In this case we again characterise λ as an eigenvalue and the minimising function as an eigenvector. By design this eigenvector must be orthogonal to φ1 , and so λ must now equal λ2 and the minimiser must be a multiple of the second eigenfunction φ2 . Indeed we can continue in this manner at each stage adding a further constraint of orthogonality with the previously determined eigenfunction to obtain 1 λn =
(9.66)
min 2 (0,1) ϕ∈HDir ϕ,φ1 =0
0
[ϕ2 + qϕ2 ]dt , 1 2 0 ϕ dt
···
ϕ,φn−1 =0
where the minimising function is a multiple of φn . Extending this characterisation of the eigenvalues to the setting of general impedance boundary conditions, we can easily show Theorem 9.10. Increasing q, h, or H increases all the eigenvalues of L. The first eigenfunction φ1 is at least piecewise continuously differentiable, and since |φ1 |2 = φ21 and (|φ1 | )2 = (φ1 )2 almost everywhere, the function |φ1 | also minimises the Rayleigh quotient, and so it must be a multiple of the eigenfunction φ1 . This means that either φ1 = |φ1 | or φ1 = −|φ1 |, and in either case this means that φ1 cannot change sign in (0, 1). Thus if φ1 (c) = 0 for a < c < b, then φ1 (c) must also be zero. This would mean that φ1 being the solution of a homogeneous second order linear differential equation with zero Cauchy data at c would vanish identically in (0, 1), and this is impossible for an eigenfunction. Thus, without loss of generality fixing the sign to be positive, we have φ1 (x) > 0 in (0, 1). Now since all other eigenfunctions are orthogonal to φ1 , they must have at least one zero in (0, 1). In fact, if {λn , φn } and {λm , φm } are eigenvalueeigenfunction pairs with n > m, then multiplying equation (9.59) with λ = λn by the function φm , equation (9.59) with λ = λm by the function φn , and subtracting gives (λn − λm )φn φm = φn φm − φm φn . Suppose now that φm has consecutive zeros at the points c and d with a ≤ c < d ≤ b, and without loss of generality we take φm > 0 on (c, d). This means that φm (c) > 0 and φm (d) < 0. Then integrating the previous expression over (c, d) and using integration by parts gives (9.67) d *d * φn φm dt = (φn φm − φm φn )* = φm (d)φn (d) − φm (c)φn (c). (λn − λm ) c
c
If now φn does not vanish in (c, d) (say again φn > 0 there), then the lefthand side of (9.67) is positive while the right-hand side is negative. Thus
264
9. Fundamental Inverse Problems for Fractional Order Models
φn must have a zero in (c, d), and so between every pair of zeros of φm any eigenfunction with a higher index must vanish at least once. ˜ be solutions of Theorem 9.11 (Separation theorem). Let u, λ and u ˜, λ ˜ Then if α and β are consecutive zeros of u, then there (9.59) with λ < λ. must be at least one zero of u ˜ in (α, β). In short, this means that for each n the eigenfunction φn+1 must have at least one more zero on (0, 1) than φn . One can also show that it must have exactly one more zero. Theorem 9.12 (Oscillation theorem). The nth eigenfunction of L has exactly n−1 zeros in the open interval (0, 1). We finally return to the general setting and provide a monotonicity result in terms of several parameters appearing in (9.54) and (9.55). Theorem 9.13. Reducing the size of the interval (a, b), increasing p or q or decreasing r, increasing γa or γb , increases all the eigenvalues of the problem (9.54) and (9.55). 9.2.1.2. Asymptotic expansion of the eigenvalues and eigenfunctions. For convenience and knowing that λ ≥ 0, we will rewrite the canonical form of the equation (9.59) as (9.68)
uxx + μ2 u = q(x)u
and assume that the boundary conditions are (9.69)
u (0) − hu(0) = 0,
u (1) + Hu(1) = 0,
where we will treat separately the cases h = ∞ and H = ∞, this terminology being simply a convention to denote the Dirichlet boundary conditions u(0) = 0, and u(1) = 0. The techniques described in this section follow closely those outlined in the book by Yosida [358]. We first consider the situation where h < ∞ and H ≤ ∞. Note that we can scale the function u by any constant factor, and it will still satisfy (9.68) and (9.69). Thus we can normalise the solution by a particular choice of the left-hand boundary conditions: (9.70) [u(0) = 1, u (0) = h] if h < ∞ and [u(0) = 0, u (0) = 1] if h = ∞. For a given continuous function v(x), it is easily seen by direct substitution that the equation u + μ2 u = v has a particular solution u(x) = 1 x μ 0 [sin μ(x − t)]v(t) dt. Hence the general solution of (9.68) can be written
9.2. Direct and inverse Sturm–Liouville problems
265
x in the form u(x) = A cos μx + B sin μx + μ1 0 [sin μ(x − t)]q(t)u(t) dt, where A, B are integration constants. Now using the initial conditions gives 1 x σh (9.71) u(x) = σ0 cos μx + sin μx + sin μ(x − t)q(t)u(t) dt μ μ 0 with 1 if h < ∞ h if h < ∞ σh = σ0 = 0 if h = ∞, 1 if h = ∞. 2 1/2 , it follows from Gronwall’s inequality Since | cos(μx)+ μh sin μx| ≤ 1+ μh2 and equation (9.71) that u is bounded on the interval [0, 1],
1 1 1 2 2 |q(t)| dt (9.72) |u(x)| ≤ 1 + μh2 eμ 0 . Equation (9.71) can be differentiated to give x [cos μ(x − t)]q(t)u(t) dt, (9.73) u (x) = −σ0 μ sin μx + σh cos μx + 0
and the above two equations together with trigonometric addition theorems yield the expressions σh u(1) = σ0 cos μ + sin μ μ cos μ 1 sin μ 1 cos(μt)q(t)u(t) dt − sin(μt)q(t)u(t) dt + μ 0 μ 0 1 1 sin(μt)q(t)u(t) dt = cos μ σ0 − μ 0 1 sin μ σh + cos(μt)q(t)u(t) dt + μ 0 and u (1) = σh cos μ − σ0 μ sin μ 1 1 + cos μ cos(μt)q(t)u(t) dt + sin μ sin(μt)q(t)u(t) dt 0 0 1 = cos μ σh + cos(μt)q(t)u(t) dt 0 1 + sin μ −σ0 μ+ sin(μt)q(t)u(t) dt . 0
We will treat the somewhat simpler Dirichlet case h = H = ∞ and leave the other case to the reader. Then the endpoint condition u(1) = 0 must be used to determine the eigenvalues and eigenfunctions, and this gives 1 1 cos(μt)q(t)u(t) dt − cos μx sin(μt)q(t)u(t) dt = 0 sin μx + sin μx 0
0
266
9. Fundamental Inverse Problems for Fractional Order Models
from which it follows that (9.74)
tan μ =
1
sin(μt)q(t)u(t) dt . 1 1 + 0 cos(μt)q(t)u(t) dt 0
Now q ∈ L2 (0, 1) and u is bounded there, so both the integrals in (9.74) tend to zero as μ → ∞ and so limμ→∞ tan μ = 0 which implies that μn = nπ +αn , limn→∞ αn = 0, n = 1, 2, . . .. Since tan(nπ + αn ) = tan αn , then for n sufficiently large, tan μn = tan αn ≈ αn , and so 1 0 sin(μt)q(t)u(t) dt . αn ≈ (9.75) μn = nπ + αn , 1 1 + 0 cos(μt)q(t)u(t) dt Now if we use (9.71) to replace u inside the integral in the above then, (9.76) 1 1 1 1 sin(2μt)q(t) dt + o(n−2 ) 1−cos(2μt) q(t) dt 1 − μn = nπ+ 2μn 0 2μn 0 1 1 1 1 = nπ+ q(t) dt − cos(2nπt)q(t) dt + o(n−2 ). 2nπ 0 2nπ 0 1 1 Notice that the third term ηn = 2nπ 0 cos(2nπt)q(t) dt, whose significance will be apparent later, deceases to zero faster than 1/n, and the rate at which it does depends on q—specifically on the cosine Fourier coefficients of this function. In particular, the sequence {ηn } lies in 2 , but we cannot say more about its rate of decay unless we know more about q. If q ∈ L2 (0, 1), then we can integrate both integral terms in (9.76) by parts and use the Dirichlet boundary conditions to improve the estimate to 1 1 (9.77) μn = nπ + q(t) dt + o n−3 . 2nπ 0 We can continue this process: by assuming more smoothness on the potential function q(x), we obtain a more rapid decay of the remainder term in (9.77). Since λn = μ2n , (9.76) becomes 1 2 2 q(t) dt + en , (9.78) λn = n π +
{en } ∈ 2 ,
0
with a more rapid decay of the sequence {en } if more regularity is imposed on q. This is illustrated in Figure 9.3 which is taken directly from [300]. For the smooth (infinitely differentiable) function q1 vanishing at both x = 0, 1, the decay of the the value of 1corresponding eigenvalues is evident; other than −3 the mean q¯ = 0 q(t) dt, there is only information of order 10 about q(x) in {λn } for n ≥ 4. The decay in the sequence {en } is much less rapid for the function q2 which has a discontinuous first derivative but lies in H 1 (0, 1) and correspondingly less so for the discontinuous function q3 . This feature will
9.2. Direct and inverse Sturm–Liouville problems
c n = λ n − n2 π 2 −
q(x) 1.0 q 1
0.5
0.0
2.0
1.0
0.0
... .... ....... ... ... ... ... .. ... . ... ... ... .. . ... .. . ... . . . ... ... . ... . . . ... ... . ... . .. . ... . . ... . ... . . . .... . ... . . . . .... . ...... . . . . . . . . . . . . . . . . . . ...
1
0.001
...... ........ .. ...... .... .. .. ..... ... .... ... .. .. .. .. .. ... ... . . . ... .. .. ... ... .. .. .. .. ... . ... .. ... . ... . . .. .. . . .. ... . .. .. . .. . ... .. . .. . .. ... . ... .. .. ... . ... ... .. ... . ... ... ... .. ... . ... . ... . ... . ... . ..... ... ... ............ ..
1
−0.001
••
0.05
•
0
q , 4 ≤ n ≤ 30
• • • •• •••••
•••••••••• n
• • • ••• ••• • •• •••••••• n • •
• −0.05
2.0
−1.5
•
•
0.0
0.1
0.0 ... .. .... .... ... .... ..... .... .... ....
•
••
• •
4.0 q 3
0.0
1
•
0.0
q2
267
•
•
• • • • • • •• • •• • • • • n • • • • • • •
••
1 −0.1
•
Figure 9.3. Eigenvalue asymptotics for various potentials
have important consequences from an inverse problems perspective—namely the recovery of q from spectral sequences. Returning now to the eigenfunctions φn , we have from (9.71) that φn satisfies a Volterra integral equation of the second kind with kernel depending on the function q. This can be solved by successive approximations and so if we use the first two terms, again applying trigonometric addition theorems, we obtain 1 1 x φn (x) ≈ sin μn x + 2 sin μn (x − t) sin(μn t) q(t) dt μn μn 0 x 1 . 1 sin μn x + 2 sin(μn x) sin(2μn t) q(t) dt = μn 2μn 0 x / , 1 − cos(2μn t) q(t) dt . − cos(μn x) 0
Using the asymptotic estimate for {μn } gives (9.79)
φn (x) =
1 1 sin nπx + O 2 , nπ n
268
9. Fundamental Inverse Problems for Fractional Order Models
again with increasing decay of the error term with increasing regularity on the function q. Note that we have chosen to normalise the eigenfunctions by imposing a condition at the left-hand endpoint, (9.70), and a natural question is what this implies for the L2 norm of φn since these are in some sense interchangeable. From (9.79) we easily compute 1
1 1 +O . (9.80) ρn := φn 2 = 2 2 n π 2 n In fact, if we write ϕn (x) = φn (x)/ φn L2 , then we have ϕn (x) = √12 sin nπx+ αn , where {αn } ∈ 2 since q ∈ L2 and this leads to ∞
1
ϕn (x) − √ sin nπx 2 < ∞. 2 n=1 It follows from the Hilbert-Schmidt theorem on self-adjoint, compact operators on a Hilbert space that Theorem 9.14. The eigenfunctions of L with impedance boundary conditions according to (9.59) form a complete family in L2 (a, b). Alternatively, the following lemma coupled with our asymptotic values of the eigenvalues/eigenfunctions can be used to give a more direct proof; see [29]. Lemma 9.7. Let {un (x)}∞ n=1 be a complete orthonormal sequence in a Hilbert space, and let {ψn (x)}∞ n=1 be an orthonormal sequence such that ∞
un − ψn 2L2 < ∞.
n=1
Then {ψn (x)}∞ n=1 is also complete. Since we know the sequence { √12 sin nπx} is orthonormal and complete in L2 (0, 1), then from the above lemma the same is true for the sequence {ϕn }. This provides an independent proof of the completeness of the eigenfunctions of the Sturm–Liouville problem, at least in the Dirichlet case. However, we also obtain a similar estimate for the case with h < ∞. Indeed, from (9.71) h sin nπx + O(n−2 ), and so we obtain the asymptotic φn (x) = cos nπx + nπ estimate 1
1 (9.81) ρn := φn 2 = + O 2 . 2 n Of course if we incorporated more information about q, then a more precise form of the asymptotic error could be obtained, but (9.81) is sufficient to prove that the eigenfunctions are complete.
9.2. Direct and inverse Sturm–Liouville problems
269
For the case of more general boundary conditions, an almost identical analysis shows that 1 h + H + 12 0 q(t) dt (9.82) μn = nπ + + O n−2 . nπ From this it follows that E φn (x) = cos nπx + sin nπx + O n−2 , with nπ (9.83) 1 x 1 x q(t)dt − x(h + H + q(t)dt). E =h+ 2 0 2 0 9.2.1.3. Estimates on the eigenvalues and eigenfunctions. We will need some estimates on the eigenvalues and eigenfunctions. According to the classical Sturm–Liouville theory [57, 279], we have the following useful estimates for the eigenvalues {λn (q)} and eigenfunctions {ϕn (x; q)}. These estimates depend on the smoothness of the potential as is to be expected, and as this is frequently a consideration, we give estimates for the spaces L2 , L∞ and the Sobolev spaces H m for m ≥ 1. To keep an already complex topic as simple as possible, we do this only for DD eigenvalues (Dirichlet conditions at both ends). We denote the corresponding spectrum by {λn } and assume the interval is [0, 1]. A simple scaling can be done in the case of the interval [a, b]. These lemmas are taken from a variety of sources including [57, 279, 309]. The first of these is in parts a recap of what we have derived above. Lemma 9.8. If q ∈ L2 (0, 1), then 1 2 2 q(s) ds + ηn , λn = n π +
{ηn } ∈ 2 .
0
If q ∈ L∞ (0, 1), then
1
λn = n2 π 2 +
q(s) ds + O 0
1 n
.
If q ∈ H m (0, 1) for a positive integer m, then a2 a1 ηn + 3 + · · · + 2m/2+1 , λn = nπ + {ηn } ∈ 2 . n n n Remark 9.1. Lemma 9.8 shows that if q exhibits additional smoothness, then the remainder term decays at a guaranteed faster rate than if q were just in L2 . This pattern continues with increasing regularity and shows that for smooth functions the error after the leading terms (n2 π 2 + q¯) decays more rapidly. Refer to Figure 9.3 for three illustrative examples of potential functions q(x) with varying levels of regularity and the corresponding decay of the error terms. This shows quite clearly that one needs far more spectral data to recover a rough function than a smooth one.
270
9. Fundamental Inverse Problems for Fractional Order Models
Lemma 9.9. For any q ∈ Mω , the set of nonnengative functions in L∞ with essential bound ω, the eigenvalues {λn (q)} and eigenfunctions {ϕn (x; q)} to the Sturm–Liouville problem (9.59) with h = H = ∞ are entire functions of q ∈ L2 , and they satisfy the estimates, λn (q) ≥ n2 π 2 ,
|λn (q1 ) − λn (q2 )| ≤ C q1 − q2 L2 ,
C
q2 −q1 L2 , ϕn (0; q1 )−ϕn (0; q2 ) ≤ C q2 −q1 L2 , n √ √ C
ϕn (x; q) − 2 sin nπx L∞ (0,1) ≤ , ϕn (x; q) − 2nπ cos nπx L∞ (0,1) ≤ C, n where the constant C is independent of n, and the estimates hold uniformly on Mω .
ϕn (q1 )−ϕn (q2 ) L2 ≤
Lemma 9.10. Let z1 (x) and z2 (x) denote the fundamental set of solutions for −y + q(x)y = λy, that is with z1 (0) = 1, z1 (0) = 0 and z2 (0) = 0, z2 (0) = 1. Then (9.84) √ 2 2
z1 (x)−1 L∞ (0,1) ≤ 2 q L2 eqL2 and z2 (x)−x L∞ (0,1) ≤ q L2 eqL2 . 9.2.2. The inverse Sturm–Liouville problem. We now turn our attention to the opposite question: instead of asking what the spectral values obtained from a given effective potential q(x) are, we ask what spectral information will allow a recovery of the potential that generated it. In particular, will knowledge of the eigenvalues for a given set of boundary values be sufficient to recover q? However, this conjecture turns out to be false; in general a single eigenvalue sequence is insufficient to determine the potential and there are easy counterexamples. Suppose that −u + q(x)u = λu, u(0) = u(1) = 0 for an arbitrary q ∈ L2 (0, 1). Let q˜(x) = q(1 − x), that is, the reflection about the midpoint of the potential q. Then by putting x → 1 − x in this equation, we see that both q and q˜ must have the same eigenvalues, and hence it is impossible to recover a unique potential from a single measurement of eigenvalues. This example is important not only to settle the possibility of being able to recover a potential from only its spectrum, but it indicates a way in which one might construct a positive uniqueness result. The counterexample fails if the chosen q was originally symmetric about the midpoint and, as we will see, in this case with these boundary conditions uniqueness in fact holds. The definitive result is due to the Swedish mathematician Borg in 1946 [36] who proved that if {λn }∞ n=1 is the spectrum of (9.68) corresponding ˜ n }∞ is the to the boundary conditions (9.55) with values h, H, and {λ n=1 ˜ n }∞ uniquely ˜ = H, then the pair {λn , λ spectrum from values h and H n=1 determines the potential q(x). He also proved that if q is symmetric about
9.2. Direct and inverse Sturm–Liouville problems
271
the midline, q(1 − x) = q(x), and if h = H, then a single spectrum {λn }∞ n=1 uniquely determines the potential. A few years later, Levinson [211] considerably shortened the proofs using complex analysis techniques, and Marchenko [241] showed that the two ˜ in the boundspectra are sufficient to determine the constants h, H, and H ary conditions as well as the potential. The celebrated paper of Gel’fand and Levitan [111] appeared in 1951 and showed that the potential could be recovered from a spectrum {λn }∞ n=1 and a 2 ∞ set of norming constants { φn L2 }n=1 (remember, the implicit normalisation with the value of either u or u at x = 0 is assumed). This showed that other data could be used to recover the potential, but that two data sequences were again required. There are other possible combinations of data that can be used to recover q, and we refer to [57]. Our immediate purpose will be to provide proofs of some of these results, but we will also answer the question of constructibility of the potential. In particular, if we are given only finite spectral data, can a reasonable approximation to the potential be found? As we will see, there is a unifying approach to these problems that not only lets us settle the uniqueness issues but also leads to a numerically efficient algorithm for the reconstruction of q and is based on the references [299, 300]. 9.2.2.1. An important integral operator. A crucial tool for uniqueness and reconstruction is the Gel’fand–Levitan equation that relates eigenfunctions corresponding to different potentials. Let us suppose that φ(x) and ψ(x) are solutions of (9.85)
−φ + q(x)φ = λφ, −ψ + p(x)ψ = λψ
that satisfy the same initial conditions at x = 0. We seek a representation that maps solutions ψ onto solutions φ of equation (9.85). Here is the transformation operator (9.86) x
K(x, t)ψ(t) dt.
φ(x) = ψ(x) + 0
t
..... t ......... ........... .............. ...................... . . . . . ....... .............. ....................... K(x, x) = 12 0x q(s)ds .......................... .............................. .............................................. . . . . . .. . . . . . . . . . . . . . . ...................................... ........................................ ............................................ ....... . . . . . . . . . . . . . . . . . . . ............................................................................................ . . . . ...... . . . . . . . . . . . . . . . . . . . . . . ....... ................................................ .......................................................... ...... . . . . . . . . . . . . . . . . . . . . . . . . . ... ............K −K + q(x)K= 0 ...... . . . . . . xx .. ...... . . . tt ................. ..................................................... ......................................................................... ...... ..................................................................... ........... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ............................................................................ ......................................................................... ....................................................................... ................................................................... ................................................................. .............................................................. ........................................................... ....... ................................................. ..................................................... ................................................... ................................................ ............................................. ........................................... ..... . . . . . . ... ........... . . . .................. ... . . . . ....... ................................... ................................ .... . . . . . . . . . . . 1 x K(x, −x) = − 2 0 q(s)ds ............................................................................ ...... ............. ....... . . . . . ................ ............ .......... ....... .. t
=x
x
= −x
The function K = K(x, t; q, p) depends on the potentials q and p but Figure 9.4. The characteristic initial not on the parameter λ. This fact will value problem for K(x, t; q) in the Dirichlet case be crucial for further developments.
272
9. Fundamental Inverse Problems for Fractional Order Models
We will sometimes write K(x, t; q) in the special case when the potential p = 0. Differentiating(9.86) twice, using the identities q(x)φ(x) − p(x)ψ(x) = x x q(x)K(x, t)ψ(t) dt, λ(φ − ψ)(x) = K(x, t)λψ(t) dt = (q − p)(x)ψ(x) + 0 0 x (t) + p(t)ψ(t)) dt shows that K(x, t) must satisfy the variaK(x, t)(−ψ 0 tional equation (9.87) x x d2 K(x, t)ψ(t) dt + K(x, t)ψ (t) dt − 2 dx 0 0 x (q(x) − p(t))K(x, t)ψ(t) dt + 0
= −(q − p)(x)ψ(x) d2 + p and all x ∈ (0, 1). dx2 Since these eigenfunctions are complete, this together with two integrations by parts implies that K satisfies the hyperbolic equation, (9.88) Ktt − Kxx + q(x) − p(t) K = 0, 0 ≤ t ≤ x ≤ 1, for all eigenfunctions ψ of −
subject to (in case of Dirichlet or Neumann boudary conditions) 1 x q(s) − p(s) ds. (9.89) K(x, x) = 2 0 For the case of Neumann conditions at the left boundary, ψ (0) = 0, it additionally implies Kt (x, 0) = 0. For the Dirichlet case we obtain K(x, 0) = 0. Thus, in the Neumann case we extend K as an even function of t, thus 1 x ds, in the extending (9.89) as K(x, ±x) = 2 0 q(s) − p(s) Dirichlet case as x an odd function of t, with K(x, ±x) = ± 12 0 q(s) − p(s) ds. See Figure 9.4 for an illustration of the latter case. Equation (9.88) is a hyperbolic pde with characteristics given by the lines x = ±t. The boundary values K(1, t) and Kx (1, t) form a Goursat problem for K, and (9.89) gives a characteristic initial value problem for K. Both of these conditions lead to existence and uniqueness of solutions [110]. 9.2.2.2. Uniqueness proofs. As mentioned above, for recovering a general potential q, two sets of data are needed. We consider three possible combinations. Two Spectra. Let us denote by Λ(q, h, H) and Φ(q, h, H) the sequence of eigenvalues and eigenvectors corresponding to the potential q(x) and the boundary conditions u (0) − hu(0) = 0 and u (1) + Hu(1) = 0; that ∞ is, {λn }∞ n=1 = Λ(q, h, H) and {φn }n=1 = Φ(q, h, H), where −φn + qφn = λn φn subject to the boundary conditions just mentioned with h and H as impedance parameters.
9.2. Direct and inverse Sturm–Liouville problems
273
Theorem 9.15. Let Λ(q1 , h1 , H1 ) = Λ(q2 , h2 , H2 ) and Λ(q1 , h1 , H1 ) = Λ(q2 , h2 , H2 ), where Hi = Hi . Then q1 = q2 , h1 = h2 , H1 = H2 , and H1 = H2 . ∞ Proof. Let {φn }∞ n=1 = Φ(q1 , h1 , H1 ) and {ψn }n=1 = Φ(q2 , h2 , H2 ). The asymptotic formula according to Lemma 9.8 is sufficient to determine the 1 mean value of the potential. Thus it follows that 0 (q1 − q2 ) ds = 0. Let us assume for the moment that neither boundary condition at x = 1 is ofx Dirichlet type. From the Gel’fand–Levitan equation φn (x) = ψn (x) + 0 K(x, t)ψn (t)dt, we calculate
φn (1) + H1 φn (1) = ψn (1) + H1 ψn (1) + K(1, 1)ψn (1) 1 [Kx (1, t) + H1 K(1, t)]ψn (t) dt + 0 1 [Kx (1, t) + H1 K(1, t)]ψn (t) dt, = (H1 − H2 )ψn (1) + 0
1 since equation (9.89) and the already found identity 0 (q1 −q2 ) ds = 0 imply that K(1, 1; q1 , q2 ) = 0. Using the boundary conditions at x = 1 gives 1 [Kx (1, t) + H1 K(1, t)]ψn (t) dt = 0. (9.90) (H1 − H2 )ψn (1) + 0
Now ψn (1) cannot be zero for then the boundary condition at x = 1 would imply that ψn (1) = 0, and hence from the uniqueness theorem for odes, that ψn (x) was identically zero. This would contradict the fact that ψn was an eigenfunction. Indeed, using an asymptotic formula, one can even conclude that the sequence {ψn (1)}n∈N is bounded away from zero. Since K(1, ·) and Kx (1, ·) are in L2 (0, 1), it follows that 1 [Kx (1, t) + H1 K(1, t)]ψn (t) dt = 0, lim n→∞ 0
and so, by (9.90), we must have H1 = H2 . Also, {ψn }∞ n=1 is complete in L2 (0, 1), so that (9.90) shows that (9.91)
Kx (1, t) + H1 K(1, t) = 0.
Now use the second spectrum corresponding to the boundary condition u (1) + Hi u(1) = 0 to see that H1 = H2 and (9.92)
Kx (1, t) + H1 K(1, t) = 0.
Due to H1 = H1 the identities (9.91) and (9.92) give the conditions (9.93)
Kx (1, t) = K(1, t) = 0.
274
9. Fundamental Inverse Problems for Fractional Order Models
From uniqueness for the Goursat problem and (9.93) it follows that K(x, t) = 0. Generalizing (9.89) to impedance boundary conditions, we obtain from (9.87) that 1 x (q1 − q2 ) ds. K(x, x) = h1 − h2 + 2 0 Setting x = 0 in the above shows that h1 = h2 and differentiating shows that q1 = q2 . The modification for Dirichlet conditions at x = 1 is straightforward. In many applications it is not possible to modify the boundary conditions in order to provide two distinct eigenvalue sequences so we must seek alternatives. Some of these are listed below. Endpoint Data. We are given the Dirichlet eigenvalues {λj }∞ n=1 plus the ∞ sequence of numbers {φj (1)/φj (0)}n=1 . From this and the assumed normalisation φj (0) = 1, we obtain {φj (1)}. With p = 0, (9.86) becomes x sin λj x sin λj t + K(x, t) dt. (9.94) φj (x) = λj λj 0 The eigenvalues {λj } correspond to H = ∞, and so the boundary condition at x = 1 gives 1 K(1, t) sin λj t dt = − sin λj 0
from which we can recover K(1, t), as before. By differentiating (9.94) and using (9.89) at x = 1, we get 1 sin λj t 1 sin λj 1 q(s) ds + Kx (1, t) dt, φj (1) = cos λj + 2 λj λj 0 0 that is, 1 1 1 Kx (1, t) sin λj t dt = λj φj (1) − cos λj − sin λj q(s) ds, 2 0 0 from which we also recover Kx (1, t). Note that the mean value of q is known from the asymptotic fomula in Lemma 9.8. We thus have both of K(1, t) and Kx (1, t) from the spectral data, and everything else follows as before. Norming constant data. In this problem we are given {λn }∞ n=1 and 1 2 (0) = 1 , where ρ = φ (x)dx (remember we used the condition φ {ρn }∞ n n=1 0 n to normalise the eigenfunctions). If we differentiate the equation −y + qy = λy with respect to λ, we obtain −y˙ + q y˙ = λy˙ + y,
9.2. Direct and inverse Sturm–Liouville problems
275
∂y where y˙ denotes ∂λ . Multiplying the above by y, the original equation by y, ˙ and subtracting gives y 2 = y y˙ − y˙ y.
Integrating between x = 0 and x = 1 and setting λ = λj (so y becomes φj (x)), we get 1 φ2j dx = φ˙ j (1)φj (1), 0
and therefore ρj = φ˙ j (1)φj (1) or
(9.95)
φj (1) =
ρj . ˙ φj (1)
We want to convert the data {ρj } into endpoint data {φj (1)}, and so we ˙ need an expression for φj (1). If we differentiate the product of (9.94) with λj with respect to λj , we obtain x
1 1 ˙ x cos( λj x)+ λj φj (x)+ φj (x) = tK(1, t) cos( λj t) dt . 2 λj 2 λj 0 Since φj (1) = 0, we get 1 φ˙ j (1) = 2λj
# cos λj +
1
tK(1, t) cos
$ λj t dt ,
0
and so we obtain φj (1) =
cos
λj +
1 0
2λj ρj tK(1, t) cos(
λj t) dt
.
Everything on the right-hand side is known, and we have reduced the problem to the endpoint data case. There is a remarkably short proof of the uniqueness theorem of Gel’fand and Levitan which states that a single spectral sequence and the corresponding sequence of norming constants is sufficient to determine the potential. This particular proof requires no knowledge of hyperbolic equations and relies only on an elementary level of functional analysis. Its elegance is only diminished by the fact that unlike the technique presented above for the two spectrum case, the ideas it develops do not seem to lead to a constructive algorithm. Theorem 9.16. Let the operator with potentials p and q have the same ∞ spectral values {λn }∞ n=1 and corresponding norming constants {ρn }n=1 , the 2 L norm of the eigenfunctions. Then p = q. Proof. For definiteness let us take the case h = ∞. Denote by φn = φn (x, q) and ψn = ψn (x, p) the eigenfunctions corresponding to potentials q and p
276
9. Fundamental Inverse Problems for Fractional Order Models
and the same eigenvalues λn . Then we can write (9.86) in the form ψn = (I + K)φn , where x (9.96) Kf = K(x, t)f (t)dt 0
is a Volterra operator on L2 [0, 1]. Now it is known that each of the se∞ 2 quences {φn }∞ n=1 and {ψn }n=1 is a complete orthogonal set in L [0, 1], and by dividing (9.86) by the norming constants ρn , we see that the resulting eigenfunctions are also orthonormal. Thus the operator I + K maps orthonormal sets into orthonormal sets and hence must be unitary on L2 [0, 1]. Thus (9.97)
(I + K)(I + K∗ ) = (I + K∗ )(I + K) = I.
Equation (9.97) simplifies to KK∗ = K∗ K, showing that K is a normal operator on L2 [0, 1]. From spectral theory for normal operators it follows that there is an eigenvalue of K lying on the circle of radius K . On the other hand, being a Volterra integral operator, K has no nonzero eigenvalue [23]. This means that K = 0, which in turn shows that K(x, t) = 0 for all 0 ≤ t ≤ x ≤ 1, and consequently ψn = φn for all n ∈ N. This implies that the elliptic operators whose eigensystems are {λn , ψn } and {λn , φn } are identical, and therefore p = q. We conclude this section with an extension of the endpoint uniqueness result to higher space dimensions. The result on multiple coefficients is due to Canuto and Kavian [51] and goes back to earlier work on single coefficient identification [50, 256]. To this end, consider the eigenvalues 1 ∇· λn and eigenfunctions ϕn of the elliptic operator defined by L = r(x) (p(x)∇·) − q(x)·, that is, −∇ · (p(x)∇ϕn ) + q(x)ϕn = λn r(x) ϕn ϕn = 0
in Ω, on ∂Ω,
|ϕn (x)|2 r(x) dx = 1. Ω
Theorem 9.17. Assume that p, r ∈ C 1 (Ω), q ∈ L∞ (Ω), p(x) ≥ p > 0, r(x) ≥ r > 0. • Given q in Ω and p, ∇p, on ∂Ω the spectral data {λn }n∈N , {p∂ν ϕn |∂Ω }n∈N uniquely determines p and r in Ω. • Given p in Ω the spectral data {λn }n∈N , {p∂ν ϕn |∂Ω }n∈N uniquely determines q and r in Ω.
9.2. Direct and inverse Sturm–Liouville problems
277
9.2.2.3. Ill-posedness. The inverse Sturm–Liouville problem is only mildly ill-posed in the usual sense of the scales of the norms connecting the domain and range spaces or the rate of decay of the singular values of the linearised map. In fact there is an important converse to Lemma 9.9, which we now state [57]. Lemma 9.11. Let q ∈ L2 and assume q is bounded in some ball B. Suppose ˜k ∞ we have the eigenvalue sequences {λkn }∞ n=1 and {λn }n=1 that correspond to the spectrum of qk (x), k = 1, 2 with respect to two different impedance values ˜ Then H, H. (9.98)
q1 − q2 L2 ≤ C
∞
˜1 − λ ˜ 2 |). n(|λ1n − λ2n | + |λ n n
n=1
˜ n } in 2 , then Thus if we control the eigenvalue sequences {nλn } and {nλ 2 we control q(x) ∈ L . However this gives no indication of the value of the coupling constant. The estimate (9.98) follows directly from the Gel’fand– Levitan formulation and the values Kx (1, t) and K(1, t); see [300]. Remark 9.2. The following must be taken into account in any interpretation of Theorem 9.11. From (9.78) we know that for q ∈ L2 the eigenvalues behave as λn = n2 π 2 + q¯ + en with {en } ∈ 2 . If q is smoother, then {en } will converge to zero at a faster rate, as we saw in Lemma 9.8. This means the information content in {λn }, namely the sequence {en }, is being masked by a term growing quadratically in n. For even a relatively smooth q, say just in C 2 , it is frequently the case that the ratio of λ10 and the value of e10 is 106 or larger which clearly limits our ability to recover the tenth Fourier mode of q unless the spectral sequence is provided to enormously high accuracy. Thus when boundary data contaminated by noise is converted into a spectral sequence for use in undetermined coefficient recovery as in, for example, Section 10.4, this aspect must be taken into account. Remark 9.3. Given the existence of the Liouville transform, one might be tempted into concluding that exactly the same situation occurs if we have, say, the case of the equation −a(x)un = λn un . This is not quite true. The leading term in the asymptotic expansion of theeigenvalues on 1 the unit interval in this case is λn = n2 π 2 /L2 , where L = 0 [a(s)]−1/2 ds, and so unlike the potential problem, information about the coefficient is contained in the leading order term. This can also translate into a significant difference in the case of recovery of a(x) from an initial value problem for the subdiffusion operator. Indeed, we will return to this observation in Section 10.5 where recovery of both a(x) and q(x) is sought in the problem ∂tα u − −a(x)ux x + q(x)u = f from two sets of initial/boundary value data measuring each solution at t = T . The singular values of the linearised
278
9. Fundamental Inverse Problems for Fractional Order Models
inversion for the matrix arising from a are between one and two orders of magnitude greater than those from the matrix arising from q, indicating that reconstruction of a is much more stable than reconstruction of q. 9.2.2.4. A constructive algorithm for q. The uniqueness proofs presented in this section can be used as the basis of a constructive algorithm for obtaining the potential q(x) from the spectral data. For simplicity we take ˜ ∞ the case h = ∞ and the two-spectrum {λn }∞ n=1 and {λn }n=1 data cases. This is again taken from the paper [300] and the details we have omitted can be found there. Note that since we are able to convert each of the versions of the inverse spectral problems to the two-spectrum case by a specific formula, the constructibilty algorithm and its proof to be shown serves for the two-spectrum, endpoint data, and norming constant data versions. We know the eigenfunctions must satisfy the representation (9.94) where K satisfies (9.88) and (9.89) with p set to zero, together with a boundary condition at x = 1, say, u (1) + Hu(1) = 0. Proceeding as above, we use the 1 asymptotic form of the eigenvalues to obtain an estimate for 0 q(s) ds the mean of q(x), a quantity we will denote by q¯. Now the eigenvalue equation is −y + q(x)y = λy, and we can write this as −y + (q(x) − q¯)y = (λ − q¯)y working instead with the potential q ∗ = q(x) − q¯ (which has mean zero) and a new eigenvalue sequence λ∗n = λn − q¯. So by this device we may assume without any loss of generality that q¯ = 0. This means that K(1, ±1) = 0. Applying the boundary condition at x = 1 to the function φn (x) gives √ 1 H sin λn [Kx (1, t) + HK(1, t)] √ t dt. (9.99) cos λn + √ sin λn = − λn λn 0 √ Now {sin λn t} is complete so that we can solve (9.99) uniquely to obtain the value of Kx (1, t) + HK(1, t). If we let g(t) = Kx (1, t) + HK(1, t), then (9.99) is 1 g(t) sin λn t dt = − λn cos λn − H sin λn . 0
Assuming we have obtained only the first N eigenvalues {λn }N n=1 , set g(t) = N and we get the matrix system, A[ga , 1 · · · gN ]T = [b1 · · · bN ] k=1 gk sin kπx, 1 √ √ √ √ where bk = − λk cos λk − H sin λn and Ajk = 0 sin jπx sin λk x dx. Note that we have already shown that the matrix A is invertible. Thus the first N eigenvalues corresponding to the boundary condition u (1) + Hu(1) allow us to compute the first N Fourier coefficients of g(t) = Kx (1, t) + HK(1, t).
9.2. Direct and inverse Sturm–Liouville problems
279
Repeat this for the second spectrum to obtain the first N Fourier coef˜ n }N . We thus ficients of Kx (1, t) + H K(1, t) from the N eigenvalues {λ n=1 obtain the first N Fourier coefficient approximation to both K(1, t) and Kx (1, t). We know K(x, t) must solve the hyperbolic equation (9.88), with the boundary conditions on the characteristics x = ±t, or alternatively the Goursat data consisting of the values K(1, t) and Kx (1, t) that we have now obtained from the spectral values. If we actually knew the potential q(x), then this would lead to an overdetermined problem: indeed either the data {K(x, 0), K(x, x)} or the data {K(1, t), Kx (1, t)} would be sufficient to determine K if q were known. This suggests that we look at two separate hyperbolic boundary value problems, one using Goursat data and the other using Cauchy data. Let u(x, t; q) and v(x, t; q) denote the solutions of u(1, t) = K(1, t),
vtt − vxx + q(x)v = 0, x v(x, x) = 12 0 q(s) ds,
ux (1, t) = Kx (1, t),
v(x, 0) = 0.
utt − uxx + q(x)u = 0, (9.100)
Here K(1, t) and Kx (1, t) are given from the eigenvalue sequences by means of solving equations of the form (9.99). Then in order to recover the function q(x), we need to solve either of the (nonlinear) equations (9.101) d v(1, t; q) K(1, t) 0 u(x, x; q) = q(x) or S[q] := = = . T[q](x) := vx (1, t; q) Kx (1, t) 0 dx In the first case we can elect to solve the leftmost equation in (9.101). This seeks a fixed point of the mapping T and means that, at each iteration, we have to solve a Cauchy problem for the hyperbolic equation utt − uxx + qn (x)u = 0 with data given on the boundary x = 1. We then solve on the diagonal t = x for u(x, x). This is followed by updating the value of qn to qn+1 by solving the trivial equation qn+1 (x) = 2[u(x, x)] . This iteration scheme is then easily shown to converge. Theorem 9.18. T has a unique fixed point in L∞ (0, 1). Proof. For a fixed M > 0, define CM = {q ∈ L2 (0, 1) : |q(x)| ≤ M a.e.}. Let PM denote the operator of projection onto CM , that is h(x) if |h(x)| ≤ M, (9.102) PM h(x) = ±M if ± h(x) ≥ M. Suppose q and qˆ are fixed points of T, and choose M so that |q|, |ˆ q| < M . Thus q and qˆ are also fixed points of PM T, and we are done if we show that
280
9. Fundamental Inverse Problems for Fractional Order Models
q −→ PM T is a contraction on CM in the weighted norm 1
q 2L2 := q 2 (x)e2κ(x−1) dx κ
0
for some sufficiently large κ. We clearly have (9.103)
q ) L2κ ≤ T(q) − T(ˆ q ) L2κ ,
PM T(q) − PM T(ˆ
and using the equation defining T, it is easy to show that (9.104) 1 (ˆ q (y) − q(y))u(y, 2x − y; q) dy T(q)(x) − T(ˆ q )(x) = 2 x 1 qˆ(y)[u(y, 2x − y; qˆ) − u(y, 2x − y; q)] dy. +2 x
By introducing the Riemann function for uxx − utt + q(x)u, the second term on the right may be rewritten in the form 1 Q(x, y)(ˆ q(y) − q(y)) dy x
with a bounded kernel Q, depending on q and qˆ. We therefore see that 1 |q(y) − qˆ(y)| dy, |T(q)(x) − T(ˆ q )(x)| ≤ C x
and so by a standard calculation it follows that C
T(q) − T(ˆ q ) L2κ ≤ √ q − qˆ L2κ . κ Thus by choosing κ sufficiently large, we obtain contractivity of PM T.
The scheme also converges exceedingly rapidly. For the three functions shown in Figure 9.3, effective numerical convergence is achieved by the second or third iteration. As an example, for q (1) (x) and N = 10, the value of the 9 relative maximum norm error in the reconstruction, that is
qn − q ∞ q0 − q ∞ , drops to 0.0176 after one iteration and equals 0.0036 for n = 2. No further improvement is gained by continuing the scheme. For larger magnitudes of q, the scheme will require further iterations. In the case of the first test function it is clear that the function can be effectively reconstructed from the first five eigenvalues of the two spectra. This should also be clear from Figure 9.3 which shows that the values of 1 en = λn − n2 π 2 − 0 q(s) ds are exceedingly small for N > 5. In fact, e1 = 0.096, e2 = 0.0737, e3 = 0.0139, and there is certainly information to be extracted from these values. However, |en | < 10−3 for n ≥ 4 and less than 10−4 for n > 10. What this says is that there is little relative information in the higher eigenvalues for this particular potential.
9.3. The fractional Sturm–Liouville problem
281
Alternatively, choosing the rightmost equation in (9.101), for a given approximation qn (x) we can solve the Goursat problem for v in the region x 0 ≤ t ≤ x ≤ 1 where the data is v(x, 0) = 0 and v(x, x) = 12 0 qn (t) dt and then evaluate v(1, t) and vx (1, t) to match up with the spectral data on x = 1 by, for example, Newton’s method. We also remark that the convergence rate of Halley’s method is truly spectactular here. For a simple-to-medium complex q a single iteration suffices, and for even a complex q two iterations suffice. Note that in each case we have a well-posed problem to solve for either u(x, t) or v(x, t), but we have to perform a differentiation to make the update to qn+1 . This shows that from this perspective the inverse Sturm–Liouville problem is very mildly ill-posed. The more significant issue is the fact that any error in the measurement of {λn } is immediately incorporated into the recovery of q only through the term en = λn − n2 π 2 (for the Dirichlet case). Thus an error of 1% in, say, λ10 will contribute to a significant error in en due to the n2 π 2 term.
9.3. The fractional Sturm–Liouville problem In the Djrbashian–Caputo case our eigenvalue problem is (9.105)
α −DC a Dx u + q(x)u = λu,
where λ is a (possibly complex-valued) constant and we assume that 1 < α ≤ 2. Suppose first that we have Dirichlet boundary conditions, (9.106)
u(a) = u(b) = 0.
Any solution to (9.105), (9.106) is defined at most up to an arbitrary constant factor, and we have a choice of normalisation of the resulting eigenfunctions u. One possibility is to do so by scaling u so that u L2 = 1 but we also can choose to take u (a) = 1. In the case of an ode this is a unique determination for a given value of q and λ. It is also in the fractional case for the initial approximation function is u0 (x) = xEα,2 (λxα ), and it uniquely determines each subsequent iterate of the Picard iteration α k+1 + λuk+1 = q(x)uk . Indeed, in the case where q(x) = 0 defined by DC a Dx u the eigenfunction is φ(x) = (x − a)Eα,2 (λ(x − a)α ) with λ characterised by λ = ξ(b−a)−α where ξ is a zero of the Mittag-Leffler function Eα,2 (z) so that both endpoint conditions are satisfied. If we had to modify the boundary conditions to u (a) = 0 and use the normalisation u(a) = 1, then the corresponding eigenfunction would be φ(x) = Eα,1 (λ(x − a)α ), where λ is now λ = ξ(b−a)−α and ξ is a zero of the Mittag-Leffler function Eα,1 (z). The zeros of the Mittag-Leffler functions Eα,1 and Eα,2 can be computed although this is not without its complications. These are in general complex-valued as we have seen in Theorems 3.26 and 3.2.
282
9. Fundamental Inverse Problems for Fractional Order Models
For nonzero q(x) one can in principle rely on the integral equation representation, Theorem 5.4. In the classical case with second derivatives one uses the eigenfunction for q = 0 (a sine or cosine function) and the boundary condition at x = b to get an initial approximation to the eigenvalues and hence to the eigenfunctions. Each successive iterate of the integral equation gives a transcendental equation for a correction to the previous set of eigenvalues, and the process is repeated. This can be effective due to the fact that the asymptotic behaviour of the eigenvalues (for Dirichlet boundary condi1 tions on (a, b) = (0, 1)) in case α = 2 is λn = n2 π 2 + 0 q(s) ds+en , where en tends to zero for large n—and quite rapidly for smooth q; cf. Section 9.2.1. One implementation of this (again for simplicity we only consider the Dirichlet case φ(a) = φ(b) = 0) is to use Newton’s method to solve the equation φ(b; λ) = 0 for the eigenvalues. The nth iterate solves −φ + q(x)φ = λn φ using the current value of λn on (a, b) subject to the initial conditions φ(a) = 0, φ (a) = 1. We let F (λ) = φ(b, λ) and use a Newton (or a frozen Newton) scheme to update the value of λn → λn+1 for each n. Even in its most basic form this works very well. One can also replace the Newton iteration by a bisection scheme. This gives only a slight loss of convergence rate at a gain of not requiring the derivative of the map of F with respect to λ. The keys to success here are the facts that we have an extremely good initial approximation λ0 for the (real-valued) spectrum and that the nonlinearity of the equation F (λ) = 0 is quite mild. For the fractional case things are very different. First, the map F is now complex valued and bisection has to give way to the Newton version of the scheme. The computation of F (λ) in the classical case can take advantage of high order quadrature solvers, such as Runge-Kutta-Fehlberg or stiff solvers for large values of λ. The corresponding solvers available for the fractional case are much lower order and hence much more computationally intensive. Perhaps most important is the fact that the asymptotic behaviour of the error terms is much slower leading to poorer initial approximations to λ, and this can cause considerable difficulty with the Newton schemes. The suggested method here is to use a finite element scheme in space and compute the eigenvalues from the resulting stiffness matrix. The analysis is delicate and would take us beyond our scope; we refer to the papers [164, 165].
From a theoretical standpoint there is much known about Sturm-Liouville theory for classical derivatives, and indeed the picture was almost complete by the middle of the nineteenth century; cf. Section 9.2.1 for a brief overview. The eigenvalues are real, the eigenfunctions are simple, and those corresponding to distinct eigenvalues are mutually orthogonal. The zeros of eigenfunctions arising from successive eigenvalues strictly interlace, and
9.3. The fractional Sturm–Liouville problem
283
there are various monotonicity theorems relating the spectrum to the coefficients and boundary conditions easily seen from the Rayleigh quotient (see for example [340, Chapters 35,36]) which itself relies on the orthogonality of the eigenfunctions. None of these results in the above generality are known in the case of fractional derivatives as in (9.105), and in many instances the conclusions mentioned above would not be allowed. α If λ ∈ C is an eigenvalue of −DC 0 Dx , then its conjugate λ is also an eigenvalue. Thus the eigenfunctions will come in complex conjugate pairs. There are only finitely many real eigenvalues for any α < 2. In fact, the existence of real eigenvalues is only guaranteed for α sufficiently close to 2. In the Djrbashian–Caputo case it has been shown [277] that there exists real eigenvalues provided α > 5/3. Careful numerical experiments indicate that the first real zeros appear for α ∈ (1.5991152, 1.5991153) and they occur in pairs [169]. For the Riemann–Liouville derivative there is always at least one real eigenvalue [165].
In the case q = 0, we have from Proposition 3.2 that
1 α ln |n| π α +O . zn = 2nπi − (α − 1) ln 2π|n| + sign(n) i + ln 2 Γ(2 − α) |n| This leads to asymptotic values for the magnitude and phase of (9.107)
α 2 2 2 2 α ) π + ((1−α) ln 2πn + ln ) ∼ (2πn)α, |λn | ∼ (2n + 1−α 2 Γ(2−α)
(2 − α)π 2πn + (1 − α) π2 . ∼ arg(λn ) ∼ π − α atan α (α − 1) ln 2πn − ln Γ(2−α) 2 The magnitude prediction in (9.107) is accurate except for the first few eigenvalues, but the phase is accurate only for very large eigenvalue numbers, and this is particularly evident as α approaches 2. This can be attributed
250
Im (λ )
.. .....
α = 7/4
200
. .....
. .....
150 100 .. ..... .. ..... •
50 .. .....• .• .....
.. .....•
. .....
.. ....•
.. .....•
.. .....
.. .....
.. .....
.. ..... •
. ..... •
•
Im (λ )
α = 5/4
•
. ..... •
400
•
. ..... •
•
200
.. ..... •
Re (λ ) 25
600
50
75
100
.. .....
. ..... ... • ..••• .•
. .....
•
.. .....
. .....
.. .....
.. .....
. .....
.. .....
.. .....
. .....
.. .....
. .....
.. .....
.. .....
. .....
•
.. .....
• •
• •
• •
•
•
Re (λ ) 500
1000
1500
α Figure 9.5. Dirichlet eigenvalues of the operator −DC 0 Dx on (0, 1) for 5 7 α = 4 and α = 4
2000
284
0.3
0.2
0.1
0.0
-0.1
-0.2
9. Fundamental Inverse Problems for Fractional Order Models
α = 7/4 α = 3/2 α = 5/4
..................................... ....... ....... ...... ...... ..... ...... . . . . ..... ..... ..... ..... ..... . . . ..... ... ................ . ..... . .. ........................................ ..... . . ....... ....... ..... .. ........ . . . . . . ..... ..... ....... ......... ..... . . . . . . . ..... ...... ..... .......... . . . . . ..... ...... ..... ... . . . . ..... . . . . ..... ...... ..... ..... . . . . . . . ...... .... ..... ... . . . . . . ..... ...... .... .. ..... . . . . . .... ...... ..... ... . . . . . . ....... .... ..... . . . . ..... . . . . ......... ..... . . . . . . . . . .... . . ......................................... ..... .. ..... ...... .... ..... . . ... .... .................. ...... . ... .... . . . ...... ... .. ... ......... ........ ... .. ................................... ... .. . ... .. .. .. .... .... . .. . . . . . . ......... .. .. ......... .. .. .. .. .. .. .. .. . .. .. .. .. .. .. ... ... ... .............
0.1
0.0
-0.1
α = 7/4 α = 3/2 α = 5/4
............. ............................ ... .......... ... ......... ............ .. ......... .... ....... . ... ........ ... ... .. ... ......... . . .... .. ...... ... .. .. ...... ... .. ...................... ..... . . . . .... ........ ..... .. .. ......... ...... ...... ... . ... .. .. .... ............ ...... ... . ... ........ .............. . ... ......... ...... ... .. ........ ....... .. ..... ... .... ......... .. ...... . . ...... ... ..... . . .... ... ... . .............. ... ......... ..... . . . ... . . . . ... ... . . ............ .......... ......... .. . .. .. ........... .. ..... .... ... .......................... ..... .. .... ...... ...... ......................... ... .. .. . .. .. .. ... ........... ...... ................. .. .. .. ........ ... .. ........ .. .......... .. .. . ... .................... ... .. ................ .. .. . . .. . . ... . .... ... .. .. .. ... .. .. .. .. .. ........... .. .. .. .. .. ... .. .. .. ... .. .. .. . . . . .. . .. .. ....... .. .. .. .. .. .. .. .. .. .. .. .. .. . . . . .. ...... . .. .. .. .. .. .. . .. .. ... ...
0. α Figure 9.6. Dirichlet eigenfunctions of the operator −DC 0 Dx on (0, 1) 5 3 7 for α = 4 , 2 , 4 . The figure on the left shows the lowest frequency eigenfunction, while that on the right shows the second eigenfunction. The solid lines indicate the real part of φn , and the dotted lines correspond to the imaginary part.
to the relatively crude approximation made in deriving these asymptotic formulae due to taking only a limited number of terms of the Mittag-Leffler functions expanded at infinity, but also to the slow convergence of the arctangent function for large argument. This pattern appears to continue, thus leading to the conjecture that there is always an even number of real eigenvalues. In the case of α = 1.75 numerical evidence suggests there are four of these as indicated in Figure 9.5. However, these pairs can be quite close to each other and so there is no possible interval condition, as in the classical case of α = 2 where there is exactly one (Dirichlet) eigenvalue in each interval 1 (nπ, (n + 1)π) if 0 q(s) ds = 0 [58]. For example, with α = 1.5991153, the two smallest eigenvalues in magnitude are real and simple with approximate values 14.0024 and 14.0150. The eigenfunction of the lower eigenvalue has no zeros in (0, 1), while the larger has one single zero and so they are linearly independent. However, this zero occurs at x ≈ 0.9994 and the supremum norm difference of the two eigenfunctions is less than 1.84 × 10−4 . Thus for all practical purposes, neglecting one of these pairs will have virtually no effect on most numerical computations. α α The eigenfunctions φn for −DC 0 Dx are given by φn (x) = x Eα,2 (−λn x ). These functions are sinusoidal in nature but are significantly attenuated near x = 1 with the degree of attenuation strongly depending on the fractional order α. Figure 9.6 shows the first three eigenfunctions for α = 5/4, 3/2, 7/4. While it is usual to normalise eigenfunctions by scaling so the L2 norm is set to unity, in one dimension an equivalent and frequently more useful approach is to specify an endpoint condition: for Dirichlet conditions normalise by
9.3. The fractional Sturm–Liouville problem
285
taking φn (0) = 1; otherwise set φn (0) = 1. We have followed this approach here, taking φn (0) = 1. One consequence of this is the eigenfunction amplitudes decrease with increasing index. In the case α = 2 and q = 0, this normalisation on the Dirichlet eigenfunctions gives φ√n (x) = sin(nπx)/nπ whereas the L2 normalisation to unity gives φn (x) = 2 sin(nπx)/nπ. With complex-valued eigenfunctions one can always multiply by any complex number so that multiplying by i interchanges the previous real with imaginary ones. The choice of φn (0) = 1 sets a consistent choice for selecting the real part of the eigenfunction. Figure 9.7 shows the location of the zeros of both the real and the imaginary parts of the first six eigenfunctions for the case α = 32 , where the index count neglects the eigenvalue-eigenfunction pairs coming from the eigenvalue with negative imaginary part. It shows that the number of interior zeros of both the real and imaginary parts increases by two with consecutive eigenfunctions. Also, the number of interior zeros of the real and imaginary parts of the respective eigenfunction φn always differs by one. It should be stressed that this pattern is based on numerical evidence; there is, at the moment, no formal proof of this situation. Thus there is no strict interlacing of the zeros of the eigenfunctions as there is in the case α = 2. This pattern seems to be typical for α less than the critical value αcrit when all eigenvalues are complex. For α > αcrit the first eigenfunction is real and has no zeros. When it is just less than αcrit , then a zero appears near x = 1 and with further decrease in α the location of this zero also shifts towards x = 0. With α = 1.599025 < αcrit , there are no real roots and the two smallest eigenvalues in magnitude are the complex conjugate pair 14.0062 ± 0.1955i. As we noted above, the imaginary part of the corresponding eigenfunction has no zeros in (0, 1), but the real part does. This is at a point x0 ∈ (0.9997, 0.9998) and the maximum value of the real part on (x0 , 1) is less than 5.25 × 10−9 . Thus in a practical sense this zero plays no computational role. In the case α > αcrit the number of interior zeros for consecutive real eigenvalues differs only by one which is exactly the situation for the classical Sturm–Liouville case. After the eigenvalues again become complex, the situation reverts to the case of α < αcrit . These examples highlight the difficulty in obtaining analytic results or estimates on the properties of the eigenvalues or eigenfunctions. We now consider the case when the derivative is of Riemann–Liouville type. There are many similarities with the Djrbashian–Caputo derivative, but as we shall see, there are also some significant differences. This will be especially true when we turn to inverse Sturm–Liouville problems in Section 9.4.
286
9. Fundamental Inverse Problems for Fractional Order Models
Figure 9.7. Location of the zeros of the first six Dirichlet eigenfuncα 3 tions for the operator −DC 0 Dx on (0, 1) with α = 2 . The left figure is for the real part of φn ; the right for the imaginary part. α The solutions of the equation −RL 0 Dx u = λu with u(0) = 0 are given by scalar multiples of
(9.108)
u(x) = xα−1 Eα,α, (−λxα ).
Thus the Dirichlet eigenvalues on the interval (0, 1), {λn } are given by the zeros of Eα,α, (z) = 0. Our proof of the asymptotic behaviour for the Djrbashian–Caputo eigenvalues will go over virtually without significant change. However, the eigenfunctions {ψn } are now given by ψn (x) = xα−1 Eα,α, (−λn xα ). Since limx→0 Eα,α, (−λxα ) = 1, it follows that ψn (x) ∼ xα−1 as x → 0. Thus for 1 < α < 2, ψn (x) ∼ xα−1 and so cannot even be Lipschitz continuous at x = 0. There is therefore no possibility of normalising by setting the derivative at x = 0 to be unity, although we can modify this by setting a fractional derivative to be unity.
Figure 9.8. The leftmost figure shows the lowest Dirichlet eigenfuncα tions of the operator −RL 0 Dx on (0, 1). The rightmost shows the number α of real Dirichlet eigenvalues of −RL 0 Dx on (0, 1) as a function of α.
9.3. The fractional Sturm–Liouville problem
287
α Figure 9.9. The Dirichlet eigenfunctions φ2 , φ5 , φ10 for −RL 0 Dx on 5 (0, 1) with α = 4 . The real parts are shown on the left, the imaginary on the right.
We will prove below that there is always at least one real eigenvalue and the associated eigenfunction is strictly positive in (0, 1). Figure 9.8 shows the first eigenfunctions for our usual three α values, and it can be seen that they more nearly become symmetric about x = 1/2 as α increases. This is to be expected since as α → 2 the eigenfunction of u , sin nπx, must be recovered. For α sufficiently near 1 there is only a single real eigenvalue and corresponding eigenfunction. This situation actually holds for α less than approximately 1.34 after which there are three real eigenvalues with linearly independent eigenfunctions. However, just after this critical value, the two eigenfunctions are almost indistinguishable from any practical viewpoint. After α is approximately 1.49, there is then a further bifurcation into five real eigenvalues with independent eigenfunctions. This process continues, but at a greatly accelerated rate, so that by α = 1.9 there are 65 real eigenvalues. This is illustrated in the rightmost figure of Figure 9.8. With increasing α from α = 1, each subsequent real eigenvalue added has an eigenfunction with one more zero. When subsequent added eigenvalues become complex these occur in complex conjugate pairs (although there is no proof that a real eigenvalue or eigenvalue pair might appear instead) as in the Djrbashian– Caputo case and then the added eigenfunctions have two more zeros on (0, 1) than the previous one. Figure 9.9 shows a few higher eigenfunctions for the case α = 43 and the behaviour here is typical of other values. One should note both the singular behaviour at the origin and the sinusoidal one with rapidly decreasing amplitude as x → 1.
288
9. Fundamental Inverse Problems for Fractional Order Models
However, these are merely observations based on computations with known and provable error bounds and should not be considered a formal proof of these observations. As in the Djrbashian–Caputo case it is unclear mathematically whether the eigenvalue(s) at the bifurcation point is geometrically/algebraically simple. Naturally, the bifurcation is not stable under the perturbation of a potential term and then the eigenfunctions can be noticeably different; see [165]. After most of what we have had to say here has been observational, and based on computations, we will end with the previously promised analytical result. Theorem 9.19. The lowest Dirichlet eigenvalue in the case of a Riemann– Liouville derivative with a zero potential, is always real and positive. The associated eigenfunction is strictly positive in (0, 1). Proof. We consider the operator T : C0 [0, 1] → C0 [0, 1], f → Tf , with Tf defined by Tf = (0 Ixα f )(1)xα−1 − 0 Ixα f (x). 1 α It is readily checked that λ is an eigenvalue of −RL 0 Dx iff λ is an eigenvalue of T, with the same eigenfunction. Clearly, T is linear and it can be easily shown that the operator T : C0 [0, 1] → C0 [0, 1] is compact. Let K be the cone of nonnegative functions in C0 [0, 1]. We will show that operator T is positive on K. Let f ∈ C0 [0, 1] and f ≥ 0. Then x 1 1 α−1 α−1 (1 − t) f (t) dt x − (x − t)α−1 f (t) dt Tf (x) = Γ(α) 0 0 1 1 (9.109) (1 − t)α−1 f (t) dt xα−1 = Γ(α) x x + ((x − xt)α−1 −(x − t)α−1 )f (t) dt . 0
For any x ∈ D, the first integral is nonnegative. Similarly, (x − xt)α−1 > (x − t)α−1 holds for all t ∈ (0, x), which shows the second integral is also nonnegative. Hence, Tf ∈ K, i.e., the operator T is positive. Now it follows directly from the Krein-Rutman theorem [73, Theorem 19.2] that the spectral radius of T is an eigenvalue of T, and an eigenfunction u lies in K\{0}. Most authors have focused on retrieving as much as possible from wellknown results of the classical case, in particular seeking conditions on the types of fractional operators at each boundary that will give real eigenvalues and mutually orthogonal eigenfunctions.
9.3. The fractional Sturm–Liouville problem
289
Another possibility is to combine both types of derivatives in an either additive or multiplicative way. α RL α In the first category there is the Riesz derivative, θRL a Dx + (1 − θ) x Db 1 with the case θ = 2 giving the symmetric form. We will study this in more detail in Section 12.1.2.6.
The use of product combinations are motivated by the fact that they are much more amenable to an integration by parts formula than a combination appearing as a sum. The exact physical motivation for these combinations is to some extent still unclear. The Dirichlet boundary condition situation for this case has been considered by several authors, but the extension to fractional impedance conditions is straightforward, although a distinction then must be made between the type of fractional operator as the Riemann– Liouville and Djrbashian–Caputo derivatives are no longer identical; see, for example, [291], [360]. We now provide an example of such a two-sided derivative that comes as a product and has real eigenvalues. A two sided fractional derivative. So far in this section we have considered only one-sided derivatives—those taken from the left endpoint. It is also possible to view this from a more symmetric perspective and consider an operator involving derivatives starting at both x = a and x = b. We can make such a combination in several ways, either by sums or by products. The Riesz derivative (cf. Section 12.1.2.6) can be formulated in terms of weighted averages of left and right derivatives. However, from a mathematical perspective it turns out one of the simplest forms to analyse is to take a particular product combination of left and right Riemann–Liouville derivatives. We shall look at this case now; cf. [197]. For (9.110)
1 2
< β < 1, let Lβ± be defined on (a, b) by β RL β Lβ± := −RL x Db p(x) a Dx + q(x).
Note this is composed of a left sided Riemann–Liouville derivative followed by a right sided Riemann–Liouville derivative at the other endpoint. We take p(x) > 0 and q(x) to be continuous functions on [a, b]. Boundary conditions can be of the fractional impedance type, namely a linear combination of a α−1 u(a) − hu(a) = 0, with fractional order derivative and the identity RL a Dx a corresponding version at the right-hand endpoint. However, for simplicity we assume zero Dirichlet boundary conditions at both endpoints. With this assumption the Riemann–Liouville derivatives are also Djrbashian–Caputo derivatives.
290
9. Fundamental Inverse Problems for Fractional Order Models
For a positive and continuous weight function r(x), we seek λ and φ ∈ such that
L2 (a, b)
Lβ± φ = λr(x)φ.
(9.111)
Lemma 9.12 is the analogue of the usual integration by parts formula for second order operators in divergence form and shows that Lβ is self-adjoint. β β 2 Lemma 9.12. If f ∈ b Ixβ (L2 (a, b)) and p(x)RL a Dx g ∈ a Ix (L (a, b)), then b b β f (x)L± [g(x)] dx = g(x)Lβ± [f (x)] dx. a
a
b
Proof. Set Iqf g := a q(x)f (x)g(x) dx. Then we have b b β β RL β f (x)L± [g(x)] dx = − f (x)RL x Db p(x) a Dx g(x) dx + Iqf g a
a
b β RL β = − p(x)RL a Dx g(x) a Dx f (x) dx + Iqf g a
b β RL β = − g(x)RL x Db p(x) a Dx f (x) dx + Iqf g a
b
= a
g(x)Lβ± f (x) dx.
Theorem 9.20. All the eigenvalues of Lβ± are real and eigenfunctions corresponding to distinct λ are orthogonal with respect to the weighted L2 inner product with weight function r(x). ¯ φ, ¯ Proof. Taking complex conjugates, we have the pair Lβ± φ = λφ, Lβ± φ¯ = λ and using the above lemma, we obtain b b β ¯ β ¯ ¯ 0= [φL± φ − φL± φ] dx = (λ − λ) r(x)|φ2 (x)| dx. a
a
L2
norm, it must follow that Since an eigenfunction must have a nonzero ¯ λ = λ. Suppose λ1 and λ2 are distinct eigenvalues corresponding to eigenfunctions φ1 (x) and φ2 (x), that is, Lβ± φ1 = λ1 r(x)φ1 and Lβ± φ2 = λ2 r(x)φ2 . Now multiplying the first equation by φ2 , the second by φ1 , subtracting and integrating over [a, b], we obtain b b r(x)φ1 (x)φ2 (x) dx = [φ1 (x)Lβ± φ2 (x)−φ2 (x)Lβ± φ1 (x)] dx = 0, (λ1 −λ2 ) a
a
where the last step follows from Lemma 9.12. Since we assumed λ1 = λ2 , we obtain the second conclusion of the theorem.
9.4. The inverse Sturm–Liouville problem with fractional derivatives
291
It may seem that the existence of complex eigenvalues/eigenfunctions and lack of orthogonality of the eigenfunctions is due solely to the one-sided nature of the derivatives and that normal service is restored by looking at a two-sided operator. This is not the situation, and the example taken in this section should be be viewed as a specific case rather than the general rule for two-sided eigenvalue problems. Much of the discussion in this section on one-sided derivatives is taken from [169]. This case offers an opportunity to obtain the asymptotic behaviour of both the eigenfunctions and eigenvalues as a perturbation from the eigenfunctions arising from q = 0 as these are known Mittag-Leffler functions. What remains to be done is to obtain further terms in the expansion in the case of a nontrivial potential q. Indeed, the issue of obtaining such expansions in all cases, either with one or two sided derivatives remains an open question. We have seen in this section that there appears to be an interlacing property of the eigenfunctions when both real and imaginary parts are considered. It would be useful to be able to state and prove a precise formulation, not just for the one-sided, but also for the various combinations of two-sided derivatives. Sturm–Liouville theory plays an important role in the solution of many pdes. Separation of variables and the fact that the solutions in each direction (that is, the eigenfunctions) are complete allows one to construct a corresponding complete family for the entire pde. There are many ways to see completeness of the eigenfunctions for the regular Sturm– Liouville problem but these rely on properties that may not be true for the fractional counterpart. The difficulties here were fully appreciated by Djrbashian: “Questions of completeness of eigenfunctions, or a more delicate question whether they are a basis in L2 (0, 1) are undoubtedly interesting. But their solutions are apparently faced with significant analytical difficulties” [80]. There has been some more recent work on these questions with constant coefficients (so the Mittag-Leffler function can be used directly). We mention here the work of Malamud and of Aleroev, especially in the case where a fractional derivative appears in a lower order term; see for example [9]. In fact, this topic remains full of open questions and it is only in special cases where completeness had been shown. An example here is the two sided Riemann–Liouville/Djrbashian–Caputo combination in [360].
9.4. The inverse Sturm–Liouville problem with fractional derivatives We recall from Section 9.2.1 that the classical inverse Sturm–Liouville problem in potential form is to be given the operator Lu := u − qu, where u is
292
9. Fundamental Inverse Problems for Fractional Order Models
subject to boundary conditions at the interval endpoints, which we will take to be x = 0, 1. Suppose that u (0)−hu(0) = 0 and u (1)+Hu(1) = 0 for fixed h, H, and let φn (x; q, h, H) be the eigenfunction for −u + q(x)u = λn u with the above conditions. Then it is well known that the set {φn (x; q, h, H)}∞ n=1 is complete in L2 (0, 1); cf. Section 9.2.1. It is usual to normalise these eigenfunctions by the condition u(0) = 1 in the case that h < ∞ and take u (0) = 1 in the Dirichlet case h = ∞ corresponding to u(0) = 0. Clearly, for the potential q = 0, the of −u = λu with √ √ eigenfunctions u(0) = 0 and u(1) = 0 are φ(x) = sin( λx)/ λ while the eigenvalues are λn = n2 π 2 . It is easy to see that φ2n (x) spans the set of even functions in L2 (0, 1). If we now repeat for the boundary condition u (1) = 0, then the resulting squared eigenfunctions span the set of odd functions on L2 (0, 1). For q = q1 , q2 we denote the respective eigenfunctions by φn (x; q1 ) and φn (x; q2 ), and integration by parts shows that
1
(9.112)
(q1 − q2 )φn (x; q1 )φn (x; q2 ) dx = 0
for all n.
0
If in the above one replaces the eigenfunctions by their value at q = 0, that is, φn (x; 0), the spanning argument shows that if q1 and q2 are even, then (9.112) proves that q1 = q2 . This shows uniqueness for the linearised problem around a constant potential. What can be proven in general is that one needs two spectral sequences in order to uniquely recover q. Recall from Section 9.2.2 that examples here include the following. ∞ • The eigenvalues {λn,1 }∞ n=1 and {λn,2 }n=1 are given for the boundary parameters h, H1 and for h, H2 , respectively. Then the pair {λn,1 , λn,2 }∞ n=1 uniquely determines the potential q. This is the two-spectrum problem.
• The eigenvalues {λn }∞ n=1 are given for fixed values of h, H together 2 with the L norms of the eigenfunctions { φn L2 }2n=1 . This is the norming constant problem, and the data also uniquely determines q. • If it is known that the potential is symmetric about the midpoint x = 12 , that is q(1 − x) = q(x) and if h = H, then the eigenvalues {λn }∞ n=1 uniquely determine q. Now we turn to the inverse Sturm–Liouville problem with a Djrbashian– Caputo fractional derivative in the leading term. We take the boundary conditions to be φ(0) = φ(1) = 0. Given a spectral sequence {λn }, determine
9.4. The inverse Sturm–Liouville problem with fractional derivatives
293
the potential q(x) (or more precisely, determine what we can about q(x)) in α −DC 0 Dx φn + qφn = λn φn
in (0, 1),
φn (0) = φn (1) = 0,
(9.113)
φn (0) = 1, where the condition φn (0) = 1 serves as the normalisation condition of the eigenfunction φn . The eigenvalues will be complex except for possibly the lowest few and the eigenfunctions will also be complex valued. When q = 0, the eigenfunctions are given by xEα,1 (−λxα ), but now λ ∈ C. We can compute the map Fn : q → φ(1) of the initial value problem for φ = φ(q, λn ), α −DC 0 Dx φ + qφ = λn φ
(9.114)
φ(0) = 0,
in (0, 1),
φ (0) = 1,
and the composite map F = (F1 , F2 , . . .) as well as its linearisation. The derivative w = Fn (q)[δq] satisfies the sensitivity equation α −DC 0 Dx w + qw = λw − φ(q, λn )δq
w(0) = 0,
in (0, 1),
w (0) = 0 ,
where φ(q, λn ) solves (9.114). Hence, the linearised problem F (0)[δq] = 0 at q = 0 for a given eigenvalue λn implies that with the solution representation from Theorem 5.4, 1 K(λn , s)δq(s) ds 0= 0
with the kernel K(λn , s) = (1 − s)α−1 Eα,α (−λn (1 − s)α ) sEα,2 (−λn sα ). In case of α = 2, the kernel K(λn , s) is symmetric with respect to s = 1/2, i.e., K(λn , s) = K(λn , 1 − s), and thus any odd function lies in the nullspace of the integral operator. However, for any 1 < α < 2, the kernel K no longer satisfies K(λn , 1 − s) = K(λn , s). Thus not all of the odd part of the function δq needs to be in the nullspace of F (0). However, this gives neither an indication of how much of the odd part of δq is captured nor how much of the even part. However, it seems that this single Dirichlet (complex-valued) spectrum is able to recover the entire q. See [169], where it was found that excellent reconstructions were possible for α less than about 3/2, the smaller the α the better the condition number and behaviour of the singular values. In this regime all the eigenvalues occur in complex conjugate pairs, see (9.107) and Figure 9.5. As α increases towards α = 7/4, the problem clearly shows strong ill-conditioning and by α = 1.9 only a very few of the lowest frequency modes can be recovered (surely only under relatively low noise). This is
294
9. Fundamental Inverse Problems for Fractional Order Models
expected, for as α → 2 we are attempting to recover the potential in the classical case from a single spectrum. We have no analytical proof or even solid heuristic reason why this works out to be the case, although there are other known inverse Sturm–Liouville problems with complex spectrum, the real and imaginary of which act as a pair of sequences carrying in a sense mutually exclusive information and allowing full recovery of a general potential. An example along these lines is in [297]. Figure 9.10 shows the singular values of the matrix F (0) when δq is represented in the basis {cos 2kπx, sin 2kπx}10 k=1 . That on the left shows the case for smaller α values; that on the right illustrates what happens as α → 2. From this we see the conditioning of F (0) worsens with increasing α, but the degree of ill-conditioning is relatively mild for α near unity, and even at α = 1.75 we should expect to retain a significant number of the basis vectors unless the errors in the spectra are high. If one looks at the singular values for the cosine basis (the even functions), then we find these are only slightly smaller than for the corresponding sine basis (the odd functions) although this distinction does increase with increasing α. Thus in the regime indicated by the leftmost figure, the distinction between our ability to recover the even and odd parts of δq is quite small. However for larger α this is no longer true.
Figure 9.10. The singular value spectrum of F (0) when δq is represented in the basis {cos 2kπx, sin 2kπx}10 k=1 .
The right side of the figure shows a very definite break between the first 10 singular values and the last 10; the larger singular values correspond to the even basis functions, the smaller to the odd basis functions. This distinction increases with α. Note also as α increases towards the value 2, the singular values corresponding to the even basis functions actually decay at a slower rate, but on the other hand, the singular values for the odd basis functions decay faster as α → 2. This is exactly what one should expect as the problem converges to the classical case; we recover the very mildly
9.4. The inverse Sturm–Liouville problem with fractional derivatives
295
ill-conditioned problem of reconstructing the even part of q (equal to a single derivative loss or σn ∼ 1/n)—but we have no information about the odd part of q. Example 9.1. Consider the inverse Sturm–Liouville problem with a Riemann–Liouville fractional derivative, α −RL 0 Dx ψ + qψ = λψ
in (0, 1),
ψ(0) = ψ(1) = 0, α−1 ψ)(0) = 1. Then the following with the normalisation condition (RL 0 Dx facts can be shown.
(i) The eigenfunctions at q = 0 are given by xα−1 Eα,α (−λxα ). (ii) Let F : q → ψ(1), where ψ solves the initial value problem α −RL 0 Dx ψ + qψ = λn ψ
in (0, 1),
α−1 ψ)(0) = 1. ψ(0) = 0, (RL 0 Dx
The linearised forward map F (0)[δq] at q = 0 with an eigenvalue λn in the direction of δq is given by 1 (1 − s)α−1 Eα,α (−λn (1 − s)α )sα−1 Eα,α (−λn sα )δq(s) ds = 0. F (0)[δq] = 0
(iii) Any odd function lies in the nullspace of the linearised forward map. The inverse Sturm–Liouville problem, for all its extensive study in the classical case and its importance as a building block for many undetermined coefficient problems involving a spatially dependent unknown, remains much of an enigma when transported to fractional derivatives. In Section 9.3 we have seen the difficulty of obtaining information about the forwards problem—being given the potential q(x) and looking to recover information about the spectral properties of the operator, namely those of the eigenvalues and eigenvectors. In the case of the classical inverse Sturm– Liouville problem an important tool is the Gel’fand–Levitan transformation. If φ(x, λ) and ψ(x, λ) are eigenfunctions corresponding to two different potentials q1 and q2 having the same spectrum {λn } and initial conditions at x = 0, then they must be related by x K(x, t) φ(t) dt, (9.115) ψ(x) = φ(x) + 0
where K(x, t) satisfies the hyperbolic equation (9.116) Ktt − Kxx + (q1 (x) − q2 (t))K = 0,
K(x, ±x) = h ±
1 2
x
[q1 (s) − q2 (s)] ds
0
296
9. Fundamental Inverse Problems for Fractional Order Models
involving q1 and q2 but independent of λ. Using the completeness of the eigenfunctions {φn } it is possible to show the uniqueness of most of the standard inverse eigenvalue problems, including those listed above from the analysis in Section 9.2; see also [300]. No such transformation seems to exist in the fractional case, and any attempt to emulate the steps in its derivation would certainly lead to replacing the kernel K by one involving fractional derivatives in both x and t together with boundary terms that simply vanish in the classical case but could not be expected to do so in the fractional. While it is of course possible to prove uniqueness for many of the second order inverse spectral problems, other properties of the second order differential equations come into play. An example here is the basic technique used by Borg [36] in his groundbreaking solution of the inverse spectral problem for two different boundary conditions. It is well known that 2 the set {φn (x; q, h, H)}∞ n=1 is complete in L (0, 1) but Borg proved that 2 the set of the squares {φn (x; q, h, H)} spans “half of L2 ” in the following senses. First, if q(x) = q(1 − x) and H = h so that the problem is symmetric about the line x = 12 , then {φ2n } spans the set of even functions in L2 (0, 1), and this is the key ingredient in the proof that the resulting spectrum {λn,h,H }∞ n=1 determines q uniquely. Second if H1 = H2 , then the set {φ2n (x; q, h, H1 ), φ2n (x; q, h, H2 )} spans L2 (0, 1), and this quickly leads to uniqueness of the potential q from the two spectral sequences {λn,h,H1 , λn,h,H2 }∞ n=1 . As we have seen in Section 9.3 even showing completeness of the eigenfunctions can be difficult in some cases; to determine the span of the squares seems a formidable challenge. Even if this were accomplished, the steps following in Borg’s method to show uniqueness of q would have to be modified due to the lack of a simple integration by parts formula.
9.5. Fractional derivative as an inverse solution One of the very first undetermined coefficient problems for pdes was studied in the paper by Jones, [171], based on his 1961 PhD thesis. This is to determine the coefficient a(t) from (9.117)
ut = a(t)uxx , u(x, 0) = 0,
0 < x < ∞,
t > 0,
−a(t)ux (0, t) = g(t),
0 < t < T,
under the overposed condition of measuring the temperature at x = 0 (9.118)
u(0, t) = ψ(t).
The paper [171] completely analyses the problem giving necessary and sufficient conditions for a unique solution and determining the exact level of
9.5. Fractional derivative as an inverse solution
297
ill-conditioning. The key steps in the analysis are a change of variables and conversion of the problem to an equivalent integral equation formulation. Perhaps surprisingly, this approach involves an Abel fractional integral and subsequently a fractional derivative, as we now show. The assumptions are that g is continuous and positive and ψ is continuously differentiable with ψ(0) = 0 and ψ > 0 on (0, T ). In addition, the function h(t) defined by √ πg(t) (9.119) h(t) = t ψ (τ ) dτ 0 (t−τ )1/2
satisfies limt→0 h(t) = h0 > 0. Note that h is the ratio of the two data functions: the flux g and the Djrbashian–Caputo derivative of order 1/2 of ψ. It was shown that any a ∈ G := {a ∈ C[0, T ) : inf t∈(0,T ) h(t)2 ≤ a(t) ≤ supt∈(0,T ) h(t)2 } that satisfies (9.117), (9.118) must also solve the integral equation √ πg(t) =: Ta (9.120) a(t) = t ψ (τ ) dτ 0 [ t a(s) ds]1/2 τ
and vice-versa. The main result in [171] is that T has a unique fixed point on G and indeed T is monotone in the sense of preserving the partial order on G; if a1 ≤ a2 , then Ta1 ≤ Ta2 . Given this, it might seem that a parallel construction for the fractional diffusion counterpart of (9.117), Dtα = a(t)uxx , would be relatively straightforward by exactly the same method but this seems not to be so. The basic steps for the parabolic version require items that just aren’t true in the fractional case—such as the product rule—and without these the above structure cannot be replicated, or at least not without some further ingenuity.
Chapter 10
Inverse Problems for Fractional Diffusion
In this chapter, we discuss a selection of inverse problems for fractional diffusion, mostly the subdiffusion model (6.1). Given partial information about the solution to the equation, we are asked to find the unknown problem data, e.g., boundary condition, initial condition, or unknown coefficient in the differential operator. Such problems arise in many physical problems and engineering contexts. Mathematically, they exhibit dramatically different features and require different solution techniques. The aim of this chapter is to give a flavour of the above. In particular, we will study the influence of the fractional differentiation order α ∈ (0, 1) on the degree of ill-posedness and contrast the behaviour to the classical case α = 1. The difference will be most pronounced in those instances, in which the nonlocal nature of the fractional derivative and the resulting memory effect enhances the transport of information from the measured data to the searched for quantity. An example of this is the backwards subdiffusion problem, where one seeks to identify the initial condition from final observations. A similar effect arises in sideways superdiffusion, where the fractional derivative is a spatial one and the flow of information heavily depends on the directionality that this derivative gets in the fractional case. As soon as the inverse problem is nonlinear—as in the various coefficient identification problems to be studied here—the situation becomes less transparent though, and there the emphsis will be more on exploring how much of the analysis and methodology available in the classical diffusion setting still carries over to anomalous diffusion.
299
300
10. Inverse Problems for Fractional Diffusion
10.1. Determining the initial condition 10.1.1. Backwards diffusion. The classical forward heat problem is as follows. Given the initial temperature u0 (x) of a body (with known boundary conditions), can one uniquely determine the temperature u(x, t) within the interior at a later time? Suppose that Ω ⊂ Rd (d = 1, 2, 3) is an open, bounded, simply connected domain with a smooth boundary ∂Ω. Then the temperature field u satisfies ut = Δu u=0
(10.1)
u(·, 0) = u0
in Ω × (0, T ), on ∂Ω × (0, T ), in Ω.
According to the standard theory for parabolic problems, the answer to the uniqueness question is yes and, additionally, there is a stability result. For example, the parabolic maximum principle implies immediately that sup (x,t)∈Ω×[0,T ]
|u(x, t)| ≤ sup |u0 (x)|, x∈Ω
provided that the solution u(x, t) is continuous, and hence the function g(x) := u(x, T ) satisfies
g L∞ (Ω) ≤ u0 L∞ (Ω) . Similarly, by a standard energy argument, one can get an estimate in L2 (Ω): multiplying (10.1) with u and integrating with respect to space and time, us ing integration by parts in space Ω u(x, t)Δu(x, t) dx = − Ω |∇u(x, t)|2 dx d and the identity uut = 12 dt (u2 ), we arrive at T 1 1 (u(x, T ))2 dx + |∇u(x, t)|2 dx dt = (u(x, 0))2 dx , 2 Ω 2 0 Ω Ω which implies (10.2)
g L2 (Ω) ≤ u0 L2 (Ω) .
The classical backwards heat problem is as follows. Given the final temperature g(x) := u(x, T ) in the domain Ω, for some time T > 0, can the initial distribution u0 (x) be uniquely determined? It is well known that the answer to this question is again yes, but the stability is a very different issue. It is a severely ill-conditioned problem: even for a very small perturbation to the measured temperature g, which is inevitable in practice, the corresponding initial condition u0 can be subject to arbitrarily large change. We can see this very quickly by applying the technique of separation of variables. Let {(λn , ϕn )}∞ n=1 be the Dirichlet eigenpairs of the negative
10.1. Determining the initial condition
301
Laplacian −Δ on the domain Ω ⊆ Rd . The eigenfunctions form an orthonormal basis in L2 (Ω), while the eigenvalues {λn } (counting multiplicity) increase asymptotically at a rate O(n2/d ) as n → ∞. Suppose {λn , ϕn }∞ n=1 are the eigenvalues and normalised Dirichlet eigenfunctions on Ω. Then the solution to (10.1) is given by (10.3)
u(x, t) =
∞
e−λn t u0 , ϕn ϕn (x).
n=1
Evaluating at t = T to provide the data g(x) = u(x, T ) and writing both this and the unknown initial value in terms of the eigenfunction basis, gn = g, ϕn , an = u0 , ϕn , gives the condition an = eλn T gn
(10.4)
from which a unique recovery of the Fourier modes an , and hence of u0 (x), is clear. This simple formula also makes it obvious that any data error in g(x), which is reflected in its Fourier modes will be multiplied by an exponential factor rapidly increasing with the index n. This lack of stability is characteristic of many inverse problems, and the backwards heat problem shows this in dramatic fashion. The forwards operator or the direct problem here is to be given the initial data u0 (x) and use it to determine the final value u(x, T ). As we see from (10.2) this represents a bounded operator. In fact, by elliptic regularity, assuming the domain Ω to be sufficiently smooth, we get that for some constant C˜p ,
g 2H p (Ω) ≤ C˜p2 (−Δ)p/2 g 2L2 (Ω) = C˜p2
∞
(−Δ)p/2 g, ϕn 2
n=1
= C˜p2
∞
λpn gn2 = C˜p2
n=1
where Cp := C˜p supλ≥0 can be replaced by (10.5)
e−λT λp/2
∞
e−2λn T λpn a2n ≤ Cp2
n=1
=
a2n ,
n=1
p p/2 C˜p ( 2eT ) .
g H p (Ω) ≤ Cp u0 L2 (Ω)
∞
Hence the estimate (10.2)
for any p,
showing that the forwards operator is a compact operator in L2 (Ω). Of course now its inverse cannot be a bounded operator, and this represents the classic dilemma of inverse problems: the inverse problem for a smoothing forwards operator will always be ill-posed. The technique of attempting to find a stable approximation to an ill-posed problem is to replace the unbounded operator A representing the inverse map by a suitable bounded operator A . This is the idea of regularisation which is central to the study of inverse problems; cf. Chapter 8. In Section 8.3 we have discussed the method of quasi-reversibility for this purpose, and in fact Section 10.1.3 is devoted
302
10. Inverse Problems for Fractional Diffusion
to the use of a fractional time derivative for regularising the backwards heat problem. Before going there, we will study the problem of reconstructing the initial data as an inverse problem for the subdiffusion model itself. 10.1.2. Backwards subdiffusion. The subdiffusion model represents the fractional counterpart of the heat problem (with α ∈ (0, 1)), ∂tα u = u
on ∂Ω × (0, T ),
u=0
(10.6)
in Ω × (0, T ),
u(·, 0) = u0
in Ω.
From the solution theory in Chapter 6, the forward problem (10.6) is wellposed, with a unique solution that depends stably on the problem data. The corresponding backwards subdiffusion reads as follows. Given the final time data g = u(T ), can one recover the initial condition u0 ? Now using the Mittag-Leffler function Eα,β (z) defined in Chapter 3, the solution u to equation (10.6) can be expressed as (10.7)
u(x, t) =
∞
Eα,1 (−λn tα )u0 , ϕn ϕn (x).
n=1
Hence, the final time data g = u(T ) is given by g(x) =
∞
an Eα,1 (−λn T α )ϕn (x),
n=1
where an = u0 , ϕn are the Fourier coefficients of the initial condition u0 in the eigenfunction basis {ϕn }. By taking the inner product with ϕn , we deduce that the Fourier coefficients of u0 and g are related by 1 gn , n = 1, 2, . . . , (10.8) an = Eα,1 (−λn T α ) with gn = g, ϕn . Formally, it follows that the initial data u0 is given by u0 =
∞ n=1
g, ϕn ϕn , Eα,1 (−λn T α )
provided that the series on the right-hand side converges in a suitable norm. The denominator Eα,1 (−λn T α ) does not vanish (see Corollary 3.3), as a consequence of the complete monotonicity of the function Eα,1 (−t) on R+ . Formula (10.8) is very telling. For α = 1, it of course reduces to the familiar expression an = eλn T gn , which as we have seen clearly shows the severely ill-posed nature of the backward heat problem: the perturbation in the nth Fourier mode gn of the (inevitably noisy) data g is amplified by an exponentially growing factor eλn T in the corresponding Fourier mode an of the initial data u0 . Even for a small index n, which corresponds to low
10.1. Determining the initial condition
303
frequency information, an can be astronomically large if the terminal time T is not exceedingly small. This in turn implies a very bad (numerical) stability, and theoretically one can only expect a logarithmic-type stability result. In the fractional case α ∈ (0, 1), by Theorem 3.23, the Mittag-Leffler function Eα,1 (−t) decays only linearly on R+ , and thus the multiplier term 1/Eα,1 (−λn T α ) grows only linearly in λn , that is, 1/Eα,1 (−λn T α ) ∼ λn , which is very mild compared to eλn T for α = 1, indicating a much betterbehaved inverse problem. Thus |u0 , ϕn | ≤ cλn |g, ϕn | and so |u0 , ϕn | ≤ c|g, Δϕn| and by integration by parts twice using the Dirichlet boundary conditions that |u0 , ϕn | ≤ c|Δg, ϕn|, if the data g ∈ H˙ 2 (Ω). Hence, roughly, the inverse problem amounts to a two spatial derivative loss. More precisely, we have the following stability estimate [306, Theorem 4.1]. Theorem 10.1. Let T > 0 be fixed, and let α ∈ (0, 1). For any g ∈ H˙ 2 (Ω), there exists a unique u0 ∈ L2 (Ω) and a weak solution u ∈ C([0, T ]; L2 (Ω)) ∩ C((0, T ]; H˙ 2 (Ω)) to problem (10.6) such that u(·, T ) = g. Moreover, there holds c1 u0 L2 (Ω) ≤ u(·, T ) H 2 (Ω) ≤ c2 u0 L2 (Ω) . Proof. For any u0 ∈ L2 (Ω), by the solution expansion (10.7), there holds u(x, T ) =
∞
u0 , ϕn Eα,1 (−λn T α )ϕn (x).
n=1
Hence, u(·, T ) ∈ H˙ 2 (Ω) if and only if ∞
u0 , ϕn 2 λ2n Eα,1 (−λn T α )2 < ∞.
n=1
2 2 2 For g ∈ H˙ 2 (Ω), we have c1 g 2H 2 (Ω) ≤ ∞ n=1 λn g, ϕn ≤ c2 g H 2 (Ω) . By Corollary 3.3, Eα,1 (−λn tα ) does not vanish on the positive real axis R+ . Thus we may let an = g, ϕn /Eα,1 (−λn T α ). By Theorem 3.25, ∞ n=1
a2n ≤
∞
g, ϕn 2 (1 + Γ(1 − α)λn T α )2 ≤ cT 2α
n=1
∞
∞
λ2n g, ϕn 2 .
n=1
Letting u0 = n=1 an ϕn and denoting by u(x, t) the solution to (10.6) with this initial condition u0 , we have g = u(·, T ) and u0 ≤ c u(·, T ) H 2 (Ω) . The second inequality is already proved in Theorem 6.2.
304
10. Inverse Problems for Fractional Diffusion
Intuitively, the history mechanism of anomalous subdiffusion retains the complete dynamics of the physical process all the way to time t = 0, including the initial data. The parabolic case, being based on a Markov process, has no such memory effect and the coupling between the current and previous states is hence very weak. The dramatic difference between the α < 1 and α = 1 cases might lead to a belief that “inverse problems for fractional diffusion equations are always less ill conditioned than their classical counterparts.” In which case models based on the fractional derivative paradigm can escape the curse of being strongly ill-posed. However, this statement turns out to be quite false in some of the subsequent sections. The condition number of the (discrete) forwards map F is a useful broad description of the degree of ill-posedness. The condition number at T = 0.01, 0.1, and 1 in Figure 10.1 (left) shows that it stays mostly around O(104 ) for a fairly broad range of α values. This reflects the fact that for any α ∈ (0, 1), backwards subdiffusion amounts to a two spacial derivative loss and we are just picking up this feature. As α → 1, the condition number eventually increases, rapidly reflecting the severely ill-posed nature of the classical backwards diffusion problem. In such situations when the condition number is large, a more revealing picture can be provided by looking at singular value spectrum of the map; see Figure 10.1 (right), the first 50 singular values. It shows the nature of the backwards problem very well. Since the singular values with smaller indices (those less than about 25) are larger for the heat equation than the fractional cases and these becomes more accented as α decreases, at this value for T , the heat equation should allow superior reconstructions (under conditions on noise in the data g). The situation reverses when larger singular values are considered, and it quickly shows the numerical infeasibility of recovering high frequency information from the backwards heat problem, whereas in the
Figure 10.1. Left: the condition number vs. the fractional order α. Right: the singular value spectrum at T = 0.001 for the backwards fractional diffusion. We only display the first 50 singular values.
10.1. Determining the initial condition
305
fractional case many more Fourier modes might be attainable (depending of course on the noise in the data). Figure 10.2 shows the singular value spectra for α = 1 and α = 1/2. In the former case, the singular values decay exponentially, but there is considerable difference with the change of time scale. For T = 1 there is likely at most one usable singular value, and still only a handful at T = 0.1. All this is commensurate with the expected exponential decay. In the fractional case, due to the very different decay rate, there is a considerable number of singular values available even for larger times. However, there is still the initially larger decay in the fractional case. log10 (singularvalue)
0
...........
T T T T
log10 (singularvalue) 0 ...................................
= 0.001 = 0.01 = 0.1 =1
........... .... . ........... ... ...... .. .. ... ...... ............ ........... .. ... .... ... ...... ........... ................... .... ... ...... ......... .... ....... ........... ............................. . ............ . . ..... ........ . . . . . . . . . . . . . . ............................... ............... ...... ......... ..... ........................................ . ....... ..... ......... ......................... ........................................... .. .
−2 −4
............................... ........... .. ..................... ............. ....................................... ............................ ................ ................................... .................... .. ......................... ............................... .....................
−6 0
20
40
60
80
k 100
−3 −6 −9 −12 −15
........... ........ ... ... ....... ... ... ........... ...... ... ... ..... ... ... ........... ..... .. ... .... . ... . .. .... ... .... ........... ... ... ... ... ... ... . ... .... ... ... ... ... ... . .... ..... ... ... ... ... ... ... ... .. ... ... ... ... ... ... ... ... ... ... ... ... ... ... . ... ... .. ... ... ... ... ... . ... ... .. ... . . ... .. ... ... ... ... ... ... ... ... ... ... ... ... ... .. ... .. .. . .
0
20
40
60
T T T T
= 0.001 = 0.01 = 0.1 =1
80
k 100
Figure 10.2. The singular value spectrum of the forward map F from the initial data to the final time data, for (left) α = 1/2 and (right) α = 1, at four different times for the backwards subdiffusion.
We must however put these figures into physical perspective. The equations have been stated with a unit diffusion coefficient. In a physical model there would be a diffusion coefficient σ in the elliptic operator, ∇ · (σ∇u), and this could be of the order of 10−5 [55]. For the heat equation, it is just rescaling time by the same factor. This is not quite true for the fractional case as the time variable appears raised to the power α. Mathematically, this does not change the ill-posed nature, but it does modify the time scales indicated in our rescaled model. 10.1.3. Regularisation of backwards diffusion by backwards subdiffusion. We slightly generalise the setting by using in place of an elliptic differential operator L and setting A = −L with homogeneous Dirichlet boundary conditions. By λk we denote the eigenvalues A with corresponding eigenfunctions ϕk forming an orthonormal basis of L2 (Ω), which also easily allows us to define space-fractional derivative operators as powers of A according to ∞ λβk v, ϕk ϕk . (10.9) Aβ v = k=1
306
10. Inverse Problems for Fractional Diffusion
We consider the backwards diffusion problem, ut − Lu = 0, u(x, t) = 0, u(x, 0) = u0 ,
(x, t) ∈ Ω × (0, T ), (x, t) ∈ ∂Ω × (0, T ), x ∈ Ω,
that is ut + Au = 0,
(10.10)
u(0) = u0 ,
where u0 is unknown and has to be determined from the final time value (10.11)
u(x, T ) = g(x),
x ∈ Ω,
for some T > 0 and a measured function g(x) taken over the domain Ω. We regularise (10.10) in the spirit of quasi-reversibility by replacing the first time derivative by a fractional one, ∂tα u + Au = 0,
(10.12)
u(0) = u0 .
This regularisation is enabled by the stability estimate from Theorem 10.1. The constants c1 , c2 there depend on α, and in particular c2 blows up as α tends to one. We will now quantify this dependence for later use in the convergence analysis. From Theorem 3.25 we obtain the stability estimate Lemma 10.1. 1 1 1 ≤ + Γ(1 − α)T α ≤ C¯ , α λ Eα,1 (−λT ) λ 1−α
(10.13) for C¯ =
1 4λ1
√
+
2π 4
3
max{T 4 , T } and all λ ≥ λ1 , α ∈ [α0 , 1), α0 > 0.
Proof. The first inequality in (10.13) is an immediate consequence of the lower bound in Theorem 3.25. The second inequality can be obtained by π , which allows us to estiusing the reflection formula Γ(z)Γ(1 − z) = sin(πz) mate Γ(1 − α) = for α ∈ [ 34 , 1).
π(1 − α) 1 1 1 1 x ≤ max Γ(α) sin(π(1 − α)) 1 − α Γ(1) x∈[0,π(1−α0 )] sin(x) 1 − α
This lemma when taken together with filtering of the data using a function fγ with λfγ (λ) ≤ Cγ implies a bound on the noise propagation in time for the reconstruction of u0 in the case of a fractional operator. The fact 1 as α → 1 is one that the noise amplification grows only linearly with 1−α of the key facts that renders fractional backwards diffusion an attractive regularising method.
10.1. Determining the initial condition
307
Convergence of Eα,1 (−λT α ) to exp(−λT ) as α → 1 is clear, but to prove convergence of the backwards subdiffusion regularisation uδ0 (·, α) to u0 we require rate estimates in terms of 1 − α. 1 ), there exists C = Lemma 10.2. For any α0 ∈ (0, 1) and p ∈ [1, 1−α 0 C(α0 , p) > 0 such that for all λ ≥ λ1 , α ∈ [α0 , 1),
(10.14)
|Eα,1 (−λT α ) − exp(−λT )| ≤ Cλ1/p (1 − α) .
Proof. To prove (10.14), we employ an energy estimate for the ode satisfied by v(t) := Eα,1 (−λtα ) − exp(−λt) = uα,λ (t) − u1,λ (t); see Theorem 5.4 with n = 1, ∂t v + λv = −(∂tα − ∂t )uα,λ =: w . Multiplying with |v(τ )|p−1 sign(v(τ )), integrating from 0 to t, and then applying Young’s inequality yields t t 1 p p |v(t)| + λ |v(τ )| dτ = w(τ )|v(τ )|p−1 sign(v(τ )) dτ p 0 0 t (p − 1)λ t 1 |w(τ )|p dτ + |v(τ )|p dτ, ≤ p−1 pλ p 0 0 i.e., after multiplication with p, t p |v(τ )|p dτ ≤ (10.15) |v(t)| + λ 0
1 λp−1
t
|w(τ )|p dτ .
0
We proceed by deriving an estimate of the the Lp norm of w of the form t 1/p p |w(τ )| dτ ≤ C λ (1 − α) 0
with C > 0 independent of α and λ. We do so using its Laplace transform and the fact that
w = −(∂tα − ∂t )uα,λ = ∂t Eα,1 (−λtα ) − hα ∗ ∂t Eα,1 (−λtα ) , where hα (t) =
1 t−α with (Lhα )(ξ) = ξ α−1 , Γ(1 − α)
due to L(tp ) (ξ) = Γ(1 + p)ξ −(1+p) for p > −1. Using the identity
L ∂t Eα,1 (−λtα ) (ξ) = −
λ λ + ξα
(that follows from Theorem 5.4 with n = 1) together with the convolution theorem, we have
ξ α−1 − 1 = λ A(ξ; α) B(ξ, α) . Lw (ξ) = λ λ + ξα
308
10. Inverse Problems for Fractional Diffusion
Here we have split λ1 Lw (ξ) into two factors using a fixed exponent ρ > 0 to be chosen in dependence of α0 but independently of α ∈ [α0 , 1), ξρ = L(a(t; α)) , A(ξ; α) = λ + ξα B(ξ; α) = ξ α−1−ρ − ξ −ρ = L(b(t; α)) = L(θ(t; α) − θ(t; 1)) 1 with θ(t, α) = Γ(1−α+ρ) tρ−α . We now choose q ∈ [1, ∞] such that for q ∗ = q ∗ q−1 the norm A(·; α) Lq (R) is finite and therefore,
a(·; α) Lq (0,T ) ≤ C A(·; α) Lq∗ (R) ≤ C1 < ∞. This can be achieved by imposing 1 . α−ρ For the B factor, we employ the mean value theorem to conclude that for any t > 0 there exists α ˜=α ˜ (t) ∈ [α, 1] such that dθ (t, α ˜ ) (α − 1) θ(t, α) − θ(t, 1) = dα
Γ (1 − α ˜ + ρ) 1 tρ−α˜ − + log(t) (1 − α) . = Γ(1 − α ˜ + ρ) Γ(1 − α ˜ + ρ) q∗ >
(10.16)
Now we choose r ∈ [1, ∞] such that
b(·, 1) Lr (0,T ) = θ(·, α) − θ(·, 1) Lr (0,T ) ≤ C2 (1 − α), which is achieved by imposing 1 . 1−ρ Altogether we have, by Young’s convolution inequality (A.15), that
(10.17)
r<
w Lp (0,T ) = λ a(·, α) ∗ b(·; α) Lp (0,T ) ≤ λ a(·, α) Lq (0,T ) b(·; α) Lr (0,T ) ≤ C1 C2 λ(1 − α) , provided
1 q
+
1 r
≤ 1 + 1p .
Together with (10.16) and (10.17) and uniformity with respect to α ∈ 1 [α0 , 1) this leads to the condition p < 1−α . 0 Lemma 10.2 together with the stability estimate (10.13) yields the following bound which will be crucial for our convergence analysis in Section 10.1.3.3. 1 ), there exists C˜ = Lemma 10.3. For any α0 ∈ (0, 1) and p ∈ [1, 1−α 0 ˜ 0 , p) > 0 such that for all λ ≥ λ1 , α ∈ [α0 , 1), C(α * * * * exp(−λT ) * ˜ 1+1/p . * (10.18) * Eα,1 (−λT α ) − 1* ≤ Cλ
10.1. Determining the initial condition
309
10.1.3.1. Regularisation strategies. In Section 8.3, we have outlined several possible candidates for a quasi-reversible regulariser for the backwards heat equation. In this section we provide an overall strategy involving fractional derivatives and look at how individual regularising equations fit in. The ultimate idea will be to split the problem into distinct frequency bands and then combine to recover the value of u0 (x). This is feasible since the mapping u0 → g(x) is linear. Such a strategy is of course not new for this problem but the key is to recognise that each quasi-reversible component that we have described will perform differently over each frequency band, and the problem is how to make the most effective combination. Throughout, we assume that the final value g(x) has been measured subject to a noise level, the magnitude of which we know
g − g δ L2 (Ω) ≤ δ
(10.19) with given δ > 0.
(a) Using a subdiffusion regularisation. Perhaps the simplest possibility of regularisation by a subdiffusion process is to replace the time derivative in (10.10) by one of fractional order ∂tα , relying on the stability estimate from Lemma 10.1, that is, simply define uα by (10.11), (10.12). However, there are two issues that need to be taken into account. First, from Theorem 6.2(i) there is still some smoothing of the subdiffusion operator, and the actual final value at t = T will lie in H˙ 2 (Ω). The subdiffusion equation still decays to zero for large T , and indeed the amplification factor Afrac (k, α) connecting the Fourier coefficients (10.20)
Afrac (k, α)g, ϕk = u(·, 0; α), ϕk
is (10.21)
Afrac (k, α) = 1/Eα,1 (−λk T α ),
and thus grows linearly in λk . Thus we must form some approximation g˜δ (x) of the data g δ (x) in in order that the amplification factors remain bounded for all λk . This is easily accomplished by some conventional regularisation method; cf. Sections 8.1, 8.2.3.2. Tikhonov regularisation (or some iterated version of it) is not appropriate for this purpose, since due to its saturation at a finite smoothness level, it would not be able to optimally exploit the fact that we deal with infinitely smooth exact data g. Thus we employ Landweber iteration for this purpose, H˙ 2 (Ω),
(10.22)
w(i+1) = w(i) − μA−2 (w(i) − g δ ) ,
w(0) = 0 ,
310
10. Inverse Problems for Fractional Diffusion
which by setting w(i) = Ap(i) can be interpreted as a gradient descent method for the minimisation problem min
p∈L2 (Ω)
1 −1
A p − g δ 2L2 (Ω) , 2
and set g˜δ = w(i∗ ) for some appropriately chosen index i∗ . In practice we use the discrepancy principle for this purpose, while the convergence result in Lemma 10.4 emthe step size μ is assumed to satisfy ploys an a priori choice of i∗ . In (10.22), 1 the condition μ ∈ 0, A−2 2 2 . L →L
Second, one has to check that none of the amplification coefficients in (10.21) exceeds that for the heat equation itself. However, as shown in [170] for any value of T , there exists an N such that all amplification factors Afrac (k, α) in (10.21) exceed those of the parabolic problem Apar (k) := eλk T for k ≤ N . In this sense the low frequencies are more difficult to recover by means of the regularising subdiffusion equation than by the parabolic equation itself. Of course for large values of λ the situation reverses as the Mittag-Leffler function decays only linearly for large argument. Figure 10.3 shows the plots of Afrac (k, α) for α = 0.5, 0.9, 1. log10 (A)
103
.... ......... ......... ......... . . . . . . . . . .. .......... .. .. .. .. . .......... .. 102 ......... . ......... .. ....... . . .. . .. .... ...... ...... ...... . . .. . . . . . . . . . . . . . . ................................ ...... ...... . . . . . . ... ..... ................. .... ... ...... .......... ... . .... ..................... . . . . . . . .... . ............ .. 101 .... ........... .. α = 0.5 .... ................. .. . . . .. ........ α = 0.9 ... .... ........................ ... α =1 .. .................. λ 0 ........
10
0
100
200
300
400
500
600
Figure 10.3. Amplification factor A(λk , α)
Thus we modify the reconstruction scheme and there are several possibilities, of which we describe three. (b) Using split-frequencies. The first approach is to modify the reconstruction scheme as follows: for frequencies k ≤ K, we recover the Fourier coefficients of u0 by simply
10.1. Determining the initial condition
311
inverting the parabolic equation as is, using Apar (k) and for frequencies k > K we use Afrac (k, α) defined in (10.21), i.e., Apar (k)g δ , ϕk for k ≤ K, δ u0 (·; α, K), ϕk = δ g , ϕk for k ≥ K + 1. Afrac (k, α)˜ The question remains how to pick K and α. For this purpose, we use the discrepancy principle: in both cases we use the assumption on the noise level in g δ and its smoothed version g˜δ , respectively. More precisely, we first of all apply the discrepancy principle to find K, which—according to existing results on truncated singular value expansion (see for example, [97])—gives an order-optimal (with respect to the L2 norm) low frequency reconstruction uδ0,lf (·, K). Then we aim at improving this reconstruction by adding higher frequency components that cannot be recovered by the pure backwards heat equation, which is enabled by a subdiffusion regularisation acting only on these frequencies. The exponent α acts as a regularisation parameter that is again chosen by the discrepancy principle. We refer to Section 10.1.3.3 for details on this procedure. This works remarkably well for a wide range of functions u0 . It works less well if the initial value contains a significant amount of midlevel frequencies as well as those of low and high order. In this case the split-frequency idea can be adapted as follows. As above we determine the value of K1 = K using the discrepancy principle; this is the largest frequency mode that can be inverted using the parabolic amplification Apar (k) given the noise level δ. We then estimate K2 which will be the boundary between the mid- and high frequencies. With this estimate we again use the discrepancy principle to determine the optimal α2 , where we will use Afrac (k, α2 ) for those frequencies above K2 to recover the Fourier coefficients of u0 for k > K2 . In practice we set a maximum frequency value K . By taking various values of K2 , we perform the above to obtain the overall best fit in the above scheme. To regularise the midfrequency range, we again use the subdiffusion equation with α = α1 and choose this parameter by again using the discrepancy principle; i.e., we set ⎧ ⎪ Apar (k)g δ , ϕk for k ≤ K1 , ⎪ ⎪ ⎪ ⎨A (k, α )˜ δ for K1 +1 ≤ k ≤ K2 , 1 g , ϕk frac uδ0 (·; α1 , α2 , K1 , K2 ), ϕk = δ ⎪ g , ϕk for K2 +1 ≤ k ≤ K3 , Afrac (k, α2 )˜ ⎪ ⎪ ⎪ ⎩0 or A (k, α )˜ δ for k ≥ K3 + 1. 3 g , ϕk frac Thus we solve the backwards diffusion equation in three frequency ranges, (1, K1 ), (K1 , K2 ), and (K2 , K ) using (10.21) with α = 1, α = α1 , and α2 .
312
10. Inverse Problems for Fractional Diffusion
We remark that this process could be extended, whereby we split the frequencies into [1, K1 ], (K1 , K2 ], . . ., (K−1 , K ) and use the discrepancy principle to obtain a sequence of values α = 1, α1 , . . . , α . We found that in general the values of αi decreased with increasing frequency. This is to be expected: Although the asymptotic order of Eα,1 is the same for all α < 1, the associated constant is not. Larger values of α correspond a larger constant and give higher fidelity with the heat equation, as Lemmas 10.1 and 10.2 show. (c) Adding a fractional time derivative to the diffusion equation. Another option is to take a multiterm fractional derivative replacing (10.12) by (10.23)
ut + ∂tα u + Au = 0.
We must show that the solution u to (10.23) and subject to the same initial condition converges in L2 (Ω) × (0, T ) to the solution of (10.10) as → 0. Equation (10.23) is a specific case of the more general multiterm fractional diffusion operator (10.24)
M
α
qj ∂t j u + Au = 0.
j=1
In this case the Mittag-Leffler function must be replaced by the multiterm version [214, 229] with considerable additional complications although the theory is now well understood. While one can use (10.24), the complexity here arises from the 2M coefficients {qj , αj } that would have to be determined as part of the regularisation process. Thus we restrict our attention to equation (10.23). We can calculate the fundamental solution to (10.23) as follows. First we consider the relaxation equation (10.25)
w + ∂tα w + λw = 0,
w(0) = 1.
Taking Laplace transforms {t → s}, we obtain 1 + sα−1 . s + sα + λ Now the imaginary part of s + sα + λ does not vanish if s is not real and positive, so that the inversion of the Laplace transform can be accomplished by deforming the original vertical Bromwich path into a Hankel path Haη surrounding the branch cut on the negative real axis and a small circle of radius η centre the origin; see [113, Chapter 4]. This gives 1 + sα−1 1 ds, est w(t) = 2πi Haη s + sα + λ
(10.26)
w(s) ˆ =
10.1. Determining the initial condition
313
. /* 1+sα−1 * and as η → 0 we obtain H(r) = − π1 Im s+s . Multiplying both α +λ * s=reiπ numerator and denominator by the complex conjugate of s + sα + λ gives . 1 + sα−1 /* 1 + rα−1 e(α−1)πi * = * s + sα + λ s=reiπ (λ − r) + rα eαπi 1 + rα−1 e(α−1)πi (λ − r) + rα e−απi = (λ − r)2 + 2(λ − r)rα cos(απ) + 2 r2α (λ − r) − 2 r2α−1 + 2rα cos(απ) − λrα−1 eαπi = . (λ − r)2 + 2(λ − r)rα cos(απ) + 2 r2α The first three terms in the numerator are real so taking the imaginary part yields . 1 + sα−1 /* λrα−1 sin(απ) * = − . Im * s + sα + λ s=reiπ (λ − r)2 + 2(λ − r)rα cos(απ) + 2 r2α Thus (10.27)
∞
w(t; α, , λ) =
e−rt H(r; α, , λ) dr,
0
where the spectral function H satisfies (10.28)
H(r; α, , λ) =
λ rα−1 sin(απ) . π (λ − r)2 + 2 r2α + 2(λ − r)rα cos(απ)
For > 0 and 0 < α < 1, H is strictly positive showing that the fundamental solution of the initial value problem (10.25) is also a completely monotone function. We can use the above to obtain the solution representation to equation (10.24) (10.29)
u(x, t) =
∞
u0 , ϕk w(t, α, , λk )ϕk (x),
k=1
which becomes a potential regulariser for the backwards heat problem. Tauberian results for the Laplace transform fˆ(s) of a sufficiently smooth function f (τ ) show that limτ →∞ f (τ ) = lims→0+ fˆ(s); cf. Section 3.2. If in ˜ (10.28) we make the change of variables r → λρ, H(r, ·) → H(ρ, ·) then sin(απ) α−1 α−1 ˜ ˜ λ ρ . Now hold take the limit limρ→0+ H(ρ), we obtain H(ρ) ∼ π t = T fixed in (10.27) and we see that sin(απ) α−1 (λT )−α λ π Γ(1 − α) −α T as λ → ∞ . ∼ C(α) λ
w(T ; α, , λ) ∼ (10.30)
314
10. Inverse Problems for Fractional Diffusion
This indicates that the combined asymptotic behaviour of the two fractional terms in (10.23) defers to that of the lower fractional index, here α. This is in fact known even for the general multiterm case (10.24); see [214]. Thus given we have made the prior regularisation of the data by mapping it into H˙ 2 (Ω), equation (10.23) will be a regularisation method for the diffusion equation for > 0. The question then becomes how effectively it performs. The answer is quite poor. Equation (10.30) shows why. If is very small, then the asymptotic decay of the singular values of the map F : u0 → g is again too great and the combination of the two derivatives is insufficient to control the high frequencies. This is particularly true the closer α is to unity. On the other hand, for lower frequency values of λ, the fractional derivative term plays a more dominant role, and this is greater with increasing and with decreasing α. and decreasing α. Thus one is forced to select regularising constants α and that will either decrease fidelity at the lower frequencies or fail to adequately control the high frequencies. There is a partial solution to the above situation by taking instead of (10.23) the balanced version (10.31)
(1 − )ut + ∂tα u + Au = 0.
This ameliorates to some degree the concern at lower frequencies but has little effect at the higher frequencies. We will not dwell on this version or its above modification as there are superior alternatives, as will see in the next subsection. However, the lessons learned in the previous two versions show the way to achieve both goals; low frequency fidelity and high frequency control. (d) Using space fractional regularisation. Replacing the spatial differential operator A = −L in the pseudo-parabolic regularisation (8.98) by its fractional power defined in the sense of (10.9), we obtain (10.32)
(I + Aβ )ut + Au = 0.
This β-pseudo-parabolic equation is no longer a regulariser for the backwards parabolic equation ut + Au = 0 if β < 1, but we expect it to have partial regularising properties; the exploration of this will be studied in the next sections. Alternatively, we can combine both space and time fractional derivatives to obtain (10.33)
(I + Aβ )∂tα u + Au = 0.
10.1. Determining the initial condition
Define μn by μn = sentation (10.34)
λn . 1+λβ n
315
Then the solution to (10.32) has the repre-
u(x, t; β, ) =
∞
u0 , ϕn e−μn T ϕn (x),
n=1
while the solution to equation (10.33) has the representation (10.35)
u(x, t; α, β, ) =
∞
u0 , ϕn Eα,1 (−μn tα )ϕn (x).
n=1
The split frequency idea of the previous paragraph can be carried over to fractional operators in space. Once again we look for frequency cut-off values and we illustrate with three levels, so we have K1 and K2 as above. The regularising equation will be the β-pseudo-parabolic equation as in (10.32). For the lowest frequency interval we choose = 0 so that we are simply again inverting the parabolic; for the midrange k ∈ (K1 , K2 ], we take β = 0.5, and for the high frequencies k ∈ (K2 , K ], we use β = 1 so that we have the usual pseudo-parabolic equation, uδ0 (·; β1 , β2 , 1 , 2 , K1 , K2 ), ϕk ⎧ ⎪ Apar (k)g δ , ϕk ⎪ ⎪ ⎪ ⎨A (k, β , )g δ , ϕ 1 1 βps k = δ ⎪Aβps (k, β2 , 2 )g , ϕk ⎪ ⎪ ⎪ ⎩0
with Aβps (k, β, ) = exp
for for for for
k ≤ K1 , K1 + 1 ≤ k ≤ K2 , K2 + 1 ≤ k ≥ K , k ≥ K + 1,
λn 1 + λβn
T
.
This is a regulariser in L2 (Ω), and so there is no need for the preliminary mapping of the data into H˙ 2 (Ω). In each interval we compute the value of from the discrepancy principle and invert the corresponding amplification factors to recover u0 from (10.34). Variations are possible and in particular reserving the β-pseudo-parabolic equation for midrange frequencies and using a subdiffusion equation for the regularisation of the high frequencies, as in split frequency approach described above. 10.1.3.2. Reconstructions. In this section we will show a few illustrative examples for L = in one space dimension based on inversion using the split-frequency model incorporating fractional diffusion operators since overall these gave the best reconstructions of the initial data. Comparison between different methods is always subject to the possibility that, given almost any inversion method, one can construct an initial function u0 that
316
10. Inverse Problems for Fractional Diffusion
will reconstruct well for that method. As noted in the previous section, we did not find (10.23) or its modification (10.31) to be competitive. The βpseudo-parabolic equation (10.32) when used only for midrange frequencies and with a subdiffusion operator for the high frequencies can give comparable results to the double-split fractional. However, the difficulty lies in determining the pair of constants β and . It turns out that the optimal reconstruction using the discrepancy principle is not sensitive to β in the range 12 ≤ β ≤ 34 or even beyond, but it is sensitive to the choice of . We have taken two noise levels δ on the data g(x) = u(x, T ) at which to show recovery of u0 : δ = 1% and δ = 0.1%. These may seem a low noise level, but one must understand the high degree of ill-posedness of the problem and the fact that high Fourier modes very quickly become damped beyond any reasonable measurement level. One is reminded here of the quote by Lanczos, “lack of information cannot be remedied by any mathematical trickery”. We have also taken the final time T to be T = 0.02. As noted in the introduction concerning equation scaling, we in reality have the combination σT for the parabolic equation in (10.10) where the diffusion coefficient σ is typically quite small. Since we have set σ = 1 here our choice of final time is actually rather long. A decrease in our T by a factor of ten would result in much superior reconstructions for the same level of data noise. Our first example is of a smooth function except for a discontinuity in its derivative near the rightmost endpoint so that recovery of relatively high frequency information is required in order to resolve this feature. Figure 10.4 shows the actual function u0 together with reconstructions from both the single split-frequency method and with a double splitting. One sees the slight but significant resolution increase for the latter method. In the single split method the discrepancy principle chose K1 = 4 and α = 0.92; for the double split K1 = 4, K2 = 10 and α1 = 0.999, α2 = 0.92. 2.0
u 0(x)
T = 0.02 δ = 0.001
......... ...... ......... Actual u0 .... ....... 1.5 ..... ......... ....... single splitfreq .... . ........ .... ....... doublesplitfreq ...... . . ..... . 1.0 .... ...... .... ........ .................. ..... . .. .. ...... . ... ...... ...... .... .... . ... ..... ....... . . . ..... .... 0.5 ... . .. ...... .... . ..... . .. .... ....... . .... . ....... ... ...... . .... . .. ...... . . . . . . . . . ....... ..... . ...... x .............. ... ..
0.0
0.0
0.2
0.4
0.6
0.8
1.0
Figure 10.4. Reconstructions from single and double split frequency method
10.1. Determining the initial condition
317
u 0(x)
........ .... T = 0.02 δ = 0.01 ... ... .. ...... ... ... .... ... Actual u 0 ... .... .... . ...... .. . ...... 3 ... ... ...... ... ..... ... ....... ..... ...... SVD ... ..... ...... ...... ...... .... ...... .... .. ... .. . . .. . . . .. .. .. doublesplitfreq 2 ....... ... .......... .... ... ........ ... .... ... ...... ... . .. . ... ..... ..... .. .... ... ...... .... ...... ... ...... . . . . . . . . . . . . . . . . . . . . ... ... ... .. . .. .. .. . .. .. .. 1 .. .. . ..... ..... ... . .. . .. . . . .... ... ... . . . .. . . . . ... . ...... . . .. .. .. ...... ...... ... . .. .. . ..... . . .... ... ..... . . . . ..... ....... .. . ...... .. . .. ....... .. ...... ... ... ... ... .. . ......... .......... .... ... . ...... x . 0 . . ... ...... ... ... .. ... .. ....... 0.4...... ...... . . . 0.0 0.2 0.6 0.8 . ..1.0 . . ...... . ... . . ... ... .. ....... ....... . . −1 . . ... .. ... .. ... .. .. ....... ........ ....... . −2 4
Figure 10.5. Reconstructions from svd and double split frequency
method
It is worth noting that if we had to increase the noise to δ = 0.01, then not only would the reconstruction degrade but would do so more clearly near the singularity in the derivative. However, of more interest is the fact that the reconstructions from both methods would be identical. The discrepancy principle detects there is insufficient information for a second splitting, so that K2 is taken to be equal to K1 . As a second example we chose a function made up by setting its Fourier coefficients and choosing these so that the first seven are all around unity as are those in the range from 10 to 15. The reconstructions shown in Figure 10.5 use the triple interval split frequency and, as a comparison, a truncated singular value decomposition from the parabolic equation with the parameters chosen again by the discrepancy principle. As the figure shows, the svd reconstruction can only approximate the low frequency information in the initial state whereas the split frequency model manages to capture significantly more. Note that the reconstruction here is better at those places where u0 has larger magnitude. If we had to reduce the noise level to δ = 0.001 or reduce the value of T , this difference would have been even more apparent. If we included the single split frequency reconstruction it would show a significant improvement over the svd but clearly poorer than the split into three bands. Indeed, a similar instance of u0 benefits from a further splitting of frequency bands beyond the three level; see Figure 10.6. Remark 10.1. In higher space dimensions, for special geometries, the eigenfunctions and eigenvalues of − can be computed analytically and everything would proceed as described. For more complex geometries one can rely on a numerical solver and there are many possibilities here, that even would include the multiterm fractional order derivative and more general elliptic operators L (see [163] and references within), or for the combined
318
10. Inverse Problems for Fractional Diffusion
u 0 (x) ...... T = 0.02 δ = 0.001 .. ... ... ............ .. . Actual u 0 4 .. .. ............. .... .. .. .... ...... .... .. ........ ....... ..... ..... . SVD ... .. .. . . . 3 ....... .. .. ... .. .. .. ................ .. . . . . . double-split freq . . . . . .......... ... ... .. .. ....... ....... . . .. .. ........... .. .... ... .. ..... .... ...... 2 ........... . triple-split freq . . . . . . . . . . . . ........... . ..... ... .... ... ...... .. ..... ....... . . . .. .......... . ........... . . . . . . . . . . . . 1 .... .. ...... ........ ...... ....... .. .. ....... .... ...... ....... ..... .. ..... ....... .. ................................... .................. ................ x ........ ... .. ......... . . . ......... .. ............... ........... ....................... . . . . . 0 .. .... . . . . .. . . . . . . . . .. .. .. .... ....... 0.4 0.0 0.2 0.6.................... 0.8 1.0 ....... .... .... −1 ....... .... ...... 5
−2
Figure 10.6. Reconstructions from svd as well as double and triple split frequency method
subdiffusion and fractional operator in space (10.33), [34]. The linearity of the problem with respect to u0 would still allow a decomposition into frequency bands as described for the case in this section. Note that only a very limited number of eigenfunctions and eigenvalues is required for the reconstruction, since the high frequency part can be tackled directly via the fractional pde of temporal order α . 10.1.3.3. Convergence analysis. The ultimate goal of this section is to provide a convergence analysis in the sense of regularisation (i.e., as the noise level δ tends to zero), for the split frequency approach described above. Here the order of time differentiation α acts as a regularisation parameter. As a first result we will show convergence of the subdiffusion regularisation (10.12), both with an a priori choice of α and with the discrepancy principle for choosing α. For this method we will also establish convergence rates under additional smoothness assumptions on u0 . (a) Simple subdiffusion regularisation. We first of all consider approximate reconstruction of u0 as (10.36)
uδ0 (·; α) =
∞ k=1
1 c˜δ ϕk , Eα,1 (−λk T α ) k
that is, the initial data of the solution u to the subdiffusion equation of order α in equation (10.12) with final data g˜δ an H˙ 2 (Ω) smoothed version of the given final data and where c˜δk are its Fourier coefficients. We will show that uδ0 (·; α) → u0 in L2 (Ω) as δ → 0 provided α = α(δ) is appropriately chosen. Note that the relation ck = e−λk T ak holds, which corresponds to the identity g = exp(−AT )u0 .
10.1. Determining the initial condition
319
Since most of the proofs here will be set in Fourier space, we recall the notation ∞ ∞ ∞ ∞ δ δ δ u0 = an ϕn , g = c n ϕn , g = cn ϕn , g˜ = c˜δn ϕn , n=1
n=1
n=1
n=1
where {ϕn } are eigenfunctions of A = −L on Ω with homogeneous Dirichlet boundary conditions and {λn } are the corresponding eigenvalues enumerated according to increasing value. As a preliminary step, we provide a result on H˙ 2 (Ω) (more generally, 2s H˙ (Ω)) smoothing of L2 data, which, in view of (H 2 − L2 ) well-posedness of time fractional backwards diffusion, is obviously a crucial ingredient of regularisation by backwards subdiffusion. Recall the above mentioned Landweber iteration for defining g˜δ = w(i∗ ) , (10.37)
w(i+1) = w(i) − A(w(i) − g δ ) ,
w(0) = 0 ,
where A = μA−2s
(10.38)
with s ≥ 1 and μ > 0 chosen so that A L2 →L2 ≤ 1. Lemma 10.4. A choice of i∗ ∼ T
(10.39)
−2
log
u0 L2 (Ω) δ
yields (10.40)
0
g−˜ g L2 (Ω) ≤ C1 δ , δ
A (g−˜ g ) L2 (Ω) ≤ C2 T s
δ
−1
δ
log
u0 L2 (Ω) δ
=: δ˜
for some C1 , C2 > 0 independent of T and δ. The proof can be found in the appendix of [185]. Standard results on convergence of Landweber iteration do not apply here due to the infinite order smoothness of the function we are smoothing; more precisely, it satisfies a source condition with an exponentially decaying index function. We are now in a position to prove convergence of uδ0 (·; α) in the sense of a regularisation method, first of all with an a priori choice of α. Theorem 10.2. Let u0 ∈ H˙ 2(1+1/p) (Ω) for some p ∈ (1, ∞), and let uδ0 (·; α) be defined by (10.36) with g˜δ = w(i∗ ) according to (10.37), (10.38), (10.39), ˜ is chosen such that with s ≥ 1 + 1p , and assume that α = α(δ) (10.41)
˜ 1 and α(δ)
δ˜ → 0, ˜ 1 − α(δ)
as δ˜ → 0 .
320
10. Inverse Problems for Fractional Diffusion
Then ˜ − u0 L2 (Ω) → 0 ,
uδ0 (·; α(δ))
as δ˜ → 0 .
Proof. In terms of Fourier coefficients, the L2 error can be written as (10.42) ∞
2 1/2 1 δ δ ˜
u0 (·;α(δ)) − u0 L2 (Ω) = c ˜ − a k k Eα,1 (−λk T α ) k=1
2 1/2 e−λk T 1 δ (˜ c − 1 ak − c ) + k Eα,1 (−λk T α ) k Eα,1 (−λk T α ) k=1 ∞ 1/2 ∞ 1/2 1 ≤ C¯ λ2k (˜ cδk − ck )2 + wk (α)2 a2k , 1−α =
∞
k=1
k=1
where (10.43)
wk (α) =
e−λk T − 1, Eα,1 (−λk T α )
and we have used the triangle inequality as well as (10.13). The first term 1 ˜ on the right-hand side is bounded by C¯ 1−α δ, which tends to zero as δ˜ → 0 under condition (10.41). The second term on the right-hand side tends to zero as α 1, since we have, due to (10.18), 1+1/p
˜ wk (α) → 0 as α 1 and wk (α) ≤ Cλ k
for all k ∈ N .
2(1+1/p) 2 1/2 ∞ From the fact that λ a = u0 H˙ 2(1+1/p) (Ω) < ∞ and from k=1 k k Lebesgue’s dominated convergence theorem, we have convergence of the in∞ finite series k=1 wk (α)2 a2k to zero as α 1. We now consider an a posteriori choice of α according to the discrepancy principle, applied to the smoothed data (10.44)
τ δ˜ ≤ exp(−AT )uδ0 (·; α) − g˜δ ≤ τ δ˜
˜ for some fixed constants 1 < τ < τ independent of δ. The fact that this regularisation parameter choice is well-defined, that is existence of an α such that (10.44) holds, can be proven under the assumption (10.45)
˜ g δ L2 > τˆδ˜
with τˆ = τ /(1 − (1 + λ1 T ) exp(−λ1 T )) (please note that the factor 1 − exp(−λ1 T )(1 + λ1 T ) is always positive). Namely, from Theorem 3.25 with
10.1. Determining the initial condition
321
limx→1 Γ(x) = 1, we conclude that wk (α) as defined in (10.43) satisfies ˜ 1+1/p with limα→0 wk (α) = e−λk T (1 + λk T ) − 1 and wk (α) ≤ Cλ k ∞
1+1/p δ 2 c˜k
λk
= ˜ g δ H˙ 2(1+1/p) (Ω) < ∞,
k=1
thus by Lebesgue’s dominated convergence theorem, lim
α→0
exp(−AT )uδ0 (·; α) −
g˜ L2 = lim δ
∞
α→0
wk (α) c˜δk
2 1/2
k=1
∞
2
1/2 = cδk )2 e−λk T (1 + λk T ) − 1 (˜ k=1
≥ (1 − e−λ1 T (1 + λ1 T )) = (1 − e
−λ1 T
(1 +
∞
c˜δk
k=1 λ1 T )) ˜ g δ L2
2 > τ δ˜ .
On the other hand, lim exp(−AT )uδ0 (·; α) − g˜δ L2 = 0 ≤ τ δ˜ .
α→1
Hence, from continuity of the mapping α → exp(−AT )uδ0 (·; α) − g˜δ L2 on the interval (0, 1) (which would actually not hold on [0, 1)!) and the intermediate value theorem, we conclude existence of α ∈ (0, 1) such that (10.44) holds. Note that the case of condition (10.45) being violated for all δ > 0 sufficiently small is trivial in the sense that then obviously u0 = 0 holds. Theorem 10.3. Let u0 ∈ H˙ 2(1+2/p) (Ω) for some p ∈ (1, ∞), and let uδ0 (·; α) be defined by (10.36) with g˜δ = w(i∗ ) according to (10.37), (10.38), (10.39), ˜ is chosen according to (10.44). g δ , δ) with s ≥ 1 + 1p , and assume that α = α(˜ Then ˜ u0 in L2 (Ω) , g δ , δ)) uδ0 (·; α(˜
as δ → 0 .
Proof. In view of the representation
exp(−AT )uδ0 (·; α) − ≤
∞
2 1/2 g = cδk wk (α)˜
∞ k=1
δ
k=1
wk (α)e−λk T ak
2 1/2 +
∞
2 1/2 cδk − ck ) wk (α)(˜ k=1
322
10. Inverse Problems for Fractional Diffusion
with wk (α) as defined in (10.43) and likewise
exp(−AT )uδ0 (·; α) − g δ ∞ ∞
2 1/2
2 1/2 wk (α)e−λk T ak wk (α)(˜ − cδk − ck ) ≥ k=1
k=1
as well as (10.18), which yields (10.46)
∞
2 1/2 ˜ g δ − g ˙ 2(1+1/p) , cδk − ck ) ≤ C ˜ wk (α)(˜ (Ω) H k=1
the discrepancy principle (10.44) yields (10.47)
˜ δ˜ ≤ (τ − C)
∞
2 1/2 ˜ δ˜ . wk (α)e−λk T ak ≤ (τ + C) k=1
From the error decomposition (10.42) we therefore conclude
uδ0 (·; α) −
u0 L2 (Ω)
δ˜ + ≤ C¯ 1−α
∞
2
1/2
wk (α)ak
k=1
∞ C¯ wk (α)e−λk T 2 1/2 ak ≤ + 1−α τ − C˜ k=1
∞
2
1/2
wk (α)ak
,
k=1
where due to (10.14) and (10.18) we have * * * −λ T * *e k * wk (α)e−λk T * * e−λk T *=* * * − 1 * * Eα,1 (−λ T α ) * 1−α * 1−α k * * 1 e−λk T * * = *e−λk T − Eα,1 (−λk T α )* α Eα,1 (−λk T ) 1 − α 1/p 1+1/p ˜ . ≤ Cλk 1 + Cλ k Taking into account the assumption u0 ∈ H˙ 2(1+2/p) (Ω), we get that uδ0 (·; α) is uniformly bounded in L2 (Ω) and thus has a weakly convergent subsequence whose limit due to the upper estimate in (10.47) has to coincide with u0 . A subsequence-subsequence argument therefore yields weak L2 convergence of uδ0 (·; α)δ to u0 . Concerning convergence rates, first of all observe that the rate and stability estimates (10.14), (10.18) yield the following convergence rate for the time fractional reconstruction in case of very smooth data u0 and noise free
10.1. Determining the initial condition
data:
u00 (·; α) − u0 L2 (Ω)
323
: ; ∞ 2 ; exp(−λk T ) < − 1 ak ≤ Eα,1 (−λk T α ) k=1 : ;∞
2 ; ˜ 1+1/p )λ1/p ak , ≤ C(1 − α)< exp(λk T )(1 + Cλ k k k=1
where ak are the Fourier coefficients of the initial data and hence the righthand side is a very strong norm of u0 . Convergence rates under weaker norm bounds on u0 and with noisy data can be obtained similarly to [150] by means of Jensen’s inequality and an appropriate choice of α. Theorem 10.4. Let u0 ∈ H˙ 2(1+1/p+max{1/p,q}) (Ω) for some p ∈ (1, ∞), q > 0, and let uδ0 (·; α) be defined by (10.36) with g˜δ = w(i∗ ) according to ˜ is chosen (10.37), (10.38), (10.39), with s ≥ 1, and assume that α = α(δ) such that ˜ ∼ δ˜ , as δ˜ → 0 . (10.48) 1 − α(δ) Then (10.49)
˜ − u0 L2 (Ω) = O log( 1 )−2q ,
uδ0 (·; α(δ)) δ
as δ → 0 .
In the noise free case we have (10.50)
1 −2q ) ,
u00 (·; α) − u0 L2 (Ω) = O log( 1−α
as α 1 .
Proof. For c = 1 − λ1 T and some q > 0 set
−q f (x) := c − log(x) for x ∈ (0, exp(−λ1 T )],
1 − for ξ ∈ (0, 1] , θ(ξ) := ξ exp 2 c − ξ 2q so that f (x) ∈ (0, 1]
and
θ(f 2 (x)) = x2 f 2 (x) for x ∈ (0, exp(−λ1 T )] .
It is readily checked that θ is convex and strictly monotonically increasing, and that the values of its inverse can be estimated as −2q 2c + log( a1 ) −1 a (10.51) for ab ∈ (0, e2λ1 T ] , b ∈ (0, B] , bθ ( b ) ≤ 1 2B 2q + Cq where Cq > 0 is chosen such that 1
log(z) ≤ Cq z 2q for z ≥
1 B
.
324
10. Inverse Problems for Fractional Diffusion
Indeed, estimate (10.51) can be verified by the following chain of implications and estimates
−1 ξ = θ−1 ( ab ) ⇔ ab = θ(ξ) = ξ exp 2 c − ξ 2q 1
1 − 2q
1
1 − 2q
⇔ log(a) = log(bξ) + 2c − 2b 2q (bξ)
1 ⇔ 2c + log( a1 ) = log( bξ ) + 2b 2q (bξ) −2q 2c + log( a1 ) ⇔ bξ ≤ . 1 2B 2q + Cq
1
≤ (Cq + 2B 2q )(bξ)
1 − 2q
Therefore, Jensen’s inequality yields, for any two sequences (σk )k∈N ⊆ (0, exp(−λ1 T )] and (ωk )k∈N ∈ 2 ,
2
2 ∞ ∞ ∞ 2 2 f (σ f (σ )ω )σ ω k k k k k k=1 k=1 θ(f (σk ) )ωk ∞ ∞ θ = . ≤ k=1∞ 2 2 2 k=1 ωk k=1 ωk k=1 ωk Hence, applying θ−1 to both sides and using (10.51), we obtain (10.52)
2 ∞ −2q ∞ ∞
2 )σ ω f (σ k k k k=1 2c + log( a1 ) 2 −1 ωk θ f (σk )ωk ≤ ≤ 1 ∞ 2 k=1 ωk 2B 2q + Cq k=1
k=1
for
∞
2 f (σk )σk ωk , a=
b=
k=1
∞
ωk2 .
k=1
Now we choose the pair of sequences (σk )k∈N and (ωk )k∈N in such a manner
2 that ∞ )ω = u00 (·; α) − u0 2L2 (Ω) f (σ k k k=1 σk := exp(−λk T ) , exp(−λk T ) 1 exp(−λk T ) = − 1 a − 1 (c + λk T )q ak . ωk := k Eα,1 (−λk T α ) f (σk ) Eα,1 (−λk T α ) By (10.14), (10.18) we obtain 2 ∞ ∞
2 exp(−λk T ) − 1 exp(−λk T ) ak a= f (σk )σk ωk = Eα,1 (−λk T α ) k=1
≤ C 2 (1 − α)2
∞
k=1
˜ 1+1/p )λ1/p ak (1 + Cλ k k
k=1
b=
∞ k=1
ωk2 ≤ C˜ 2
∞ k=1
1+1/p
λk
(c + λk T )q ak
2 ,
2 =: B .
10.1. Determining the initial condition
325
Now we can deduce the rate (10.50) directly from (10.52). The rate (10.49) with noisy data follows from the error decomposition (10.42) using the fact that the second term in (10.42) just coincides with u00 (·; α) − u0 L2 (Ω) , for which we can make use of (10.50),
δ˜ 1 −2q
uδ0 (·; α) − u0 L2 (Ω) ≤ C¯ + O log( 1−α ) , 1−α
together with the parameter choice (10.48). (b) Split frequency subdiffusion regularisation.
Our goal is to establish convergence in the sense of a regularisation method of the split frequency subdiffusion reconstruction uδ0 (·; α, K) = uδ0,lf (·; K) + uδ0;hf (·; α, K) (10.53)
=
K k=1
exp(λk T )cδk ϕk
+
∞
Eα,1 (−λk T α )−1 c˜δk ϕk .
k=K+1
Initially K is determined by the discrepancy principle (10.54)
K = min{k ∈ N : exp(−AT )uδ0,lf − g δ ≤ τ δ}
for some fixed τ > 1. This determines the low frequency part uδ0,lf (·; K). After this is done, uδ0,hf (·; α, K) is computed with α calibrated according to the discrepancy principle, (10.55)
˜ τ δ˜ ≤ exp(−AT )uδ0 (·; α, K) − g δ ≤ τ δ.
Theorem 10.5. Let u0 ∈ H˙ 2(1+2/p) (Ω) for some p ∈ (1, ∞), and let uδ0 (·; α, K) be defined by (10.53) with g˜δ = w(i∗ ) according to (10.37), (10.38), ˜ are g δ , δ) (10.39), with s ≥ 1 + 1p , and assume that K = K(g δ , δ) and α = α(˜ chosen according to (10.54) and (10.55). Then ˜ K(g δ , δ)) u0 in L2 (Ω) , uδ0 (·; α(˜ g δ , δ),
as δ → 0 .
Proof. The discrepancy principle (10.54) for K in terms of Fourier coefficients reads as ∞ ∞
1/2
1/2 δ 2 (ck ) ≤ τδ ≤ (cδk )2 , k=K+1
k=K
which is due to the fact that cδk = cδk −ck +e−λk T ak and the triangle inequality as well as (10.19) implies (10.56) ∞ ∞
2 1/2
2 1/2 −λk T ak ≥ (τ − 1)δ and ≤ (τ + 1)δ. e e−λk T ak k=K
k=K+1
326
10. Inverse Problems for Fractional Diffusion
From the discrepancy principle (10.55) for α we conclude τ δ˜ ≤
K ∞
2 1/2 (cδk − c˜δk )2 + cδk ≤ τ δ˜ , wk (α)˜ k=1
k=K+1
where again we can use c˜δk = c˜δk − ck + e−λk T ak and the triangle inequality, as well as (10.19), (10.40), and (10.18) (cf. (10.46)) to conclude ∞
˜ δ˜ − (1 + C1 )δ ≤ (τ − C)
wk (α)e−λk T ak
2 1/2
˜ δ˜ + (1 + C1 )δ , ≤ (τ + C)
k=K+1
where (1 + C1 )δ ≤ C˜1 δ˜ . For the error in the initial data this yields
uδ0 (·; α, K) − u0 L2 (Ω) K ∞
2 λk T δ = e ck − ak + k=1
=
k=K+1
K
eλk T (cδk − ck )
k=1 ∞
2 1/2 1 δ c ˜ − a k k Eα,1 (−λk T α )
2
2 1/2 1 δ λ (˜ c − c ) + w (α)a k k k k k λk Eα,1 (−λk T α ) k=K+1 ∞
2 1/2 C¯ ˜ λK T δ+ ≤e δ+ wk (α)ak 1−α +
k=K+1
∞ 1 (λK −λk )T 2 1/2 ≤ ak e τ −1 k=K
∞ C¯ wk (α)e−λk T 2 1/2 ak 1−α τ − C˜ − C˜1 k=K+1 ∞
2 1/2 + . wk (α)ak
+
k=K+1
Since the right-hand side estimate in (10.56) implies the convergence ∞ ∞
2 (ak )2 → 0 as δ → 0 e(λK −λk )T ak ≤ k=K
k=K
of the first term, the rest of the proof goes analogously to that of Theorem 10.3.
10.2. Sideways subdiffusion and superdiffusion
327
10.2. Sideways subdiffusion and superdiffusion Another classical inverse problem for the heat equation is the sideways problem, where one seeks to recover information on values of the pde solution on some inaccessible part of the boundary from overposed data at some other boundary part. We will again focus on the spatially one-dimensional case and consider either time or space fractional derivatives leading to sub- or superdiffusion, respectively. 10.2.1. Sideways subdiffusion. There are several different versions of this depending upon whether the direct problem from which it arises is in a quarter plane in space-time or on a finite interval in space. 10.2.1.1. Quarter-plane sideways problem. The quarter-plane problem is defined for x > 0 and t > 0: x > 0, t > 0, ut = uxx , u(0, t) = f (t),
t > 0,
u(x, 0) = u0 (x),
x > 0, 2
assuming a suitable boundedness condition such as u(x, t) ≤ c1 ec2 x , c1 , c2 > 0. Given smooth initial and left-hand data, the problem is known to be wellposed. The sideways problem arises when f (t) is unknown, but we are able to measure a time trace of u at a point x = L, i.e., h(t) = u(L, t) and wish to recover f . This classical problem is known to be severely ill-posed. The fractional sideways problem in a quarter plane reads: ∂tα u = uxx , (10.57)
x > 0, t > 0,
u(0, t) = f (t),
t > 0,
u(x, 0) = u0 (x),
x > 0,
where u(x, t) is assumed to be suitably bounded as x → ∞. We want to determine f given the possibly noisy data h(t) = u(L, t). First, we note that the solution u to the direct problem (10.57) is given by t Gα,x (x, t − s)f (s)ds, (10.58) u(x, t) = 0
where Gα,x (x, t) is the space derivative of the fundamental solution Gα (x, t) to the subdiffusion model given by (6.5). The representation (10.58) was already derived in Section 6.1 or by appealing to Duhamel’s principle. It is well known in the case of the heat equation [49, 55]. Setting u(L, t) = h(t) gives the solution f (t) of the quarter-plane sideways problem in terms of a Volterra equation of the first kind t Rα (t − s)f (s)ds, (10.59) h(t) = 0
328
10. Inverse Problems for Fractional Diffusion
with Rα (s) = Gα,x (L, s). The kernel can be explicitly written as a Wright function (cf. Section 3.5), 1 W α α (−Ls−α/2 ) 2sα − 2 ,2− 2 ∞ (−L)k −k α −α 2 . = α α s k! Γ(− 2 k + 2 − 2 )
Rα (s) = (10.60)
k=0
In case of α = 1 (i.e., classical diffusion), the kernel R(s) is given explicitly by 3 L2 L R(s) = √ s− 2 e− 4s . 2 π
For any α ≤ 1, the kernel is infinitely smooth Rα ∈ C ∞ (0, ∞). This is just a statement that the solution of the forwards problem lies in C ∞ in t for any L > 0, even if f is only in L1 (0, ∞). The classical way to analyse the degree of ill-posedness of a Volterra integral equation of the first kind is to differentiate it sufficiently often, such that for some m, the mth derivative at s = 0 is nonzero, thus converting into a Volterra equation of the second kind, which will have a unique and stable solution (although we must then take into account the fact that we have differentiated both sides of the equation, including the data term, m times). However, this fails in our case since all derivatives of Rα vanish at s = 0, and this implies that the problem is extremely ill conditioned. This concurs with an earlier observation: the mapping F from unknown f to data h takes L1 functions onto C ∞ functions, and the inversion cannot be anything but very ill conditioned. This is well known for the heat equation, and also holds for the fractional case. Indeed, by Theorem 3.27, W− α2 ,2− α2 (−z) decays exponentially to zero for large z → ∞. Thus the inverse problem has the same qualitative ill-conditioning for all α, 0 < α ≤ 1, but quantitatively, there are still considerable differences due to the subtle difference in the exponential decay. 10.2.1.2. Finite-interval sideways problem. Now we consider the lateral Cauchy problem in the unit interval Ω = (0, 1), ut = uxx , (10.61)
0 < x < 1, t > 0,
ux (0, t) = g(t),
t > 0,
u(1, t) = f (t),
t > 0,
u(x, 0) = u0 (x),
0 < x < 1,
10.2. Sideways subdiffusion and superdiffusion
329
which in the subdiffusion case is replaced by ∂tα u = uxx , 0 < x < 1, t > 0, ux (0, t) = g(t),
t > 0,
u(1, t) = f (t),
t > 0,
u(x, 0) = u0 (x),
0 < x < 1.
The inverse problem is to determine f (t) = u(1, t), t > 0, for given u0 (x), x ∈ (0, 1) as well as g(t), h(t) = u(0, t), t > 0. We aim at determining f (t) = u(1, t), t > 0. We shall take the Laplace transform (denoted by ) of the direct problem. The Laplace transform of the Djrbashian–Caputo derivative is L(∂tα u) = z α u (z) − z α−1 u0 and with u0 (x) = 0, the governing equation becomes ˆ ˆ(x, z) − u ˆxx (x, z) = 0, with u ˆ(0, z) = h(z), u ˆx (0, z) = gˆ(z). zαu The general solution of the transformed equation is therefore gˆ(z) ˆ u ˆ(x, z) = h(z) cosh(z α/2 x) + α/2 sinh(z α/2 x), z and setting x = 1 gives an explicit formula for the Laplace transform of fˆ(z) by gˆ(z) ˆ (10.62) fˆ(z) = h(z) cosh(z α/2 ) + α/2 sinh(z α/2 ). z This shows that we have a unique solution, but the existence of the exponentially growing factors cosh(z α/2 ) and sinh(z α/2 )/z α/2 also indicates the level of the severe ill-conditioning. In theory, the solution in the physical domain can be obtained by computing the inverse Laplace transform 1 fˆ(z)ezt dz, f (t) = 2πi Br where Br is the Bromwich path. The growth of the multipliers, which for both terms are asymptotically α/2 of order ez , indicates the degree of ill-conditioning. Actually the forward map F is infinitely smoothing, showing the inverse map is exponentially illposed. The larger the value of α, the greater the growth factor. Hence, while the lateral sideways problem is severely ill conditioned, it will be stronger as α → 1− , but we might expect to see a significant decrease as α → 0+ . Figure 10.7 shows the first 100 singular values for the discrete problem at T = 1 (actually we plot only every second value for the clarity of the symbols in the figure) for α = { 14 , 12 , 34 , 1}. This shows that the singular values lie approximately along a straight line with the vertical axis having equally spaced powers of 10. Thus in this sense the singular values decay exponentially and the problem is severely ill conditioned. However for α less than about 12 , this ill-conditioning is extremely mild and one
330
•◦ ◦•
10. Inverse Problems for Fractional Diffusion
could effectively use an even larger number of singular values. For α > 12 this 10 changes, and at some point the values tend to zero at a much faster rate, the 10 greater as α approaches 1. However, for 10 all α including unity, truncating these 10 1 20 40 60 80 100 values at a modest level will still lead to a reasonable inversion. In fact for every Figure 10.7. Singular values for α away from 1 the condition numbers the sideways problem indicate a reasonable inversion might be possible, and even when α = 1 the situation is easily rendered harmless by an effective regularisation strategy. 1
10−1 −2 −3 −4
•◦ ◦• ◦ •◦◦◦◦ • • ◦◦◦◦◦ • ◦◦◦◦ •• ◦◦◦◦ ◦◦◦ •• ◦◦◦ •• ◦◦◦ •• ◦ ◦◦ ◦◦ •• ◦◦ ◦ ◦ •• ◦◦◦ •• ◦◦◦ •• ◦ ◦◦ •• •• •• •• •• •• •• •• •• •• ◦• ◦• ◦• α = 1/4 3/4 1/2 •• 1 •
−5
10.2.2. Sideways superdiffusion. We now replace the space (instead of the time) derivative in the classical sideways diffusion problem (10.61) by a fractional one of order β ∈ (1, 2), β ut − DC 0 Dx u = 0,
0 < x < 1, t > 0,
β where DC 0 Dx denotes the Djrbashian–Caputo derivative and Ω = (0, 1). Again, we will study the setting of Cauchy data being given at one boundary point, and the value of u is supposed to be determined at the opposite point, so either β 0 < x < 1, t > 0, ut = DC 0 Dx u,
(10.63)
ux (0, t) = g(t),
t > 0,
u(0, t) = h(t),
t > 0,
u(x, 0) = u0 (x),
0 < x < 1,
and f (t) = u(1, t), or β ut = DC 0 Dx u,
(10.64)
0 < x < 1, t > 0,
ux (1, t) = g(t),
t > 0,
u(1, t) = h(t),
t > 0,
u(x, 0) = u0 (x),
0 < x < 1,
and f (t) = u(0, t). Obviously the directionality of this derivative must play a role for identifiability and ill-posedness in this problem. This becomes completely transparent in the extremal case β = 1, where the solution u can be easily determined as u(x, t) = u(0, x + t) = u(1, x + t − 1) = u(x + t, 0), which implies that the information provided by g is redundant (as a matter of fact, g needs to satisfy the compatibility condition h = g). In the leftto-right transport case (10.63), there is an additional consistency condition on the initial data u0 (x) = h(x), and for the searched-for boundary values we get the identity f (t) = u(1, t) = h(t + 1), so that measurements on
10.2. Sideways subdiffusion and superdiffusion
331
an interval [0, T ] only provide f (t) on the interval [0, T − 1]—the values f (t) for t ∈ (t − 1, T ] are undetermined. In the right-to-left transport case (10.64), u0 only needs to be compatible at x = 1, u0 (1) = h(0) and we have f (t) = u(0, t) = h(t − 1) for t ≥ 1 and f (t) = u(t, 0) = u0 (t) for t ∈ [0, 1]. That is, in this case, the problem of recovering f from u0 , h is even wellβ posed. Although DC 0 Dx is not continuous from the left as β 1, we will recover some of this behaviour in the fractional case. To this end, we again take the Laplace transform in time. For simplicity of exposition assume u0 ≡ 0, β (x, z) = 0, z u(x, z) − DC 0 Dx u
(0, z), c1 = u x (0, z) and use Theorem 5.4 with n = 2, μ = z, g = 0, c0 = u to obtain u (x, z) = u (0, z)Eβ,1 (zxβ ) + u x (0, z)xEβ,2 (zxβ ) . In the left-to-right case (10.63), with u (0, z) = h(z), u x (0, z) = g(z), u (1, z) = f (z), this gives (10.65)
f(z) = h(z)Eβ,1 (z) + g(z)Eβ,2 (z).
The derivation is slightly more complicated in the right-to-left case (10.64): d γ−1 x Eβ,γ (zxβ ) = zxγ−2 Eβ,γ−1 (zxβ ), cf. using the differentiation formula dx (3.34), we get the linear system x (0, z)Eβ,2 (z) = u (1, z)(z), u (0, z)Eβ,1 (z) + u x (0, z)Eβ,1 (z) = z −1 u x (1, z), u (0, z)Eβ,0 (z) + u (0, z) = whose solution, taking into account u (1, z) = h(z), u x (1, z) = g(z), u f (z) yields (10.66)
h(z) − z −1 Eβ,2 (z) g (z) Eβ,1 (z) . f(z) = Eβ,1 (z)2 − Eβ,0 (z)Eβ,2 (z)
In both cases (10.65) and (10.66), the time domain version f (t) can be recovered by an inverse Laplace transform ezt f(z) dz, f (t) = Br
where the integral is taken over the Bromwich path. √ √ In the case where β = 2, this gives cosh z and √1z sinh z as multipliers to the data h(z) and g(z) resulting in the exponential ill-conditioning of the sideways heat problem.
332
10. Inverse Problems for Fractional Diffusion
In the case where β ∈ (1, 2), the exponential asymptotics in Theorem 3.23 1 β1 1 − 1 β1 Eβ,1 (z) ∼ ez , Eβ,2 (z) ∼ z β ez , β β indicate that the left-to-right problem (10.65) still suffers from exponentially growing multipliers to the data, and thus the problem is still severely ill conditioned. Also the multipliers are asymptotically larger for the fractional order β closer to unity. In other words, anomalous diffusion in space does not mitigate the ill-conditioned nature of the sideways problem, but in fact worsens the conditioning severely, if the transport takes place in the opposite direction of the memory of the fractional differential operator, which is actually quite intuitive. In fact the situation is very different in the opposite case (10.66), when the memory of the differential operator supports the transport of information. From Theorem 3.23 and the fact that the Bromwich path lies in the sector | arg z| ≤ π/2, we deduce that for large |z|, we have 1 2z 1/β 2 1/β e − z −1 ez , 2 β βΓ(1 − β)z 1 1 1/β 1/β z 1/β−1 ez , Eβ,0 (z)Eβ,2 (z) ∼ 2 e2z − β βΓ(2 − β) Eβ,1 (z)2 ∼
and thus the denominator Eβ,1 (z)2 −Eβ,0 (z)Eβ,2 (z) behaves for large |z| like Eβ,1 (z)2 − Eβ,0 (z)Eβ,2 (z) ∼
1 1/β z 1/β−1 ez βΓ(2 − β)
as |z| → ∞.
This together with the exponential asymptotics of Eβ,1 (z) and Eβ,2 (z) from Theorem 3.23 indicates that the multipliers of h and g in (10.66) are growing at most at a very low-order polynomial rate, for large z. Hence, the highfrequency components of the data noise are only mildly amplified (at most polynomially instead of exponentially). This analysis indicates that the sideways problem with the lateral Cauchy data specified at the point x = 1 is nearly well-posed, as long as the fractional order β is bounded away from two, for which it reverts to the classical ill-posed sideways problem for the heat equation. As β → 1+ the information at x = 0 is transported to x = 1, almost free from distortion. In summary, depending on the location of the over-specified data, anomalous superdiffusion can either help or aggravate the conditioning of the sideways problem with a fractional space derivative. The reader is referred to [170] for the results of numerical simulations for various values of the exponent β and the directional effect.
10.3. Inverse source problems
333
10.3. Inverse source problems In this section, we focus on the basic model (with α ∈ (0, 1)) (10.67)
∂tα u − u = f
in Ω × (0, T ],
on a C 1,1 domain Ω ⊆ Rd , d ≥ 1, with suitable initial condition u(0) = u0 and boundary conditions. The inverse problem of interest here is to recover the source term f , given additional data. There are many well-known examples for the heat equation [49]. Inverse source problems for the classical diffusion equation have been extensively studied; see, e.g., [47, 48, 158, 295]. Those for fractional diffusion equations have also received some attention, and we shall look at a few of them in this section. We shall not attempt recovery of a general source term f . In a later section we will consider the important nonlinear case of dependence on only the dependent variable f (u) as such reaction-diffusion equations have many physical applications. Our measurements will consist of either final time data or lateral boundary (or possibly domain interior) data, and in this section we look for an only space- or time-dependent component of the source term f ; that is, we will take f = f (x) or f = f (t). By the linearity of problem (10.67), we may assume without loss of generality that u0 = 0. With this restriction the inverse problem is a linear one. There is a folklore theorem for the heat equation: inverse problems where the data is not aligned in the same direction as the unknown are almost surely severely ill-posed, but usually only mildly so when these directions do align. Of course, the “usually” statement is important for there are exceptions; the backwards parabolic problem being a notable example, although here as soon as we take the fractional diffusion case with α < 1, the folklore theorem actually holds true. We will see that these ideas are followed by and large in the fractional case but with modifications that are worthy of note. 10.3.1. Recovering a spatially dependent source from final time data. First we consider the recovery of f = f (x) in the equation ∂tα u − u = f (x) in Ω × (0, T ], (10.68)
∂u + γu = 0 ∂ν u(·, 0) = 0
on ∂Ω × (0, T ], in Ω,
where γ is a given positive function. The given data is the final time data g(x) = u(x, T ), x ∈ Ω, for some T > 0. Using the eigenpairs {(λn , ϕn )} of
334
10. Inverse Problems for Fractional Diffusion
the negative Laplacian with a Robin boundary condition, we have u(x, t) =
∞
un (t)ϕn (x),
n=1
where the function un (t) satisfies ∂tα un (t) + λn un (t) = f, ϕn , with un (0) = 0. Simple computation based on (6.18), (6.20) gives the following solution representation ∞ t (t − s)α−1 Eα,α (−λn (t − s)α ) dsf, ϕn ϕn (x) u(x, t) = (10.69) =
n=1 0 ∞ λ−1 n (1 n=1
− Eα,1 (−λn tα ))f, ϕn ϕn (x).
The solution u can be split into a steady-state component us (x) and a dynamic or transient exponent ud (x, t): u(x, t) = us (x) + ud (x, t), with us (x) =
∞
λ−1 n f, ϕn ϕn (x)
n=1
and ud (x) = −
∞
α λ−1 n Eα,1 (−λn t )f, ϕn ϕn (x).
n=1
It can be verified that the functions us and ud satisfy respectively ⎧ −us = f in Ω, ⎨ ⎩ ∂us + γus = 0 ∂ν
on ∂Ω
and ⎧ α ∂ u − ud = 0 in Ω, ⎪ ⎪ t d ⎪ ⎪ ⎪ ⎨ ∂ud + γu = 0 on ∂Ω, d ∂ν ⎪ ∞ ⎪ ⎪ ⎪ ⎪ (0) = − λ−1 u d ⎩ n f, ϕn ϕn
in Ω.
n=1
Hence the map from the source term f (x) to the final data g(x) is given by g(x) =
∞
α λ−1 n 1 − Eα,1 (−λn T ) f, ϕn ϕn (x).
n=1
Upon taking the inner product of both sides with ϕn , we get a direct relationship between the Fourier coefficients of the unknown f and the data g
10.3. Inverse source problems
335
as well as the inversion of the map connecting them: λn g, ϕn (10.70) f, ϕn = . 1 − Eα,1 (−λn T α ) By the complete monotonicity of the Mittag-Leffler function for negative argument, 1 = Eα,1 (0) > Eα,1 (−λ1 T α ) ≥ Eα,1 (−λ2 T α ) > · · · , and thus formula (10.70) is well-defined for any T > 0, and it gives the precise condition for the existence of a source term and the estimate, |f, ϕn | ≤ (1 − Eα,1 (−λ1 T α ))−1 λn |g, ϕn |. Upon summing over all frequencies, we obtain the same stability estimate as in the backward subdiffusion case,
f L2 (Ω) ≤ c g H 2 (Ω) , but with the important addition that this holds for all α ∈ (0, 1], including the heat equation case α = 1. This should not be surprising, since for large terminal time, the steady state us dominates, and this is precisely a two derivative loss. The transient term ud does depend on α but not in a significant way. Looking at the singular value spectrum in [170, Fig. 7] confirms the analysis. The decay of the singular values is algebraic; for a final time of T = 1, σ50 is approximately 10−4 and the differences between the spectra are almost independent of α. When T is small, these differences become more pronounced as the effect of the term T α has more significance, and in fact the condition number for the problem actually increases with decreasing α. This analysis extends to the setting ∂tα u − u = f (x)r(t) in Ω × (0, T ], (10.71)
∂u + γu = 0 ∂ν u(·, 0) = 0
on ∂Ω × (0, T ], in Ω,
with a known time-dependent function, (10.72)
r ∈ L∞ (0, T )
with r(t) ≥ r > 0 ,
t ∈ (0, T ).
Taking the inner product with ϕn , one sees that the functions defined by un (t) = u(·, t), ϕn satisfy ∂tα un + λn un = f, ϕn r(t) in (0, T ] with zero initial condition. Therefore using the functions 1 Eα,1 (−λn tα ) (10.73) Eα,n (t) := λn
336
10. Inverse Problems for Fractional Diffusion
(see also (10.103) below), which by Theorem 3.22 are completely monotone and by (3.34) satisfy Eα,n (t) = −tα Eα,α (−λn tα ), we have the representation t un (t) = − Eα,n (t − s)r(s) dsf, ϕn , 0
and therefore g, ϕn = un (T ) = − the explicit formula (10.74)
f, ϕn =
−λn
T 0
t 0
Eα,n (T − s)r(s) dsf, ϕn , leading to
λn g, ϕn Eα,n (t − s)r(s) ds
,
cf. (10.70). The denominator in (10.74) is uniformly bounded away from zero under our assumptions (10.72) on r since t Eα,n (t − s)r(s) ds − λn 0 t α (−Eα,n (t − s) (r(s) − r) ds = r(1 − Eα,1 (−λn t )) + 0 ≥0
≥0
≥ r(1 − Eα,1 (−λ1 t )) . α
10.3.2. Determining a time dependent component from lateral boundary data. Now we turn to the case of recovering the component f (t) in ∂tα u − u = r(x)f (t) in Ω × (0, T ] from the values of the function u (or of the flux) at a fixed point x0 on the boundary ∂Ω or the domain Ω where we assume r(x) is known. The solution to the direct problem is readily available from (6.18), (6.20): ∞ t (t − s)α−1 Eα,α (−λn (t − s)α )f (s) ds r, ϕn ϕn (x0 ). h(t) = u(x0 , t) = n=1 0
In particular, if r = ϕn for some n, then the problem becomes t h(t) = Rα,n (t − s)f (s) ds 0
tα−1 Eα,α (−λn tα )r, ϕn ϕn (x0 ).
This is a Volterra integral with Rα,n (t) = equation of the first kind for f (t), and the analysis hinges around the behaviour of the kernel Rα,n near t = 0. Since Eα,α (−λn tα ) → 1/Γ(α) as t → 0, the kernel is weakly singular. Taking a fractional derivative ∂tα of order α of both sides gives t Rα,n (t − s)f (s) ds, ∂tα h(t) = ϕn (x0 )f (t) − λn 0
10.3. Inverse source problems
337
which is a Volterra integral equation of the second kind with a unique solution f that depends continuously on Rα,n and ∂tα h. Hence, the inverse problem is very mildly ill-posed requiring only an α-derivative loss on the data h, as can be seen from the estimate
f L∞ (0,T ) ≤ C ∂tα h L∞ (0,T ) . The assertion holds for the general case [306, Theorem 4.4], and we will here prove it under slightly weaker conditions on r. Consider the initial boundary value problem, ⎧ α ∂ u(x, t) = u(x, t) + r(x)f (t), (x, t) ∈ Ω × (0, T ), ⎪ ⎨ t u(x, t) = 0, (x, t) ∈ ∂Ω × (0, T ), (10.75) ⎪ ⎩ u(x, 0) = 0, x ∈ Ω. We consider the following inverse source problem: given r(x), determine f (t), 0 < t < T , from h(t) = u(x0 , t), 0 < t < T , for x0 ∈ Ω. Theorem 10.6. Let r ∈ H˙ 2β (Ω), with β > d/2, r(x0 ) = 0. Then the solution u to problem (10.75) with f ∈ C[0, T ] satisfies (10.76)
c ∂tα u(x0 , ·) L∞ (0,T ) ≤ f L∞ (0,T ) ≤ C ∂tα u(x0 , ·) L∞ (0,T )
for some constants c, C > 0 independent of f . Proof. Since f ∈ C[0, T ] and r ∈ H 2β (Ω), with Eα,n as in (10.73), we obtain ∞ t Eα,n (t − s)f (s) ds r, ϕn ϕn (x) u(x, t) = − n=1 0
in L2 (0, T ; H 2 (Ω)) and (10.77) ∂tα u(x0 , t) = f (t)r(x0 ) −
∞
t
λn
n=1
Eα,n (t − s)f (s) ds r, ϕn ϕn (x0 )
0
= f (t)r(x0 ) + d(t) in L2 (Ω × (0, T )). We have for any t ∈ (0, T ], |d(t)| ≤
∞
λn Eα,n L1 (0,T ) f L∞ (0,T ) |r, ϕn | |ϕn (x0 )|,
n=1
where by complete monotonicity of Eα,n T
Eα,n L1 (0,T ) = (−Eα,n )(t) dt 0
= Eα,n (0) − Eα,n (T ) =
1 − Eα,1 (−λn T α ) 1 ≤ λn λn
338
10. Inverse Problems for Fractional Diffusion
s/2
and |ϕn (x0 )| ≤ ϕn L∞ (Ω) ≤ C ϕn H˙ s (Ω) = Cλn
for s > d/2. Thus using the Cauchy–Schwarz inequality and the fact that r ∈ H˙ 2β (Ω), we obtain |d(t)| ≤ C f L∞ (0,T )
∞
λs/2 n |r, ϕn |
n=1
≤ C f L∞ (0,T )
∞
λs−2β n
∞
1/2
n=1
= C f L∞ (0,T )
∞
λs−2β n
1/2
λ2β n |r, ϕn |
1/2
n=1
r H˙ 2β (Ω) ,
n=1
where due to the Weyl eigenvalue estimate, λn ∼ n2/d , the sum converges if and only if (s − 2β)2/d < −1, that is, by the above condition s > d/2 for β > d/2. Inserting this into (10.77) yields the lower bound in (10.76). Setting Rα (t) = −
∞
λn r, ϕn tα−1 Eα,α (−λn tα )ϕn (x0 )
n=1
and proceeding as above, we can show that Rα L1 (0,T ) ≤ C r H˙ 2β (Ω) . With this as a convolution kernel, f satisfies the second kind Volterra integral equation, t
1 α ∂t u(x0 , t) − Rα (t − s)f (s)ds , 0 < t < T, (10.78) f (t) = r(x0 ) 0 by r(x0 ) = 0. Applying [120, Theorems 3.1, 3.5], we get |f (t)| ≤
1
∂ α u(x0 , ·) L∞ (0,T ) eRα L1 /|r(x0 )| |r(x0 )| t
∀t ∈ [0, T ],
from which the upper bound in (10.76) follows.
Figure 10.8 shows the singular value spectrum for the forward map. There is a slight increase in ill-posedness as α increases, and thus fractional diffusion can mitigate the degree of ill-posedness in the inverse problem, which agrees with the stability estimate. For α close to zero it effectively behaves as if it were well-posed. 10.3.3. Recovering the space component from lateral boundary data. Next we look at the case of recovering a space-dependent source f (x) in (10.68) from the time trace of the solution at a fixed point on the boundary ∂Ω. The solution to the direct problem is given by u(x, t) =
∞ n=1
α λ−1 n 1 − Eα,1 (−λn t ) f, ϕn ϕn (x),
10.3. Inverse source problems
σ1 σ50
10 6
339
..... ...... ...... . ..... ... . ...... ... ...... ... ...... .. ....... .... . . ........ .. ......... ... .......... .. ......... .... ................ ........ . . . . . . . ........... ...... ............. ..... ............. ......... . ....... ...... .................. ........... . . . . . . . ....... ....... ...... ................... ............ . . . . . ....... ....... ...... ....... ......... ......... ................. ............ .......................................... α ................................... .
10 4
10 2
10 0 0.0
σn
100 T =1 T = 0.1 T = 0.01
0.2
0.4
0.6
0.8
1.0
(a) Condition number of forward map
10−1
10−2
α =1 α = 3/4 α = 1/2 α = 1/4
..................................................... ............................................................................................ ................................ ................ .............. ............................ ............................... ..... ......... ............................... ..... .......... ............................... .......... ..... ........... ..... ........... ..... ................ ..... ............... ...... ............... ...... ................. ....... ................ ....... ... ........ .......... .......... .............. ............. ................ ................ ...............
n
0
10
20
30
40
50
(b) singular values at T = 1.
Figure 10.8. Numerical results for the inverse source problem with flux data at x = 0 and source term xf (t), where f (t) is unknown
hence for x0 ∈ ∂Ω, ∞ ϕn (x0 ) h(t) = u(x0 , t) = 1 − Eα,1 (−λn tα ) f, ϕn . λn n=1
This formula is very informative. First, the choice of the measurement point x0 has to be strategic in the sense that it should satisfy ϕn (x0 ) = 0 for all n ∈ N. Should ϕn (x0 ) = 0, then the nth mode could not be recovered. Note that the condition ϕn (x0 ) = 0 will be almost impossible to arrange in practice. For example, if Ω = (0, 1), then the Dirichlet eigenfunctions are ϕn (x) = sin nπx, and this is zero for some n for every point x such that x = m n π where m is any integer. Thus all points of the form qπ for q rational must be excluded for the point x0 to satisfy the condition. Second, upon some simple algebraic operations, the inverse problem is equivalent to expressing a function h(t) in terms of the basis {Eα,1 (−λn tα )}. These functions all decay to zero polynomially in α and are thus almost linearly dependent, indicating the ill-posed nature of the problem. This shows a strong contrast in the degree of ill-conditioning between recovering a time and a space-dependent unknown from the same lateral data. This is very much in keeping with the folklore theorem. It also shows that while the ill-conditioning is of the same general order, fractional diffusion can both be more as well as less ill-conditioned than the classical diffusion case. To analyse the inverse problem, we consider a one-dimensional subdiffusion equation with Neumann boundary conditions, ⎧ α ⎪ ⎨∂t u(x, t) = ∂xx u(x, t) + f (x), (x, t) ∈ Ω × (0, T ), u(x, 0) = u0 (x), x ∈ Ω, (10.79) ⎪ ⎩ ∂x u(0, t) = ∂x u(1, t) = 0, t > 0,
340
10. Inverse Problems for Fractional Diffusion
where Ω = (0, 1), T > 0, and 0 < α < 1. Given initial data u0 (x) ∈ L2 (Ω) and source term f (x) ∈ L2 (Ω), there exists a unique solution u ∈ C([0, T ]; L2 (Ω)) ∩ C((0, T ]; H˙ 2 (Ω)) to problem (10.79) and by noting λ0 = 0 (with ϕ0 (x) = 1), the solution u has the representation (10.80) ∞ u0 , ϕn Eα,1 (−λn tα )ϕn (x) u(x, t) = n=1 ∞
+
tα α (f, ϕ0 ) + λ−1 n f, ϕn (1 − Eα,1 (−λn t ))ϕn (x). Γ(α + 1) n=1
The inverse problem is to determine the unknown source term f (x) from the measured data h(t) = u(0, t) for 0 ≤ t ≤ T . The following result by Zhang and Xu [366] gives an affirmative answer to the question. Theorem 10.7 ([366, Theorem 1]). The data h(t) uniquely determines the source term f (x). Uniqueness follows directly from Lemma 10.5 and the fact that for u ˜ ˜ satisfying (10.79) with f replaced by f (but the same initial data), the ˜) satisfies function w ˆ = ∂tα (u − u ⎧ α ˆ t) = ∂xx w(x, ˆ t), (x, t) ∈ Ω × (0, T ), ∂ w(x, ⎪ ⎨ t w(x, ˆ 0) = f (x) − f˜(x), x ∈ Ω, ⎪ ⎩ ∂x u(0, t) = ∂x u(1, t) = 0, t > 0, where we have used the fact that by the pde, ∂tα (u − u ˜)(x, 0) = ˜(0, t) = h(t) = u(0, t), then (u − u ˜)xx (x, 0) + (f − f˜)(x). Moreover, if u w(0, ˆ t) = 0. Lemma 10.5. Let w(x, t), w(x, t) ∈ C([0, T ]; L2 (Ω)) ∩ C((0, T ]; H˙ 2 (Ω)) be 0 (x), solutions of (10.79) with f = 0 and the initial conditions u0 (x) and u 0 (x) in respectively. Then w(0, t) = w(0, t) for 0 ≤ t ≤ T implies u0 (x) = u L2 (Ω). Proof. This lemma is taken from [366, Lemma 2]. Since the eigenfunctions of the Neumann Laplacian form a complete orthogonal basis in L2 (Ω), the initial conditions can be represented by ∞ ∞ an ϕn (x) and u 0 (x) = a ˜n ϕn (x). u0 (x) = n=1
n=1
˜n , n = 1, 2, . . ., given the data w(0, t) = w(0, t). It suffices to show an = a Since w(0, t) = w(0, t) for 0 ≤ t ≤ T , we have ∞ ∞ an Eα,1 (−λn tα ) = a ˜n Eα,1 (−λn tα ), 0 ≤ t ≤ T. n=1
n=1
10.4. Coefficient identification from boundary data
341
Moreover, by the Riemann–Lebesgue lemma and the analyticity of the Mittag-Leffler function, both sides are analytic in t > 0. By the unique continuation for real analytic functions, we have ∞ ∞ α an Eα,1 (−λn t ) = a ˜n Eα,1 (−λn tα ), t ≥ 0. n=1
n=1
Since an → 0, by Theorem 3.23, |e
−t Re z
∞
an Eα,1 (−λn t )| ≤ e α
−t Re z
|a0 | +
n=1
∞ n=1
≤ ce−t Re z t−α
∞
|an | α λn t Γ(1 − α)
−α −t Re z λ−1 e , n ≤ ct
n=1
e−t Re z t−α
is integrable for t ∈ (0, ∞) for fixed z with and the function Re z > 0. By the Lebesgue dominated convergence theorem ∞ −ztand the Laplace transform relation for the Mittag-Leffler function 0 e Eα,1 (−λn tα ) = z α−1 /(z α + λn ) for Re z > 0 (cf. Lemma 3.1), we deduce ∞ ∞ ∞ z α−1 −zt α e an Eα,1 (−λn t )dt = an α , z + λn 0 n=1
n=1
which implies (10.81)
∞ n=0
bn = 0, η + λn
Re η > 0,
˜n . Since limn→∞ (an − a ˜n ) → 0, we where η = z α and bn = an − a can analytically continue in η, so that the identity (10.81) holds for η ∈ C \ {−λn }n≥1 . Last, we deduce b0 = 0 from the identity by taking a suitable disk which includes 0 and does not include {−λn }n≥1 . By the Cauchy integral formula, integrating the identity along the disk gives 2πi b0 = 0, i.e., a0 . Upon repeating the argument, we obtain an = a ˜n , n = 1, 2, . . ., a0 = which completes the proof of the lemma. The proof clearly indicates that the data to an unknown relation in the inverse source problem amounts to an analytic continuation process, which is well-known to be a severely ill-posed procedure.
10.4. Coefficient identification from boundary data So far we have only discussed linear inverse problems. Now we turn to nonlinear ones. A standard such problem is to recover a coefficient in an elliptic operator L in the diffusion equation ut − Lu = 0 from either overspecified boundary data or interior observations at a fixed time. We discuss the latter case in the next section.
342
10. Inverse Problems for Fractional Diffusion
In this section we will look at a specific formulation based on determining a space-dependent potential q(x) in (10.82)
∂tα u = uxx − q(x)u in (0, 1) × (0, T ],
subject to known initial and boundary conditions from measurements u(x0 , t) = h(t) at some boundary point x0 . In the following three subsections, we study three different scenarios of driving the solution u(x, t) from which we seek to recover the unknown potential q(x). We make two remarks concerning the single space dimension setting and the choice of the potential as the unknown coefficient. Throughout, we are going to rely on the eigenvalue/eigenfunction expansion of the operator −Lu := −uxx + q(x)u, and many of the results needed are only available on the basis of Sturm–Liouville theory, both direct and inverse, thereby restricting us to R1 . The Liouville transform allows a more generic second order operator in one space dimension to be mapped in a transparent way into the canonical form of a potential-only equation. Hence the choice taken in (10.82). Some of the approaches we show here were taken in [265] for the case of the parabolic operator, and we will rely heavily on this work. We will also require some background in inverse Sturm–Liouville theory; notes for this topic are included in Section 9.2. 10.4.1. Excitation by initial conditions. In this subsection we look at the subdiffusion problem on the domain Ω = (0, 1), ∂tα u = uxx − qu in Ω × (0, T ], (10.83)
ux (0, t) = ux (1, t) = 0 u(·, 0) = u0
t ∈ (0, T ],
in Ω.
For the inverse problem, we consider the additional information h(t) = u(0, t) ,
0 ≤ t ≤ T,
and aim at recovering the potential q(x). To this end, we first of all consider the Sturm–Liouville problem with a Neumann boundary condition for given q ∈ L2 (Ω), −ϕ + qϕ = λϕ
with ϕ (0) = ϕ (1) = 0.
It is known that the eigenfunctions {ϕn } form a complete orthogonal set in L2 (Ω). The normalisation of the eigenfunctions is at our disposal, and we shall choose ϕn L2 (Ω) = 1 so that the eigenfunctions are orthonormal. Note that for Sturm–Liouville problems it is more usual to normalise by giving
10.4. Coefficient identification from boundary data
343
the value at one endpoint, e.g., ϕn (0) = 1. The eigenvalues {λn } have the asymptotic behaviour 1 2 2 (10.84) λn = n π + q(s) ds + n 0
(cf. (9.78)), where the residual sequence n ∈ 2 for q ∈ L2 (Ω). For sufficiently smooth u0 (x), the solution to (10.83) is given by u(x, t) =
∞
u0 , ϕn Eα,1 (−λn tα )ϕn (x),
n=1
and thus the given data h(t) can be expressed by ∞
(10.85)
u0 , ϕn ϕn (0)Eα,1 (−λn tα ) = h(t).
n=1
The question is can the potential q(x) be uniquely recovered from this? We can continue formally and take the Laplace transform to obtain (cf. Lemma 3.1) zα
(10.86)
∞
u0 , ϕn ϕn (0)
n=1
1 ˆ = h(z). z α + λn
If we are able to recover the complete set of eigenvalues {λn } plus norming constants, such as {ϕn (0)}, then there is a unique potential q(x) to the inverse Sturm–Liouville problem; see Section 9.2. Note that ϕn (0) = 0 for any n since this coupled with the left boundary condition ϕn (0) = 0 would imply that ϕn is identically zero. Inverting (10.86) directly is nontrivial. The main reason is while u0 is given, we are unable to compute u0 , ϕn , since we do not know the eigenfunctions {ϕn } (depending on q). Thus we concentrate on the uniqueness question. Suppose that there are two potentials q(x) and p(x) with eigenvalues/eigenfunctions {λn , ϕn } and {(μn , ψn )}, respectively. Then from (10.86), (10.87)
∞
∞
u0 , ϕn ϕn (0)
n=1
1 1 = (u0 , ψn )ψn (0) α α z + λn z + μn
∀z > 0.
j=1
Thus both sides represent the same analytic function: that on the left with poles at z = {−λn }, that on the right at z = {−μn }. Also, the residues at these poles must agree so (10.88)
u0 , ϕn ϕn (0) = (u0 , ψn )ψn (0)
for all n ∈ N.
If u0 , ϕn does not vanish for any n, we can conclude from equality of the poles in (10.87) that λn = μn for all n ∈ N. Then the question is whether
344
10. Inverse Problems for Fractional Diffusion
the residue condition (10.88) could provide information on ϕn (0) and ψn (0). If so, this would imply the desired uniqueness [300]. These issues remain even in the case of α = 1, and thus the questions become what initial conditions u0 will have all its Fourier coefficients nonzero in the orthogonal basis {ϕn }, and then how do we translate this into a condition that will provide norming constants. This seems an impossible condition to verify given that we have no knowledge of q and hence of ϕn , but there is a way out—by choosing u0 (x) = δ(x), the Dirac delta measure centred at x = 0. With this choice u0 , ϕn = ϕn (0) and (u0 , ψn ) = ψn (0) and (10.87), (10.88) become ∞ n=1
∞
ϕ2n (0) α z
1 1 = ψn2 (0) α and ϕn (0)2 = ψn (0)2 for all n ∈ N. + λn z + μn n=1
So since {φ2n (0)} and {ψn2 (0)} cannot vanish for any n, by equality of the poles we have
λn = μ n ,
|φn (0)| = |ψn (0)|,
for all n ∈ N .
Thus by Theorem 9.16 (noting that there we had fixed the left-hand boundary value of the eigenfunctions to one, whereas here we fix their L2 norm to one), q = p in L2 (Ω), and there is a unique solution to the weak form of the inverse problem. We shall explain the caveat “weak” below. There is a difficulty we have overlooked. With this u0 , do we know that the solution of (10.83) is sufficiently regular to justify the solution representation? By the theory in Chapter 6, we have sufficient regularity if u0 (x) ∈ L2 (0, 1), but we are far from this assumption with u0 = δ. The technicality however can be overcome [63]. This paper represents one of the first undetermined coefficient problems involving fractional derivatives. It was set with the unknown parameter to be a conductivity p(x) in ∂tα u(x, t)− (p(x)ux )x (x, t) = 0, but the arguments are almost identical and the two coefficients are interchangeable using the Liouville transform (cf. Section 9.2), provided that the coefficients are reasonably smooth. The difficulty of regularity of u0 = δ can also be overcome by ∞ of lack 0 setting u0 = n=1 an ϕn with ϕ0n the eigenfunctions of the Laplacian, that is, corresponding to the potential q = 0, together with a perturbation argu∞ −1−2 1/2 , by using the ment. Indeed, setting ak = k −1/2− , C = n=1 k
10.4. Coefficient identification from boundary data
345
Cauchy–Schwarz inequality and Lemma 9.9 we have, for any k, |u0 , ϕk | = |
∞
an ϕk , ϕ0n |
n=1
= |ak +
∞
an ϕk − ϕ0k , ϕ0n | ≥ |ak | − C ϕk − ϕ0k L2
n=1
for q L2 (Ω)
1 ≥ k −1/2− (1 − C C q L2 (Ω) k −1/2+ ) ≥ k −1/2− 2 sufficiently small.
In theory, the uniqueness proof lends itself to a constructive algorithm, which consists of two steps. First, we extract from (10.85) the spectral data {λj , ϕj (0)}N 1 (for some N ). Then we reconstruct the potential q(x) from the spectral data, which is only mildly ill conditioned. The first step is very delicate. In (10.84), for smoother q, the decay in n becomes more rapid; essentially a factor of n−1 for every additional derivative in q. Thus the value of an eigenvalue is a sum of an increasing sequence masking the information sequence {n } that is (possibly rapidly) decreasing. Hence, we have to recover the spectral information very accurately in order for it to be useful in recovering q, indicating the computational challenge with this approach. 10.4.2. Excitation by boundary conditions. We return to the previous undetermined coefficient problem, but now with a modified initial/boundary condition setup. After the preliminary analysis showing uniqueness, our goal here will be to bring out the differences between the classical parabolic formulation and the time fractional version. Specifically, we shall consider the following problem: suppose u(x, t) satisfies 0 < x < 1, t > 0, ∂tα u(x, t) − uxx (x, t) + q(x)u(x, t) = 0, (10.89)
ux (0, t) = 0,
ux (1, t) = a(t),
u(x, 0) = 0,
0 ≤ x ≤ 1,
t > 0,
α where again ∂tα = C 0 Dt denotes the Djrbashian–Caputo fractional derivative of order α, 0 < α ≤ 1. The potential q(x) is assumed to be unknown, and we take q(x) ∈ L∞ to utilise regularity results for (10.89). Weaker conditions, for example, q ∈ L2 would suffice if we only consider the question of uniqueness.
We have imposed a zero value for the initial condition to avoid some of the issues noted in the previous section as this aspect is really tangential to the main theme, and we are viewing a system at a ground state that will be driven by the input of energy at one boundary while insulated at the
346
10. Inverse Problems for Fractional Diffusion
other. We shall restrict the input function a(t) to be integrable and have compact support on the interval [0, T ] for some fixed T > 0. To ensure compatibility at t = 0 we must have a(0) = 0. This models the physically reasonable situation of heat being added to the system through one part of the boundary for a fixed time period and then shut off. In order to gain information on the unknown potential, we measure the temperature on one boundary, which we will take to be the right, active boundary. The arguments would be almost identical if the left, insulated boundary was the measurement position, (10.90)
u(1, t) = h(t),
t > T,
and from the pair {a(t), h(t)} seek to determine the unknown potential q(x). We use the eigenvalues/eigenfunctions from the last subsection, but we impose the normalisation φn (1) = 1 (as opposed to ϕn L2 = 1 in the . previous subsection). Then we let ρn = φn −2 L2 We can use the common trick of converting the problem to one with homogeneous boundary conditions by the transformation v(x, t) = u(x, t) − 2 a(t) x2 . We obtain ∂tα v(x, t)−vxx (x, t)+q(x)v(x, t) = f (x, t) where f (x, t) = 2 −(∂tα a(t) + q(x)a(t)) x2 + a(t) and use (6.18), (6.19), (6.20) to arrive, after some computations, at the solution representation t ∞ α−1 α ρj (t − s) Eα,α (−λj (t − s) )a(s) ds φj (x) (10.91) u(x, t) = j=1
0
(cf. [302, Proposition 3]). Evaluating at the right-hand boundary and using the normalisation φj (1) = 1, we have t ∞ (10.92) u(1, t) = ρj sα−1 Eα,α (−λj sα )a(t − s) ds, 0 < t < T. j=1
0
An integration by parts using the assumption a(0) = 0 and (3.34) yields (10.93) t t 1 d (Eα,1 (−λj sα ))a(t − s) ds sα−1 Eα,α (−λj sα )a(t − s) ds = − λ ds j 0 0 t 1 1 Eα,1 (−λsα )a (t − s) ds, = a(t) − λj λj 0 and this gives (10.94) ∞ ∞ ρj ρj t a(t) − Eα,1 (−λj sα )a (t − s) ds, u(1, t) = λj λj 0 j=1
j=1
0 < t < T.
10.4. Coefficient identification from boundary data
From Lemma 9.9 we know that ρn = c0 +o(1) as n → ∞, and so ∞. Therefore, setting ∞ ∞ ρj ρj , A(t) = − Eα,1 (−λj tα ), b= λj λj j=1
347 ∞ ** ρj ** j=1 * λj *
0 and the fact that a = 0 yields b + A(t) = c + B(t),
0 < t < ∞.
Taking the Laplace transform then shows that ∞ ∞ c σj z α−1 b ρj z α−1 + + = z λj z α + λj z μj z α + μj j=1
j=1
for Re z > 0. Multiplying with z 1−α and setting η = z α , we obtain ∞ ∞ c σj 1 b ρj 1 + = + (10.96) η λj η + λj η μj η + μj j=1
j=1
for Re η > 0. By analyticity with respect to η, we see that the two representations in (10.96) must agree, and so both the pole locations and their residues must be identical. This gives b = c,
λj = μ j ,
ρj = σj
for all j ∈ N.
The Gel’fand–Levitan theory for the potential-form inverse Sturm–Liouville problem (cf. Section 9.2) now yields the uniqueness result for recovering q given overposed values on the boundary x = 1.
348
10. Inverse Problems for Fractional Diffusion
To make these arguments rigorous, one needs to assume that a is nonzero and lies in the space ⎧ α = 0}, 12 0, ⎪ ⎨ t u(0, t) = u(1, t) = 0, t > 0, (10.99) ⎪ ⎩ u(x, 0) = 0, 0 < x < 1. If we use the terminology of u = u(x, t; q) representing temperature at position x in a unit rod and time t, then the experimental setup consists of supposing we are able to interact with the rod by being able to input a complete family of spatially varying sources {fj (x)} while holding the temperature at the ends fixed. We then measure flux data of either a fixed time value t¯ at the left end of the rod uj,x (0, t¯; q), or the net flux leaving the bar at t¯; namely uj,x (1, t¯; q) − uj,x (0, t¯; q). From this we wish to recover the potential q. This inverse problem was first considered in [227] for the parabolic case of α = 1 and subsequently in [168] for the fractional. The following exposition is based on this latter paper. Thus we have the overposed data conditions: either (10.100)
uj,x (0, t¯; q) = hj
350
10. Inverse Problems for Fractional Diffusion
or uj,x (1, t¯; q) − uj,x (0, t¯; q) = hj .
(10.101)
Our questions will be the usual ones: Can q(x) be uniquely recovered? If so, what is the degree of ill-conditioning? How does this depend on t¯ and, in particular, on the fractional order α? Perhaps not surprisingly, inverse Sturm–Liouville ideas will again play a role. Thus {λn , ϕn }∞ 1 will denote the (of course as yet unknown) eigenvalues and orthonormal eigenfunctions 2 corresponding to the potential q in the operator −L(q) := − ddx2 + q subject to Dirichlet boundary conditions on (0, 1). We consider only nonnegative potentials in L∞ and define, for a fixed value of ω, the set ( ) Mω = q ∈ L∞ (0, 1) : q ≥ 0, q L2 (0,1) ≤ ω . Green’s function for L(q) is (10.102)
G(x, y; q) =
∞ ϕn (x; q)ϕn (y; q) n=1
λn (q)
.
Note that the solution of the equation −v + qv = f , v(0) = v0 , v(1) = v1 , is given by 1 G(x, x; q)f (y)dy + v0 Gx (0, x; q) − v1 Gx (1, x; q). v(x) = 0
This can be deduced from the following lemma. Lemma 10.6. Green’s function G(x, y; q) has the following properties: (a) The function v(y) = Gx (0, y; q) solves L(q)v(y) = 0 on (0, 1) and v(0) = 1 and v(1) = 0. (b) The function v(y) = Gx (1, y; q) solves L(q)v(y) = 0 on (0, 1) and v(0) = 0 and v(1) = −1. We shall also need the following two basic estimates on Green’s function G(x, y; q). Lemma 10.7. For any q ∈ Mω , Green’s function G(x, y; q) satisfies the following. • For any δ > 0, inf y∈[0,1−δ] Gx (0, y; q) ≥ δ − π −1 ω. • There holds |Gx (0, y; q) − Gx (1, y; q)| ≥ 1 − π −1 ω. Proof. By Lemma 10.6, the modified function w(y) := Gx (0, y; q) − 1 + y satisfies −L(q)w = (y − 1)q and w(0) = w(1) = 0. Therefore, multiplying 1 1 with w and integrating by parts, we get 0 |w |2 + qw2 dy = 0 (y − 1)qw dy. Since the potential q ≥ 0, we deduce w 2L2 (0,1) ≤ q L2 (0,1) w L2 (0,1) ,
10.4. Coefficient identification from boundary data
351
and since w ∈ H01 (0, 1), we get w L∞ (0,1) ≤ w L2 (0,1) and w L2 (0,1) ≤ π −1 w L2 (0,1) . Combining these, we see that w L∞ (0,1) ≤ π −1 q L2 (0,1) . Then the definition of w(y) and the triangle inequality give the desired estimate for the first part. To show part (b) of Lemma 10.7, we consider the function w(y) := Gx (0, y; q) − Gx (1, y; q) − 1, which solves −L(q)w = q and w(0) = w(1) = 0. This yields H L∞ (0,1) ≤ π −1 q L2 (0,1) ≤ π −1 ω, and from this the result follows in an identical manner. It will be convenient to define a commonly occurring quantity, 1 Eα,1 (−λn (q)tα ). (10.103) Eα,n (t) := λn (q) The unique solution u(x, t) to the fractional diffusion equation with timeindependent right-hand side (10.99) due to (6.18) can be represented by (10.104) ∞ t u(x, t) = (t − s)α−1 Eα,α (−λn (q)(t − s)α )dsf, ϕn (·; q)ϕn (x; q) =
n=1 0 ∞ n=1 1
1 − Eα,1 (−λn (q)tα ) ϕn (x; q)f, ϕn (·; q) λn (q)
=
G(x, y; q)f (y)dy −
0
∞
Eα,n (t)ϕn (x; q)f, ϕn (·; q),
n=1
where the second line follows from (3.34) and the third line from the definition of Green’s function G(x, y; q). For f ∈ L2 (Ω), the solution exhibits the regularity u(x, t) ∈ C([0, T ]; H 2 (Ω) ∩ H01 (Ω)) by the results from Chapter 6. For q1 , q2 ∈ Mω , let uj (x, t; qi ) be the solution of (10.99) with f = fj . We also define flux conditions Fleft and Fdiff for the statement of equal overposed data, Fleft : {uj,x (0, t¯; q1 ) = uj,x (0, t¯; q2 )}∞ , j=1
Fdiff :
{uj,x (1, t¯; q1 ) − uj,x (0, t¯; q1 ) = uj,x (1, t¯; q2 ) − uj,x (0, t¯; q2 )}∞ j=1 .
We introduce functions that arise in the representation of flux data ∞ Eα,n (t)ϕn (0; q)ϕn (y; q), E0 (y, t; q) = E1 (y, t; q) =
n=1 ∞
Eα,n (t)ϕn (1; q)ϕn (y; q),
n=1
and will relate Green’s function to E0 and E1 in Lemma 10.8. In each of the two cases corresponding to the two types of flux data, the first statement
352
10. Inverse Problems for Fractional Diffusion
relates Green’s function Gx (x, y; q) with the transient part Ei (y, t¯; q), while the second provides an O(t¯−α ) decay estimate of the transient part. Lemma 10.8. Let {fj } be complete in L2 (0, 1) and let q1 , q2 ∈ Mω . Then in the case of overposed data (10.100) assuming condition Fleft , we have Gx (0, y; q1 ) − Gx (0, y; q2 ) = E0 (y, t¯; q1 ) − E0 (y, t¯; q2 ),
Gx (0, y; q2 )(q2 − q1 ) L2 (0,1) ≤ C t¯−α q2 − q1 L2 (0,1) , and in the case of (10.101) together with Fdiff [Gx (1, y; q1 ) − Gx (0, y; q1 )] − [Gx (1, y; q2 ) − Gx (0, y; q2 )] = E0 (y, t¯; q2 ) − E0 (y; t¯; q1 ) + E1 (y, t¯; q1 ) − E1 (y, t¯; q2 ),
(Gx (0, y; q2 ) − Gx (1, y; q2 ))(q2 − q1 ) L2 (0,1) ≤ C t¯−α q2 − q1 L2 (0,1) . Proof. It follows from the representation (10.104) that 1 ¯ [Gx (0, y; q) − E0 (y, t¯; q)] fj (y) dy. uj,x (0, t; q) = 0
This together with the flux condition Fleft and the completeness of the set {fj } in L2 (0, 1) yields the first assertion directly. The analogous situation also holds for Fdiff . Proving Lipschitz estimates for Gx in each case takes more work and can be found in the appendix of [168]. We are now in a position to state uniqueness results for the inverse problem. Theorem 10.9. Let uj (x, t; qi ) be the solution of (10.99) with q = qi ∈ Mω for i = 1, 2, and suppose that the set {fj } forms a complete basis in L2 (0, 1). (a) If further, q1 = q2 on the interval [1 − δ, 1] for some δ ∈ (0, 1) and ω < δπ, then there exists a time t˜ > 0 such that if the flux condition Fleft holds for any t¯ > t˜, then q1 = q2 on the interval [0, 1]. (b) If ω < π, then there exists a time t˜ > 0 such that the flux condition Fdiff for any t¯ > t˜ implies q1 = q2 on the interval [0, 1]. Proof. By Lemmas 10.7 and 10.8, we have (δ − π −1 ω) q2 − q1 L2 (0,1−δ) ≤ Gx (0, ξ; q2 )(q2 − q1 ) L2 (0,1−δ) ≤ C t¯−α q2 − q1 L2 (0,1−δ) , and consequently
q2 − q1 L2 (0,1−δ) ≤ C t¯−α (δ − π −1 ω)−1 q2 − q1 L2 (0,1−δ) . For large t¯, C t¯−α (δ − π −1 ω)−1 < 1 holds true, and q2 − q1 L2 (0,1−δ) = 0. This shows assertion (a). The proof of assertion (b) is virtually identical.
10.4. Coefficient identification from boundary data
353
Perhaps the next question is how to use this to reconstruct the potential from the flux data {hj }. One approach (and this was taken in the initial literature references [168, 227]) is to consider the mapping from a given potential to the data, and use Newton’s method. In doing so there are several observations that will facilitate the computation of the corresponding derivative map and allow us to prove its injectivity. The first is to expand the potential q as a Fourier series in terms of the eigenfunctions of the operator d2 −L = − dx 2 +q, and then to select a particular set of forcing functions, again consisting of these eigenfunctions. Of course the potential q, and hence these eigenfunctions, are unknown, so we instead choose those corresponding to a fixed q(x) which we take to be q = 0. Thus we also choose fj (x) = sin jπx. This means that we are actually using a quasi-Newton method with the derivative map fixed at the potential q = 0. From inverse Sturm–Liouville results this method is known to perform exceedingly well [57], and we intend to mimic this approach. It will also make some rather tedious formulas below much less so, and of course any known basis set {fj }∞ 1 can be mapped oneto-one onto this more convenient orthogonal choice of basis. The strategy will be as follows. We first derive the asymptotic behaviour of the flux data as a function of t and of the index j from the effect of the prescribed forcing source functions {fj }. As usual, we need some preliminary lemmas. The first is for the steady state solution. Lemma 10.9. Let q ∈ Mω for ω sufficiently small, and let Sj (x) solve −L(q)Sj = sin jπx with Sj (0) = Sj (1) = 0. Then there exist sequences {aj }, {bj } ∈ 2 such that Sj (0) =
1 jπ
+
aj (q) S, j 2 π2
Sj (1) − Sj (0) =
(−1)j −1 jπ
+
bj (q) . j 2 π2
Proof. Let z1 (x) and z2 (x) be the fundamental solutions of L(q)z = 0 with the boundary conditions z2 (0) = z1 (0) = 0 and z1 (0) = z2 (0) = 1, respectively. We know from Sturm–Liouville theory (9.84) that these functions obey the bounds √ q2
z1 (x) − 1 L∞ (0,1) ≤ 2 q L2 (0,1) e L2 (0,1) , (10.105) q2
z2 (x) − x L∞ (0,1) ≤ q L2 (0,1) e L2 (0,1) . Then an easy calculation in ordinary differential equations shows that (10.106) 1 z2 (x) Sj (x) = Sj (0) − jπ x 1 [z1 (t)z2 (x) − z1 (x)z2 (t)]q(t) sin jπt dt . + 2 2 sin jπx + j π 0
354
10. Inverse Problems for Fractional Diffusion
Now Sj (1) = 0 but z2 (1) = 0, and so 1 1 1 Sj (0) − =− 2 2 [z1 (t)z2 (1) − z1 (1)z2 (t)]q(t) sin jπt dt. jπ j π z2 (1) 0 The first estimates directly follow from this, Parseval’s inequality, and (10.105). To obtain the second estimate, we integrate the differential equa1 j −1 + 0 qSj dx, which tion −L(q)Sj = sin jπx to get Sj (1) − Sj (0) = (−1) jπ together with the formula (10.106) for Sj (x) yields 1 1 (−1)j − 1 + 2 2 q(x) sin jπx dx Sj (1) − Sj (0) = jπ j π 0 1 1 + Sj (0) − qz2 dx jπ 0 1 x 1 q(x)z2 (x) z1 (t)q(t) sin jπt dtdx + 2 2 j π 0 0 x 1 1 q(x)z1 (x) z2 (t)q(t) sin jπt dtdx. − 2 2 j π 0 0 Changing the order of integration in the last two terms then shows Sj (1) − Sj (0) = where bj (q) =
(−1)j −1 jπ
+
bj (q) , j 2 π2
1
q(x) sin jπxdx 1 1 1 − [z1 (t)z2 (1)−z1 (1)z2 (t)]q(t) sin jπt dt qz2 dx z2 (1) 0 0 1 1 z1 (x)q(x) q(t)z2 (t) dt + 0 x 1 1 q(t)z1 (t)dt sin jπx dx. − z2 (x)q(x) 0
0
x
The second estimate in the lemma follows from {bj (q)} ∈ 2 which in turn follows from Parseval’s identity and the estimate √ q2 |z2 (1)| ≥ 1 − 2 q L2 (0,1) e L2 (0,1) . The second lemma provides an estimate for the transient solution. Lemma 10.10. Let Sj be defined as in Lemma 10.9, and let vj solve j + qv j = 0, ∂tα v j − vxx
Then vxj (0) =
aj j 2 π2
v j (0, t) = v j (1, t) = 0,
and vxj (1) − vxj (0) =
bj , j 2 π2
v j (x, 0) = −Sj (x).
where {aj } ∈ 2 and {bj } ∈ 2 .
10.4. Coefficient identification from boundary data
355
Proof. The solution v j is given by v j (x, t) =
∞
−Sj (s), ϕn (s; q)Eα,1 (−λn (q)tα )ϕn (x; q),
n=1
and it follows from the definitions of Sj , ϕn (x; q), and integration by parts that ϕn (x; q), Sj (x) = λ−1 n (q)ϕn (x; q), sin jπx. Hence (ϕn (x; q), Sj (x)) = λ−1 n (q)(ϕn (x; q), sin jπx) = −(j 2 π 2 λn (q))−1 (ϕn (x; q), sin jπx), and inserting these into the eigenfunction equation gives v j (x, t)
1 ∞ 1 1 ϕn (s; q) sin jπsds Eα,1 (−λn (q)tα )ϕn (x; q) = 2 2 j π λn (q) 0 n=1 1 ∞ q(s) − λn (q) 1 ϕn (s; q)Eα,1 (−λn (q)tα )ϕn (x; q) sin jπs ds. = 2 2 j π 0 λn (q) n=1
Consequently, vxj (1, t) − vxj (0, t) =
bj (q) 1 = 2 2 2 2 j π j π
1
b(q; s) sin jπs ds, 0
where b(q; s) =
∞ q(s) − λn (q) n=1
λn (q)
ϕn (s; q)[ϕn (1; q) − ϕn (0; q)]Eα,1 (−λn (q)tα ).
2 Since the set {ϕn (s; q)}∞ n=1 forms a complete orthonormal basis in L (0, 1) and the potential q is uniformly bounded, we deduce from Lemmas 9.9 and 3.25 that
b(q) 2L2 (0,1)
≤C ≤C
∞ n=1 ∞ n=1
2 |ϕn (1; q) − ϕn (0; q)|2 Eα,1 (−λn (q)tα )
n2 C ≤ 2α . 2 2α λn (q)t t
Parseval’s identity now yields {bj (q)} ∈ 2 . Furthermore, bj (q) 2 ≤ C holds uniformly for all q with sufficiently small L2 norm. These two lemmas now allow us to prove the following. Theorem 10.10. Let fj (x) = sin jπx, and let uj (x, t; q) be the solution to (10.99). For each fixed t > 0 and for all q ∈ Mω with ω sufficiently small,
356
10. Inverse Problems for Fractional Diffusion
there exists sequences {cj (q)}, {˜ cj (q)} ∈ 2 such that cj (q) 1 + 2 2, jπ j π j (−1) − 1 c˜j (q) + 2 2. uj,x (1, t) − uj,x (0, t) = jπ j π uj,x (0, t) =
Proof. We decompose the solution uj (x, t) for the input source fj = sin jπx into a steady-state part Sj and a transient part v j as follows: −L(q)Sj = sin jπx with
Sj (0) = Sj (1) = 0,
and j + qv j = 0, ∂tα v j − vxx
v j (0, t) = v j (1, t) = 0,
v j (x, 0) = −Sj (x).
Then we apply the estimates in Lemmas 10.9 and 10.10.
The results in Theorem 10.10 may be interpreted as follows: The net flux data uj,x (1, t¯; q)−uj,x (0, t¯; q) has a leading-order term (jπ)−1 ((−1)j−1), which corresponds to the exact data for q = 0, plus a correction term (jπ)−2 cj (q). Therefore, the term cj (q) can be thought of as containing all the information about the potential q. The leading term (jπ)−1 ((−1)j −1) stems purely from the steady-state piece Sj (cf. Lemma 10.9), and thus it is independent of the time t¯. In contrast, the informative piece cj (q) depends crucially on the time t¯, and the larger the time t¯ is, the closer cj (q) is to the steady-state part. A similar interpretation applies to the flux data on the left end, i.e., uj,x (0, t¯; q). This observation motivates us to look at the map F from a given q to the sequence {cj (q)}∞ j=1 defined as F : L2 (0, 1) → 2 ,
F (q) := {cj (q)}∞ j=1 ,
and the next result shows that the linearisation F (q) at q = 0 is invertible. Theorem 10.11. There exists a time t˜ such that for each fixed t > t˜, the 1
p L2 (0,1) . map F satisfies F (0)p 2 ≥ 2√ 2 Proof. We give an outline here, and the few omitted steps are easily filled in. The solution uj (x, t; 0) of (10.99) with q = 0 and f (x) = sin jπx is given α 2 2 by uj (x, t, 0) = λ−1 j (1 − Eα,1 (−λj t ) sin jπx, where λj = j π . Next let
10.4. Coefficient identification from boundary data
357
w := Duj (·, ·; 0)[δq], the derivative of uj with respect to q at q = 0 in the direction δq. Then w solves ⎧ α ⎪ ⎨∂t w − wxx = −uj (x, t; 0)δq(x), (x, t) ∈ [0, 1] × (0, ∞), w(0, t) = w(1, t) = 0, t > 0, ⎪ ⎩ w(x, 0) = 0, x ∈ [0, 1]. Computing a representation for w is now a standard calculation that we have seen several times previously and gives (10.107) DFj [0]δq = j 2 π 2 [wx (1, t) − wx (0, t)] 1 1 δq(s) sin jπs ds + ξt (s)δq(s) sin jπs ds = 0
+2
∞
0
1
nπ((−1) − 1)an,j (t) n
sin jπs sin nπs δq(s) ds, 0
n=1
where
t
an,j =
(t − τ )α−1 Eα,α (−μn (t − τ )α )Eα,1 (−μj τ α ) dτ,
0
(10.108) ξt (s) = 2
∞ (−1)n − 1
n=1
nπ
Eα,1 (−μn tα ) sin nπs.
It is easy to show that ∞ 4
ξt L∞ (0,1) ≤ min 1, n−1 Eα,1 (−μn tα ) . π n=1
The Cauchy–Schwarz inequality and Parseval’s identity now show that the last term Aj in (10.107) satisfies the bound |Aj |2 ≤ 2
∞
[nπ((−1)n − 1)an,j (t)]2
n=1
= δq(s) sin jπs 2L2 (0,1)
∞ n=1
∞
1
2 sin jπs sin nπs δq(s) ds
0
[nπ((−1)n − 1)an,j (t)]2 .
n=1
358
10. Inverse Problems for Fractional Diffusion
Standard estimates for the Mittag-Leffler function now give the series of inequalities −2μ −2μα a2n,j ≤ a2n,1 ≤ C(μ)λ−2 1 λn t
for any μ ∈ ( 34 , 1), and therefore
Aj 22 ≤ max j∈N
∞
[nπ((−1)n − 1)an,j (t)]2
n=1
δq(s) sin jπs 2L2 (0,1)
j=1
1 = δq 2L2 (0,1) max j∈N 2 ≤ δq 2L2 (0,1)
∞
∞
∞
[nπ((−1)n − 1)an,j (t)]2
n=1
(nπ)2 ((−1)n − 1)2
n=1
C(μ) 2μα λ21 λ2μ n t
C(μ) ≤ δq 2L2 (0,1) 2μα . t Combining the above estimates gives, for t sufficiently large,
F (0)δq 2 & 1 & & & & ≥& δq(x) sin jπx dx& &2 0 & 1 & & & & −& ξt (x)δq(x) sin jπx dx& & − Aj 2 0
1 ≥√ 2
2
∞ 41 C(μ) 1− Eα,1 (−λn tα ) − μα π n t
δq L2 (0,1)
n=1
1 ≥ √ δq L2 (0,1) . 2 2
The importance of Theorem 10.11 lies in the fact that it gives local uniqueness around q = 0 for any fixed time t¯ > t∗ when using the entire 2 sequence of net flux data. It also gives continuity at q = 0 for smooth perturbations. We are also mapping Fourier coefficients of q into those directly obtained from the data and the analogy here with the inverse Sturm–Liouville problem suggests that this local linearisation gives excellent reconstructions [57]. However, in a practical situation we would not have the entire set of input sources and the resulting data at our disposal, but only those from a finite dimensional subset. We now turn to looking at this question. N N For each integer N ∈ N and time t, we define the maps GN t : R → R for the net flux case (10.101) and HtN : RN → RN for the single flux case
10.4. Coefficient identification from boundary data
(10.100) given by
⎡
⎢ ⎢ GN (q) = ⎢ t ⎣
π 2 (u1,x (1, t; q) − u1,x (0, t; q)) 4π 2 (u2,x (1, t; q) − u2,x (0, t; q)) .. .
359
⎤ ⎥ ⎥ ⎥, ⎦
N 2 π 2 (uN,x (0, t; q) − uN,x (0, t; q)) ⎤ [π 2 u1,x (0, t; q) ⎢ 4π 2 u2,x (0, t; q) ⎥ ⎥ ⎢ HtN (q) = ⎢ ⎥, .. ⎦ ⎣ . ⎡
N 2 π 2 uN,x (0, t; q) where q(x) = N i=1 qi sin iπx and q = (q1 , . . . , qN ) and the quantities anj defined in (10.108) as well as 2jnk[(−1)n+j+k − 1] π((n−k)2 −j 2 )((n+k)2 −j 2 ) if n ∈ {j −k, k−j, k+j}, zero otherwise, 1 Eα,1 (−n2 π 2 tα ) + nπan,j (t)bn,j,k , = nπ ⎧
(−1)k+j −1 1 1 ⎪ if j = k, − (k+j)2 ⎨ 2π 2 (k−j)2 = ⎪ ⎩ − 14 if j = k, ∞ =2 ((−1)n − 1) (nπ)−1 Eα,1 (−n2 π 2 tα ) + nπan,j (t) en,j,k ,
bn,j,k =
en,j,k (SN )j,k
(PN (t))jk
(P˜N (t))j,k = 2
n=1 ∞
(nπ)−1 Eα,1 (−n2 π 2 t) + nπan,j (t) en,j,k .
n=1
With these definitions we then have the following. N Theorem 10.12. The Jacobian matrices of GN t and Ht at q = 0 are given by 1 DHtN [0] = SN + P˜N (t), DGN t [0] = 2 IN + PN (t),
where IN is the N × N identity matrix. In particular, for each fixed N 1 lim DGN t [0] = 2 IN ,
t→∞
lim DHtN [0] = SN ,
t→∞
where SN is invertible for each fixed integer N . The above theorem can be used, along with the finite-dimensional implicit function theorem, to show that, for each fixed N , the frozen Newton schemes,
360
10. Inverse Problems for Fractional Diffusion
defined by −1 N qn+1 = qn − (DGN t [0]) [Gt (qn ) − γN ],
qn+1 = qn − (DHtN [0])−1 [HtN (qn ) − γN ],
q0 = 0, q0 = 0,
are well-defined and the former (using G) converges, provided that
qact L2 (0,1) is sufficiently small and t¯ is sufficiently large. Note that the implicit function theorem gives local uniqueness about q = 0 at a fixed time t¯ > t˜ for some t¯ when using the entire 2 sequence of net flux data. The approach also gives local continuity at q = 0 for smooth perturbations. For the case of single-point flux data, the steadystate contribution SN to the Jacobian matrix given in Corollary 10.12 does not have a bounded inverse as N → ∞. However, it is true that the finitedimensional mapping will, for each fixed N and t¯ sufficiently large, have a unique inverse. Given the form of the Jacobian matrices, it can be expected, particularly in the case of flux measured at one single point, that the value of t¯ should depend on N . In the finite-dimensional case, the condition that q(x) be known in a neighbourhood of x = 1 as required by Theorem 10.9 is not necessary, provided that q is sufficiently small. These schemes are stable and converge rapidly over a broad range of potentials and initial approximations. These are also very easy to program and the results are entirely in line with the analysis developed and the discussion below. Figure 10.9 shows the reconstruction errors as a function of α and the value of the final time measurement value t¯ for both the single and net flux cases. Figure 10.10 shows the reconstruction of both a smooth and a piecewise linear function for a range of data error levels with α = 0.5. What do this analysis and the results of such numerical experiments tell us about the effect with respect to the fractional exponent α and, in particular, with the fractional case as against the classical one with α = 1? 10−1 Error
10−1 Error
10−2
10−2
.. .. .. .. ... ... . ... . ........... ¯ . t = 0.05 ... .... .. ........... ¯ .. t = 0.01 .. ... ... ........... ¯ ... t = 0.005 ... .... .. .. .. ... .. ... .. ... .... .. .. ... ... ... ... .. .. . . ... ... ... .... .. .... ...................... . . . . . . . . . . . . . . . . . . . . .. ........................................... ........ ......................................................................................................................................................................................................................
10−3 0.00
0.25
0.50
0.75
1.00
.. ... .. .. . . ........... ¯ t = 0.05 ... ... ........... ¯ .. t = 0.01 .. . . ........... ¯ t = 0.005 ... ... ... .... .. ... ... ... .. ... . .. .. .. ... . . . . . .. ... ... ... ... ... ... ...... ........... . . . . . . . . . . ... ................ .............. ...........................................................................................................................................................................................................................................
10−3 0.00
α 0.25
0.50
Figure 10.9. Reconstruction errors for a range of α values
0.75
1.00
10.4. Coefficient identification from boundary data
...... ............................................... ........................ ............ ...................... ............................ . . ........... ........ .......... .......... ........... ........... ........... .......... . . .......... .... . . ......... ..... ......... ....... .......... . .. ........... .... . . .......... .. . . . ........... . . . . . .......... 0.5 .. . . ........... ... . ....... . .... ....... . . . ......... . .. . ........ . .. . ........ . .. . ........ . .. . ........ . . . . . ...... .. . . . ...... . . . . . .... .. . . . . . .. ... . . . . . . . . . . . . .. . ..... ............ x 0.0 ...
1.0
0.0
α = 0.5 ..... exact ........... δ = 3% ........... δ = 5% ........... δ = 10%
0.5
1.0
361
..... ............... ................ .............. ....... ......... .. ................. . ...... .. . .... .. ...... ...... .. .. ...... ...... .. ...... .. ... ......... .. ...... ...... .. .. ...... ...... .. ........ ....... . ........ . . . . 0.5 ...... ....... .... ....... ..... ....... .......... ....... ...... .... .. .. ...... ..... .. .. ....... . . . . . . . . . . .. ...... . . . . . . . . . . . . . . . . . . . . ............. ........... .. . ....................... .............. ............... ......... ............... x . . . . . . . . . . . . 0.0 ........................ . ....... ................ ......... . ............. . ...... . . ................... . . .... . . . . .................. .. .................... ................. .......................... .....
1.0
-0.5 0.0
α = 0.5 ..... exact ........... δ = 3% ........... δ = 5% ........... δ = 10%
0.5
1.0
Figure 10.10. Reconstruction of both a smooth and a rough function: α = 0.5 for different data error levels
First we must recognise that the time of measurement t¯ is an important element. If t¯ is very small, then we are far from the steady state situation and the transient solution has not had time to diffuse to the boundary, so we should expect poorer reconstructions. Indeed, we only have uniqueness for t¯ sufficiently large. In this time range the solution arising from the fractional derivative will diffuse faster and hence should be more effective. As an example, if t¯ = 0.005 and we have N = 10 input sources, then the condition 4 number of the matrix DGN t¯ drops from 6.5 × 10 to 56 as α decreases from 1 to 0.8. The single flux case shows even more dramatic figures, decreasing from 2.6 × 1013 to 8.2 × 104 over the same range and t¯ value. It was the case throughout, but especially for smaller t¯, that the condition number of N DGN t¯ was smaller than DHt¯ showing the greater efficiency of the net flux over the single point flux data measurements. By t¯ = 0.05 the difference over t¯ = 0.005 has become dramatic. The reconstruction errors from noise-free data due to truncation by the value of N has become the dominant factor, and the differences with respect to α remain, but at an almost imperceptible level from that of t¯ = 0.005. In N addition, the differences in the condition numbers between DGN t¯ and DHt¯ are much closer showing that for larger times both types of overposed data give similar results. The influence of the order α on the reconstruction accuracy diminishes considerably as the time t¯ increases from 0.005 to 0.05, at which time the system has almost settled to its steady state for any α ∈ (0, 1]. In all three cases, the smallest reconstruction errors are close to each other for either data type, which is the truncation error (2.608 × 10−3 ) incurred by using only the first ten Fourier modes. Finally we mention again the fact that the time scales mentioned here are not absolute. The model equation (10.99) has all physical constants set
362
10. Inverse Problems for Fractional Diffusion
to unity and, in particular, we mention this of the diffusion coefficient k that couples the time and space scales and appears as ∂tα u − kuxx + q(x)u = f being itself a combination the specific heat and thermal conductivity. For example, for many materials k ≈ 10−5 , and so one must adjust the above t¯ values upwards by such an amount to convert into physical time.
10.5. Coefficient identification from final time data We now consider identification of potential term q = q(x) from observations in Ω at some fixed T > 0 (10.109)
u(x, T ) = g(x)
instead of data on the lateral boundary (10.90). This allows us to extend the spatial setting to higher dimensions. Thus we consider ∂tα u − u + q u = r, (10.110)
∂ν u(x, t) + γ(x)u(x, t) = 0, u(x, 0) = u0 ,
(x, t) ∈ Ω × (0, T ), (x, t) ∈ ∂Ω × (0, T ), x ∈ Ω,
for some 0 < α ≤ 1, where r(x, t) is a known source term and we assume throughout this section (10.111)
γ ∈ L∞ (∂Ω) ,
γ ≥ γ > 0 a.e. in Ω .
Here Ω ⊂ Rd is in a bounded, simply connected domain with C 1,1 boundary ∂Ω. To allow for applicability of certain Sobolev embedding results, the analysis will be restricted to dimensions d ∈ {1, 2, 3}. In (10.110), − may be replaced by a uniformly elliptic, second order pde operator defined in Ω with known and sufficiently smooth coefficients. An often encountered situation is the one in which besides the potential, another space dependent coefficient is to be determined, for example the diffusion coefficient a = a(x) in (10.112)
∂tα u − ∇ · (a∇u) + qu = r,
(x, t) ∈ Ω × (0, T ).
Clearly, more observations are needed for this purpose, and we will use a second set of final time data triggered by an alternative choice of the forcing term r. Additionally, we may also incorporate a nonlinear reaction term of the form q(x)f (u) where f is known so that our model becomes (10.113)
∂tα u − u + qf (u) = r,
(x, t) ∈ Ω × (0, T ).
Of course this requires assumptions on the nonlinear function f (u) even for existence of bounded solution up to a given time T . This can be guaranteed under various conditions such as f > 0 or f uniformly Lipschitz; see Section 6.2.4 for the setting without a spatially varying potential q(x) and Section 10.5.3.3 including this case.
10.5. Coefficient identification from final time data
363
There is again a long history for this version of the inverse potential problem. For example, in [159, 296] conditions were given for a unique recovery of q as well as reconstruction algorithms in the parabolic case α = 1. The differential operator was evaluated on the boundary where data was measured to obtain the fixed point equation, (10.114)
q(x) = T(q)(x) :=
g(x) − ut (x, T ; q) + r(x, T ) , g(x)
x ∈ Ω.
This was extended to the fractional case α ∈ (0, 1) in the one-dimensional setting in [367]. In this section we will consider methods for reconstructing coefficients (mainly q) in the above described settings (10.110), (10.112), (10.113) from final time data (10.109). In Section 10.5.1, following [182, 184], we will provide an analysis of a fixed point scheme of the type (10.114) also in the fractional case and higher space dimensions. Moreover, in Section 10.5.2 we discuss extension of the scheme to the problem of identifying two space-dependent coefficients in (10.112). An alternative to this fixed point scheme is found in Section 10.5.3, where we study the application of Newton’s and Halley’s methods to the inverse problem of recovering q, formulated as an operator equation F (q) = g. There we will also comment on the semilinear case (10.113). 10.5.1. A fixed point scheme. Consider the problem of identifying the potenial q(x) in (10.110) from observations g(x) = u(x, T ), x ∈ Ω, and define a fixed point operator T by (10.115)
T(q)(x) = q + (x) =
g(x) − ∂tα u(x, T ; q) + r(x, T ) , x ∈ Ω, g(x) where u(x, t) := u(x, t; q) solves (10.110).
We first of all show that the parameter-to-state map G : q → u is welldefined as an operator from a subset of L2 (Ω) to C([0, T ]; H 2 (Ω)), which via the pde implies ∂tα u = u − q u ∈ C([0, T ]; L2 (Ω)), thus T is well-defined. We would like to work with the Hilbert space X = L2 (Ω) as a parameter space. As a domain of G we therefore use the following subset of L2 (Ω), that will later on, in the context of Newton’s method, also be used as the domain of the forward operator F : / C . 2 q − q¯ L2 < μ = BμL (¯ q) . (10.116) D(F ) := q˜ ∈ L2 (Ω) : ∃ q¯ > 0 : ˜ q¯>0
364
10. Inverse Problems for Fractional Diffusion
Here μ needs to be chosen sufficiently small; more precisely % min{1, γ} 1 , μ = min , Ω Ω 2 (−)−1 2 (CFΩ CH 1 →L4 ) L (Ω)→H 2 (Ω) CH 2 →L∞ where CFΩ is the constant in the Friedrichs inequality
1/2 ,
v H 1 ≤ CF ∇v 2L2 + tr∂Ω v 2L2 (∂Ω) Ω Ω 1 4 2 ∞ and CH 1 →L4 , CH 2 →L∞ are the H (Ω) → L (Ω) and H (Ω) → L (Ω) embedding constants; see [210].
This choice of D(F ) (10.116) and in particular of μ allows us to establish the following preparatory result on the elliptic operator Lq := − + q· induced by an L2 (Ω) potential q ∈ D(F ). q 2 2 Theorem 10.13. For q ∈ D(F ), → L (Ω), de the operator L : H (Ω) q fined by L u, vH 1 (Ω)∗ ,H 1 (Ω) := Ω (∇u · ∇v + quv) dx + ∂Ω γuv ds is an isomorphism.
Proof. For q ∈ D(F ), the operator Lq : H 1 (Ω) → (H 1 (Ω))∗ is an isomorphism due to the Lax-Milgram lemma (see, e.g., [305, Section 16.1]) based on boundedness (∇u · ∇v + quv) dx + γuv ds Ω
∂Ω
≤ ∇u L2 ∇v L2 + q L2 u L4 v L4 + γ L∞ u L2 (∂Ω) v L2 (∂Ω) Ω 2 ≤ 1 + q L2 (CH 1 →L4 ) + γ L∞ tr∂Ω H 1 (Ω)→L2 (∂Ω) u H 1 v H 1 and H 1 ellipticity 2 2 (|∇u| + qu ) dx + Ω
γu2 ds ∂Ω
≥ ∇u 2L2 − min{0, q} L2 u 2L4 + γ tr∂Ω u 2L2 (∂Ω) min{1, γ} Ω 2 ≥ − min{0, q} L2 (CH 1 →L4 ) u 2H 1 . Ω 2 (CF ) For q in D(F ) as in (10.116), by a fixed point argument, we also have H 2 regularity in the sense that (Lq )−1 : L2 (Ω) → H 2 (Ω) is bounded. This can be seen as follows. For arbitrary q¯ ∈ L∞ (Ω; R+ ) and f ∈ L2 (Ω), we have Lq u = f if and q − q)u + f ], where Tq¯ is a contraction on only if u = Tq¯u := (Lq¯)−1 [(¯ 2 H (Ω) provided the multiplication operator Mq−¯q : v → (q − q¯)v satisfies
(Lq¯)−1 Mq−¯q H 2 →H 2 < 1. Thus, the solution u of Lq u = f is in H 2 (Ω), with
u H 2 ≤ C q f L2
10.5. Coefficient identification from final time data
365
for some constant C q := (Lq )−1 L2 →H 2 independent of f , provided inf
q¯∈L∞ (Ω;R+ )
(Lq¯)−1 Mq−¯q H 2 →H 2 < 1 .
The latter is satisfied if q ∈ D(F ) defined as in (10.116) with μ > 0 sufficiently small, since for any constant function q¯ ≡ const ∈ L∞ (Ω; R+ ), and any f ∈ L2 (Ω), v = (Lq¯)−1 f ∈ H 2 (Ω), the estimate 2 2 2 q¯
− v L2 ≤ − v L2 + q¯ |∇v| dx = L v, (−v)L2 − q¯ γv 2 ds Ω
≤
2 1 2 f L2
+
1 2
−
∂Ω 2 v L2
holds. Hence (−)(Lq¯)−1 L2 →L2 ≤ 1 and therefore
(Lq¯)−1 Mq−¯q H 2 →H 2 ≤ (−)−1 L2 →H 2 Mq−¯q H 2 →L2 Ω ≤ (−)−1 L2 →H 2 CH ¯ L2 . 2 →L∞ q − q
As a consequence, the inverse operator (Lq )−1 : L2 (Ω) → L2 (Ω) is selfadjoint and compact, and therefore Lq has a complete orthonormal system of eigenfunctions ϕq and a corresponding sequence of eigenvalues λq tending to infinity. Moreover, for the spaces (10.117)
H˙ qs (Ω) := {v ∈ L2 (Ω) : v H˙ s :=
∞
q
(λq )s v, ϕq 2 < ∞}
=1
we have that H˙ qs and H s coincide (up to boundary conditions, see Section A.2) with equivalent norms as long as s ≤ 2, where the constants in 2 the equivalence estimates are uniform for q in an L2 neighborhood BμL (¯ q) of some positive and constant potential q¯. Indeed this is actually true for q ∈ L2 (Ω) and s ≤ 2, since we can estimate u0 H˙ 2 = Lq u0 L2 by q
Ω (1 + CH 2 →L∞ q L2 ) u0 H 2 . As soon as s > 2, higher smoothness of q is needed, as can be seen from the example
u0 H˙ 4 = (Lq )2 u0 L2 = (−)2 u0 + q(−)u0 + (−)[qu0 ] + q 2 u0 L2 , q
which makes sense only for q ∈ H 2 (Ω). We will work with H˙ qσ (Ω) as a solution space with σ having to be well chosen to balance between a sufficiently high required regularity to allow embedding into L∞ (Ω) and a sufficiently low power of the eigenvalues to still enable decay of the contraction constant Φσ (T ); see Proposition 10.1. By means of separation of variables, Theorem 10.13 allows us to prove well-posedness of the forward problem (10.144). Theorem 10.14. For any q ∈ D(F ) according to (10.116) and any u0 ∈ L2 (Ω), the initial boundary value problem (10.144) has a unique solution
366
10. Inverse Problems for Fractional Diffusion
u ∈ C ∞ ((0, T ]; H 2 (Ω)) ∩ C([0, T ]; H 2 (Ω)) and for any t > 0 the estimate
u(·, t) H 2 ≤ CΓ(α + 1)t−α u0 L2 holds for some constant C depending only on q (but not on t, α, or u0 ) and which can be chosen uniformly for all q in an L2 (Ω) neighborhood of some constant q¯. Proof. Due to Theorem 10.13, the solution u of (10.144) can be written by means of the separation of variables formula in terms of the Fourier coefficients with respect to the basis ϕq (10.118)
u(x, t) =
∞
Eα,1 (−λq tα )u0 , ϕq ϕq (x) ,
=1
hence, u ∈ C ∞ ((0, T ], H 2 (Ω)), and using Theorem 3.25, we obtain for any t ∈ (0, T ],
u(·, t) H 2 ≤ CH˙ 2 →H 2 u(·, t) H˙ 2 q
q
≤ CH˙ 2 →H 2 Γ(α + 1)t−α u0 L2 . q
We will now prove contractivity of the operator T in (10.115) for sufficiently large, final time T , hence convergence of the corresponding fixed point iteration and uniqueness of a fixed point, and therewith of a solution to the inverse problem. For two different potentials q, q˜ (with corresponding solutions u = G(q), u ˜ := G(˜ q )), the difference q) = δq + = T(q) − T(˜
(10.119)
∂αu −∂tα u(T ) + ∂tα u ˜(T ) ˆ(T ) := − t , g g
where u ˆ =u−u ˜, solves ˆ − ˆ u + qˆ u = −δq u ˜, ∂tα u
(10.120)
t ∈ (0, T ) ,
u ˆ(0) = 0,
with δq = q − q˜. Proposition 10.1. For any σ > d2 , the solution to (10.120) satisfies ˆ(T ) L2 (Ω) ≤ Eα,1 (−λ1 T α ) δq u0 L2 (Ω) + C0 Φσ (T ) δq L2 (Ω) , (10.121) ∂tα u Ω where C0 = CH ˙ σ →L∞ Lu0 L2 (Ω) and q˜
T
σ
(10.122) Φ (T ) = sup sup ˜1 λ≥λ1 μ≥λ
0
sα−1 Eα,α (−λsα ) μσ/2 Eα,1 (−μ(T −s)α ) ds .
10.5. Coefficient identification from final time data
367
˜ j , ϕ˜j ) of the operators defined by Proof. We use eigensystems (λj , ϕj ), (λ Lw = −w + q w and Lw = −w + q˜w, to obtain the representations ∞ t sα−1 Eα,α (−λj sα )−δq u ˜(t − s), ϕj ds ϕj (x), u ˆ(x, t) = u ˜(x, t) = ∂tα u ˆ(x, T ) =
j=1 0 ∞
Eα,1 (−λj tα )(u0 , ϕ˜j )ϕ˜j (x),
j=1 ∞ .
Eα,1 (−λj tα )−δq u0 , ϕj
j=1
T
+ ˜(x, t) = − ∂tα u
∞
/ sα−1 Eα,α (−λj sα )−δq ∂tα u ˜(T −s), ϕj ds ϕj (x),
0
˜ j Eα,1 (−λj tα )u0 , ϕ˜j ϕ˜j (x) . λ
j=1
Here Ω α ˜(t) L∞ (Ω) ≤ CH ˜(t) H˙ σ (Ω) ≤ C0 sup μσ/2 Eα,1 (−μtα ) .
∂tα u ˙ σ →L∞ ∂t u q˜
q˜
˜1 μ≥λ
Hence ˆ(T ) L2 (Ω) ≤
∂tα u +
Eα,1 (−λj T α )δq u0 , ϕj 2
1 2
j=1
∞ j=1
∞
T α−1
s
α
Eα,α (−λj s
)δq ∂tα u ˜(T
2 12 − s), ϕj ds
0
≤ Eα,1 (−λ1 T α ) δq u0 L2 (Ω) + C0 δq L2 (Ω) T × sup sup sα−1 Eα,α (−λsα ) μσ/2 Eα,1 (−μ(T − s)α ) ds . ˜1 λ≥λ1 μ≥λ
0
In view of the estimate from Proposition 10.1 and the identity (10.119), it is crucial for contractivity to prove that Φσ (T ) as defined in (10.122) tends to zero for increasing T . For this purpose, the exponent σ must not be too large. In fact we have the following result, which gives convergence to zero for σ = 2 and hence for any σ ≤ 2. Lemma 10.11. For Φ(T ) defined by (10.123) T sα−1 Eα,α (−λsα ) max{1, μ}Eα,1 (−μ(T − s)α ) ds, Φ(T ) = sup sup ˜1 λ≥λ1 μ≥λ
0
368
10. Inverse Problems for Fractional Diffusion
we have Φ(T ) → 0 as T → ∞. Proof. Using the identities d Eα,1 (−λsα ) , ds
λsα−1 Eα,α (−λsα ) =
Eα,1 (0) = 1 ,
Eα,α (0) =
1 , Γ(α)
and the bound Eα,1 (−x) ≤
1 , 1 + Γ(1 + α)−1 x
that hold for every α ∈ (0, 1) (cf. Theorem 3.25) and all s ∈ R and all x ∈ R+ , as well as the complete monotonicity of the function x → Eα,1 (−x) on R+ , we can estimate as follows: T sα−1 Eα,α (−λsα )μEα,1 (−μ(T − s)α ) ds 0
≤ Γ(1+α)
T 2
sα−1 Eα,α (−λsα )(T − s)−α ds
0
T
+ = Γ(1+α)
α λ
T 2
T 2
sα−1 Eα,α (−λsα )(T − s)−α ds Eα,1 (−λsα )(T − s)−α−1 ds
0
− αλ T −α + αλ Eα,1 (−λ( T2 )α )( T2 )−α T
T α + Eα,α (−λ( 2 ) ) sα−1 (T − s)−α ds ≤ Γ(1+α)
T 2
α λ
+
T 2
(T −s)−α−1 ds
0 α T −α λ( 2 )
+
Eα,α (−λ( T2 )α )
1 1 2
rα−1 (1−r)−α ds ,
where we have used the substitution s = T r. Hence we get 1
Γ(1+α) 1+α T −α α−1 −α T α ( ) + E (−λ ( ) ) r (1 − r) ds Φ(T ) ≤ α,α 1 2 1 ˜ 1 } λ1 2 min{1, λ 2 → 0 as T → ∞ .
Thus for σ ≤ 2, by Lemma 10.11 and Φσ (T ) ≤ CΦ(T ), the right-hand side in (10.121) tends to zero as T → ∞. On the other hand, for σ ≥ 32 , H σ (Ω) continuously embeds into L∞ (Ω) for a bounded domain Ω ⊆ Rd , d ≤ 3, as needed in the proof of Proposition 10.1. Hence this works in up to three space dimensions and gives contractivity for T large enough.
10.5. Coefficient identification from final time data
369
Theorem 10.15. Let Ω ⊆ R3 be a bounded C 1,1 domain, r = 0, α ∈ (0, 1), 1 | ≤ C0 . Then for T > 0 sufficiently large, the and let g ∈ H 2 (Ω) satisfy | g(x) operator T defined by (10.115) is a contraction with respect to the L2 norm. In one space dimension, the analysis of the operator T can utilise Sturm– Liouville theory and becomes less technical for this reason. We will here briefly collect some of the results from [367]. We impose impedance conditions at both endpoints (we want to avoid Dirichlet) and zero initial value ∂tα u − uxx + q(x)u = r(x), (10.124)
0 < x < 1, t > 0,
− ux (0, t) + γu(0, t) = ux (1, t) + γu(1, t) = 0, u(x, 0) = 0,
t > 0,
0 < x < 1,
with 0 < γ < ∞ and a time independent known source term r. As in the higher dimensional setting, we assume that we are able measure the value of u at some later time T > 0, (10.125)
u(x, T ) = g(x),
0 ≤ x ≤ 1.
We will require that g(x) obeys certain conditions (which means that the measured data will have to be projected into a space satisfying these conditions) namely that g is positive bounded away from zero and possesses two uniformly bounded derivatives (10.126)
0 < g ≤ g(x) , x ∈ [0, 1] ,
g ∈ W 2,∞ (0, 1).
In (10.124), instead of incorporating a forcing term, we could drive the system by nonhomogeneous conditions on the boundary but the exposition is simpler with the current conditions. The data g(x) could, in a heat conduction scenario, be measured by, for example, a thermal camera at a fixed time. The exposition here follows that of [367], which is in turn based on earlier work for the parabolic problem, [159]. older) We assume that q ∈ L∞ and r ∈ H 1 (0, 1) so that r is in fact (H¨ continuous and is positive. With these assumptions the solution u(x, t) of the direct problem is strictly positive in [0, 1] × (0, T ]. The solution of the direct problem (10.125) we have now met several times (10.127)
u(x, t) =
∞
Eα,n (t)ϕn (x; q)f, ϕn (·; q),
n=1
where Eα,n (t) is defined in (10.103) and {λn }, {ϕn } are the eigenparameters of the operator −uxx + qu subject to impedance boundary conditions with
370
10. Inverse Problems for Fractional Diffusion
impedance value γ. From Lemma 9.9 for q1 , q2 ∈ L∞ (0, 1) we have the estimates |λqn1 − λqn2 | ≤ C q1 − q2 L2 (0,1) , (10.128) C
ϕqn1 − ϕqn2 L2 (0,1) ≤ q2 − q1 L2 (0,1) . n The fixed point equation reads as (10.129)
q(x) =
r(x) + g (x) − ∂tα u(x, T ; q) . g(x)
This is valid as our data g(x) should be strictly positive (but we will return to this point later). It is possible to just set q ∈ L2 (0, 1) as in this case the eigenfunctions ϕn (x) are well-defined and hence so also is our solution representation, but using the scheme (10.129) will be facilitated if we in fact restrict q ∈ L∞ (0, 1). In fact we shall define the set M by (10.130) M := {p ∈ L∞ (0, 1) : 0 ≤ p(x) ≤ (r(x) + g (x))/g(x) , x ∈ (0, 1)}, and the map T(q) : L∞ → L∞ by the value of the term on the right-hand side of (10.129), (10.131)
T(q)(x) =
r(x) + g (x) − ∂tα u(x, T ; q) , g(x)
x ∈ (0, 1).
Given our construction of the set M and Lemma 10.12, we see that T : M → M, that is, T is self-mapping. Lemma 10.12 ([367, Corollary 2.1]). Under conditions (10.126) with r ≥ 0, for any q ∈ M the solution of (10.124) satisfies w(x, t) = ∂tα u(x, t; q) ≥ 0 for x ∈ [0, 1] × [0, T ] and r(x) + g (x) ≥ 0 for x ∈ [0, 1]. Proof. Note that limt→0+ ∂tα u = r(x) > 0. If we take the αth derivative in the defining equations for u, we obtain with w = ∂tα u ∂tα w − wxx + q(x)w = 0, (10.132)
0 < x < 1, t > 0,
−wx (0, t) + γw(0, t) = wx (1, t) + γw(1, t) = 0, w(x, 0) > 0,
t > 0,
0 < x < 1,
then using the maximum principle from Section 6.3, we obtain w = ∂tα u ≥ 0. Using this for q = qact with g(x) = u(x, T ; qact ), we get r(x) + g (x) = ∂tα u(x, T ; qact ) + q(x)g(x) ≥ 0. The goal is to show that T has a unique fixed point p and this can be recovered by the iteration scheme qn+1 = Tqn . This will be achieved by showing that T is a contraction mapping with respect to the L2 norm, provided the measurement time T is sufficiently large. Note that M is closed
10.5. Coefficient identification from final time data
371
with respect to the L2 norm, so Banach’s contraction principle, Theorem 8.15, can be invoked. It will be convenient to let Eq denote the evolution operator acting on r defined by
(Eq r)(t) =
∞ n=1
1 Eα,1 (−λn (q)tα )r, ϕn (·; q)ϕn (x; q). λn (q)
Lemma 10.13 ([367, Lemma 3.1]). Under conditions (10.126) with r ≥ 0, for any q1 , q2 ∈ M,
∂tα (Eq1 − Eq2 )(T ) L2 (0,1) ≤ CT −α q1 − q2 L2 (0,1) . Proof. By the fact that ∂tα Eα,1 (−λn (q)tα ) = λn (q)Eα,1 (−λn (q)tα ) and an obvious splitting into three terms, we obtain ∂tα (Eq1 (T ) − Eq2 (T )) ∞
Eα,1 (−λqn1 tα ) − Eα,1 (−λqn2 tα ) r, ϕn (·; q1 )ϕn (x; q1 ) = n=1
+ Eα,1 (−λqn2 tα )r, ϕn (·; q1 ) − ϕn (·; q2 )ϕn (x; q1 )
+ Eα,1 (−λn (q)tα )r, ϕn (·; q2 ) ϕn (x; q1 ) − ϕn (x; q2 ) = S1 + S2 + S3 , where
S1 2L2 (0,1) =
∞ * * *Eα,1 (−λq1 T α ) − Eα,1 (−λq2 T α )*2 r, ϕn (x : q1 )2 n n n=1
≤ CT −2α
∞
1 |λqn1 − λqn2 |2 ¯ 4 r, ϕn (x, q1 )2 , λn n=1
¯ n lies between λqn1 and λqn2 . The functions qi are nonnegative and the where λ {λn } sequence is monotonically increasing with λn (q) > λ1 (q) ≥ λ1 (0). The λn (q) sequence is also Lipschitz continuous with respect to q, see Lemma 9.9. From these facts we then have
S1 2L2 (0,1) ≤ CT −2α r L2 (0,1) q1 − q2 L2 (0,1) .
372
10. Inverse Problems for Fractional Diffusion
For S2 using equation (10.129) and again Lemma 9.9, possibly changing values of the generic constant C, we obtain ∞
S2 2L2 (0,1) = |Eα,1 (−λqn2 T α )|2 r, ϕ(x,q1 ) − ϕ(x,q2 ) 2 n=1
≤ CT −2α
∞
(λqn2 )−2 r, ϕ(x,q1 ) − ϕ(x,q2 ) 2
n=1
≤ CT
−2α
≤ CT
−2α
r 2L2 (0,1)
q1 −
∞ 1
q1 − q2 2L2 (0,1) n2
n=1 2 q2 L2 (0,1)
An almost identical argument for S3 gives the estimate
S3 2L2 (0,1) ≤ CT −α r 2L2 (0,1) q1 − q2 2L2 (0,1) . Combining all three estimates shows the required conclusion.
Using Lemma 10.13 and the definition of T, we obtain Theorem 10.16. Under conditions (10.126) with r ≥ 0, for any q1 , q2 ∈ M,
T(q1 ) − T(q2 ) L2 (0,1) ≤ CT −α q1 − q2 L2 (0,1) where the constant C is independent of T . In particular, for T large enough, this implies the existence and uniqueness of a fixed point of the operator T in the set M. In fact the map T has another important property—that of monotonicity with respect to the natural ordering in M: q1 ( q2 if and only if q1 (x) ≤ q2 (x) for all x ∈ M. The following result is a immediate consequence of the definition of T and Lemma 10.12. Theorem 10.17 ([367, Theorem 3.2]). Under conditions (10.126), with r ≥ 0, for any q1 , q2 ∈ M, if q1 ( q2 , then T(q1 ) ( T(q2 ). Thus we have shown that if data g(x) is given that corresponds exactly to u(x, T ; qact ) for a solution qact of the inverse problem, then this qact can be recovered by the iteration scheme qn+1 = Tqn . Furthermore, if we take q0 = 0, then the sequence of iterates will be increasing with respect to the order relation (. However, an easy argument shows that the initial approximation q0 (x) = (g + r)/g is a superior choice to q0 (x) = 0. Of course real data is subject to noise, and we refer to Section 8.2.3.2 for a discussion on data smoothing and noise propagation; in our case, we additionally have to make sure that the smoothed data satisfies the positivity assumption (10.126).
10.5. Coefficient identification from final time data
373
If T < 1 (and remember there is a rescaling incorporated into this), then one sees a monotone rate of convergence in α (a smaller α gives a better rate) in the numerical results. For a fixed value of α, the larger the chosen T the better the rate. This is very consistent with the contraction constant in Theorem 10.16. When working in one space dimension, one is able to take advantage of specific properties of the associated eigenvalues and eigenfunctions. One of these advantages includes the ability to transform equations with unknown coefficients not in potential form to an equivalent potential Q(x) through the Liouville transform (see (9.56), (9.57)), then use the methods of this section for its recovery, and then use the inverse Liouville transform to recover the original coefficient. However, the above suggestion must be taken with the following in mind. As Remark 9.2 points out, conversion to spectral sequences has a limitation under any noise conditions on the data. In particular, converting to a potential is particularly susceptible to this ill-conditioning as the information content of the unknown potential lies very low in the spectral sequence. This is not true for the case of a term a(x) in the eigenvalue problem −(a(x)un ) = λn un as the leading term in the spectral sequence incorporates information about a(x). This means in particular if one is trying to recover a(x) in ∂tα u − (a(x)ux )x = r(x), (10.133)
0 < x < 1, t > 0,
− ux (0, t) + γ0 u(0, t) = ux (1, t) + γ1 u(1, t) = 0, u(x, 0) = u0 (x),
t > 0,
0 < x < 1,
from values of u at a later time time t, then conversion using a Liouville transform is a very poor idea. Instead, one should modify the mapping T in equation (10.131) to x α 0 ∂t u(s, T ; a) − f (s) ds (10.134) T(a) = g (s) and choose the boundary/initial parameters γ0 , γ1 , u0 to avoid solutions with maxima or minima in the interior. Again we refer to Section 10.5.2, where a composite iteration scheme is used to recover both a(x) and a potential q(x). It will be noted there that the speed of convergence of the iterates {an } is far greater than that of the {qn } which is commensurate with Remark 9.2 and the above parallel situation for the case of a(x). On the other hand, this transformation approach prohibits the recovery of more than one coefficient occurring in the spatial operator L. This does not mean of course that all methods to recover multiple coefficients are
374
10. Inverse Problems for Fractional Diffusion
doomed to failure but that a different pathway is required, Of course, more data will be needed and this might occur through two simulations with different initial/boundary/forcing conditions in each case where the value of the solution at t = T is being measured. See section 10.5.2 for examples of this situation. 10.5.2. Identification of two coefficients. The fixed point scheme and, to some extent also its analysis, can be extended to the identification of two space dependent coefficients a = a(x), q = q(x) in (10.112). While the contractivity proof requires a restriction to one space dimension again (see [184]), the principle of projecting the pde to the observation manifold Σ = Ω × {T } and inserting the measured data also works in higher space dimensions as follows. Recall that we seek to determine both a(x), q(x) in (10.135)
∂tα u − ∇ · (a∇u) + q f (u) = ru ,
t ∈ (0, T ) ,
− ∇ · (a∇v) + q f (v) = rv ,
t ∈ (0, T ) ,
∂tα v
with prescribed impedance boundary and initial conditions (10.136)
a ∂ν u + γu u = bu (x, t), u(x, 0) = u0 ,
a ∂ν v + γv v = bv (x, t),
v(x, 0) = v0 ,
x ∈ ∂Ω,
x ∈ Ω,
and known forcing functions ru = ru (x, t, u), rv = rv (x, t, v), γu , γv , bu , bv , and reaction term f (u) from observations gu (x) := u(x, T ) ,
gv := v(x, T ) ,
x ∈ Ω.
For a given a, q we can evaluate (10.135) on the manifold Ω × {T } to obtain −∇ · (a∇gu ) + q f (gu ) = ru (T, gu ) − ∂tα u(x, T ; a, q), (10.137) −∇ · (a∇gv ) + q f (gv ) = rv (T, gv ) − ∂tα v(x, T ; a, q) . The system (10.137) can be written as a (10.138) M = F(a, q), q where M is a linear operator depending only upon the data functions gu , gv and their derivatives, and F is a nonlinear operator on (a, q). The analysis strategy is to provide conditions under which M is invertible and the combined nonlinear operator T(a, q) := M−1 F(a, q) is contractive. To establish an iterative reconstruction scheme from (10.137), we therefore let T(a, q) = (a+ , q + ), where (10.139)
−∇ · (a+ ∇gu ) + q + f (gu ) = ru (T, gu ) − ∂tα u(·, T ; a, q), −∇ · (a+ ∇gv ) + q + f (gv ) = rv (T, gv ) − ∂tα v(·, T ; a, q),
10.5. Coefficient identification from final time data
375
and u(x, t) := u(x, t; a, q), v(x, t) := v(x, t; a, q) solve equation (10.135) with initial and boundary conditions (10.136). In the case f (u) = u there is an obvious approach to the above; multiply the first equation in (10.137) by gv , the second by gu , and subtract thereby eliminating q from the left-hand side. This gives (10.140)
∇ · (a W ) = a∇ · W + ∇a · W = φ,
where (10.141)
W = gv ∇gu − gu ∇gv , φ = gv (∂tα u(T ) − ru (T, gu )) − gu (∂tα v(T ) − rv (T, gv )) .
The value of a W on ∂Ω is known from the boundary conditions imposed on the system so that (10.140) gives an update for a(x) in terms of W and φ by solving the elementary transport equation (10.140) for a. This shows that the scheme (10.139) can be inverted for a and then by substitution, also for q provided that W does not vanish in any subset of Ω with nonzero measure. If now, for example, we are in one space dimension and u and v share the same boundary conditions at the left endpoint, then x φ(s) ds, (10.142) a(x)W (x) = 0
where we used the fact that W (0) = 0. The above will be the basis of one implementation of our reconstruction process below; namely the eliminate q version. This also works the other way around so an alternative is to multiply the first equation by ∇gv and the second by ∇gu and then subtract giving the eliminate a version (more precisely it is eliminate ∇a only), (10.143)
qW (x) = a∇W + ψ, ψ = (ru (T, gu ) − ∂tα u(T ))∇gv − (rv (T, gv ) − ∂tα v(T ))∇gu ,
where ψ and a have already been computed from the previous iteration. There is a seeming symmetry between this uncoupling of a and q but this first impression could be misleading. In (10.142) we obtain an updated a directly from previous iteration values of both a and q, and this involves only the function W . The inversion of a will go smoothly if W does not vanish in Ω and even zero sets of vanishing measure can be handled, as we will see in the numerical reconstructions below. In (10.143) we obtain an updated q that depends not only on W but also on ∇W and a. As we shall see, this makes the uncoupling of q less stable than the other way around. However, the important point is that the above shows that the linear operator M can be inverted by eliminating either of q or a. Given this, we could also use (10.138) directly by inverting the linear operator M and solving simultaneously for a and q after representing these functions in a
376
10. Inverse Problems for Fractional Diffusion
basis set. An implementation of this approach will also be shown below and in general turns out to be the most effective approach, more clearly avoiding some of the difficulties noted above by using a least squares setting. 10.5.2.1. Reconstructions of q and a by fixed point schemes. We will show the results of numerical experiments with the three versions of the basic iterative scheme: compute a, q in parallel; eliminate q and recover a; eliminate a and recover q. In the reconstructions to be shown, we used the values α = 1, T = 0.5, and a noise level (uniformly distributed) of 1% as a basis for discussion. At the end of the section we will indicate the effective dependence of the reconstruction process on these quantities. The reconstructions we show will be in R as the graphical illustration is then more transparent and there is little to be gained technically or visually from higher dimensions. Dependence on the choice of the final time T will be considered below and reconstructions of q with α < 1 will be shown in Section 10.5.3.4. We will also take the following actual functions to be reconstructed as qact (x) = 8x e−3x .
aact (x) = 1 + 4x2 (1 − x) + 0.5 sin(4πx),
As data we took two differing initial values, u0 (x) and v0 (x), and as boundary conditions we used (nonhomogeneous) Dirichlet at the left endpoint and Neumann at the right; these are typically different for each of u and v. One such data set is shown in Figure 10.11. 1
0
......................... .... .... .... .... .................. .... .... ............. .... ........... .... ......... . . .... . . . . . . .... ... ............ ....... . ... ....... ... ....... ... ....... ... . . . . . . .. ..... . . ... . . . . . . ..... . . . ... . . . .. ........... ... ................ ... ...... ... ... ... ... ... ... ... 0 .... .... 0 .... .... .. ..
u v
-1
2 1 0 -1 -2
........................... .... .... .... .... . ................... ... .... .............. .... .... ........... ......... . . . . ... . . . . . . . ..... .... ........ ... ....... ... ... ....... ....... ... .. ....... . . . . . . . . ... . ...... . . . . . . . ... .. ........... ... ................. ... ........ ... ... ... .8 ... ... ... .... .... u .... .... ....
g gv
Figure 10.11. Initial values u0 (x), v0 (x), and data gu = u(x, T ),
gv = v(x, T )
We now compare the performance of the three schemes. In the parallel scheme we need to represent q and a by means of an appropriate basis set. The linear operator M, in equation (10.138) then takes the matrix form A1 Q1 , M= A2 Q2
10.5. Coefficient identification from final time data
377
where A1 denotes the representation of a(x) using the values gu , and Q2 denotes the representation of q(x) using the values gv (cf. (10.137)). Since we make no constraints on the form of the unknown functions other than sufficient regularity, we do not choose a basis with built-in restrictions as would be obtained from an eigenfunction expansion. Instead we used a 2 radial basis of shifted Gaussian functions bj (x) := e−(x−xj ) /σ centred at nodal points {xj } and with width specified by the parameter σ. The sequential schemes are based on eliminating one of a(x) or q(x) and having M represented through pointwise values of the functions W and W . The singular values of the component matrices A1 and Q1 are shown in the leftmost figure in Figure 10.12 and the functions W and W in the rightmost figure. 3
2 ◦◦ ◦◦◦◦◦ ◦◦◦◦ 1 ◦ 0
90
◦◦◦
◦◦
-1 -2 -3
◦◦
◦◦
sv (A) sv (Q) ◦ ◦
60
◦◦
◦◦
◦◦
-4 5
10
15
20
25
30
◦◦
◦◦
0 -30 -60
W W
. ...
..... . .. .....
. ... . ... . .. . .. ... . ... ................................................. ......... ............. ...... .. ........ ............ ...... ............... ... ......... ............................................ 1.0 ..... . . . ..... ... . ...... . . . .. ...... .. ...... ...... ....
30
Figure 10.12. Left: Singular values of the matrices A, Q;
Right: The functions W and W
Even before seeing the resulting reconstructions, it is clear that the far superior conditioning of the A matrix over that of Q—a factor of over 10 in the larger singular values and of 100 in the lower ones—is going to significantly favour the reconstruction of the a(x) coefficient. This is also borne out from the rightmost figure here: while the values of the function W are modest, there is a much larger range in the values of W . Note that from equation (10.142) the reconstruction of a(x) requires only W , but that of (10.143) which updates q(x) requires both W and W . Indeed this turns out to be the situation in both cases. Using the parallel scheme, reconstructions of a and q under 1% random uniform noise are shown in Figure 10.13. The initial approximations were a(x) = 1 and q(x) = 0. The first iteration resulted in an already near perfect reconstruction of the a(x) coefficient but that of q(x) lagged significantly behind, and in the end the error in the data measurements were predominately in this coefficient. In using the eliminate q and update a strategy, some care must be taken here as the function W (x) has two zeros: an interior one around x ˜ = 0.17
378
10. Inverse Problems for Fractional Diffusion
a(x)
2.0
1.5
1.0
..... ....... ... ...... ... .... ... ... . . ... . ... ... .. ... . ... .... ... ... ... . . ........ ... . ....... . . . . . . . ... .... . . ... . . . ... ... . .. . ... . . ... .. .. ... ... . . ... ... .. .. . . ... ... .. .. . . ... ... .. .. . . ... ... . . .. . . ... ... .. . ... ... ... . . ... ..... ... . .. .... ..... ........ . . . ....... ...... ....... .........
1.0
0.5
0.0 0.5
................ ........ ...... ...................... ..... ...... ........ .... .... ........ ....... ..... .. ....... .... ...... . ....... ........ ....... . ........... ...... . .......... ... . ........ ... . . .... ..... ........ . ....... . ... ....... . . . . . ........ . . . . .. . ... ... ........ . . . . . .. . . .... .... . .. .. . . . . .. .. .... . .. . . . ... .. .. .. .. .. .. ... .. .. .. ... .... .. . .. . .. . . .. .. ... ... ... ....
q(x)
.0
Figure 10.13. Reconstructions of a and q using parallel algorithm
and at the endpoint x = 1. Thus a straightforward division by W to recover a xfrom (10.142) is not feasible. In theory the right-hand side term Φ(x) = ˜, but 0 φ(s) ds should also vanish at such points including the interior one x with data noise this is not going to be the case and and some interpolation type procedure for numerically implementing this zero-by-zero division needs to be implemented; see [182] for details. A new direct solve then is used to recover the next iteration of q(x). As Figure 10.14 shows, the results are comparable to the previous reconstruction. It is worth observing from Figures 10.11 and 10.14, that those regions where the reconstructions of q are poorest coincide with regions of smaller values of gu (x) = u(x, T ) and gv (x) = v(x, T ), namely near the left-hand point of the interval. This is in keeping with the fact that both W and W are smaller in magnitude at these points. a(x)
2.0
1.5
1.0
.... ..... .... ....... .... ... ... ... .. . ... ... ... ... .... . ... . . ... . . . ... ... . . . . . . . . . . . . . . . ... ........ ..... . . . . . ... ... .. . . . . . . . ... ... . . ..... . ... . . ... .. ... . ...... . ... . . . . ... . . . . . ... .. . .. .... . . . . . ... .... .. ... . . . . . . .... ...... ... . . . ..... . ... . . .. ..... ... .... ........................ .... ..... ....... ....... ....... .... ......
2.5 2.0 1.5 1.0 0.5 0.0
0.5
. . ... ... .. . ... .... ... . . . .. . .. .... .. ... . .. . .. . ... .. ..... .. .. . .. .. ...... .. ..... . .. .. ...... .. .... . . ... .. ...... .......................................................... .... . . .. . . . . ............. . ...... ... .............. ... ................ .................. . ................. .. .. ................ ...... ...... .. ............... ....... . . . . ................ .. ... ...... . ... .. .. .. ... . ..... . .... . . ....... . . .. .. .. .. ...... ... ...... .. .. . .. . ... .. .. ...... .. .. .. .. .
q(x) ....
.2
.6
.0
Figure 10.14. Reconstructions of a and q using eliminate a, recover q algorithm
We do not show a reconstruction for the version based on the update equation (10.143). While a somewhat satisfactory reconstruction of the coefficient a(x) was obtained, the scheme failed to converge for the coefficient
10.5. Coefficient identification from final time data
379
q(x). This fact alone shows how dominant a role the diffusion coefficient plays in the process at the expense of the much weaker potential term. It is only under significantly less data noise that an effective reconstruction of the latter was possible. The relative error history of all three versions is shown in Figure 10.15. Note the very different scales between the figures especially between the a(x) and q(x) norms, but also between the three schemes. These clearly show that not all of the three methods are equally effective and also emphasises the ability to recover a(x) much better than q(x).
a(x) 0.01
0.01
•
• 0.005
1.0
a(x) •
a(x)
•
•
•
•
q(x)
0.005
1.0
•
0.1
•
•
•
• 0.05
q(x)
2.0
•
• 0.5
0.5
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
q(x) •
1.5
•
1.0
0.5
Figure 10.15. Relative errors of the three schemes for a(x) and q(x) • {a, q}n − {a, q}act L∞ /{a, q}act L∞ ∗ {a, q}n − {a, q}act L2 /{a, q}act L2 Left: Parallel scheme; Centre: Eliminate q; Right: Eliminate a.
The difference in the iteration counts between the eliminate q scheme, and the other two in the figures is due to the use of a discrepancy principle as a stopping rule, which terminates the iteration as soon as the residual drops below the noise level. Now we study the influence of changing the final time. The above reconstructions were set with the final time taken to be T = 0.5. The question is how the schemes would progress for different T . In particular, we are interested in this as one should expect the contraction constant (if indeed there is a contraction) to be smaller with increasing T .
380
10. Inverse Problems for Fractional Diffusion
The answer is much as expected: the contraction constant varies with T , and more generally, so does the strength of both of the nonlinear contribution terms ut (x, T ; a, q) and vt (x, T ; a, q). We would thus expect more iterations to be required as T was reduced, and indeed this is the case. As an illustration of this effect, Figure 10.16 shows the first iterate of a1 (x), for the values of T = 0.1 and T = 0.05. Recall from Figure 10.13 that even the first iteration a1 was sufficiently close to the actual when T = 0.5 so that it was barely indistinguishable from the actual. We do not show the reconstructions of q(x) here, as in neither the case of T = 0.1 nor of T = 0.05 did these converge. Indeed for these smaller values of T the iterations oscillated widely without any sense of convergence, it being quite clear that we were in a region of nonconvergence. a(x)
2.0
....... .... ..... ... .. .... ... .. ..... .... .. ... .. ... . ... ... ....... ....... .. ... ..... .. .. ..... .... ... . .. ..... ... .. .. .... ....... ..... . ..... ..... .. ..... ... ... ....... ......... . . .. ...... ...... ........... ....... . . ....... ...... .. ..... . ....... . . ....... ... ... ... .. ... .. . . . .. ..... ........ . .. . ....... . . . ....... . .. ... ........ ....... .. .. . ........ . . .. .... ... . ..... . .. ..... . . .... ... .. .. . ...... ....... . ........ .... . .. . . .. ....... . ..... . ..... . . .. . ..... ... .. . .. ... ................. .... . .... ......... .. .. .. .... ... .. .. .. .... ... .. .. . .. .. .... .. ........ .......
T = 0.1 T = 0.05
1.5
1.0
0.5
Figure 10.16. Variation of the first iterate a1 with T
The variation of the schemes with noise level is now predictable. With an error much in excess of 1%, the schemes degrade rapidly. Even for errors less than this value, the schemes based on pointwise evaluation and requiring W did quite poorly, especially for smaller values of the time measurement T and the nonparabolic case. For a discussion on the possibility (or impossibility) of reconstructing a and q from measurements of a single state at two different times (rather than measurements of two states generated by different excitations), we refer to [184]. 10.5.3. Newton’s and Halley’s methods. We return to identification of the potential q alone in ∂tα u − u + qu = 0, (x, t) ∈ Ω × (0, T ), (10.144)
∂ν u(x, t) + γ(x)u(x, t) = 0, u(x, 0) = u0 ,
from final time data (10.109).
(x, t) ∈ ∂Ω × (0, T ), x ∈ Ω,
10.5. Coefficient identification from final time data
381
The inverse problem under consideration can be written as an operator equation F (q) = g, where (10.145)
F : X → Y,
q → u(·, T ), where u = G(q) solves (10.144).
Again, we would like to work with the space X = L2 (Ω). As a domain of F we therefore use the same ball in L2 (Ω) that we had already used for defining G; cf. (10.116). The plan of this section, first of all, is to formulate the inverse problem with L2 (Ω) as parameter space and H 2 (Ω) as data space, to prove that it is well-posed in this setting, and therefore invoke the results from Section 8.2.3 to conclude convergence of Newton’s and Halley’s method for its solution. First, we do so for the linear model (10.144). In Section 10.5.3.3 we will discuss extension of the results to the semilinear equation (10.113) under appropriate assumptions on the (given) function f (u). It follows from Theorem 10.14 that the forward operator F is welldefined, i.e., for any q in an L2 neighborhood of some initial guess, the linear (sub)diffusion equation has a unique solution u that is regular enough to allow for final values u(·, T ) ∈ H 2 (Ω). Corollary 10.1. The operator F defined by (10.145) is well-defined as a mapping from D(F ) ⊆ L2 (Ω) into H 2 (Ω) (cf. (10.116)) for any u0 ∈ L2 (Ω). Note that D(F ) has nonempty interior in L2 (Ω), as required for the analysis of Newton’s and Halley’s method; cf. Section 8.2.3. Based on a linearised injectivity result (Theorem 10.18 below), we next establish local well-posedness of the linearised inverse problem in this L2 (Ω)−H 2 (Ω) setting, i.e., boundedness of the inverse F (q)−1 H 2 (Ω)→L2 (Ω) ; see Section 10.5.3.1. On one hand this implies local well-posedness of the nonlinear inverse problem F (q) = g, due to the inverse function theorem. On the other hand, together with some further smoothness properties of F and F it will allow us to invoke results from Section 8.2.3 on Newton’s and Halley’s methods to conclude their convergence including rates, which we will do in subsection 10.5.3.2. Since a realistic setting requires a norm on the data space that does not contain derivatives, we refer to our discussion of ways of filtering given data g δ ∈ L2 (Ω) such that the filtered version g˜δ is in H 2 (Ω) in Section 8.2.3.2, where also propagation of noise through the algorithms is investigated. Throughout this section we will assume that the space dimension is d ∈ {1, 2, 3}, since we will repeatedly make use of the fact that the product of an L2 (Ω) and an H 2 (Ω) function is again in L2 (Ω), which is not valid in space dimensions four and higher; see Section A.2. Moreover, for simplicity
382
10. Inverse Problems for Fractional Diffusion
of notation, we will drop the identifier Ω in the norms on Lp (Ω), H s (Ω), and abbreviate the norm in the Bochner space L2 (0, T ; Lp (Ω)) by · L2 (Lp ) . In the proofs below, we will require some additional estimates on MittagLeffler functions. Lemma 10.14. For any α ∈ (0, 1), λ > 0, the estimate T T 1−α Γ(α + 1) (10.146) Eα,1 (−λτ α )2 dτ ≤ λ 0 holds. Moreover, for any α ∈ ( 12 , 1] and any θ ∈ (0, 2 − α1 ), there exists C˜α such that for all λ > 0, T (10.147) t2(α−1) Eα,α (−λtα )2 dt ≤ C˜α2 λ−θ . 0
Proof. Using Theorem 3.25 we get T T λ λEα,1 (−λτ α )2 dτ ≤ dτ −1 α 2 0 0 (1 + Γ(α + 1) λτ ) 1 1 λT λT = dσ ≤ dσ −1 α α 2 −1 α 2 0 (1 + Γ(α + 1) λT σ ) 0 (1 + Γ(α + 1) λT σ) 1 1−α Γ(α + 1) 1 − ≤ T 1−α Γ(α + 1) , =T 1 + Γ(α + 1)−1 λT α i.e., (10.146). To obtain (10.147), we use the abbreviation e(t) := tα−1 Eα,α (−λtα ) ≥ 0 and H¨older’s inequality with θ ∈ (0, 2 − α1 ), T T T
θ T
1−θ 2−θ e(t)2 dt = e(t)θ e(t)2−θ dt ≤ e(t) dt e(t) 1−θ dt 0
0
where
0
0
1 1 1 − Eα,1 (−λT α ) ≤ . λ λ 0 Moreover, for α ∈ (0, 1] the mapping x → Eα,α (−x) is (completely) monotone for x ≥ 0 (cf. Theorem 3.20), which implies |Eα,α (−x)| ≤ |Eα,α (0)| = 1, hence 0≤
T
e(t) dt =
T
0
T
2−θ
e(t) 1−θ dt ≤
2−θ
t(α−1) 1−θ dt < ∞.
0
10.5.3.1. Well-posedness of the linearised inverse problem in an L2 (Ω) − H 2 (Ω) setting. The Fr´echet derivative of the forward operator, defined by (10.145), is given by F (q)p = v(·, T ) = trΩ×{T } G (q)p, where v = G (q)p solves (10.148)
∂tα v − v + qv = −p u,
(x, t) ∈ Ω × (0, T ),
10.5. Coefficient identification from final time data
383
with homogeneous initial and impedance boundary conditions and a solution u to (10.144). We first of all prove injectivity of F (q) for any q such that F 1(q) < ∞. Theorem 10.18. The linear operator F (q) : L2 (Ω) → H 2 (Ω) is injective. Proof. We conclude contractivity of the fixed point mapping T : p →
(10.149)
−(∂tα G (q)p)(T ) F (q)
analogously to the proof of Proposition 10.1 with the replacements u ˆ ↔ ˜ ↔ G(q). Thus from v(T ) = F (q)p = 0, which when G (q)p, δq ↔ p, u insterted into (10.148), implies ∂tα G (q)p = −pF (q), we conclude that p = T(p). Hence by contractivity of the linear fixed point operator T, we have
p ≤ c p for some c < 1, that is, p = 0. Theorem 10.19. For any u0 ∈ H 2 (Ω) such that u(·, T ) defined by (10.144) is bounded away from zero and for any q ∈ D(F ), then the linear operator F (q)−1 : H 2 (Ω) → L2 (Ω) is well-defined and bounded. Proof. Also the linearised inverse problem of recovering p in (10.148) from j = v(·, T ) = G (q)p can be rewritten as a fixed point equation with T as in (10.149), p = Tp −
(10.150)
−j + qj F (q)
in L2 (Ω). The main step of the proof is to show that T is compact, due to the fact that p → (∂tα G (q)p)(T ) even maps into H 2ε (Ω), for some ε > 0, and compactness of the embedding H 2ε (Ω) → L2 (Ω) (note that we have assumed boundedness of Ω), which can be seen as follows. To this end, we need to bound the H 2ε (Ω) norm of ∂tα G (q)p(T ) = ∂tα v(T ) in terms of p L2 . We use the identity (10.151) ∂tα v(x, t) = −p(x)u(x, t) − Lq v(x, t) t ∞
q q τ α−1 Eα,α (−λq τ α )p u(·, t − τ ), ϕq dτ ϕq (x) = −pu(·, t), ϕ + λ =1 ∞
=−
0
Eα,1 (−λq tα )p u0 , ϕq ϕq (x)
=1 ∞ t
−
=1
0
Eα,1 (−λq τ α )put (·, t − τ ), ϕq dτ ϕq (x),
384
10. Inverse Problems for Fractional Diffusion
where we have applied integration by parts and the fact that Eα,1 (0) = 1 as well as d (10.152) Eα,1 (−λn τ α ) = −λn τ α−1 Eα,α (−λn τ α ), dτ cf. (3.34). Thus by Theorem 3.25, and (10.146), we get for an arbitrary ε ∈ (0, 1/4) that (10.153) ∞ α (λq )ε Eα,1 (−λq T α )p u0 , ϕq
∂t v(·, T ) H˙ 2ε = q
=1
2 1 T 2 Eα,1 (−λq τ α )put (·, T − τ ), ϕq dτ + 0
≤ Cα (Lq )−1+ε p u0 L2 + (Lq )−1/2+ε put L2 (L2 ) . The two terms on the right-hand side of (10.153) can be estimated by q −1+ε ψ ˜ dx q −1+ε Ω p u0 (L ) p u0 L2 = sup
(L ) ˜ L2
ψ ˜ 2 (Ω)\{0} ψ∈L (10.154)
≤ p L2
˜ L2
u0 (Lq )−1+ε ψ ˜ L2
ψ ˜ 2 (Ω)\{0} ψ∈L
= p L2
u0 ψ L2 q )1−ε ψ 2
(L L ψ∈H˙ q2−2ε (Ω)\{0}
sup
sup
Ω ≤ CH ˙ 2−2ε →L∞ p L2 u0 L2 , q
and analogously, using H¨older’s inequality, (10.155)
Ω |(Lq )−1/2+ε put (·, t)| ≤ CH ˙ 1−2ε →Lr∗ p L2 ut (·, t) Lr∗∗ (Ω) q
2r ∗ r ∗ −2
for r∗∗ = and r∗ = ∞ in case d = 1, r∗ < ∞ in case d = 2, and 2d ∗ r ≤ d−2+4ε in case d ≥ 3. With the two estimates (10.154), (10.155) at hand and a possibly modified constant Cα , we can continue (10.153) as
(10.156)
∂tα v(·, T ) H˙ 2ε ≤ Cα p L2 u0 L2 + ut L2 (Lr∗∗ ) . q
This is the desired estimate on ∂tα v(·, T ), but we still need to prove that the factor multiplying p L2 on the right-hand side is finite. For establishing the ∗∗ required L2 (0; T ; Lr (Ω)) regularity of ut , we just use the solution formula u(·, t) =
∞ =1
Eα,1 (−λq tα )u0 , ϕq ϕq (x) ,
10.5. Coefficient identification from final time data
385
which we differentiate with respect to t to get (cf. (10.152)) ut (·, t) = −
(10.157)
∞
λq tα−1 Eα,α (−λq tα )u0 , ϕq ϕq (x).
=1
Hence (10.158) Ω
ut L2 (Lr∗∗ ) ≤ CH ˙ s →Lr∗∗ ut L2 (H˙ qs ) q ∞ T
1 q α 2 q s+2 q 2 2 Ω 2(α−1) = CH˙ s →Lr∗∗ t Eα,α (−λ t ) dt (λ ) u0 , ϕ q
=1
0
for s > 0 if d = 1 and s ≥ rd∗ ≥ d−2+4ε if d ≥ 2. The time integral can be 2 further estimated, using (10.152) and (10.147). Thus we can continue the estimate (10.158) as ∞
1 2 Ω ˜
ut L2 (Lr∗∗ ) ≤ CH˙ s →Lr∗∗ Cα (λq )s+2−θ u0 , ϕq 2 q (10.159) =1 Ω ˜ ≤ C ˙ s r∗∗ Cα u0 ˙ 2+σ , Hq →L
Hq
, if where, according to the requirements ε > 0, θ < 2 − α1 and s ≥ d−2+4ε 2 d ≥ 2 (and s > 0 if d = 1), we can choose σ = s − θ = 0 in case d ∈ {1, 2}, α > 12 , and in case d = 3, α > 23 . Boundedness away from zero of u(·, T ) implies boundedness of the mul1 2 tiplication operator w → u(·,T ) w as an operator from L (Ω) into itself. Thus altogether we get compactness of the operator T, which, by means of the injectivity result Theorem 10.18 and Fredholm’s alternative, applied to the fixed point equation (10.150), yields invertibility of F (q) : L2 (Ω) → H 2 (Ω). Since the second Riesz theorem implies closedness of the range of F (q), its inverse is bounded. Remark 10.2. Strict positivity of u(x, T ) ≥ u > 0 for all x ∈ Ω with a positive bound u can be concluded from boundedness away from zero of u0 by the same bound u0 (x) ≥ u > 0 from some maximum principle (e.g., [283, Theorems 6, 7 in Section 3, Chapter 3] for the case α = 1 and Theorem 6.11 in the fractional diffusion case α < 1), provided q ∈ L∞ (Ω). Note that the sign of q is of no influence here, as can be seen from the fact that the change of variables v(x, t) = ect u(x, t) leaves the initial data unchanged, while transforming the equation to one with a potential q(x)+c. One can argue similarly in the subdiffusion case. However, q ∈ L∞ (Ω) appears to be essential. At this point, dependence of the stability constant on T or α comes into play: Larger T or α closer to 1 implies that the values of u at final time, by
386
10. Inverse Problems for Fractional Diffusion
which we divide, are smaller. This is clearly visible in the numerical results of Section 10.5.3.4; see Figure 10.17 there. 10.5.3.2. Convergence of Newton’s and Halley’s methods. We now consider the following derivative-based iterative reconstruction methods; cf. Section 8.2.3. • Newton’s method: qk+1 = qk + F (qk )−1 (g − F (qk ));
(10.160)
• frozen Newton: qk+1 = qk + F (q0 )−1 (g − F (qk ));
(10.161)
• Halley’s method: (10.162)
qk+ 1 = qk + F (qk )−1 (g − F (qk )), 2
qk+1 = qk + (F (qk ) + 12 F (qk )[qk+ 1 − qk , ·])−1 (g − F (qk )) ; 2
• frozen Halley: (10.163)
qk+ 1 = qk + F (q0 )−1 (g − F (qk )), 2
qk+1 = qk + (F (q0 ) + 12 F (q0 )[qk+ 1 − qk , ·])−1 (g − F (qk )) . 2
Our goal is to verify the convergence conditions on Newton’s and Halley’s methods from Section 8.2.3. We do so by establishing (a) boundedness of F (q)−1 ; (b) boundedness of F (q); (c) boundedness of F (q). Under conditions (a), (b) and with continuity of F , Newton’s method converges locally; additionally convergence is quadratic if (c) holds; cf. Theorem 8.9. From (a), (b), (c) we get local quadratic convergence of Halley’s method. If additionally F is Lipschitz continuous, convergence is cubic; cf. Theorem 8.10. Also linear convergence of the frozen versions (10.161), (10.163) follows from (a), (b), (c); cf. Theorem 8.11. Well-posedness of the linearised problem (a) has already been established in the previous section, so it remains to prove (b) and (c). Lipschitz continuity of F could in principle be proven as well. However, it would need higher than L2 regularity of q, so it would not be applicable in a full Halley scheme where the iterates qk can only be expected to be in L2 (Ω). In a frozen Halley scheme, where q0 might be chosen to have higher regularity, Lipschitz continuity of F cannot be shown to enhance the convergence speed (see Theorem 8.11), so it would not pay off to prove it.
10.5. Coefficient identification from final time data
387
We start with boundedness of F (q) in the L2 (Ω)–H 2 (Ω) setting. Theorem 10.20. For any q ∈ D(F ) ⊆ L2 (Ω), u0 ∈ H 2 (Ω), the operator F (q) : L2 (Ω) → H 2 (Ω) is bounded and satisfies
F (q)p H 2 ≤ C1 u0 H 2 p L2
(10.164)
for a constant C1 > 0 and any p ∈ L2 (Ω).
Proof. From (10.151), (10.156) and using integration by parts, (10.152), as well as the solution formula for (10.148), we obtain, for v(x, T ) = F (q)p, (10.165) ∞ T τ α−1 Eα,α (−λq τ α )p u(T − τ ), ϕq dτ ϕq (x) v(x, T ) = − =
0
=1 ∞ =1
1 λq
T 0
Eα,1 (−λq τ α )p ut (T − τ ), ϕq dτ
+ Eα,1 (−λq T α )p u0 , ϕq
−
p u(·, T ), ϕq
ϕq (x).
T Hence, using Theorem 3.25 and (10.146), we obtain 0 Eα,1 (−λq τ α )2 dτ = O( λ1q ), Eα,1 (−λq T α )2 = O( (λ1q )2 ), and the Cauchy–Schwarz inequality as
well as an estimate similar to (10.154) to get
v(·, T ) H 2 ≤ CH˙ 2 →H 2 Lq v(·, T ) L2 q ∞ T . q α 2 ≤ CH˙ 2 →H 2 Eα,1 (−λ τ ) dτ q
+
∞
=1
0
Eα,1 (−λq T α )2 pu0 , ϕq 2
1/2
=1
T 0
put (τ ), ϕq 2 dτ
1/2
∞
1/2 / + p u(·, T ), ϕq 2 =1
[put ] L2 (L2 ) + (L ) (p u0 ) L2 + pu(·, T ) L2 ≤ Cα (L )
Ω Ω ≤ Cα CH ˙ 1 →Lr∗ put L2 (Lr ) + CH˙ 2 →L∞ pu0 L1 (Ω) + pu(·, T ) L2 q q
Ω Ω ≤ Cα CH ˙ 1 →Lr∗ ut L2 (Lr∗∗ ) + CH˙ 2 →L∞ u0 L2 + u(·, T ) L∞ p L2 , q −1/2
q
q −1
q
388
10. Inverse Problems for Fractional Diffusion ∗
r 2r where r∗ = 1−r , r∗∗ = 2−r = r2r ∗ −2 , and ut L2 (Lr ∗∗ ) can be estimated as in (10.159), using (10.157) and (10.147), (10.166) Ω
ut L2 (Lr∗∗ ) ≤ CH ˙ s →Lr∗∗ ut L2 (0,T ;H˙ qs (Ω)) q 1/2 ∞ T Ω = CH t2(α−1) Eα,α (−λq tα )2 (λq )s+2 u0 , ϕq 2 dt ˙ s →Lr∗∗ q
=1
≤
Ω ˜ CH ˙ s →Lr∗∗ Cα q
0
∞
1/2 (λq )s+2−θ u0 , ϕq 2
=1
Ω ˜ ≤ CH ˙ s →Lr∗∗ Cα u0 H˙ qs+2−θ q
∗
for θ < 2− α1 , s ≥ rd∗ such that H 1 (Ω) continuously embeds into Lr (Ω), i.e., 2d in case d ≥ 3. r∗ = ∞ in case d = 1, r∗ < ∞ in case d = 2, and r∗ ≤ d−2 This is exactly the same situation as in the proof of (10.159) (setting ε = 0 there), and hence allows us to choose θ ≥ s, in space dimensions three and less, which by equivalence of the H˙ q2 and the H 2 norm yields (10.167)
ut L2 (Lr∗∗ ) ≤ C u0 H 2 .
Ω q Moreover, u(·, T )) L∞ ≤ CH˙ 2 →H 2 CH 2 →L∞ L u(·, T ) L2 , which using q (10.157) we can further estimate by ∞
2 1/2 q λq Eα,1 (−λq T α )u0 , ϕq
L u(·, T ) L2 = =1
Γ(α + 1) ≤
u0 L2 . Tα
Next, we state boundedness of F (q) : L2 (Ω)2 → H 2 (Ω) for q ∈ L2 (Ω), u0 ∈ H 2 (Ω). The second derivative of F is given by F (q)[p, p˜] = w(·, T ), where w solves (10.168)
pv − p˜ v, ∂tα w − w + qw = −˜
(x, t) ∈ Ω × (0, T ),
and v, v˜ solve (10.148) and ∂tα v˜ − ˜ v + q˜ v = −˜ pu,
(x, t) ∈ Ω × (0, T ),
respectively, both with homogeneous initial and impedance boundary conditions. Theorem 10.21. For any q ∈ D(F ) ⊆ L2 (Ω), u0 ∈ H 2 (Ω), the operator F (q) : (L2 (Ω))2 → H 2 (Ω) is bounded and satisfies (10.169)
p L2
F (q)[p, p˜] H 2 ≤C2 u0 H 2 p L2 ˜
for a constant C2 > 0 and any p, p˜ ∈ L2 (Ω).
10.5. Coefficient identification from final time data
389
The proof is similar to that of Theorem 10.20, in principle. However, due to the appearance of bilinear terms on the right-hand side of (10.168), where the factors v, v˜ in their turn solve pdes with bilinear source terms (cf. (10.148)), one has to be even more careful in choosing several exponents in H¨older estimates. The proof therefore quickly becomes quite technical without providing too much additional insight, and we therefore just refer to [182] for the full details. Based on these results, we can state the following on convergence of Newton’s and Halley’s methods as well as their respective frozen versions. Corollary 10.2. Assume that q0 ∈ L∞ (Ω) with q0 − qact L2 sufficiently small and that u0 ∈ H 2 (Ω) with u0 (x) ≥ u for some positive constant u and all x ∈ Ω. Then the iterates defined by frozen Newton (10.161) or frozen Halley (10.163), respectively, converge locally and linearly to the exact solution qact of the inverse problem. The same holds true for the respective full versions of Newton (10.160) and Halley (10.162) with even quadratic convergence as long as all iterates qk stay in L∞ (Ω).
Proof. See Theorems 8.9, 8.10, 8.11.
The requirement of q0 or qk being in L∞ (Ω) results from the use of a maximum principle in order to guarantee that u(x, t; q0 ) (or u(x, t; qk )) takes its minimum at time t = 0; see Remark 10.2. In the case of the frozen versions of Newton or Halley, this is only a matter of choosing the starting value; in the respective full versions it can be achieved by projecting qk into an interval pointwise after each step. Obviously this projection will not increase the iteration error if the values of the exact solution q are in this interval as well, so the convergence results remain valid. 10.5.3.3. Extension to a semilinear model. Consider, instead of the linear pde (10.110), the semilinear one (10.113), i.e., the problem of determining q(x) in ∂tα u − u + q(x)f (u) = 0, (10.170)
∂ν u(x, t) + γ(x)u(x, t) = 0,
(x, t) ∈ Ω × (0, T ), (x, t) ∈ ∂Ω × (0, T ),
u(x, 0) = u0 (x),
x ∈ Ω,
from final time data (10.109), where f is a given smooth function. First of all, well-posedness of the initial boundary value problem (10.170) with q ∈ L2 (Ω) can also be proven by a fixed point argument considering the fixed point equation ∂tα u − u + cq(x)u = q(x)(cu − f (u)) and assuming (10.171) f (0) = 0 , sup |f (z) − c| ≤ c1 , sup |f (z)| ≤ c2 , sup |f (z)| ≤ c3 z∈R
z∈R
z∈R
390
10. Inverse Problems for Fractional Diffusion
with c1 , c2 small enough. We also redefine Lcq = −+cq (equipped with the impedance boundary conditions from (10.170)) in the proofs of Theorems 10.14, 10.19 and of boundedness of F , F . Theorem 10.22. For any q ∈ D(F ) according to (10.116) and any u0 ∈ L2 (Ω), the initial boundary value problem (10.170) has a unique solution u in the space C ∞ ((0, T ]; H 2 (Ω)) ∩C([0, T ]; H 2 (Ω)) and the estimate max{ u C(0,T ;H 2 (Ω)) , ut L2 (0,T ;H 1 (Ω)) } ≤ Cˆα u0 L2 (Ω)) holds for some constant Cˆα . As a consequence, the operator F defined by (10.145) (with (10.170) instead of (10.144)) is well-defined as a mapping from D(F ) ⊆ L2 (Ω) into H 2 (Ω), for any u0 ∈ L2 (Ω). Proof. We prove that the fixed point operator T : v → u, where u solves ∂tα u − u + cq(x)u = q(x)(cv − f (v)), ∂ν u(x, t) + γ(x)u(x, t) = 0, 0
(x, t) ∈ Ω × (0, T ),
(x, t) ∈ ∂Ω × (0, T ),
u(x, 0) = u (x),
x∈Ω
(which is well-defined as can be seen analogously to Theorem 10.20), is a X (¯ u) with respect to the norm self-mapping and a contraction on a ball BR defined by
u X = max{ u C(0,T ;H 2 (Ω)) , ut L2 (0,T ;H s (Ω)) } ∗
r 2r , r∗∗ = 2−r = r2r with s chosen such that for some r ∈ [1, ∞] and r∗ = r−1 ∗ −2 , ∗ ∗∗ 1 r s r the embeddings H (Ω) → L (Ω) and H (Ω) → L (Ω) are continuous, i.e., ¯ solves the homogeneous linear initial value problem s ≥ rd∗ ≥ d−2 2 . Here u
¯ − ¯ u + cq(x)¯ u = 0, ∂tα u ∂ν u ¯(x, t) + γ(x)¯ u(x, t) = 0, 0
(x, t) ∈ Ω × (0, T ), (x, t) ∈ ∂Ω × (0, T ),
u ¯(x, 0) = u (x),
x ∈ Ω.
Well-posedness of this problem follows from Theorem 10.14. To prove contractivity of T, we estimate the X norm of the difference u ˆ := Tv − T˜ v of the images in terms of the difference vˆ := v − v˜ of the ˆ−L u ˆ = q(x)(c − f¯v )ˆ v with arguments, using the fact that it satisfies ∂tα u 1 cq ¯ v ) dσ. This allows us homogeneous initial conditions, where fv = 0 f (v + σˆ to represent u ˆ (analogously to (10.165)) as ∞ 1 t Eα,1 (−λq τ α )q∂t [(c − f¯v )ˆ v ](t − τ ), ϕq dτ u ˆ(x, t) = q λ 0 (10.172) =1
v ](·, t), ϕq ϕq (x) , − p[(c − f¯v )ˆ
10.5. Coefficient identification from final time data
391
where we have used the fact that vˆ(·, 0) = 0. Thus, we get, analogously to the estimate after (10.165), Ω ¯ v ] 2 r∗∗
ˆ u(·, t) H 2 (Ω) ≤ Cα CH ˙ 1 →Lr∗ ∂t [(c − fv )ˆ L (L ) q + [(c − f¯v )ˆ v ](·, t) L∞ q L2 (Ω) , where * * * * * *∂t [(c − f¯v )ˆ v ] = **
1
1
f (v + σˆ v )(vt + σˆ vt ) dσˆ v+ 0
0
* * (c − f (v + σˆ v )) dσˆ vt **
≤ c2 (|vt | + |ˆ vt |)|ˆ v| + c1 |ˆ vt |, hence (10.173) Ω v ] L2 (Lr∗∗ ) ≤ c2 CH v L∞ (H 2 ) vt L2 (Lr∗∗ )
∂t [(c − f¯v )ˆ 2 →L∞ ˆ Ω + (c2 CH v L∞ (H 2 ) + c1 ) ˆ vt L2 (Lr∗∗ ) 2 →L∞ ˆ
≤ C (c1 + c2 v X ) ˆ v X + c2 ˆ v 2X
and Ω Ω v ](·, t) L∞ ≤ c1 CH v L∞ (H 2 ) ≤ c1 CH v X .
[(c − f¯v )ˆ 2 →L∞ ˆ 2 →L∞ ˆ
For the second part of the X norm of u ˆ, we estimate u ˆt according to ¯ v ] 2 r∗∗ q L2 (Ω) ∗ s ) ≤ C 2−s
ˆ ut L(Hcq H →Lr ∂t [(c − fv )ˆ L (L )
≤ C (c1 + c2 v X ) ˆ v X + c2 ˆ v 2X ∗
∗∗
by (10.173). The embeddings H 1(Ω) →Lr (Ω)f, H s(Ω) →Lr (Ω), H 2−s (Ω) → ∗ Lr (Ω) needed here work out with s = 1 and r∗ = 4 = r∗∗ , r = 43 for d ∈ {1, . . . , 4}. Concering the derivatives of F , the presence of a semilinear term leads to the following changes. The first derivative is given by F (q)p = v(·, T ) where v solves (10.174)
∂tα v − v + qf (u)v = −pf (u),
(x, t) ∈ Ω × (0, T ),
with homogeneous initial and impedance boundary conditions and u is a solution to (10.170). For the second derivative we have F (q)[p, p˜] = w(·, T ), where w solves (10.175) pf (u)v−pf (u)˜ v −qf (u)v˜ v, (x, t) ∈ Ω×(0, T ), ∂tα w−w+qf (u)w = −˜ and v, v˜ solve (10.174) and v + qf (u)˜ v = −˜ pf (u), ∂tα v˜ − ˜
(x, t) ∈ Ω × (0, T )
respectively, both with homogeneous initial and impedance boundary conditions. Also here we work with fixed point formulations that replace qf (u)
392
10. Inverse Problems for Fractional Diffusion
on the left-hand side by cqu and add corresponding terms q(c − f (u))v, q(c − f (u))w on the respective right-hand sides. The · C(0,T ;H 2 (Ω)) and
∂t · L2 (0,T ;H s (Ω)) norms of these terms can be estimated analogously to the proof of Theorem 10.21. It only remains to estimate the additional term v , which we do as follows. qf (u)v˜
Lq−1/2 [q∂t [f (u)v˜ v ]] L2 (L2 ) = Lq−1/2 [qf (u)ut v˜ v + qf (u)vt v˜ + qf (u)v˜ vt ] L2 (L2 ) Ω Ω Ω ≤ CH v L∞ (H 2 ) qut L2 (Lr ) ˙ 1 →Lr∗ CH 2 →L∞ c3 CH 2 →L∞ v L∞ (0,T ;L∞ (Ω) ˜ q
+ c2 ˜ v L∞ (0,T ;L∞ (Ω) qvt L2 (Lr ) + c2 v L∞ (0,T ;L∞ (Ω) q˜ vt L2 (Lr ) and Ω 2
qf (u(T ))v(T )˜ v (T ) L2 ≤ c2 (CH v (T ) H 2 (Ω) , 2 →L∞ ) q L2 v(T ) H 2 (Ω) ˜
where the terms on the right-hand side can be further estimated according to Theorem 10.20. So again (10.171) with c1 and c2 sufficiently small allows X (0) us to prove contractivity of the respective fixed point operators on BR and thus carry over all results from Section 10.5.3.2 to the semilinear setting. 10.5.3.4. Reconstructions with Newton’s and Halley’s methods. In this section we show some reconstructions of the potential function q(x) and illustrate many of the features discussed with the iteration schemes presented. In particular we will bring out the dependence on α (and by necessity coupling this to the value of T ). As in the previous sections involving backwards subdiffusion (see Section 10.1.3), we will see the definite pattern affecting the ill-conditioning and hence the ability to effectively reconstruct from final time data. In terms of condition number of the inverse problem there is an interplay between the time T and the fractional exponent, and we shall see this effect quite clearly in our ability to recover q. For details on numerical solvers for the direct problem we refer to [182]. In Figure 10.17 we show the reconstructions of q(x) at effective numerical convergence for different T and α values for the linear model 10.110 using the frozen Newton scheme. The actual potential taken was qact (x) = sin(πx) + 2e−2x sin(3πx) and the initial data was u0 (x) = 20x(1−x)e−2x . Throughout q0 = 0 was used as an initial guess and as a reference potential. Since the equation here was the linear model (10.110), effective numerical convergence was achieved within two or three iterations for T = 0.1 but slightly more were required with increased T and noise level δ for the fractional case. Considerably more steps were needed for the parabolic equation (10.110) with α = 1, in which convergence actually cannot be proven any more, since the key bounded invertibility result Theorem 10.19 gets lost. The reconstruction shown for T = 1 and δ = 1% was for the fourth iteration
10.5. Coefficient identification from final time data
2
1
0
..... ..... .... T= 0.1 ... ... ... .... ....... ....... . . . . . . .. ..... ... ... ... ... .... ...... ..... ..... .... ... ..... .................. ... .... .... ... .... ... ..... ... ... ... ... ... ... . . .... ... .... . .. .... .... . ..... . . .... ..... ..... .... ..... ..... ..... .... .... . ... . ...... ..... .... .... ..... ...
. ... ... ... .............. T= 0.1 ...... ...... .............. . . ..... ... .. ........ .. ... ...... ..... ... ......... ... .. ..... ... .... .......... ........... ......... ......... 1 ..... ................. ..... ........ ...... ...... ...... ..... ....... ..... .... ... .... . . ........ . . ...... ... .. .. . . . . ... ... . . ..... . . . . ... . .......... ....... . .. . ....... . . . . . . . . . . . . . . . ........... ...... ....... ..... ...... .... .. ..
2
1
0
.... ..... ..... T= 0.5 ..... .... ... .. ...... .. ... . . . . .. ... ... ... ... ... .. .... ... .... .. . . ... ........... .... .. .... ..... .... ... ... .... . ... ... .... ... ... ... .. .... . . . . ... .. ... .. . .... .... . . ... ... .. ... . . . ... .... .... .... .......... .... ... ... ... .... . .
.. . .. ............ T= 0.5 ... ............... .... ..... ............ . . .. ... ..... .......... .. ......... ... ... ......... ...... . ......... ........... .. . . 1 .... .............. . ....... ..... . . ..... .. . ...... ..... .. ........ . . ..... ...... ..... . .. .... . . ........ .. ... .... . ... .. . . . ....... . ..... ... ... ... .... . ...... ... ... ........ ...... ........ ..... ..... ........ .... ...... ..... . .
2
1
0
393
.... .... ... T= 1 ... .... ... ... α = 0.9 .. ..... ... . . . .... α = 1.0 ... .. .... ... . ...... ... . .... .. .. . .... . ... . ............ .. . ..... ... . .... ....... .. . ... .. .... ....... .. . ... . ... .. .. ... . .... .. .... . ..... . . ... . .. .. ..... .. .. ... .. .... ... . ...... ..... . .. ..... .... . ....... ... .. ...... .. .... ... .. ....... .... .. .
... .......... . T= 1 ... ..... ........ ... ..... ......... .. ........ ... .... .. . . . ......... ... ... ...... ... ........... . . ............ . 1 .... .. .......... ....... ...... ... . ..... .... ..... .. .... . .. . ........ . ..... . ... . .... . .. .... .. . . ........ .. ..... .. ..... .. .... .. . . .... ........ ....... .. .. ...... ..... .... . . . . . . . . ........... .. ....... ........ .. ......... ....... ... .
2
2
2
0
0
0
Figure 10.17. Reconstructions of q(x) in equation (10.110) for
α = 1 and α = 0.9
in the case of α = 0.9 but the thirtieth for α = 1, although in the latter case the tenth iteration was already close to the one shown. In the top row plots of the noise level were taken to be 0.1% and in the bottom row it was increased to 1%. In each case we show the outcome for T values of 0.1, 0.5, and 1. Two values of α are shown: the fractional case with α = 0.9 and the parabolic with α = 1. The reconstruction for α = 0.9 is in dashed lines while that for α = 1 is in dotted lines. These figures clearly show the expected degradation in resolution that is inevitable for the parabolic case with increasing values of T . On the other hand, for the fractional case there is very little loss, and the reconstructions hold for much larger T than shown. The situation is not at all strongly dependent on α as long as α is taken to be less that unity. When the noise level in the measured data is increased from 0.1% to 1%, we see the expected further loss of resolution in the reconstructions. It is noteworthy that in the fractional case the reconstructions, while poorer at the higher noise level, hold almost independent of T at the ranges considered. For the parabolic case the reconstructions possible are again poorer at the higher noise level and by T = 1 the reconstruction is quite poor. The above is exactly in line with the recovery of initial data in the backwards diffusion problem in Section 10.1.3, (see also [170, 185]) and indicates that a feasible regulariser for the parabolic problem is to use instead the fractional operator with α near to, but less than, one. In fact, a more sophisticated version is possible here along the lines discussed in Section 10.1.3.
394
10. Inverse Problems for Fractional Diffusion
1.0
◦•
q n−q act q act
◦ •
0.8 0.6
•◦
0.4
0.2
◦ •
◦ •
0.0
Figure 10.18.
FrozenNewton Newton FrozenHalley Halley
qn −qact L2 (Ω) qact L2 (Ω)
◦ •
◦ •
◦ •
◦
◦
◦
◦
n
for equation (10.113) at iteration n
We now show some numerical results for the semilinear reaction-diffusion model, equation (10.113). The nonlinearity was taken to be f (u) = 10u3 . In Figure 10.18 we show the relative decrease in the L2 norms of each iterate for all four methods. Since the equation and the consequent inversion now have an additional degree of nonlinearity, we should expect a slower convergence rate, and this is indeed the case. As expected the full implementations converge faster than those frozen at the initial approximation (here q0 = 0) and Halley’s method outperforms Newton’s method based on the number of iterations. Of course, there is a considerable saving by not having to recompute the derivative map at each iteration as is done in the frozen schemes and of course Halley’s scheme requires computing a corrector step in addition to the predictor that is used in Newton’s scheme. Thus the actual running time of the algorithms is a considerably more complex question than a mere iteration count and will of course depend strongly on the nonlinear term f (u) (among other things). In Figure 10.19 we show the reconstructions after one, three, and ten iterations of the frozen Newton scheme (leftmost graphic) and the full Halley scheme (rightmost graphic) for recovering q in (10.113) where F (u) = u3 . 2
1
0
...... ... ..... actual .. ... ........... ... iteration 1 ........ ...... .... . . ... ... ...... iteration 3 . . . . ... ... ... ... .. ...... iteration 10 ... ... .......... ... ... ... ... ................. . ... ... ....... ..... ...... ... ........ ...... ... .... ...... ... .. ...... ...... .... ............................. ....... ...... .... ....... ........ ............ .... ...... ... .. .... ........ ... .......... ....... ... ... ........ .... ..... ... ...... ........ .................... .... . ... ..... ........ ..... . ......... .. .... ... ... .... ... ...... ... .. .. . ........ ..... ....... . . .... ... ...... ... ... ... .... . . . . . ....... .... . . ... ... . .... ..... . . ...... .... ... ...... ..... ....... ......... ...... ... ............ ....... .... .... ........... ....... .... ............ . . . . . . . . . . ............ . . . ...... . . . ........ .... ... ... ......... .... ........... ..... . . . . . . . . ... ....... . . ........ . . . . . ..... ... ....... ...... ................. ................. ........... ........ ....... ................... ....... ......... . ....... . ........ . ....... ..................... ...... ..... ..... ... .
q
2
1
0
........ ...... ...... actual ................ ...... ......... ............ iteration 1 ... ...... ........... . . .. .... .... iteration 3 ............ ......... ............ . . .. iteration 10 ....... ... ... ... ...... ......... .... .... ............ ... ....... ...... .... ... ........ ... ......... ...... ... ... .......... ......... ..... ... ........ .... .. ... ......... ...... ... ................... ... ......... ...... ... ..... .. ....... ... ..... ... ...... ............................. .......... ... ... ...... .... ..... .......... ..... ........ ... .... ... ............ . . . . . . ....... .... . ... ..... ... ......... ............ ... .... . ......... .... ... ... ..... .. .. . ... ......... . . . .... .... . ....... ............ .... .... ... . . ......... . . . . . . . . . . . . . . . . . ... .... . . ... ... .... .. ... .... .......... ....... .... ............. . .... ... . . ... .... ..... ...... .... ..... ...... . . . . . .... .... . . . . . ..... ........ ... .... ..... .... ............ .................. ........ .. .. ... .... ..... .............. .... ... .. .. ...... ..... ... ... .... ... ...... .................. ... ..... ... ........ ......... ... ... . ... ... ... ... ... ... .... .0 ... .... ...
q
Figure 10.19. Reconstructions of q(x) in equation (10.113) using frozen Newton and Halley schemes
10.6. Recovering nonlinear partial differential equation terms
395
As shown in Figure 10.18 the converged state was not yet reached by the frozen Newton scheme (this took approximately 30 iterations) but the Halley method had already achieved convergence essentially at iteration 8.
10.6. Recovering nonlinear partial differential equation terms Reaction diffusion equations have a rich history in the building of mathematical models for physical processes. They are descendants of nonlinear odes in time with an added spatial component making for a pde of parabolic type. These early models dating from the first decades of the twentieth century include that of Fisher in considering the Verhulst logistic equation together with a spatial diffusion or migration, ut − kuxx = f (u) = bu(1 − cu), to take into account migration of species, and that of Kolmogorov, Petrovskii, and Piskunov in similar models which are now collectively referred to as the Fisher-KPP theory of population modelling; see [254]. There is also work in combustion theory due to Zeldovich and Frank-Kamenetskii that utilise higher order polynomials in the state variable u and where the diffusion term acts as a balance to the chemical reactions [119]. The use of systems of reaction diffusion models followed quickly; adding a spatial component to traditional population dynamic models such as predator-prey and competitive species as well as the interaction of multiple species or chemicals. By the early 1950s it was recognised by Alan Turing that solutions to such equations can, under the correct balance of terms, be used to simulate natural pattern formations such as stripes and spots that may arise naturally out of a homogeneous, uniform state [333]. This theory, which can be called a reaction-diffusion theory of morphogenesis, has been a major recurrent theme across many application areas. These models use the underlying physics to infer assumptions about the specific form of the reaction term f (u). The few constants appearing, if not exactly known, are easily determined in a straightforward way by a leastsquares fit to data measurements. We envision a more complex situation where the function f (u) (or multiple such functions as we will be considering systems of equations) cannot be assumed to have a specific known form, or to be analytic so that knowing it over a limited range gives a global extension, and therefore must be treated as an undetermined coefficient problem for a nonlinear pde. In this section we will focus on the single equation case, and then final time observations in Section 10.6.1. In Section 10.6.2 we will extend the scope to reaction-diffusion equations and also discuss time trace observations.
396
10. Inverse Problems for Fractional Diffusion
10.6.1. Single reaction-diffusion equations. Let Ω be a bounded, simply connected region in Rd with smooth boundary ∂Ω, and let L be a uniformly elliptic operator of second order with with L∞ coefficients. (10.176)
∂tα u(x, t) − Lu(x, t) = f (u(x, t)) + r(x, t),
(x, t) ∈ Ω × (0, T ),
and subject to the initial and boundary conditions (10.177)
∂ν u(x, t) + γ(x)u(x, t) = b(x, t), u(x, 0) = u0 ,
(x, t) ∈ ∂Ω × (0, T ),
x ∈ Ω.
In (10.176), (10.177) the usual, direct problem is to know the diffusion operator L, the nonhomogeneous linear term r, the reaction term f (u), as well as the initial and boundary data u0 , γ, b, and to determine the evolution of the state u. However, even in the earliest applications it was appreciated that the diffusion term −ku might have the diffusion coefficient k as an unknown, and while the specific form of f (u) was assumed known from the underlying model, specific parameters may not be. In our case we assume that f is unknown (save that it be a function of u alone). Thus we are interested in the inverse problem of recovering f from additional data that would be overposed if in fact f were known. In the parabolic case α = 1 using time trace data u(x0 , t) = h(t), t ∈ (0, T ) for some fixed x0 ∈ Ω, uniqueness results and the convergence of reconstruction algorithms were shown in [84, 266, 267, 269] for the recovery of the unknown term f (u). We will take a different situation here by assuming that one is able to measure in the spatial direction by taking a snapshot (census data) at a fixed later time T , and so our overposed data will be (10.178)
u(x, T ) = g(x),
x ∈ ω ⊆ Ω.
We also note that in the case of such data, if the reaction term is of the form q(x)f (u) where q is unknown but the actual form of the nonlinearity f (u) is known, then it is possible to recover the spatial component in a unique way; see Section 10.5. In (10.178), the observation domain can be restricted to a subdomain ω of Ω. In view of the fact that the functions to be discovered are univariate, even an appropriately chosen curve in Ω could possibly suffice. The essential requirement here is that the range of the final time data covers the range of the solution values for all times; see (10.180). The reconstruction methods that we consider here are based on a projection of the pde on the observation manifold, which naturally leads to a fixed point iteration for the unknown terms. This is similar in spirit to the methods we have considered in Sections 10.5.1 and 10.5.2. As compared to Newton’s method, which is a more general approach, this projection principle requires an appropriate form of the given data; in particular,
10.6. Recovering nonlinear partial differential equation terms
397
the observation manifold cannot be orthogonal to the dependence direction of the unknown function, i.e., if we aim at reconstructing an x dependent coefficient, then time trace observations cannot be used in such a projected approach (and will hardly give good reconstruction results with any other method either). Since here the unknown function does not depend on x or t, this problem does not appear. Moreover, a convergence proof of regularised Newton type methods can only be carried out under certain conditions on the forward operator. One of them is the tangential cone condition (see Section 8.2.3.3), which is most probably not satisfied here, due to the nonlinearity. Also the alternative option of proving conditional stability, as we did in Section 10.5.3, is inhibited by the pde nonlinearity here. 10.6.1.1. A fixed point iteration scheme to recover f . A natural scheme to recover f is to evaluate equation (10.176) on the overposed boundary t = T . That is, we define a map T : f → u(x, T ; f ) by Tf (g) = ∂tα u(x, T ; f ) − Lg(x) − r(x, T ),
(10.179)
x ∈ ω,
where g is the given data and define a sequence of approximations by the fixed point iteration fk+1 = T(fk ). We start with the idealised situation of exact data and refer to Section 8.2.3.2 for treatment of the realistic setting of noisy data. Before we can utilise the map (10.179) we must obtain conditions on the data that guarantee it is well-defined. Specifically, the range of g(x) must contain all values of the solution uact = u(x, t; fact ) for t ≤ T : I = [min g(x) , max g(x)] = g(ω) ⊇ uact (Ω × [0, T )) .
(10.180)
x∈ω
x∈ω
Here uact is the state corresponding to the actual nonlinearity fact . The following example illustrates the fact that (10.180) imposes a true constraint on the class of problems that can be expected to exhibit unique identifiability. Example 10.1. Let ϕ be an eigenfunction of −L with corresponding eigenvalue λ > 0. Take f (u) = cu as well as u0 = ϕ, r = 0, then u(x, t; f ) = e−(λ−c)t ϕ(x). If ϕ stays positive (which is, e.g., the case where λ is the smallest eigenvalue), and c < λ, then, with ϕ = minx∈Ω¯ ϕ(x) ≥ 0, ϕ = maxx∈Ω¯ ϕ(x) > 0, we get for the range of u over all of Ω, min
¯ (x,t)∈Ω×[0,T ]
u(x, t; f ) = ϕe−(λ−c)T ,
max
¯ (x,t)∈Ω×[0,T ]
u(x, t; f ) = ϕ ,
whereas for the final time data we have min u(x, T ; f ) = ϕe−(λ−c)T , ¯ x∈Ω
max u(x, T ; f ) = ϕe−(λ−c)T , ¯ x∈Ω
and therefore the range condition will be violated.
398
10. Inverse Problems for Fractional Diffusion
Below we state a self-mapping property of T on a sufficiently small ball in W 1,∞ (J). In the parabolic case α = 1, one can even show contractivity for T large enough provided f is strictly monotonically decreasing, which implies exponential decay of the corresponding solution or actually its time derivative ut , as long as r vanishes or has exponentially decaying time derivative. Such a dissipative setting is indeed crucial for proving that the Lipschitz constant of T decreases with increasing final time T , as the following counterexample shows. Example 10.2. Again, let ϕ(x) be an eigenfunction of −L with corresponding eigenvalue λ > 0. Take f (1) (u) = c1 u, f (2) (u) = c2 u as well as u0 = ϕ, r = 0, and set u(i) (x, t) = u(x, t; fi ) which can be computed explicitly as u(i) (x, t) = e(ci −λ)t ϕ(x) ,
ut (x, t) = (ci − λ)e(ci −λ)t ϕ(x) , (i)
which yields (1)
(2)
Tf (1) (g(x)) − Tf (2) (g(x)) = ut (x, T ) − ut (x, T ) , = (c1 − λ)ec1 T − (c2 − λ)ec2 T e−λT ϕ(x) . Thus, for any combination of the norms · X and · Z , the contraction factor
Tf (1) (g) − Tf (2) (g) Z
f (1) − f (2) X is determined by the function m(T ) =
|(c1 − λ)e(c1 −λ)T − (c2 − λ)e(c2 −λ)T | . |c1 − c2 |
Now take c1 > c2 > λ; then m(0) = 1 and m (T ) =
(c1 − λ)2 e(c1 −λ)T − (c2 − λ)2 e(c2 −λ)T > 0. c1 − c2
This makes a contraction for finite time T impossible unless f 0 (u) = u) is sufficiently small.
ϕZ f 0 X
(with
The two examples above clearly show that the two aims (a) range condition and (b) dissipativity (for contractivity) are conflicting, at least as long as we set r = 0 as it was done here. Thus, in order to achieve (a) we need to drive the system by means of r (or alternatively, by inhomogeneous boundary conditions). Luckily, nonvanishing r does not impact case (b) since these inhomogenieities basically cancel out when taking differences between fixed point iterates for establishing contractivity estimates. Thus it is indeed possible to have range condition and contractivity together. However, for this it is crucial to not only use arbitrary found data, but to be able to design the experiment in such a way that the data exhibits the desired properties.
10.6. Recovering nonlinear partial differential equation terms
399
The operator T defined by (10.179) is a concatenation T = P ◦ S of the projection operator P : Z → X and the solution operator S : X → Z, defined by (Sf )(x) = ∂tα u(x, T ; f ) − Lg − r(x, T ) , Py such that (Py)(g(x)) = y(x) ,
x ∈ Ω,
x ∈ ω.
More generally, in order to cover the case that g : ω → I is not invertible (note the possible difference in dimensionality of the preimage and image space), we define the projection operator by (10.181) Py ∈ argmin{ (g) − y Z : ∈ X and (u0 ) + r(·, 0) − Lu0 H˙ σ (Ω) ≤ ρ0 } , where we choose ρ0 ≥ fact (u0 ) + r(·, T ) − Lu0 H˙ σ (Ω) to make sure that a fixed point of T solves the original inverse problem (10.184), (10.178). We use a bounded interval I = [gmin , gmax ] with gmin = min{g(x) : x ∈ Ω}, gmax = max{g(x) : x ∈ Ω}, in order to be able to make use of embedding theorems. Then we work with the function space setting (10.182)
X = {f ∈ W 1,∞ (I) : f (u0 ) ∈ H˙ σ (Ω)},
Z = H˙ σ (ω),
with σ such that H˙ σ (ω) continuously embeds into W 1,∞ (ω) and with the norm (10.183)
f X = f W 1,∞ (I) + f (u0 ) H˙ σ (Ω) .
Moreover, we employ the projection P (z) = max{min{z, gmax }, gmin } on I to define u(x, t; f ) as solution to (10.184) ∂tα u(x, t) − Lu(x, t) = f (P u(x, t)) + r(x, t), (x, t) ∈ Ω × (0, T ), ∂ν u(x, t) + γ(x)u(x, t) = 0, u(x, 0) = u0 ,
(x, t) ∈ ∂Ω × (0, T ), x ∈ Ω.
Hence, based on the range condition (10.180), which we need to assume to hold for the actual nonlinearity fact only, and the fact that u(x, T ; fact ) solves (10.176) and due to (10.180) coincides with its projection, we can replace the original model (10.176) by the equation containing the projection (10.184). Since the operator P plays a crucial role in the definition of the method and more generally in fixed point methods defined by projection on the observation manifold, we will take a close look at its well-definedness and boundedness. To this end, it is crucial that spatial regularity of the concatenation f ◦ g as used in the fixed point scheme transfers in an almost lossless
400
10. Inverse Problems for Fractional Diffusion
way to one-dimensional regularity of f and vice versa. We thus assume, additionally to (10.180), that (10.185) g ∈ H˙ σ (ω) ⊆ W 1,∞ (ω) , g ≥ |∇g(x)| ≥ g > 0, x ∈ ω , Lg ∈ Z, for some 0 < g < g, so that there exist c(g), C(g) > 0 such that (10.186)
c(g) f (g) W 1,∞(ω) ≤ f X ≤ C(g) f (g) Z
for all f ∈ X .
Indeed, we have in case σ = 2
1/2
f H 2 (I) = (|f |2 + |f |2 + |f |2 ) dz I 1/2 ˜ ≤ C(g) |f (g)|2 + |f (g)|∇g|2 + f (g(x))g|2 dx ω
1/2 ˜ |f ◦ g)(x))|2 + |(f ◦ g)(x))|2 dx = C(g) ω
˜ ≤ C(g) (− + id)−1 L2 (ω)→H 2 (ω) f (g) H 2 (ω) , as well as in case σ = 1
1/2 (|f |2 + |f |2 ) dz
f H 1 (I) = I
1/2 ˜ ≤ C(g) |f (g)|2 + |f (g)∇g|2 dx ω
1/2 ˜ = C(g) |f ◦ g)(x))|2 + |∇(f ◦ g)(x))|2 dx ω
˜ ≤ C(g) f (g) H 1(ω) . From this, the left-hand inequality in (10.186) in the general case σ ∈ [1, 2] follows by interpolation. For the right-hand inequality in (10.186), consider
f (g) W 1,∞(ω) = sup |f (g(x))| x∈ω
≤ max{g, 1} f W 1,∞ (I) + sup |f (g(x))| |∇g(x)| ≥ min{g, 1} f W 1,∞ (I) . x∈ω Lemma 10.15. Under condition (10.185), the operator P : Z → X is welldefined by (10.181) between the spaces defined by (10.182), and satisfies (10.187)
Py X ≤ 2C(g) y Z
for all y ∈ Z .
Proof. Existence of a minimiser follows from the fact that the admissible set { (g) − y Z : ∈ X and f (u0 ) H˙ σ (Ω) ≤ ρ0 } is a nonempty closed convex (hence weakly* closed) subset of the space X which is the dual of a
10.6. Recovering nonlinear partial differential equation terms
401
separable space. Moreover, from (10.186) and the triangle inequality as well as by minimality of Pf , comparing with = 0, we get
Py X ≤C(g) (Py)(g) Z ≤ C(g) (Py)(g) − y Z + y Z
≤C(g) 0 − y Z + y Z = 2C(g) y Z . Remark 10.3. For f ∈ X, r(·, T ) ∈ Z, under the attainability condition (10.188)
f (u0 ) + r(·, 0) − Lu0 H˙ σ (Ω) < ρ0 and y := f (u(·, T ; f )) + L(u(·, T ; f ) − g) ∈ Zg := {(g) : ∈ X},
any minimiser f + of (10.181) satisfies f + (g) = y and is therefore unique, due to the fact that g(ω) = I. The second condition in (10.188) can be shown to be satisfied by the strict monotonicity of g according to (10.185), if ω = Ω = (0, L) ⊆ R1 and u0 ∈ W 1,∞ (Ω) ∩ H σ (Ω) is strictly monotone. This can be seen by using the identity f (u(·, T ; f )) +L(u(·, T ; f ) − g) = ∂tα u(·, T ; f ) − r(·, T ) − Lg ∈ Z (this regularity of ∂tα u(·, T ; f ) will be stated below) and using the assumed regularity (10.185) of g, which transfers to its inverse and gives f + = y ◦ g −1 ∈ W 1,∞ (I) ∩ H σ (I) ⊆ X, as an easy verification of the chain rule in W 1,∞ (a, b) ∩ H σ (a, b) for (a, b) = Ω and (a, b) = I shows. Well-definedness of the operator S : X → L2 (Ω) follows from Theorem 6.7. Actually we need higher regularity Sf ∈ H˙ σ (Ω). To prove that ∂tα u(x, T ; f ) ∈ H˙ σ (Ω) (cf. (10.182)) with σ such that H˙ σ (Ω) continuously embeds into W 1,∞ (Ω), we must achieve sufficiently high regularity on the solution u satisfying (10.184) without having to assume too much regularity on f as it is contained in the right-hand side of (10.184). Thus, in view of the regularity results from Chapter 6, we need α to be large enough. More precisely, as detailed in [183, Section 3.1] with a series of technical estimates, the requirement is α ∈ ( 45 , 1]. Together with Lemma 10.15, this allows us to prove that the fixed point operator T is a self-mapping on the set (10.189) B = {f ∈ X : f W 1,∞ (I) ≤ ρ , f (u0 ) + r(·, 0) − Lu0 H˙ σ (Ω) ≤ ρ0 } , which is the first part of the following theorem. Theorem 10.23. Let α ∈ ( 45 , 1], Ω ⊆ R1 be an open bounded interval σ ∈ ( 32 , 2] and assume that g is strictly monotone and satisfies (10.185) with (10.182), that ρ0 as well as rt LQ∗ (0,T ;L2 (Ω)) are sufficiently small. Then for large enough ρ > 0, the operator T is a self-mapping on the bounded, closed and convex set B as defined in (10.189) and T is weakly* continuous in X as defined in equation (10.182). Hence Tf has a fixed point
402
10. Inverse Problems for Fractional Diffusion
f ∈ B. If this fixed point f is a monotonically decreasing function and satisfies (10.180), then f solves the inverse problem (10.176), (10.177), (10.178). Proof. We will not provide the (rather technical) proof of T being a selfmapping here, but mainly focus of showing weak* continuity. To start with, note that the set B is by definition weakly* compact and convex in the Banach space X with norm f X = f W 1,∞ (I) + f (u0 ) H˙ σ (Ω) . For any sequence (fn )n∈N ∈ B converging weakly* in X to f ∈ X, we have that the sequence of images under T (i.e., T(fn )) is contained in the weakly* compact set B. Using this and compactness of the embeddings X → L∞ (I) (where boundedness of the interval I is crucial), we can extract a subsequence with indices (nk )k∈N and an element f + ∈ B such that (10.190) ∗ T(fnk ) f + in X , T(fnk ) → f + in L∞ (I) , fnk → f in L∞ (I) . It remains to prove that f + = T(f ). For this purpose, we use the fact that the difference u ˆn := u(x, t; fn ) − u(x, t; f ) of solution to (10.184) solves (10.191)
ˆn − Lˆ un − qn u ˆn = fˆn (P u) , ∂tα u
with homogeneous initial and impedance boundary conditions, where 1
fn P (u(x, t; f ) + σ u ˆn (x, t)) dσP, qn (x, t) = 0
fˆn = fn − f , and u = u(x, t; f ). From the representations (6.18), (6.19), (6.20) (10.192)
u ˆn (x, t) =
∞
u ˆjn (t)ϕj (x)
j=1
t ˆn + fˆn (P u))(·, s), ϕj ) ds. where u ˆjn (t) = 0 (t − s)α−1 Eα,α (−λj (t − s)α )((qn u Young’s inequality (A.15) and (10.147), as well as the fact that fn ∈ B, allows us to obtain the crude estimate
ˆ un (·, t) L2 (Ω) ≤ C˜α qn u ˆn + fˆn (P u) L2 (0,t;H˙ −θ (Ω))
ˆ ∞ ≤ C˜α λ−θ
ˆ u
+
f (P u)
f 2 2 2 2 n n L (0,t;L (Ω)) L (0,t;L (Ω)) n L (I) 1
√ ˆ ∞
+ T |Ω| f
ρ ˆ u ≤ C˜α λ−θ 2 2 n n L (I) , L (0,t;L (Ω)) 1 which by Gronwall’s inequality (A.19) yields
ˆ un (·, t) L∞ (0,T ;L2 (Ω)) ≤ C(ρ, T, |Ω|) fˆn L∞ (I) .
10.6. Recovering nonlinear partial differential equation terms
403
Using the fact that u ˆjn as defined in (10.192) satisfies the fractional ode ˆjn (t) + λj u ˆjn (t) = ((qn u ˆn + fˆn (P u))(·, s), ϕj ), we obtain ∂tα u ∞ . α ˆn (x, t) = ˆn + fˆn (P u))(·, t), ϕj ) ((qn u ∂t u j=1
t
− λj
(t − s)α−1 Eα,α (−λj (t − s)α ) / ˆn + fˆn (P u))(·, s), ϕj ) ds ϕj (x). × ((qn u 0
From Young’s inequality (A.15) and (10.147) we therefore obtain an estimate ˆk in a rather weak norm of ∂tα u (10.193) ˆn (·, t) H˙ −(2−θ) (Ω)
∂tα u ≤ (qn u ˆn + fˆn (P u))(·, t) H˙ −(2−θ) (Ω) + (qn u ˆn + fˆn (P u)) L2 (0,T ;L2 (Ω)) √ Ω ≤ (CH ˆn + fˆn (P u)) L∞ (0,T ;L2 (Ω)) ˙ 2−θ ,L2 + T ) (qn u
√ Ω ≤ (CH un L∞ (0,T ;L2 (Ω)) + fˆn (P u)) L∞ (0,T ;L2 (Ω)) ˙ 2−θ ,L2 + T ) fn L∞ (I) ˆ
√ Ω |Ω| fˆn L∞ (I) . ≤ (CH ˙ 2−θ ,L2 + T ) ρC(ρ, T, |Ω|) + Thus, under the attainability condition (10.188) ˆnk (·, T ) H˙ −(2−θ) (Ω) → 0 as k → ∞
T(fnk )(g) − T(f )(g) H˙ −(2−θ) (Ω) = ∂tα u by (10.190). On the other hand, (10.190) also implies
T(fnk )(g) − f + (g) L∞ (Ω) ≤ T(fnk ) − f + L∞ (I) → 0 as k → ∞ . Hence, the two limits need to coincide T(f )(g) = f + (g) ∈ Z ⊆ C(Ω) and thus T(f ) = f + . A subsequence-subsequence argument yields weak* convergence in X of the whole sequence T(fn ) to T(f ). Now invoking Tikhonov’s fixed point theorem (cf. Theorem 8.16) with the weak* topology on X, we get existence of a fixed point of T. It only remains to prove that under conditions (10.180) and (10.188), a fixed point f solves (10.177), (10.178), (10.176). By Remark 10.3 we have f (g) = y = f (P u(·, T ; f ) + L(u(·, T ; f ) − g), hence the difference w = u(·, T ; f ) − g satisfies the elliptic pde 1
Lw + y¯w = 0
with y¯ = 0 f (g + θP w))P dθ ≤ 0 and homogeneous boundary conditions, and therefore has to vanish. In the parabolic case, T can also be shown to be contractive [183, Section 3.3], therefore the fixed point is unique and the Picard iteration converges to
404
10. Inverse Problems for Fractional Diffusion
it. The key observation for the contractivity proof is that for f (1) , f (2) ∈ X, the difference T(f (1) ) − T(f (2) ) = (u(1) − u(2) )t =: z solves zt − Lz − f1 (P u(1) )P z = y˜ ut , (2)
∂ν z(x, t) + γ(x)z(x, t) = 0,
(x, t) ∈ Ω × (0, T ),
(x, t) ∈ ∂Ω × (0, T ),
z(x, 0) = f (1) (u0 ) − f (2) (u0 ),
x ∈ Ω,
with
y˜ = (f (1) (P u(1) ) − f (2) (P u(2) ))P 1 f (1) (P (u(1) + θ(u(2) − u(1) ))) dθP + (f (1) − f (2) ) (P u(2) ))P . = 0 (2)
The factor ut appearing in the right-hand side of this pde decays exponentially for vanishing or exponentially decaying driving term r; cf. [183]. As a consequence, the Lipschitz constant of T tends to zero for increasing T , and so contractivity can be achieved by choosing T large enough, with a contractivity constant that improves for larger T . 10.6.1.2. Reconstructions. For a description of the implementation of the fixed point scheme (10.179) as well as the Newton-type methods to which we will compare it below, we refer to [183, Sections 4 and 5]. Figure 10.20 shows the reconstruction of the reaction term f (a) (u) = 2u(1 − u)(u − a) with a = 0.75 by means of the fixed point scheme (10.179). This corresponds to a particular choice of parameters in the Zeldovich model. The initial approximation was f (u) = 0, and we show iterations 1, 2, 5. The latter represented effective numerical convergence, and it is clear that the second iteration was already very good. The figure on the left shows the situation with 1% added noise and that on the right with 5% noise. Note that the iterations scheme itself was without regularisation; all of this was contained in the initial smoothing of the data g (and also gxx ). The parameters for this were chosen by the discrepancy principle; cf. Section 8.2. However, it is clear that in the reconstruction from 5% noise that this resulted in an under-smoothing. In all numerical runs based on the iteration scheme (10.179) the initial approximation for f was taken to be the zero function. Of course there are more challenging possible reaction terms, and Figure 10.21 shows the reconstruction of the Lipschitz function given by 8u2 u ≤ 12 (b) f (u) = 1 (1 + cos(5(u − 12 ))) e−(u− 2 ) u > 12 ,
10.6. Recovering nonlinear partial differential equation terms
0 -1 -2 -3 -4 -5
. . . .. ..... . . . . . ... ..... . . .. ................................................... . .. . ..... .. .......... .............. . ................................ .......... . .... ........ ....... ... ...... .. ..... . .. ..... .. ..... . ..... .. ..... .. ..... . ..... .. ..... .. ..... ..... ... .. ...... ... ... ..
0 -1 -2 -3 -4 -5
405
. . . .. ..... . . . . . . .. ..... .. . . . . .................................................. . . ...... . .......... ........ ................................ ... ........ .. .......... .. .... ...... ... ......... .. ..... .. ...... .. .... .. ....... .. ....... .. ... . ...... ....... ........ ...... ..... .... ..
Figure 10.20. Reconstructions of f (a) (u) = 2u(1 − u)(u − a) with
a = 0.75 from 1% and 5% noise
again using (10.179). The data g(x) had 1% added noise and again effective numerical convergence was obtained within five iterations and even iteration 2 was almost as close to the original. 2.0 1.5 1.0 0.5 0.0
....... .. .... .... ....... .. ..... .. ..... .... .... . ...... ... .... ............................ ..... ... ..... ....... . . . . .. . ..... ... ... ...... .... . ... . .. ...... . ... ... ...... ...................... ... ... .. .... .. ... . . . ........
Figure 10.21. Reconstruction of f (b) (u) from 1% noise
Figure 10.22 shows reconstructions of f (u) = f (a) (u) and f (u) = f (b) (u) using Newton iteration. In both cases the data g was subject to 1% noise. (a) The initial guess was taken to be of similar form: finit (u) = u(1 − u)(u − 14 ) (b)
and finit = 1 + sin(4u). However, the norms of f and finit were quite far apart in both cases. The figures show the tenth iteration although the rate clearly slowed down by the fourth or fifth iteration which were already close to the one shown. The use of frozen Newton gave almost similar results; only a slight lag in the convergence rate being noticed—as is to be expected from the convergence results in Section 8.2.3, although these have not been proven to apply here.
406
10. Inverse Problems for Fractional Diffusion
0 ................................................................................................................ ........... ........ ....... -1 .... ... ... ..... -2 ..... .... ..... ...... -3 .... .... ... -4 .... .. ... . -5
2.0 1.5 1.0 0.5 0.0
..... . .. ................ ..... ......... . .. ....... .... .... ..... ....... ... .. . ... ... .. ... ....... .. .. .. ... . .... ..................... ... ............ . . . . .... .. . ... . .. . .. ... ... .. .. ........................ ...... .. .. .... ... .. .. ... ..... .. ... ... .. ... .... . .... . . ...... .. ... ... .... ... . ... . ... . ... ... ... ... ... .. ... . ... ... ......... .... .. ... .... ....... .. ... .. .. .. ... ... ... ... ... ... ... .. .. . . ... .. .. ... . ... ..... .. . . . . . . .. .... ......... ....
Figure 10.22. Reconstructions of f (a) (u), f (b) (u) from 1% noise
using Newton iteration
We will now study dependence on the final time T and on α, comparing the fixed point method with frozen Newton method for the Verhulst model f (c) (u) = u(1 − u). The leftmost figure in Figure 10.23 shows the dependence on the choice of T and of α on the convergence rate as measured by the improvement of the first iteration over the initial approximation, namely
fiter1 − f (c) L∞ / finit − f (c) L∞ . 0.8 •
◦ • 0.6 ◦ 0.4 0.2 0.0
α = 0.25 α = 0.5 ◦ α = 0.9 • α = 1.0
•◦ •◦
α = 0.25 α = 0.5 ◦ α = 0.9 • α = 1.0
0.2
0.1
•• ◦ ◦ ◦ •
◦ •
•◦
• ◦ ◦ • T 0.0
•◦
T
Figure 10.23. Variation of fiter1 − f (c) L∞ / finit − f (c) L∞ with
T and α: Left, fixed point scheme; Right, Newton’s method
In the parabolic case the results are as predicted by the contractivity result mentioned above, the number of steps required for a given accuracy decreases as T increases. What is perhaps surprising given the only linear asymptotic behaviour of the Mittag-Leffler function is the fact the convergence for large values of T is at least as good in the fractional case and, indeed, it is considerably better for small values of the final time T . The rightmost figure in Figure 10.23 shows the corresponding results for Newton’s method. Here we again only look at the improvement of the L∞ error in the first step. Also here, in the parabolic case there is a definite
10.6. Recovering nonlinear partial differential equation terms
407
improvement in the convergence rate as T increases. However, this feature diminishes with decreasing α and by α = 0.25 is almost imperceptible. Note that the improvements obtained by the first approximations shown in the above figures should not be directly compared as the iteration scheme (10.179) started from an initial approximation of finit = 0 whereas the Newton scheme started from finit (u) = u − 0.8 u2 , so it is relatively close to the exact solution f (c) (u) = u − u2 . This necessity of choosing a close starting guess is typical of Newton type methods. 10.6.2. Systems of reaction-diffusion equations. We will now briefly turn to systems and start with a prototypical two-by-two setting, that we later extend to a general number N of equations and species. Let Ω be a bounded, simply connected region in Rd with smooth boundary ∂Ω, and let L := −∇ · (a(x)∇·) + q(x) be a uniformly elliptic operator of second order with L∞ coefficients. Consider ut (x, t) − Lu(x, t) = f1 (u) + φ1 (w) + r1 (x, t, u, v), (10.194)
vt (x, t) − Lv(x, t) = f2 (v) + φ2 (w) + r2 (x, t, u, v), w = w(u, v),
(x, t) ∈ Ω × (0, T ),
for some fixed time T and subject to the prescribed initial and boundary conditions analogous to (10.177). In equation (10.194) we assume r1 (x, t, u, v) and r2 (x, t, u, v) are known and the interaction variable w = w(u, v) is also known but either the pair {f1 , f2 } or the pair {φ1 , φ2 } is unknown and the inverse problems posed are to determine these quantities. Thus, there are two distinct inverse problems. The first is when we assume the interaction coupling φi (w) between u and v is known, but both f1 (u) and f2 (v) have to be determined. The second is when we assume the growth rate couplings fi for both u and v are known, but the interaction terms φ1 (w) and φ2 (w) have to be determined. In order to perform these recoveries, we must prescribe additional data and we shall consider two possibilities: the values of u(x, T ), v(x, T ) taken at a later fixed time T (10.195)
u(x, T ) = gu (x),
v(x, T ) = gv (x),
x ∈ Ω;
or the time traces u(x0 , t), v(x0 , t) measured at a fixed point x0 ∈ ∂Ω for all t ∈ (0, T ) (10.196)
u(x0 , t) = hu (t),
v(x0 , t) = hv (t),
x0 ∈ ∂Ω.
Note that final time data corresponds to census data taken a fixed time T for the species involved. The time trace data involves monitoring the populations (or chemical concentrations) at a fixed spatial point as a function of
408
10. Inverse Problems for Fractional Diffusion
time. Both of these data measurements are quite standard in applications. In (10.195), the observation domain can again be restricted to a subdomain ω of Ω. The interaction coupling w of u and v, which we assume known, can take on several forms. The near universal choice in ecological modelling is to take w = uv. This is also common in other applications, but other more complex possibilities are in use. For example, the Gray-Scott model of reaction diffusion, [262], takes w = u2 v and is a coupling term that often leads to pattern formation. This coupling occurs, for example, in molecular realisations where there is an activator u and an inhibitor v. The antagonistic effect here occurs from the relative depletion of v that is consumed during the production of u. The so-called Brusselator equation (the Walgraef and Aifantis equation), which also leads to the generation of sustained oscillations, instead takes w = bu−cu2 v and occurs, amongst many other situations, in the dislocation dynamics in materials subjected to cyclic loading, √ [338]. Other possibilities include w = u2 + v 2 or the nonlocal situation w = Ω (u2 + v 2 ) dx as well as combinations of all of the above. A small collection of examples of 2 × 2 systems arising in systems biology can be found in [186, Section 3.1.1]. A generalisation to systems of an arbitrary number N of possibly interacting states can be achieved by replacing the unknown function f in (10.176) by a componentwise defined vector valued unknown nonlinearity f = (f1 , . . . , fN ), whose action is not only defined on a single state but on a possible combination of these, via known functions w = (w1 , . . . , wN ). Possible additional known interaction and reaction terms are encapsulated in a set of (now potentially also nonlinear) functions r = (r1 , . . . , rN ). We thus consider systems of reaction-(sub)diffusion equations of the form (10.197)
∂tα ui (x, t) + (Lu)i (x, t) = fi (wi (u(x, t))) + ri (x, t, u(x, t)), (x, t) ∈ Ω × (0, T ),
i ∈ {1, . . . N } ,
for u = (u1 , . . . , uN ) subject to the boundary and initial conditions (10.198)
∂ui + γi ui = 0 on ∂Ω × (0, T ) , i ∈ {1, . . . N } ∂ν
and (10.199)
u(x, 0) = u0 (x),
x ∈ Ω.
Well-posedness of this forward problem with Lipschitz continuous nonlinearities fi , ri is a straightforward extension of the results from Section 6.2.4; see also [166, Section 3].
10.6. Recovering nonlinear partial differential equation terms
409
Given the self-adjoint uniformly elliptic operator −L, e.g., −L = diag(−∇ · (A∇·), . . . , −∇ · (A∇·)) + Q(x)
(10.200)
with Q : Ω → RN ×N , A ∈ RN ×N symmetric (uniformly) positive definite, the functions w : I := I1 × · · · × IN → J := J1 × · · · × JN ,
r : Ω × (0, T ) × I → RN ,
and the data u0 , γi , as well as measurements on a subset ω of the domain Ω gi (x) ≡ ui (x, T ),
(10.201)
x ∈ ω,
i ∈ {1, . . . N } ,
we wish to determine the unknown functions fi : J i → R
i ∈ {1, . . . N } .
This includes both cases of identifying the reaction terms fi (by setting wi (ξ1 , ξ2 ) = ξi ) and of identifying the interaction terms φi in (10.194). We will abbreviate the collection of unknown functions by f = (f1 , . . . , fN ), noting that each individual fi might have a different domain of definition Ji . Note that this setting allows for linear and nonlinear coupling among the individual states ui via the known differential operator L (often referred to as cross-diffusion) and the known functions wi as well as ri . Throughout the analysis we will impose the range condition Ii = [gi , g i ] = [min gi (x) , max gi (x)] x∈ω
(10.202)
x∈ω
= gi (ω) ⊇ uact,i (Ω × [0, T )), i ∈ {1, . . . N }
(cf. (10.180)), where we assume ω to be compact to guarantee (via Weierstrass’s theorem and continuity of gi ) that Ii is indeed a compact interval. Here uact is the state part of a solution (fact , uact ) of the inverse problem. Consider now the spatially one-dimensional setting of Ω ⊆ R1 being an open interval (0, L), and make the invertibility and smoothness assumptions
(10.203)
g ∈ H 2 (ω; RN ) , w ∈ H 2 (I1 × · · · × IN ; RN ), * * * *N * * ∂wi * (g (x)) gj (x)** ≥ β > 0 for all x ∈ ω , i ∈ {1, . . . N } , * ∂ξj * * j=1
that by the inverse function theorem imply that wi ◦ g : ω → Ji is bijective and its inverse is in H 2 (Ji ; Ω). Thus we can define the fixed point operator T : X → X, where (10.204) X := X1 × · · · × XN , Xi = {fi ∈ W 1,∞ (Ji ) : fi (wi (u0 )) ∈ H˙ 2 (Ω)} ,
410
10. Inverse Problems for Fractional Diffusion
by (10.205)
(Tf)i (wi (g (x)) = ∂tα ui (x, T ; f) − (Lg )i (x) − ri (x, T, g (x)), x ∈ ω,
i ∈ {1, . . . N } ,
where for any = (1 , . . . , N ) with i : Ji → R, the function u(x, t) = u(x, t; ) solves ∂tα ui (x, t) + (Lu)i (x, t) = i (wi (P u(x, t))) + ri (x, t, P u(x, t)), (x, t) ∈ Ω × (0, T ) , (10.206)
i ∈ {1, . . . N } ,
∂ui + γi ui = 0 on ∂Ω × (0, T ) , i ∈ {1, . . . N }, ∂ν x ∈ Ω, u(x, 0) = u0 (x),
with the projection P : RN → I on the compact cuboid I, which is defined by Pi ξ = max{g i , min{g i , ξ}}. The range condition (10.202) guarantees that any solution (fact , uact ) of the inverse problem (10.197), (10.198), (10.199), (10.201) is a fixed point of T. Besides the reconstruction problem (10.197), (10.201) with final time data, an analogous inverse problem of recovering f = (f1 , . . . , fN ) in (10.197) arises from time trace data h(t) ≡ u(x0 , t), t ∈ (0, T ) , (10.207) for some x0 ∈ ∂Ω, (see [267, 269] for the scalar parabolic case). Under the invertibility condition h ∈ C 1,1 (0, T ; RN ) , w ∈ C 2 (I1 × · · · × IN ; RN ), * * * *N (10.208) * * ∂wi * ≥ β > 0 for all t ∈ (0, T ) , i ∈ {1, . . . N } , * ( h(t)) h (t) j * * ∂ξj * * j=1
we can, analogously to [267, 269], define a fixed point operator T : X := X1 × · · · × XN → X1 × · · · × XN , where Xi = C 0,1 (Ji ), by (10.209)
− ri (x0 , t, h(t)), (Tf )i (wi (h(t)) = ∂tα hi (t) − (Lu)i (x0 , t; f) t ∈ (0, T ) ,
i ∈ {1, . . . N } .
As in [267, 269], we expect self-mapping and contraction properties of T on a ball in X to be achievable also in the fractional case. The crucial estimates of (Lu)i (x0 , t; f) C 0,1 (Ω) required for establishing this, could in principle like there be based on the implicit representation t
t) + G(x, y, t − τ ) f(w(u(y, τ )) + r(y, τ, u(y, τ )) dy u(x, t) = ψ(x, 0
Ω
10.6. Recovering nonlinear partial differential equation terms
411
t) = u(x, t; 0) solves of u by means of the Green’s function G. In here, ψ(x, the linear problem obtained by setting f ≡ 0, and replacing r(x, t, u(x, t)) by r(x, t, 0). For this to go through, regularity estimates on the Green’s function G would be needed. We point to, e.g., [109, Theorem 1, Chapter 9] for Green’s functions for systems and their regularity in the parabolic case α = 1. In the subdiffusion case α < 1, the Green’s function is defined by the Fox H-functions, cf. e.g., [90]. Back to the final time data setting (10.201), a vectorisation of the results from Section 10.6.1 yields the following. Analogously to the proof of Theorem 10.23 one can establish T as a weakly* continuous self-mapping on a sufficiently large ball in X. Theorem 10.24. Let α ∈ ( 45 , 1], σ ∈ ( 32 , 2), let Ω ⊆ R1 be an open bounded interval, let (10.203) hold, and assume that ρ0 as well as ρ¯ := sup Dt r(·, ·, ζ) LQ∗ (0,T ;L2 (Ω)) ζ∈I
are sufficiently small. Then for large enough ρ > 0 the operator T defined by (10.205) is a self-mapping on the bounded, closed and convex set B = {h ∈ X : hi W 1,∞ (Ji ) ≤ ρ , hi (wi (u0 )) + r(·, 0, u0 ) − Lu0 H˙ σ (Ω) ≤ ρ0 } and T is weakly* continuous in X, as defined in (10.204). Thus T has a fixed point in B. Moreover, in the parabolic case α = 1, contractivity of T for sufficiently large final time T follows as in the previous subsection from the fact that for f(1) , f(2) ∈ X, the difference T(f(1) ) − T(f(2) ) = (u(1) − u(2) )t = z, where zi,t − (Lz)i =
N
(1)
i (wi (u(1) )) ∂w u(1) ) + ∂ξj (
∂ri u(1) ) ∂ξj (
(1) i + fi (wi (u(1) )) ∂w u(1) ) + ∂ξj (
∂ri u(1) ) ∂ξj (
fi
)zj
j=1
(2)
− fi
i (wi (u(2) )) ∂w u(2) ) + ∂ξj (
∂ri u(2) ) ∂ξj (
(2)
uj,t
in Ω × (0, T ), ∂zi + γi zi = 0 on ∂Ω × (0, T ), ∂ν (1) (2) zi (x, 0) = fi (wi (u0 (x))) − fi (wi (u0 (x))), (2)
x ∈ Ω,
i ∈ {1, . . . , N }. The factor ut appearing in the right-hand side of this pde decays exponentially under certain conditions; see [186].
412
10. Inverse Problems for Fractional Diffusion
Additionally, in the parabolic case α = 1, the availability of regularity results in Schauder spaces C k,β (Ω × (0, T )) (cf., e.g, [109]) allows us to work in higher space dimensions Ω ⊆ Rd , d > 1; see [186]. Note that the Schauder space setting has already been used in [267, 269] for the same nonlinearity identification problems in case of time trace (instead of final time) observations. We point to [186] for reconstruction results for the scenarios described in and after (10.194) in case α = 1.
10.7. An inverse nonlinear boundary coefficient problem In this section, we briefly consider the following inverse problem of recovering a nonlinearity in the boundary condition. Let Ω = (0, 1), and let T > 0 be fixed. Consider the following one-dimensional model: ∂tα u − uxx = 0, (10.210)
in Ω × (0, T ),
∂u = −f (u), on ∂Ω × (0, T ), ∂ν u(·, 0) = u0 , in Ω,
where the initial data u0 is known. The function f models the nonlinear dependence between the heat flux and temperature. Examples include Newton’s law of cooling, which involves a linear dependence f (u) = cu, or Stefan-Boltzmann cooling, where f (u) = cu4 . The aim is to recover the nonlinearity function f from additional data. As overposed data we again consider the time trace of the temperature at a boundary point (10.211)
u(0, t) = h(t) .
In the standard parabolic case, α = 1, this inverse problem, under suitable assumptions on the given data, has a unique solution and is only mildly ill-conditioned [268]. The fractional case was studied in [301]. The main idea is to use a fixed point argument, by the θα function introduced in Chapter 6. We use the representation (6.15) with g0 (t) = f (u(0, t)) and with g1 (t) = −f (u(1, t)) to obtain
(10.212)
1 θ¯α (x, t − s)f (u(0, s)) ds u(x, t) = v˜(x, t) − 2 0 1 θ¯α (x − 1, t − s)f (u(1, s)) ds, −2 0
10.7. An inverse nonlinear boundary coefficient problem
413
1, where v˜(x, t) = 0 θα (x − y, t) − θα (x + y, t) u0 (y) dy is a known function. Setting x = 0 and inserting the measured data (10.211), we obtain (10.213) 1 1 θ¯α (0, t − s)f (h(s)) ds − 2 θ¯α (−1, t − s)f (u(1, s)) ds . h(t) = v˜(0, t) − 2 0
0
This is a nonlinear integral equation for the unknown f and can be solved by an iterative process. Note that u(1, s) appearing as an argument of f in the last integral depends on f itself since it is supposed to solve (10.210). Thus in order to establish an iterative scheme for reconstructing f , we also need well-posedness of the forward problem. Before going through the main steps of the argument, we state the assumptions to be made on the data as well as the nonlinearity f and comment on them. • The function f is nonnegative and Lipschitz continuous. • u0 ∈ C[0, 1] is nonnegative and decreasing. • h ∈ C 1 [0, T ] is strictly monotonically decreasing. • The compatibility condition u0 (0) = h(0) is satisfied. • For all t, u(0, t) ≥ u(1, t). The positivity assumption on f corresponds to a cooling condition at the boundary and is thus compatible with the monotone decrease of the boundary temperature h. The condition on u0 implies that the left side of the region Ω is hotter than the right one at time t = 0. However, this initial configuration may not persist for all t > 0, and the requisite sufficient condition on f for achieving the last condition is unclear, so is the condition ensuring the monotonicity of h(t). The solution of the direct problem, namely given u0 and the function f to determine u, relies on the representation (10.212) that we have concluded from (6.15) and leads to a nonlinear integral equation for the solution u(x, t). This can be solved pointwise in space by the successive approximation method for the fixed point operator T defined by T(u) = u+ with (10.214)
t
θ¯α (x, t − s)f (u(0, s)) ds u+ (x, t) = v˜(x, t) − 2 0 t θ¯α (x − 1, t − s)f (u(1, s)) ds −2 0
by taking (for example) (10.215)
u(0) (x, t)
= v˜(x, t) and then iterating
u(k+1) = T(u(k) ) .
414
10. Inverse Problems for Fractional Diffusion
To prove convergence of this scheme, we can regard (10.214) as a system of inifinitely many coupled integral equations with respect to time for the unknown functions ux = u(x, ·), x ∈ [0, 1] and employ the proof of the Picard–Lindel¨ of theorem (e.g., [327, Theorem 2.5]). Concerning the reconstruction of f , we start from the representation (10.213) and reformulate it to a Volterra integral equation of the second kind for f . An essential ingredient is the following decomposition of θα 1−α (hence, of θ¯α = RL θα ), whose proof can be found in [272]. 0 Dt Lemma 10.16. θα (0, t) = c0 (α)t−α/2 + H(t), θα (1, t) = θα (−1, t) =: G(t), where H(t) and G(t) are C ∞ on [0, ∞) with all finite order derivatives vanishing at t = 0, i.e., H (m) (0) = G(m) (0) = 0, m ∈ N, and (4π)−1/2 < 1 < 1/2 is a constant which only depends on α. c0 (α) = 2Γ(1−α/2) Using the abbreviations g0 (t) = f (h(t)) ,
g1 (t) = −f (u(1, t)) ,
H 1−α (t) =
RL 1−α H(t), 0 Dt
we can therefore write (10.213) as t t g0 (τ ) c1 (α) dτ + H 1−α (t − τ )g0 (τ )dτ 1−α/2 0 (t − τ ) 0 1 1 v (0, t) − h(t)) + θ¯α (−1, t − s)g1 (s) ds, = (˜ 2 0 with c1 (α) =
c0 (α)Γ(1−α/2) 1 = 2Γ(α/2) . The first term is just one-half times Γ(α/2) α/2 integral 0 It g0 , so applying the α/2 order Djrbashian–Caputo
the fractional fractional derivative and using Theorem 4.3, we arrive at (10.216) g0 (t) = f (h(t)) τ t 1 2−α G (τ − s)f (u(1, s))ds dτ = d(t) − c2 (α) α/2 0 (t − τ ) 0 τ t 1 2−α H (τ − s)f (h(s))ds dτ, − c2 (α) α/2 0 (t − τ ) 0 t (0,τ )−h (τ ) dτ where c2 (α) = sin((1 − α/2)π)(c1 (α)π)−1 and d(t) = c2 (α) 0 v˜t2(t−τ )α/2
2−α 2−α defined is a known function. Moreover, functions RL 1−α the RLG1−α , H , which are 2−α 2−α (t) = 0 Dt G(t) t , H (t) = 0 Dt H(t) t , are in C ∞ ([0, ∞)) by G with derivatives of all orders vanishing at t = 0.
10.7. An inverse nonlinear boundary coefficient problem
415
Based on our assumption that h(t) is strictly monotonically decreasing, ˜ we make a change of variables and set f = f h(t) . We then have the final fixed point form f˜ = Tf˜ =: f˜+ with t 1 ˜ f+ (t) = d(t) − c2 (α) α/2 0 (t − τ ) τ 2−α −1 ˜ ˜ × G (τ − s)f (h (u(1, s; f)))ds dτ 0 τ t 1 2−α ˜(s)ds dτ. H (τ − s) f − c2 (α) α/2 0 (t − τ ) 0 Solving this Volterra integral equation of the second kind now follows the standard pattern of solution by successive approximation. Two key observations are in order. The term f˜ h−1 (u(1, s)) is welldefined due to the requirement that u(1, s) ≤ u(0, s) = h(s). This implies that the range of u is covered by the values of the measurements and corresponds to the range conditions (10.180), (10.202) that we have imposed in Section 10.6. Moreover, for sufficiently small T , T is a contraction mapping on C[0, T ] showing uniqueness of a solution f˜ to the above Volterra integral equation on this domain and thus of f in C[h(T ), h(0)]. The uniform Lipschitz condition on f together with the fact that we assume cooling then shows that we can in fixed increments march the solution forward to any given T . Numerical examples show this procedure to be very effective, [301].
Chapter 11
Inverse Problems for Fractionally Damped Wave Equations
In this chapter we discuss inverse problems for a different class of pde models. As opposed to Chapter 10, where the subdiffusion model was predominant, we now turn to wave phenomena, where fractional derivatives come in via damping terms; see Section 2.2. An important area of application is ultrasound imaging, and here we will study three related problems: recovery of the initial condition u0 (x), of the spatially varying wave speed c0 (x), and of a coefficient κ(x) appearing in case of nonlinear ultrasound propagation. Also the overposed data considered here will be motivated by these applictions as being boundary data over time throughout this chapter, whereas in the subdiffusion case of the previous chapter, also final time data inside the domain was a realistic alternative. Dependence of the ill-posedness of these inverse problems on the fractional order of differentiation will follow a similar pattern as in the subdiffusion case: the most ill-posed case is the one of classical strong damping β = 1, that leads to exponential attenuation and correspondingly to exponential ill-posedness of the reconstruction problem. As soon as the damping is weaker—corresponding to a differentiation order β < 1 in the damping term—the inverse problems tend to become mildly ill-posed. This is most pronounced in the recovery of the initial condition u0 , a problem that shows a certain analogy to backwards subdiffusion studied in Section 10.1.
417
418
11. Inverse Problems for Fractionally Damped Wave Equations
11.1. Reconstruction of initial conditions The inverse problem of recovering the initial state from observations at later times arises in a wide range of applications, and we have discussed it in the context of final time measurements and the subdiffusion model in Section 10.1. Now we look at the problem of recovering the initial data from time trace measurements on the boundary in some damped wave equation, which is motivated by a class of medical imaging techniques. 11.1.1. The inverse problem of pat and tat. Both photoacoustic and thermoacoustic tomography, which are essentially mathematically equivalent, rely on the excitation of an acoustic wave by means of localised thermal expansion due to an illumination with electromagnetic waves. The acoustic wave then propagates through the medium, and finally the acoustic pressure is measured at transducers arranged on a surface Σ enclosing the object to be imaged. In photoacoustic experiments, the medium is exposed to a short pulse of low frequency electromagnetic radiation and absorbs a fraction of the electromagnetic energy. As a result it heats up, expands and this induces acoustic waves. These can be measured on the surface Σ remote from the object and thereby may be used to determine various unknowns in the system. We denote by u(x, t) acoustic pressure, by I(x, t) the illuminating pulse (multiplied with some physical constant), by a(x) the spatially varying absorption coefficient, and by c0 (x) the sound speed. First of all ignoring attenuation, one arrives at the model (11.1)
1 utt − u = a(x)It (x, t), c0 (x)2
where the medium is assumed to be at rest initially. In the typically assumed special setting I(x, t) = I0 δ(t) , equation (11.1) is equivalent to (11.2)
utt − c0 (x)2 u = 0 in Ω × (0, T ), u(0) = u0 ,
ut (0) = 0 in Ω
with u0 (x) = c0 (x)2 a(x)I0 ; see, e.g., [66, eq. (2.5)]. The importance of this must be stressed from an inversion perspective: it converts an unknown source term into an initial condition. Here we will mainly study fractionally damped versions of the initial values version (11.2) and return to the source version (11.1) in Remark 11.3. Measurements of the sound pressure level at distance from the object to be imaged can be phrased as overposed data (11.3)
h(x, t) = u(x, t) ,
x ∈ Σ,
t ∈ (0, T ),
11.1. Reconstruction of initial conditions
419
where Σ ⊂ Ω typical consist of a surface or a collection of discrete points or even just a single point. Some comments on the question about how rich it needs to be in order to allow for unique recovery of the initial data can be found in Remark 11.2. Note that we do not make any smoothness assumption on Σ. The unknown quantity of interest is the spatially varying absorption coefficient a(x) that is to be reconstructed from these observations. In (11.2), the sound speed c0 is assumed to be positive and bounded away from zero as well as constant outside a bounded region B, (11.4)
c0 (x) ≥ c > 0 for all x ∈ Ω ,
c0 (x) = c for all x ∈ Ω \ B .
Often the observation surface Σ is assumed to enclose this region B = interior(Σ), but we do not need this to hold here. Here the domain on which the wave equation holds is typically Ω = Rd , d ∈ {2, 3}, equipped with certain radiation conditions. One might as well use a bounded hold-all domain with absorbing boundary conditions. The key property that we demand on the domain plus boundary/radiation conditions is that the operator A = −(c0 (x)2 /c2 ) on Ω equipped with these conditions has a self-adjoint compact inverse A−1 from some weighted L2 space over Ω into itself. Note that in the case of spatially varying sound speed, A is not given in divergence form and therefore is not self-adjoint with respect to the usual L2 (Ω) inner product. However, in that case, as in Section 7.2 we simply use the weighted L2 space L2c−2 (Ω) with weight c0 (x)−2 , which restores self0
adjointness of A. As far as smoothness is concerned, c20 ∈ L∞ (Ω) ∩ L2 (Ω) is sufficient for our purpose. With this setting, we have existence of a complete eigensystem {λj , ϕj }j∈N of A. This will allow us to represent u by means of a separation of variables formula; see Section 11.1.2.
For more details on the pat/tat model, we refer, e.g., to the review paper [204] and the references therein. While the classical wave equation mentioned above holds in lossless media, attenuation should be taken into account for realistic reconstructions. It has been observed that the propagation of ultrasound is subject to frequency dependent damping, which when transformed to the time domain, leads to terms involving fractional derivatives in the pde, as was discussed in Section 2.2. The inverse pat/tat problem in lossy media therefore amounts to identification of the initial data u0 in the initial value problem for some attenuated wave equation (11.5)
utt + c2 Au + Du = 0 in Ω × (0, T ), u(0) = u0 ,
ut (0) = 0 in Ω,
420
11. Inverse Problems for Fractionally Damped Wave Equations
from observations (11.3). Here D is a differential operator containing space and/or time derivatives. Classically, D will consist of integer derivatives, typical examples being D = −∂t or D = ∂t often referred to as strong (viscous) and weak damping, respectively. More recently, in order to account for power law frequency dependence of the attenuation, as has been observed in measurements, fractional damping models have been put forward and we have discussed some of them in Chapter 7. There, the operator D contains fractional time and/or space derivatives. We will revisit two of the time fractional models from Chapter 7, namely the Caputo–Wismer–Kelvin model ((7.43) for β˜ = 1) (11.6)
D = bA∂tβ
with β ∈ [0, 1], b ≥ 0,
the fractional Zener (fz) model (7.44) (11.7)
D = a∂t2+α + bA∂tβ
with a > 0, b ≥ ac2 , 1 ≥ β ≥ α > 0,
as well as a space fractional one ((7.43) for β = 1) known as the Chen–Holm model (cf., e.g., [45]) (11.8)
˜
D = bAβ ∂t
with β˜ ∈ [0, 1], b ≥ 0,
and we discuss them with respect to the inverse pat/tat problem in subsequent sections. As in Section 7.2, we will sometimes combine the β˜ dependent Caputo–Wismer–Kelvin and the β-dependent Chen–Holm models to a two-parameter setting denoted by ch (for Caputo–Wismer–Kelvin– Chen–Holm). Note that here we have simplified notation a1 → a, b1 → b as compared to Chapter 7, in order to be able to later denote the Fourier coefficients of u0 by aj without causing notational confusion. It is clear that the Lipschitz stability of pat/tat in the undamped wave setting (cf., e.g., [204, Section 19.3.2]) can no longer hold in the presence of Kelvin–Voigt damping, due to the fact that the evolution (11.5) with D = A∂t gives rise to an analytic semigroup, and therefore in some sense behaves like a parabolic pde. Thus the inverse problem must be expected to be exponentially unstable, similarly to the situation with the backwards heat equation from Section 10.1. In this section we will investigate the dependence of the degree of instability on the time fractional orders α, β in (11.6), (11.7) and to some extent also on the space fractional order β˜ in (11.8). We refer to [95, 201] for frequency domain considerations on the degree of ill-posedness of the inverse problem. 11.1.2. The relaxation equation. The importance of the relaxation equation for the inverse problem is due to the fact that the solution to (11.5) can be found by combining the eigenfunction expansion with the solution of
11.1. Reconstruction of initial conditions
the associated relaxation equation (11.9) λ wλ = 0 on (0, T ) , D wλ (0) = 1 , with (11.10)
421
wλ (0) = 0 (if α > 0 : wλ (0) = 0)
⎧ β ⎪ ⎨bλ∂t λ = ∂ 2 + c2 λ + a∂ 2+α + bλ∂ β D t t t ⎪ ⎩ β˜ bλ ∂t
for (11.6), for (11.7), for (11.8).
We will call these solutions wλ relaxation functions and obtain (11.11)
u(x, t) =
∞
uj (t)ϕj (x) ,
j=1
where for each j ∈ N, uj solves λ uj = 0 on (0, T ), D j
uj (0) = u0 , ϕj =: aj ,
uj (0) = 0.
Thus with wj = wλj , (11.12)
u(x, t) =
∞
wj (t)aj ϕj (x).
j=1
Note that this generalised Fourier series corresponds to the Galerkin approximation series, thus convergence of this sequence holds weakly* in the solution spaces from Section 7.1. The energy estimates from Section 7.1 also imply that uj ∈ L∞ (0, ∞) so that taking its Laplace transform is justified— a fact that transfers to wj via the identity uj (t) = wj (t)aj that we have used above. In this section we consider the models (11.6), (11.7), (11.8) and obtain their corresponding relaxation equations. We mainly focus on the Caputo– Wismer–Kelvin model (11.6) and briefly discuss the fractional Zener (11.7) and the Chen–Holm (11.8) cases. 11.1.3. Asymptotics of solutions to the relaxation equation. We here focus our analysis on the time-fractional Caputo–Wismer–Kelvin model and then for comparison also shortly discuss a space-fractional model, the Chen–Holm one. 11.1.3.1. Time-fractional damping: The Caputo–Wismer–Kelvin model. We consider the initial-value problem for the associated relaxation equation subject to initial conditions (11.13) w(0) = 1, w (0) = 0, w + λb∂tβ w + λc2 w =: w + A∂tβ w + Bw = 0,
422
11. Inverse Problems for Fractionally Damped Wave Equations
where A = λ b > 0 and B = λ c2 > 0 follows since we assume that the wave speed c, the damping coefficient b, and the spectrum {λj } of A are all positive. We also refer to [113, Chapter 4] and [28] for more details on the analysis of this relaxation equation and its solution. Taking Laplace transforms (which is justified for w ∈ L∞ (0, ∞)), we obtain s + Asβ−1 . (11.14) w(s) ˆ = 2 s + Asβ + B The denominator ω(s) := s2 + Asβ + B turns out to have no zeros in the right half-plane but two complex-conjugate zeros {p+ , p− }, p± = ρe±iγ with ρ > 0 and π2 < γ < π, and thus whose real part is negative. Lemma 11.1. For A > 0, B > 0 the function ω(s) has precisely two complex-conjugate zeros which lie in the left-hand complex plane. Proof. For 0 < β < 1 it is easy to see there can be no roots whose real part is positive by taking R ∈ (0, 1) so that a ball around any such root of radius R remains in the right-hand plane. Then letting f (s) = s2 +B so that |ω(s)| and f (s) satisfy f (s) < |ω(s)|, we conclude from Rouch´e’s theorem that ω and f have the same number of roots in this ball. Since f has none, the same is true for ω. Let D = {s : π2 < arg(s) < π; R1 < |s| < R} for some R > 1, and let f (s) = −s2 −i be a comparison function. There is no zero of f on ∂D but one simple zero inside D. On |s| = R we have that ω(s)+f (s) = Asβ +B −i, and we obtain |ω(s) +f (s)| < ARβ +B +1 < 2R2 −ARβ −B −1 < |ω(s)| +|f (s)| if R is sufficiently large. Therefore ω has also a single zero in D by Rouch´e’s theorem. A similar argument holds for the region π < args < 3π 2 , and inserting these roots p+ and p− into ω(s) = 0 shows they must be complexconjugate. The transformation z = 1/s applied to ω(s) shows that ω(1/z) = ˜ α˜ + B) ˜ with A˜ = A/B, B ˜ = 1/B and 1 < α ˜ = 2 − β < 2. Thus Bz −2 (z 2 + Az showing the result for 0 < β < 1 also shows it for 1 < β < 2. See [28] for details. The same result is actually easily shown by the following alternative argument; see Figure 11.1. Suppose s is a root in the first quadrant. Then s, the line joining the origin to the point s can be split into a component in the direction of the positive real axis and one in the direction of the positive imaginary axis. Then since β ≤ 1, sβ has components in the same directions. Similarly, the vector s2 has a component parallel to the real axis and again one in the direction of the positive imaginary axis. Since λ ≥ 0, the same is true of the
11.1. Reconstruction of initial conditions
423
˜
vector bλβ sβ . The third vector representing c2 λx points along the real axis. However, the sum of these three vectors cannot add to zero contradicting the claim that the root s lay in the first quadrant. An identical argument shows s cannot lie in the fourth quadrant and hence cannot lie in the right half-plane. The inversion of the Laplace transform of w(t) can be accomplished by deforming the original vertical Bromwich path connecting σ − i∞ to σ + i∞, σ > 0 into a Hankel path Haη , as in Figure 3.1. Consider the Hankel path Haη surrounding the branch cut on the negative real axis and a small circle of radius η centred the origin together with a vertical cut to the poles {p+ , p− } and similar circles of radius η and centres at the poles as shown in Figure 3.1. According to the residue theorem, there are thus two contributions to the integral,
(11.15)
est w(s) ˆ ds =
w(t) = Haη
est Haη
y
s .2.
s
. .......... .• ........ .. .. ... .. ... . ... .. .. ... .. ... β˜ β .. . ... ... b λ s .. ... ................ .. . . ... .. . .. ... .... .. ... .. .... ... .. ..... . . ... . . .. ... ... ... ............ ... ...... c2 λ ... ...... ...................................................................................................... .
Figure
x
11.1. Proof of
Lemma 11.1
s + Asβ−1 ds = Rβ (t) + Sβ (t). ω(s)
These are the evaluation Sβ (t) of the integral over the circle centre, the origin, and radius η as well as the attached horizontal lines in the limit as η → 0, and the contribution Rβ from the poles. For the latter, since ˆ p± ) = lims→p+ (s − p+ )w(s) ˆ is ω(s) = (s − p+ )(s − p− ) the residue Res(w, easy to compute using l’Hˆopital’s rule and gives st st ˆ Rβ (t) = Res w(s)e ˆ ; s = p+ + Res w(s)e ; s = p− % (11.16) p+ + Apβ−1 + e p+ t . = 2 Re 2p+ + Aβpβ−1 + Since p+ = Re(p+ ) + i Im(p+ ), p− = Re(p+ ) − i Im(p+ ), with Re(p+ ) < 0, it follows that the contribution from the poles gives an oscillatory term with exponentially decreasing amplitude governed by e−ρ| Re(p+ )| t . The location of these poles as a function of β and the damping constant b (where we have taken c = 1 and λn = n2 π 2 , corresponding to Ω = (0, 1)) is shown in Figure 11.2. The contribution from the line integrals gives, in addition, the term ∞ Sβ (t) = 0 e−rt Hβ (r) dr where as η → 0, we obtain the spectral function ˆ by the Hβ (r) from multiplying both numerator and denominator of w(s)
424
11. Inverse Problems for Fractionally Damped Wave Equations
200 150 150 ◦
β= ◦ β= • β= −200
100 ◦
◦
3 4 1 2 1 4
−160
−120
−80
−40
100 ◦ •• ◦ • ◦ •• ◦ • ◦ •• ◦• ◦•
◦
50
b =4 b =2 ◦ b = 0.5 • b = 0.1
◦
◦
0 −125
0
−100
−75
−50
◦
◦ ••• ◦ •• ◦◦• ◦••◦••
−25
50
0
0
Figure 11.2. Location of the poles p+ , c = 1, λn = n2 π 2 as a
function of β and b
complex conjugate of s2 + Asβ + B,
(11.17)
s + Asβ−1 * 1 * Hβ (r) = − Im * π ω(s) s=reiπ =
B rβ−1 A sin(βπ) . π (r2 + Arβ cos(βπ) + B)2 + (Arβ sin(βπ))2
Thus, overall we have
∞
w(t) = Rβ (t) +
(11.18)
e−rt Hβ (r) dr .
0
It is important to note that equations (11.17) and (11.18) depend on the eigenvalue λj through the dependence on A and B so that we should really write (11.19) c2 λ rβ−1 bλ sin(βπ) Hβ (r, λ) = 2 π r2 + λ(brβ cos(βπ) + c2 ) + λ2 (brβ sin(βπ))2 ∼
bλ sin(βπ) β−1 r as r → ∞. πc2
For b > 0, c > 0, λ > 0, and 0 < β < 1, Hβ is strictly positive showing that this component of the solution of the initial value problem (11.13) is also a completely monotone function; cf. Section 3.1.1. We now look at the time decay of the second component and require a Tauberian theorem for the Laplace transform—Theorem 3.17. Since here g ≡ 1, we just get the leading order term n = 0 in (3.8) with = β − 1 to obtain the following.
11.1. Reconstruction of initial conditions
425
Lemma 11.2. The second component in (11.18) has the asymptotic behaviour b sin(βπ)Γ(β) 1 (11.20) as t → ∞. c2 π tβ In summary we see from (11.18) that w(t) has two components. The first according to (11.16) decays exponentially to zero and has sinusoidal character. The second is strictly positive if 0 < β < 1 (strictly negative if 1 < β < 2) and decays to zero as a power law determined by the fractional index β, β = 1. It is this last term that must be utilised for any recovery of initial information. In Section 11.1.7 we shall show some graphics that illustrate the above estimates and demonstrate their strong dependence on β. 11.1.3.2. Space-fractional damping: The Chem–Holm model. As an illustration of a model that uses damping by means of the fractional Laplacian, we compute the corresponding situation for the Chen–Holm case (11.8) where the initial-value problem for the relaxation equation becomes (11.21) ˜ w(0) = 1, w (0) = 0, w + λβ bw + λc2 w =: w + Aw + Bw = 0, ˜
2 with A = λβ b > 0 and √ B = λ c > 0. It can be easily solved by computing 2 the poles p± = −A± 2A −4B and corresponding residuals (cf. (11.16)) ⎧ p+ +A p+ t p− +A p− t ⎪ e + 2p e if 4B < A2 , ⎪ − +A ⎨ 2p+ +A / . p+ +A p+ t (11.22) w(t) = 2 Re 2p e if 4B > A2 , + +A ⎪ ⎪ ⎩ 1 + (p + A)t ep+ t if 4B = A2 . +
Considering the monotonically increasing positive sequence (λj )j∈N of eigenvalues of A tending to infinity as j → ∞, we see the following asymptotic behaviour. For this purpose we denote by j0 ∈ N the unique index such that 2/(2β−1) ˜ 2c 1 ≤ λj0 if β˜ = , λj0 −1 < b 2 1 if β˜ = , j0 = 0 2 2c 2/(2β−1) ˜ for all j ∈ N. in particular j0 = 1 if λj0 ≥ b • 0 < β˜ < 12 : For all j ≤ j0 , there are two real negative poles, and for all j > j0 , there is a pair of two complex conjugate poles with the asymptotics lim
j→∞
Re(p± (λj )) ˜ λβj
b =− , 2
lim
j→∞
Im(p± (λj )) = ±c . λj
426
11. Inverse Problems for Fractionally Damped Wave Equations
•
1 2
< β˜ ≤ 1: For all j < j0 , there is a pair of complex conjugate poles ˜
with negative real part Re(p± (λj )) = − 2b λβj , and for all j ≥ j0 , there are two real negative poles with the asymptotics lim
j→∞
p+ (λj ) β˜ λ1− j
=−
c2 , b
lim
j→∞
p− (λj ) ˜
λβj
= −b .
• β˜ = 12 and b < 2c: For all j > j0 = 0, there is a pair of two complex conjugate poles with Re(p± (λj )) = − 2b λj , Im(p± (λj )) = 5 2 ± c2 − b4 λj . • β˜ = 12 and b ≥ 2c: For all j > j0 = 0, there are only two real and negative poles,√ 2 2 2 p+ (λj ) = − b+ b2 −4c λj , p− (λj ) = − b+√2c λj . b2 −4c2 In all these cases wj (t) = wλj (t) decays exponentially with time, with an exponent factor that grows like a positive power of λj . Thus the second component in (11.18) that has power law decay does not appear. Avoiding this infinite range integral of Hβ makes those models with a full time derivative for damping computationally simpler but the effect of an exponential in time loss of information in the observations will have a negative impact on inverse problems using time trace data for recovery. 11.1.4. Location of the poles of the relaxation functions. Motivated by their role for the degree of ill-posedness of the inverse problem, we develop some further results for each of the models under consideration and also provide some computational results with plots of these poles for several parameter configurations. 11.1.4.1. The Capto–Wismer–Kelvin–Chen–Holm model. Poles of the ch model are the roots of the function ˜
ω ch (s) := s2 + bλβ sβ + c2 λ = 0 . We recall that according to Lemma 11.1, whose proof relies on Rouch´e’s theorem, they lie in the left-hand complex plane. For b = 0 the poles are along the imaginary axis and are spaced exactly as the eigenvalue sequence {λn } stretched by the factor c2 . As b increases, so does the (negative) real component of the poles which follow a curve whose rough slope is determined by the ratio of b and c2 . Also the powers β and β˜ are a factor that influences the skewness of the curve along which the poles align. The magnitude of the real and imaginary parts show the relative strengths of the damping and oscillation effects, respectively, in the equation.
11.1. Reconstruction of initial conditions
β=
1 2
◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦
, c=1
β˜ = 0.5 ◦ β˜ = 0.75 β˜ = 1
427
40 30 20 10 0
◦
β=
9 10
, c=1
β˜ = 0.5 ◦ β˜ = 0.75 β˜ = 1
◦
30
◦ ◦ 20 ◦ ◦ ◦ ◦ 10 ◦ ◦
β=
9 10
, c=5
β˜ = 0.5 ◦ β˜ = 0.75 β˜ = 1
0
160 ◦ ◦ ◦ 120 ◦ ◦ 80 ◦ ◦ ◦ 40 ◦ ◦ 0
˜ c values. Figure 11.3. Roots of ω ch (s) for various β, β,
The roots of ω ch (s) are shown in Figure 11.3 with b = 0.1, λn = n2 π 2 , 9 , which β˜ = 1, and for both c = 1 and c = 5, as well as β = 12 and β = 10 illustrates the above point. Here are some notes on how these poles were computed. For rational ˜ β = p/q, ω ch (s) can be written as z 2q + Bz p + C with B = bλβ , C = c2 λ and where s = z p . Now the 2q-th degree polynomial can be represented as the characteristic polynomial of a 2q × 2q matrix. Then the roots of this polynomial are calculated by computing the eigenvalues of the companion matrix. This gives a good approximation even for reasonably large q values, but additional care must be taken; see, for example, [88]. Given now the values of {zn } for λ ∈ {λn }, one can recover {sn } from sn = znp . This is subject to considerable round-off error for even modest values of p. However it is usually sufficient as an initial approximation for Newton’s method to then compute a more exact value of the roots of ω to the desired accuracy. This is also successful for real β by first taking a rational approximation β ≈ p/q for the initial approximation of the roots and then proceeding as above. 11.1.4.2. The fractional Zener model. Poles of the fz model are the roots of the function ω fz (s) := as2+α + s2 + bλsβ + c2 λ = 0.
428
11. Inverse Problems for Fractionally Damped Wave Equations
There is a more complicated relationship here and more constants whose value can affect the outcome. In the case that β = α, we can rewrite this as ω fz (s) := (asβ + 1)(s2 + c2 λ) + δλsβ = 0,
where δ := b − c2 a,
and δ needs to be nonnegative; cf. [151, Section iii.b]. If δ = 0, then ω fz (s) √ factors. There will be two roots at ±i λ c on the imaginary axis and a potential root coming from sβ + 1/a = 0. The latter only exists in the case where β = 1, for otherwise writing s = reiθ with θ ∈ (−π, π], we have that βθ ∈ (−π, π), and therefore Im(sβ + 1) = rβ sin(βθ) = 0 implies βθ = 0, hence Re(sβ + 1/a) = rβ cos(βθ) + 1/a > 0. In the case where β = 1, we obviously have a root at −1/a, whose modulus, notably, does not increase with λ, as opposed to the two other complex conjugate roots of ω fz . Clearly, physical reasoning leads us to the conclusion that in case of a nonegative diffusivity of sound δ ≥ 0, all poles need to have nonpositive real part. However, the complex analysis arguments from the proof of Lemma 11.1 via Rouch´e’s theorem, using as a bounding function the dominant power part f (z) = az 2+α + c2 λ, does not seem to directly carry over to the fz case. This is basically due to the fact that we cannot say anything about the number of roots of the nonpolynomial function f . Additionally, asymptotics in terms of powers of s will be much less effective here since, for small a and/or α, the term s2 will be de facto dominant even for relatively large magnitudes of s. Therefore we have to take a different path to conclude that also in the fz case, the poles lie in the left-hand complex plane. We do so by means of energy estimates similar to those in Section 7.2.2, which basically correspond to the mentioned physical argument. As a (partial) counterpart to Lemma 11.1 in the ch case, we state the following. Lemma 11.3. The roots of ω fz (s) with β = α and δ := b − c2 a ≥ 0 lie in the left-hand complex plane. Proof. We consider the following initial value problem for the relaxation equation (11.23) a∂t2+α w + w + bλ∂tβ c2 λw = 0 , w(0) = 0 , w (0) = 1 , w (0) = 0 . The Laplace transform w ˆ of its solution satisfies ˆ = asα + 1, (as2+α + s2 + bλsβ c2 λ)w(s)
11.1. Reconstruction of initial conditions
429
α
:= b − c2 a ≥ 0, analogously and therefore w(s) ˆ = ωasfz +1 (s) . Now if β = α and δ to the proof of Theorem 7.12, we obtain an energy estimate for w ˜ := a∂tα w + w by multiplying (11.23) with w ˜ and integrating with respect to time t a
c2 λ 1 1 2 2 α 2 α 2 |w ˜t (t)| + |w(t)| ˜ + δλ |∂t w(t)| + |∂ w(τ )| dτ t 2 2 2 2Γ(α)t1−α 0 c2 λ 1 2 ˜t (0)|2 + |w(0)| ˜ ≤ |w 2 2 for all t ≥ 0. This implies uniform boundedness 2 =: C ˜t (0)|2 + |w(0)| ˜ |w(t)| ˜ ≤ (c2 λmin )−1 |w by a constant independent of λ. Taking Laplace transforms * ∞ * ∞ * * C −st * ˆ e w(t) ˜ dt** ≤ e− Re(s)t dt C = |w(s)| ˜ =* Re(s) 0
0
ˆ for Re(s) > 0, we see that w(s) ˜ cannot have any poles in the right half-plane. ˆ ˆ − asα−1 w(0) = (asα + 1)w(s) ˆ = Due to the identity w(s) ˜ = (asα + 1)w(s) (asα +1)2 ω fz (s) (where the numerator has no zeros in case α ∈ (0, 1)), the assertion follows. The effect of δ on the poles in the fz model can also be assessed by means of the implicit function theorem, applied to the function 2+α cos((2 + α)θ) + r2 cos(2θ) + bλrβ cos(βθ) + c2 λ ar f (r, θ; δ) = , ar2+α sin((2 + α)θ) + r2 sin(2θ) + bλrβ sin(βθ) whose zeros are the magnitudes and arguments of the roots s = reiθ of ω fz ; see [189] for details. As a result one sees that increasing δ tends to move the poles into the left-hand complex plane, which is intuitive in view of its physical role as a diffusivity of sound. ◦ •
δ δ ◦ δ • δ
•
•
•
◦
◦ • ◦ • ◦ ◦ • ◦ ◦ • ◦ • ◦◦ • ◦◦ • ◦
60
◦
◦
50
◦
• •
= 0.1 = 0.5 = 1.0 = 2.0
40
◦
30
◦
•
◦
20
◦ • ◦ 10
•
120 100 80 60 40
◦
20
◦
◦
0 −80
−60
−40
−20
−7
0
−6
−5
−4
−3
Figure 11.4. Roots of ω fz (s) for various δ values with
α=β=
1 2
, and (right) α = β =
9 10
0 −2
−
(left)
430
11. Inverse Problems for Fractionally Damped Wave Equations
Also here we have employed the method for numerically computing roots as described above. In particular we use this in order to illustrate the influence of δ > 0 on the behaviour of the roots; see Figure 11.4. 11.1.5. Uniqueness of eigenvalues from poles of relaxation functions. We will now show that all eigenvalues of the operator A can be uniquely recovered from knowledge of the poles of all solutions to the relaxation equations. From Lemma 11.1 we conclude that in case of Caputo–Wismer–Kelvin or Chen–Holm, the denominator of the relaxation function ωλch (s) = s2 + ˜ ch bλβ sβ +c2 λ has precisely two complex-conjugate zeros pch + (λ), p− (λ), which lie in the left-hand complex plane. For fz , we first consider the particular parameter configuration (corresponding to vanishing diffusity of sound) b = ac2 and β = α,
(11.24)
in which we can factorise the denominator of the relaxation function ω fz (λ, s) = (asα + 1)(s2 + c2 λ) and get the roots pfz 0 =−
√ pfz ± (λ) = ±ic λ .
1 (only in case β = α = 1) , a
fz Note that pfz 0 is independent of λ, but p± (λ) obviously allows us to distinguish between different λ’s. This distinction is possible in general. As an additional result, which is not needed for the uniqueness proof but might be convenient for the computation of poles and residues, we state that the poles are single in certain cases.
Lemma 11.4. The poles of w ˆ ch and of w ˆ fz (except for pfz 0 in case β = α = 1) differ for different λ. Moreover, in the case ch and in the case fz with (11.24) the poles are single. ˜
Proof. For ch , let f (z) = z 2 + c2 λ, g(z) = bλβ z β . Then for a sufficiently large R > c2 λ, let CR be the circle of radius R, centre at the origin. Then |g(z)| < |f (z)| on CR and so Rouch´e’s theorem shows that f (z) and (f +g)(z) have the same number of roots, counted with multiplicity, within CR . For √ f these are only at z = ±i λc so the same must be true of f + g, and so ω ch has precisely one single root in the third and in the fourth quadrant, respectively. iθ ˆλch Suppose now that both w ˆλch and w ˇ have a pole at re , where π/2 < θ < π. Then for s = reiθ ˜
s2 + bλβ sβ + c2 λ = 0,
ˇ β˜sβ + c2 λ ˇ = 0, s2 + bλ
11.1. Reconstruction of initial conditions
431
so that ˇ c2 (λ − λ) = −sβ . ˜ ˜ β β ˇ b(λ − λ ) ˇ then the left-hand side is positive and real and so βθ = π. Now if λ = λ, This means that θ > π, a contradiction. ˆλfz In case of fz, assuming that p is a pole of both w ˆλfz and w ˇ , we have β ˇ + c2 ] , 0 = ωλ (p) − ωλˇ (p) = (λ − λ)[bp 2
where due to ωλ (p) = 0, the term in brackets bpβ + c2 = − pλ (apα + 1) = 0 , ˇ In the factorisable case (11.24) of fz, obviously all roots are hence λ = λ. single. Finally, as another ingredient for the uniqueness proof in Section 11.1.6, we state that all residues are nonzero. ˆ fz (except for pfz Lemma 11.5. The residues of the poles of w ˆ ch and of w 0 ) do not vanish. ˜ β−1 β
1+α
β−1
+s+bλs ˆλfz (s) = asas Proof. Recall that w ˆλch (s) = 2s+bλβ˜ sβ 2 and w 2+α +s2 +bλsβ +c2 λ , s +bλ s +c λ so in both cases the Laplace transform of the relaxation function is of the 2 d(s) where d(s) = ω(s)−cλ . By l’Hˆopital’s rule and due to the form w ˆλ (s) = ω(s) s fact that the poles are single, we get
(s − p)d(s) d(s) + (s − p)d (s) d(p) = lim = s→p s→p ω(s) ω (s) ω (p) 2 2 cλ ω(p) − cλ =− = 0. = pω (p) pω (p)
Res(w; ˆ p) = lim
11.1.6. Unique determination of the initial condition. The uniqueness result we show here is valid for all three models (11.6), (11.7), (11.8) and—based on the analysis we have carried out in Section 11.1.5—will be done in a unified manner for all of them. Suppose now we have obtained time-trace data according to (11.3). Further, we assume that c20 , c2 , and b are known. In order to extract from this information all Fourier coefficients (aj )j∈N of the initial state u0 , in one space dimension with Σ = {x0 }, we assume that no eigenfunction vanishes at x0 . In the higher dimensional setting with eigenvalues λ of multiplicity #K λ possibly larger than one, we make an assumption of linear independence of eigenfunctions. For each eigenvalue λ of A with eigenfunctions (ϕk )k∈K λ , the restrictions of the eigenfunctions to the observation manifold are supposed to be linearly independent, that is,
432
11. Inverse Problems for Fractionally Damped Wave Equations
the following implication holds for any coefficient set (bk )k∈K λ : (11.25) ⎛ ⎞
⎝ bk ϕk (x) = 0 for all x ∈ Σ⎠ =⇒ bk = 0 for all k ∈ K λ . k∈K λ
We will also encounter this condition in the context of other reconstruction tasks from time trace data in Sections 11.2 and 11.3. Theorem 11.1. Suppose the domain Ω and the operator A are known. Then provided condition (11.25) holds, we can uniquely recover the initial value u0 (x) from time trace measurements (11.3) on Σ. Proof. Since the inverse problem under consideration is linear, it suffices to prove that vanishing observations necessarily result in zero initial conditions. We apply the Laplace transform to the data, and consider the pole locations p+, and residues, making use of the fact that by Lemma 11.4, ˆλj () = 0 for j = . This yields lims→p+, (s − p+, )w ˆ p+, ) = lim (s − p+, ) 0 = Res(h, s→p+,
= Res(w ˆ , p+, )
∞
w ˆj (s)aj ϕj (x0 )
j=1
ak ϕk (x0 ) ,
k∈K λ
for all x0 ∈ Σ, which by condition (11.25) and Lemma 11.5 implies aj = 0 for all j and hence u0 = 0. Remark 11.1. The proof shows that we recover not only the initial data but also the eigenvalues of A, which gives a possibility of additionally retrieving information on the coefficients of A; see Section 11.2.1. Remark 11.2. Condition (11.25) is in particular satisfied in the case of one space dimension, Σ = {x0 }, where all eigenvalues of A are simple (cf. Theorem 9.9), that is, #Km = 1 for all m, provided none of the eigenfunctions vanish at x0 ; e.g., this can be achieved by taking x0 on the boundary and letting ϕj be subject to non-Dirichlet conditions. To give an example in higher space dimensions, let us consider the unit disc Ω, where the eigenvalues and eigenfunctions of the negative Laplacian A with homogeneous Dirichlet boundary values are given in terms of the Bessel functions J and their positive roots μ,n in polar coordinates x = (r cos θ, r sin θ), r ∈ [0, 1], θ ∈ [0, 2π) as follows: λ,n = μ2,n , ϕ,n,1 (r, θ) = J (μ,n r) cos(θ) , ϕ,n,2 (r, θ) = J (μ,n r) sin(θ) .
11.1. Reconstruction of initial conditions
433
For a fixed eigenvalue λ of A, the index set K λ is therefore given by (11.26)
K λ = {(, n) ∈ N : μ2,n = λ} = {(, n()) : ∈ M λ }.
The set M λ is finite, since if λ = μ2,n is an eigenvalue of A for any ∈ N, then the corresponding number n() of the Bessel function root is clearly unique since these √ roots are single. To satisfy (11.25), we select r∗ ∈ (0, 1) in such a way that λr∗ is not a root of any Bessel function (which is actually the generic case) and choose the circle Σ = {(r∗ cos θ, r∗ sin θ), θ ∈ [0, 2π)} as an observation surface. Indeed this can easily be seen to satisfy (11.25) as follows. The premise is written—in terms of the index set K λ according to (11.26) specific to our setup, whose indexing also applies to the arbitrary coefficients bk —as
c b,n(),1 cos(θ) + b,n(),2 sin(θ) = 0 for all θ ∈ [0, 2π) , ∈M λ
√ where c = J ( λr∗ ) = 0 for all ∈ N0 . Taking the L2 (0, 2π) inner product with cos(jθ) and sin(jθ) (or simply using linear independence of the functions cos(θ), sin(θ), ∈ M λ ), we conclude that c b,n(),1 = 0 and c b,n(),2 = 0 for all ∈ M λ , that is, due to the fact that c = 0, all the coefficients b,n(),i , ∈ M λ , i ∈ {1, 2} vanish. Alternatively, one could choose Σ to be a diameter of the disc at an angle θ avoiding the zeros of cos(θ), sin(θ) and make use of the fact that the Bessel functions are linearly independent. This construction principle carries over to other geometries and higher space dimensions whenever the eigenfunctions allow for a separation of variables. Obvious examples for this are spheres or cuboids, where the eigenfunctions are composed of spherical harmonics and/or trigonometric functions. To see this, recall that (11.25) simply says that the eigenfunctions, when restricted to the observation surface, are linearly independent. So if Σ is oriented along one of the directions of separability, one can make use of the linear independence of the eigenfunction factors in the other direction. For an investigation on how to choose the observation location in a related problem using separable eigenfunctions as well as numerical reconstruction results; see [304]. A sufficient condition for (11.25) to hold is if there exist points x0,λ,1 , . . . , x0,λ,N λ ∈ Σ, N λ ≥ #K λ such that the matrix ϕk (x0,λ,i )k∈K λ ,i∈{1,...,N λ } has full rank #K λ . Accumulating this over the sequence of eigenvalues λ of A and bearing in mind the fact that in higher space dimensions their multiplicity can become arbitrarily large (e.g., in the cuboid example above), we conclude that the cardinality of Σ will typically have to be infinite. Still this can allow Σ to consist of a countable discrete sequence of points.
434
11. Inverse Problems for Fractionally Damped Wave Equations
The linear combinations on the left-hand side of (11.25) are all the eigenfunctions corresponding to λ. Thus, another way to view condition (11.25) is via zero sets of eigenfunctions; see, e.g., [224, 225, 361]. The observation surface Σ should be chosen such that no eigenfunction of A vanishes identically on it—at least for those components of u0 that we want to recover. (In particular, due to noise in the data, one will aim at recovering finitely many components aj = u0 , ϕj only. Then it suffices to guarantee (11.25) for the eigenvalues and eigenfunctions corresponding to these components.) Since level sets of the Laplacian (or any elliptic operator with analytic coefficients) are analytic, it should help to choose Σ as a rough manifold. Remark 11.3. These uniqueness results easily carry over to the original formulation of the imaging task as an inverse source problem for (11.1) with homogeneous initial conditions u(x, 0) = 0 and ut (x, 0) = 0, where ∂2 ∂2 2 we replace ∂t 2 − c0 = ∂t2 + A by some damped wave operator D in (11.1). We assume that the illumination profile only depends on time, thus It (x, t) = f (t), and we rename a(x)c20 (x) to a(x). We can make use of the following formalism for all the attenuation models (11.6), (11.7), and (11.8). The initial value problem for the pde is (11.27)
= a(x)f (t), Du
ut (x, 0) = 0
u(x, 0) = 0,
(and utt (x, 0) = 0 if time derivatives of strictly higher order than two appear With u(x, t) = ∞ uj (t)ϕj (x), a(x) = ∞ aj ϕj (x) this leads to in D). j=1 j=1 the relaxation equations j uj = aj f, D
uj (0) = 0
uj (0) = 0,
with the Laplace transform (using the fact that all initial conditions are homogeneous) uj (s) = aj fˆ(s), ωj (s)ˆ ˆj (s) = ˆj (s), where w that is, u ˆj (s) = aj fˆ(s)w data h(x0 , t) = u(x0 , t) ,
1 ωj (s) .
t > 0,
Inserting the overposed
x0 ∈ Σ,
we get ˆ 0 , s) = h(x
∞ j=1
u ˆj (s)ϕj (x0 ) =
∞
aj fˆ(s)w ˆj (s)ϕj (x0 ).
j=1
ˆ and using the fact that they Now considering a sequence of poles {p } of w are distinct for different λ ’s according to Lemma 11.4, taking the residues
11.1. Reconstruction of initial conditions
435
yields ˆ p ) = lim Res(h,
s→p
∞
aj (s − p )fˆ(s)w ˆj (s)ϕj (x0 )
j=1
= Res(fˆw ˆ , p )
ak ϕk (x0 ).
k∈K λ
Hence we obtain uniqueness of (aj )∞ j=1 from time trace data provided that for all ∈ N, λ = λ satisfies (11.25) and Res(fˆw ˆ , p ) = 0.
(11.28)
An interesting question in this context is how to choose the illumination function f such that a good reconstruction of a is achieved. A minimal condition on f is equation (11.28), but it would be important to determine broader conditions analytically and/or numerically. 11.1.7. Recovery algorithm for the initial condition. In this section we confine our exposition and numerical tests to the Caputo–Wismer–Kelvin model. The assumption is that we have measured the solution profile h(t) := u(x0 , t) at a fixed point x0 ∈ [0, 1] leading to the problem of recovering aj := u0 , ϕj , where {ϕj } are the complete eigenfunctions of L and these are assumed known as are the spectrum {λj } and the (constant) wave speed c. Taking Laplace transforms gives ∞ ∞ ˆ ak w ˆk (s)ϕk (x0 ) = (11.29) h(s) = k=1
k=1
s + bλk sβ−1 ϕk (x0 )ak . s2 + bλk sβ + c2 λk
The sum in equation (11.29) can be limited to a value N , and given a sequence {sj }M j=1 of (possibly complex valued) numbers, this leads to an M × N matrix M and a subsequent equation to recover {ϕk (x0 )ak }N k=1 . Hence, provided we can bound ϕk (x0 ) away from zero, this can be used to recover {aj }, and hence the first N components of u0 (x). We know that for fixed N and sufficiently large M , the matrix M will be invertible, but the question is how its singular values behave, which in turn will determine the degree of ill-conditioning of the overall problem. Another strategy, and the one we will present here, is to use the representations (11.19), (11.16) for computing wj (t) accoriding to (11.18) and combine it with (11.12) to get a solution of the direct problem involving u0 (x). Then setting x = x0 , we have a direct representation for h(t) = u(x0 , t), which we assume is available as measured data. The algorithm proceeds as follows:
436
11. Inverse Problems for Fractionally Damped Wave Equations
• Since we know the spectrum {λj }, we can compute the values of the two components of wj (t) in equation (11.18): the first is direct, the second requires the quadrature of the infinite integral, but this can be handled by standard means. • We thus can form a T × N matrix W = (wj (ti ))i=1,...,T, j=1,...N , where T is the number of sample points taken in {h(ti )}Ti=1 and N is the maximum frequency {λj }N j=1 allowed in the representation of u0 . • To perform this calculation, we must integrate the function Hβ (r) multiplied by e−rt , and also we must recover the pole locations p+ for each value of j. Finding the zeros of ωλj (s) is simplified if we take β to be rational, β = p/q. In this situation the roots are qth powers of the roots of a polynomial, in which case there are fast algorithms for their recovery. Otherwise a Newton or other iterative method would be required. • Now it is a straightforward matrix singular value decomposition inversion of W in solving h = W˜ a, where a ˜j = ϕj (x0 )aj , and then a division to recover the Fourier coefficients {aj } of u0 (x). • The choice of point x0 is of course critical in this last division. It needs to be chosen such that none of the eigenfunctions vanishes there. In the one dimensional Laplacian with Dirichlet conditions, the zeros of the combined eigenfunctions are not equally spaced, and for modest values of N there are sufficiently large gaps to make the process tractable. In our reconstructions we used N = 10 and chose x0 = 0.95. In the case of two or more spatial dimensions, multiple eigenvalues may occur, but on the other hand we will also have observations on a surface Σ instead of just a single point x0 . All eigenfunctions corresponding to the same value of λ will give the same wλ (t). Condition (11.25) still gives a means to discern between contributions in u0 corresponding to different eigenfunctions of the same λ. Figure 11.5 shows the values of h(t) = hβ (t) for β = 0.1, 0.5, 0.9 when u0 = sin(πx), that is, u0 is an exact Fourier mode, and for a more complex u0 , band-limited to N = 10 but with all frequencies nontrivial. One easily sees the difference in the profiles with β = 0.1 when the number of oscillations present clearly contains information on u0 , as opposed to the case of β = 0.9 when these have been much diluted and quickly washed out for larger values of t due to the more rapid decay of hβ (t). The conditioning of the matrix W depends on the three parameters b, c, and β (and of course on the domain size which is reflected in the eigenvalues
11.1. Reconstruction of initial conditions
0.2
0.1
0.0
−0.1
h (t)
......... ..β ..... ..... .... .. ...... .. ........ ... ... .... ... ... ... ... .... ... ... ... .... .......... ... .... ... .. ... ... ... ... ... ... ..... . .......... ... ... ... ..... ... .. .. ... ... .... ... .. ...... ... ... ...... ... ... . . ... .. . . ... ... ... ... ... ....... .... .. . . ... t . . . . . . . . . . . ... ... .. ... ..... ... ...... ................. ... ... .... ..... .... ........................................................................................................................................................................... ... .. .. ... ... .......... . ... ...... .... ...... ...... . ....... .. . . ... ... . . ... ..... ... .... ... ..... . . . ... ... ... ... ............. ....... ... ... ... ... .... ... ... . . . .. ... ... . . . ........ . . . . ... .. .. .. ... ... ... .. .. ... .... .... ....... ... . ... ... ..... . ... ... ....... ... .. ... ... ...
−0.2
0.6 0.4 0.2 0.0 −0.2
437
.... ..... ........ ... h (t) ..... β ..... β = 0.9 ...... ..... β = 0.5 ...... ........ β = 0.1 ..... .... ...... .... ..... .... ...... .. .. .. .... ....... ........ .... .... ........ .. ... .. ........ ...... ... .. ......... .. .. ... ... ......... .. ..... .. .. . ....... . ...... ... . ... .. ..... . . . ... .... . . ...... .. ... . . .. ... .... ... ............... ............ .... ........ . .......... .... .. ... ................................... ...................................................................... .... ... .... t ......................................................... ...... ....... ......... ........ ... ... ..... ................. .................... ................ ...... . .... ...... ................ ... .... ...... ............ . ....... ........ .... .. ........... ...... ..... ... ..... . . . . . . . . . . . . . . . . . . . . ..5 . . . . . . . . . . ......... 4 0 ....... ... ............ ... ...1........ 2.............. 3 ... .......... .. ...... ..... .... .. .... .. ...... ...... .
Figure 11.5. Profiles of hβ (t) for different β; left-hand figure with
smooth u0 , right-hand figure with rough u0 . •......
... ... ◦ ... ... .... ... • ...... ....... ...... ...... . . ...... ...... . . . ....... ...... ....... ... .... ........ ... ... ......... ... ... ......... ........... ... ... ............ .... ... ............ . ..... .. ............ ........ ........... ........ ........... .... .......... ....... .......... ........... . .... ...... .... ........ .... ........ .. .... ......... ..................... ........... .........................................................................................................
◦
•
•
•
◦
◦
◦
•
◦
•
•
•
•
•
◦
Figure 11.6. Singular values of W with different values of β
{λn }). Figure 11.6 indicates the variation with respect to the exponent β by showing singular values for the same three β values as in Figure 11.5. For small β the matrix W is numerically invertible for quite large values of N , but this situation changes strongly as β increases toward unity. Even for β = 12 and with data subject to very small errors, it is unlikely that one would be able to use more than about seven singular values in a reconstruction, and this drops to four or five by β = 0.9. The computation of wj (t) was done over the time interval [0, 100] on a graded mesh of 500 points so that at t = 0 the stepsize was 10−3 and at t = 100 the stepsize was 5. The small stepsize initially was essential for β near unity, while the large final time value was important for the smaller values of β. The values of b and c (here both were taken to be unity) also affect these figures, and we show the dependence on these parameters in Figure 11.8. Figure 11.7 shows the reconstructions of a u0 with ten significant modes for β = 0.1, 0.5, 0.9, and under both 0.1% and 1% uniformly distributed random noise in g. For β = 0.1, the reconstruction is indistinguishable from the actual function u0 , and these reconstructions are very much in keeping with what one would expect given the decay of the singular values shown in Figure 11.6. Even at the higher noise level, the discrepancy between the reconstruction and the actual u0 is barely visible. When β = 0.9, there are
438
11. Inverse Problems for Fractionally Damped Wave Equations
only four significant singular values given either level of noise, and so there is no possibility of seeing fine features. When β = 0.5, a few more of these become evident as about two more singular values are relevant at the lower noise level and one at the higher noise level.
2.0 1.5 1.0 0.5
........................... ..... ..... .................. ..... ..... ........ .... ...... ...... ........... . ...... . u 0 actual ...... ......... ............ ...... . . β = 0.9 ...... ............. . . ...... ......... . . . . . β = 0.5 . . ...... ......................................... . . . ... . ... β = 0.1 ...... ........... ........ ... . . . . . . . . . . ... ............ .......... . ... ... ....... . ... . ................. . ... . . . ........ . ... . . . . . . . . ... ............ ... . . . . . . ... .... ....... ... . . . . . ... ......... ....... ...... . . ... . . . ..... ......................... . . ... . . . . . . ........ ........ ... . . . . . . . ... ................... . . . ... . ...........
u 0 (x)
0.0
0.2
0.4
0.6
0.8
1.0
2.0 1.5 1.0 0.5
u 0 (x)
................. ......... ....... ... ..... ... ..... ... ..... . . . . ... u 0 actual ...... . . . . . . ... .... . . ... . β = 0.9 .. . . . . ... . . ......... . . ... . . . . β = 0.5 . . . .............................................. ... . . . . . ... β = 0.1 ............................. ... ......... . ... .......... ... . .. . ... . . . . ....... . ... . . . .......... ... . . . . ... ........... . . . ... ... ......... . . . ... ............. . . . . . . . . . ... . . . . . ......................................... . . . ... . . ............. ........ . ... . . . . .................... ... . . . . .. ............
0.0
0.2
0.4
0.6
0.8
1.0
Figure 11.7. Recovery of u0 (x) from hβ (t): left-hand side, with
1% noise, and right-hand side, with 0.1% noise
Similar information can be gleaned from the profiles of hβ shown in Figure 11.5. The high frequency information evident in the case of β = 0.1 is not discernible at the higher values of β, showing that reconstruction of these will not be feasible even under very small noise levels. In the above numerical simulations we took fixed unity values of the damping coefficient b and the sound speed c. We now look at the effect that different values of these important parameters have on the reconstruction process. It is clear that a decrease in b will lessen the damping effect and bring the equation nearer to the wave equation itself, thus toward less ill-conditioning. But the question remains as to the magnitude of this effect and how it is expressed. In the left-hand graph of Figure 11.8 we show the change in the condition number of the matrix W when restricted to frequencies 1 ≤ n ≤ N = 10; that is we take σ =svd(W) and evaluate cond(W) = σ1 /σN (in fact we take this quantity on a logarithmic scale). The graphic shows the approximately exponential increase with b. Given the fact that our reconstruction of u0 depends critically on an approximate inversion of W, this condition number of a truncated svd is a good measure of how such a reconstruction would perform. For the dependence on wave speed c, one might expect that higher values would offset the damping effect. This is indeed the case as shown in the right-hand graph in Figure 11.8, again using the condition number of W as a measure of how well one might be able to invert W and reconstruct u0 (x). The expected decrease in these quantities is clearly evident.
11.2. Reconstruction of the wave speed
4
3
2
1
log10 (cond(W)) ......
.......... .................................................................................................................. ............... ....... ...... . . . .. ... ... ... .... .. ... .......................................................................................... ... ........................... .... .............. .......... .. ....... . . . . .... . . ..... .... .... ... ... . b ...
0
1
2
3
4
5
7 6 5 4 3 2 1 0
439
.............. ............ log (cond(W)) .......... 10 ......... ....... β = 14 .... ..... ......... β = 12 ....... ......... ......... .......... ......... ........... ............ ........... ......... ......... .... .. ...... ...... ........ ......... ......... .............. ................... ...................... ..................... log10 (c) .............................................. .
−3
−2
−1
0
1
2
3
Figure 11.8. Condition number of W with respect to b and c
11.2. Reconstruction of the wave speed We have seen that knowledge of the poles allows us to extract not only the initial condition but also information on the spectrum of the operator A. This in its turn can be used to reconstruct some of its coefficients. In ultrasound applications, it is often the speed of sound c0 (x) that one is most interested in recovering. We focus our exposition on the Caputo–Wismer– Kelvin model. 11.2.1. Unique determination of the spatially varying wave speed c0 (x). Our model equation is (11.5) with (11.6) but where A is given by a more general elliptic operator L whose spectrum is again denoted by {λj }, that is (11.30)
utt − Lu − bL∂tβ u = 0,
then in one space variable we are able to recover information on L from a chosen time-trace of the solution h(x0 , t) = u(x0 , t),
t > 0,
x0 ∈ Σ ,
by making use of inverse Sturm—Liouville theory; see Section 9.2.2. One possibility is to recover the spatially varying wave speed c0 (x). We first discuss the spatially one dimensional case Ω = (0, L) and later on point out how uniqueness extends to higher space dimensions. For this purpose we assume that c0 is twice differentiable with c0 (x) ∈ L2 (Ω). The reason for this will become apparent from the use of the Liouville transform below. We take the spatial interval to be (0, L) and assume the boundary conditions in L to be * ∂u ∂u ** * + γu* = 0, =0 (11.31) * ∂x x=0 ∂x x=L with an impedance parameter γ, 0 ≤ γ ≤ ∞. An initial value u(x, 0) = u0 (x) is given (we will say more on this shortly) and the initial velocity ut (x, 0) = 0.
440
11. Inverse Problems for Fractionally Damped Wave Equations
As overposed data, we measure the time trace at the left boundary (11.32)
u(0, t) = h(t),
t > 0.
As in the proof of Theorem 11.1 this setup allows the unique recovery of the spectrum of the operator L: From identity (11.29) and the fact that the poles of wj are distinct for different j, we conclude that we can recover the ˆ and therewith, according to Lemma 11.4, the eigenvalues of A, poles of h, provided that no generalised Fourier mode of the initial data vanishes. This requirement is hard to verify in general. But in reality a randomly chosen x0 for the measurement point is unlikely to effectively exclude the use of the first N eigenvalues if N is of modest value, and for smooth coefficients in the elliptic operator L, such a value is likely to suffice as measurement error in the data will become a greater factor for larger N . There are several ways around this obstacle, beginning with the paper by Pierce [265], where the use of nonhomogeneous boundary values and taking the initial data to be zero were used in a parabolic setting. More recently, in [64], the authors took a subdiffusion model with u0 (x), a Dirac delta function at one endpoint. In both these cases the problem is converted to an eigenvalue plus another data sequence for an inverse Sturm—Liouville problem, and then standard techniques can be used; cf. Section 9.2. We take the latter line mentioned above, choosing u0 (x) = δ(x), and we thus obtain the solution representation for the direct problem as in (11.12), (11.33)
h(t) = u(0, t) =
∞
wj (t)ϕj (0)2 .
j=1
Now ϕj (0) can never be zero, for otherwise the imposed Neumann condition at x = 0 would force the eigenfunction to be trivial. Thus after taking Laplace transforms, (11.33) shows that the time trace data h(t), t > 0, allows recovery of the poles and residues of w ˆj . According to Lemma 11.4 and as in the proof of Theorem 11.1, this in turn provides the spectrum {λj } as well as the left endpoint values {ϕj (0)}. We can use this to renormalise the eigenfunctions ϕj (x) to φj (x) by dividing by the values ϕj (0) so that now φj (0) = 1—the traditional normalisation in Sturm—Liouville theory. Now we have the L2 norms of the rescaled eigenfunctions as the norming constant data: ρj := φj L2 (0,1) = |ϕj (0)|−1 . The above information is sufficient to recover a single coefficient in the operator L. In the classical form this is usually taken as the potential q(x) when in Schr¨ odinger form, −u + q(x)u = λu. But using the Liouville transform, one can also adapt this to recover instead either p or r as in −(pu ) = λu or −u = λru: the conductivity and mass-density cases. The usual mapping between these is the Liouville transform (9.56), (9.57). If we assume that (pr) and p are continuous on the interval [a, b], then the full
11.2. Reconstruction of the wave speed
441
t equation is transformed by u(t) → u ˜(x) given by x = L1 a σ(s) ds, u ˜ = f u, 1
b r(t) 2 , f (t) = where L, σ, and f are defined by L = a σ(s) ds, σ(t) = p(t) 1
(p(t)r(t)) 4 , and the new independent variable x varies over the interval 0 ≤ x ≤ 1. It is easily checked that the boundary conditions of impedance type are also mapped into impedance conditions. If we assume that the coefficients possess the regularity required above, then we can, without loss of generality, work with the equation −u + Q(x)u = λu defined on the unit interval. Q(x) is the equivalent potential for the original problem. A standard strategy is to map the given eigenvalues/norming constants for the original problem into those for the potential form, then recover Q(x) from this data and transform back to recover p(x) or r(x). However there is a more convenient form of this transformation that dispenses with the various derivatives appearing above. Instead of mapping to potential form, it is more convenient totransform to the impedance canonical 1 x −1 form: −(au ) = λau. If we set y = L 0 r(s) ds with L = 0 r(s) ds, then −u = λru becomes av ) + (λ/L2 )av on [0, 1]. The recovery of the impedance a can now be done as when using an iterative scheme that is quite analogous to the one based √ on the Gel’fand–Levitan mapping (I + K)f (see (9.96)) for f (x) = sin( λx) used for the recovery of the potential term
Q; x see [299]. In fact, defining M (x, t) = (1/ a(x)) 1 + 0 K(x, s) ds , one easily checks that M satisfies 0 ≤ t ≤ x ≤ 1, a(x)Mtt + a(x)Mx x = 0, 1 (11.34) , 0 ≤ x ≤ 1, M (x, 0) = Mt (x, 0) = 0 , a(x) and the theory goes through completely analogously to the potential case based on the operator K defined in (9.96). Note that this transformation doesn’t involve the awkward derivatives appearing in the standard Liouville transform and is the preferred method for recovering either p(x) or r(x). See [299, 300] for details including those of reconstruction methods for each situation. We note that in this setting the equation c20 (x)uxx = λu can be obtained from the mass-density case by taking c20 (x) = 1/r(x) and thus allowing recovery of the wave speed c0 (x). In summary we have proven: Theorem 11.2. In equation (11.30) if we take the initial data as u0 (x) = δ(x) and the boundary conditions as in (11.31), where the operator L has a single unknown, spatially dependent coefficient, then a time trace measurement h(t) = u(0, t) for t > 0 is sufficient to determine the unknown coefficient in Ω = (0, L).
442
11. Inverse Problems for Fractionally Damped Wave Equations
As a consequence of the results in [50, 51, 256] (see Theorem 9.17), this extends to higher space dimensions. Theorem 11.3. Theorem 11.2 remains valid in a bounded and sufficiently smooth domain Ω ⊆ Rd , d ≥ 2, if equation (11.30) is equipped with homogeneous Neumann boundary conditions. In particular we have, Corollary 11.1. With equation (11.30) and L = c0 (x)2 under the assumptions of Theorem 11.2 or 11.3, the time trace data h(x0 , t) = u(x0 , t), x0 ∈ ∂Ω, t > 0, is sufficient to uniquely determine the wave speed c0 (x) in Ω. 11.2.2. Recovering the wave speed c0 (x). Here we outline the steps that are required to computationally reconstruct c0 (x) from time trace data. We assume that the equation is (11.30), where L contains the wave speed c0 (x) as its leading term and is subject to the initial-boundary conditions (11.31) with u0 = δ. Thus, according to (11.33), the first step consists of recovering the spectrum {λj }∞ j−1 and endpoint values cj = ϕj (0) from the time trace data h(t) through (11.35)
h(t) =
∞
c2j wj (t),
j=1
where wj satisfies (11.13) with λ = λj . Note that for simplicity, we have set the constant scaling factor c2 in the wave speed to unity. As before, we take the Laplace transform of (11.13) and also of the data to obtain ∞ s + bλj sβ−1 ˆ c2j 2 =: F {λj , cj }∞ (11.36) h(s) = j=1 . β s + bλj s + λj j=1
• Equation (11.36) is a nonlinear operator equation for {λj , cj } onto ˆ h(s), and one way to invert and recover {λj , cj }N j=1 is to use Newton’s method. Given the severe ill-conditioning of this recovery problem, we should not expect to take N very large—but the exact value will be determined by the svd solution of the linearisation and the cut-off values will depend strongly in the noise level of the measured h(t). • This recovery scheme requires an initial guess for the unknowns, but we are aided here by the known asymptotic form of both {λj } and {ϕj (0)}, and for smooth c0 (x) these asymptotic values are quickly achieved. We should expect that if, for example, γ = 0 in (11.31),
11.3. Nonlinearity coefficient imaging
443
then λj ∼ C(j − 12 )2 π 2 ,
ϕj (0) ∼
(11.37)
1
as j → ∞ where C =
C j−
1 2
c0 (t) dt
0
These sequences can be used in (11.36) initially to fit the single pa s+C(j− 21 )2 π 2 bsβ−1 ˆ . With rameter C in h(s) ∼ C 2 ∞ (j − 1 )−2 1 j=1
2
s2 +C(j− 2 )2 π 2 (bsβ +1)
this estimate of C, the values in (11.37) can serve as initial approximations for {λj , cj }N j=1 in Newton’s method. The second step is to recover c0 (x) from this spectral data, and this path is well trodden in the literature. A further difficulty is the inherent illconditioning of this step which is based less on a norm of the mapped spaces but more on the value of the coupling constant. For example, we know from (11.37) that the leading and known term, which grows quadratically in j, will entirely dominate the remainder term, which carries the bulk of the information about c0 (x) and which for c0 ∈ H m (0, 1) will decay roughly as o(j −m ). Thus even if one has obtained very accurate spectral information, this ill-conditioning will be a major restriction on the number of components of c0 (x) that one can reasonably recover. A concrete implementation of this second step consists of approximating c0 (x) by a finite linear combination of known basis functions with coefficients {bk }K k=1 and considering the map F : {bk } → {λj , cj }. This is again amenable to a Newton-type iteration scheme to recover {bk }; see, e.g., [226, 298]. Alternatively, of course one could also apply Newton’s method directly to the map F : {bk } → {h(ti )}, with a number T of sampling points ti . The forward map F will then contain the numerical solution of (11.30) with given initial conditions and L = c20 (x), where c20 (x) is determined by the coeffcients {bk }.
11.3. Nonlinearity coefficient imaging The use of ultrasound is a well-established protocol in the imaging of human tissue and, besides the classical sonography methodology, there exist several novel imaging principles, such as harmonic imaging or nonlinearity imaging. The latter relies on tissue-dependence, hence spatial variation of the parameter of nonlinearity B/A that is contained in κ. The problem of nonlinear B/A parameter imaging with ultrasound [30, 42, 46, 155, 336, 362, 363] in lossy media thus amounts to identification of the space-dependent coefficient
444
11. Inverse Problems for Fractionally Damped Wave Equations
κ(x) for the attenuated Westervelt equation in pressure form u − κ(x)u2 tt − c20 u + Du = r in Ω × (0, T ), (11.38) u = 0 on ∂Ω × (0, T ), u(0) = 0, ut (0) = 0 in Ω (cf. Section 7.2), from boundary observations. Here c0 > 0 is the wave speed (possibly space dependent as well) and Du is again a damping term. For simplicity, we impose homogeneous Dirichlet boundary conditions here, but the ideas in this section extend to more realistic boundary conditions, such as absorbing boundary conditions for avoiding spurious reflections and/or inhomogeneous Neumann boundary conditions for modelling excitation via, e.g., some transducer array. Note that the excitation here is modelled by an interior source r, in view of the fact that the transducer array lies in the interior of the computational domain. Indeed, denoting by Γ ⊆ Ω the surface on which the transducer array lies, we can consider r as an approximation of a source r˜ · δ Γ concentrated on Γ, in view of the fact that formally, for any f ∈ H −1 (Ω) ⎧ ⎨ −u = f in Ω ∂ν u = 0 on ∂Ω ⎩ [[∂ν u]] = r˜ on Γ ∇u · ∇v dx = f v dx + g˜v ds for all v ∈ H 1 (Ω) ⇔ Ω Γ #Ω −Δu = f + g˜δ Γ in Ω ⇔ ∂ν u = 0 on ∂Ω, where [[∂ν u]] denotes the jump of the normal derivative of u over the interface Γ. The measurements for the inverse problem to recover κ will again be taken to be (11.39)
h(x, t) = u(x, t) ,
x ∈ Σ,
t ∈ (0, T ),
either at single point Σ = {x0 } or—in the spatially higher dimensional case—on some surface Σ contained in Ω. The inverse problem represented by equations (11.38) and (11.39) is challenging on at least three counts. First, the underlying model equation is nonlinear and in fact the nonlinearity occurs in the highest order term. Second, the unknown coefficient κ(x) is directly coupled to this term, and third, it is spatially varying whereas the data h(t) is in the orthogonal time direction and this is well known to lead to severe ill-conditioning of the inversion of the map from data to unknown. Written in a slightly more abstract form, the inverse problem under consideration is to recover the space dependent coefficient κ(x) in the attenuated
11.3. Nonlinearity coefficient imaging
445
Westervelt equation (11.40)
utt + c2 Au + Du = κ(x)(u2 )tt + r u(0) = 0,
ut (0) = 0
in Ω × (0, T ),
in Ω,
from observations (11.39). Here, c > 0 is the constant mean wave speed, and A = −(c0 (x)2/c2 ) contains the possibly spatially varying coefficient c0 (x) > 0 and is equipped, for simplicity, with homogeneous boundary conditions. Throughout this section we assume Ω ⊆ Rd , d ∈ {1, 2, 3}, to be a bounded domain with C 1,1 boundary and the coefficient c0 (x) contained in A to be bounded away from zero and infinity. Again we consider the two damping models (11.41)
˜
D = bAβ ∂tβ
(ch)
and (11.42)
D = bA∂tβ + a∂tα+2
(fz) .
11.3.1. Injectivity of the linearised forward operator. The forward map is defined by F (κ) = trΣ u, where trΣ v denotes the time trace of the space and time dependent function v : (0, T ) × Ω at the observation surface Σ (which may also just be a single point Σ = {x0 }) and u solves (11.40). Its linearisation at κ = 0 in direction δκ is F (0)δκ = trΣ z0 , where z0 solves (11.43)
ztt + c2 Az + Dz = δκ(u20 )tt ,
where (11.44)
u0,tt + c2 Au0 + Du0 = r.
Both pdes (11.43), (11.44) come with homogeneous initial conditions. As in Section 11.1, the Laplace-transformed solutions to the corresponding resolvent equation ˜ s2 + bλβ sβ + c2 λ for ch 1 (11.45) w ˆλ (s) = with ωλ (s) = ωλ (s) as2+α + s2 + bλsβ + c2 λ for fz will play a crucial role in the proofs below. We assume that r has the form (11.46)
r(x, t) = f (x)χ (t) + c2 Af (x)χ(t) + D[f (x)χ(t)]
with some function f in the domain of A vanishing only on a set of measure zero and some twice differentiable function χ of time. With (11.46), the
446
11. Inverse Problems for Fractionally Damped Wave Equations
solution u0 of equation (11.44) is clearly given by u0 (x, t) = f (x)χ(t), so that dκ(u20 )tt can be written in the form (11.47) ∞ dκ(x)u20 (x, t) tt = aj ϕj (x)ψ(t), ψ(t) = (χ2 ) (t), aj = dκ · f, ϕj . j=1
The uniqueness arguments will again rely on considering poles and residues. To extract the coefficients ak we also assume that the Laplace transform of ˆ ψ = (χ2 ) does not vanish at the poles pm of the data h m ) = 0. ψ(p
(11.48)
Theorem 11.4. Under the assumptions (11.47), (11.48), (11.25) for all eigenvalues λ of A, the derivative of the forward operator F at κ = 0, F (0) is injective. Proof. We can write the solution of equation (11.43) as z0 (x, t) =
∞
zj (t)ϕj (x),
j=1
where for each j ∈ N, zj solves (11.49) zj (t) + c2 λj zj (t) + Dj zj = aj ψ(t) ,
with Dj =
zj (0) = 0 , zj (0) = 0,
t > 0,
˜
bλβj ∂tβ for ch a∂t2+α + bλj ∂tβ for fz.
Applying the Laplace transform to both sides of (11.49) yields 1 , ˆλj (s)aj ψ(s), where w ˆλj (s) = (11.50) zˆj (s) = w ωλj (s)
s ∈ C,
and we have used homogeneity of the initial conditions. Thus, assuming that F (0)dκ = trΣ z0 = 0, implies that 0 = zˆ0 (x0 , s) =
∞
, aj ϕj (x0 )w ˆj (s)ψ(s)
for all s ∈ C , x0 ∈ Σ .
j=1
Considering the residue at some pole pm corresponding to the eigenvalue ˆλj (s) = 0 for λm and using the fact that by Lemma 11.4, lims→pm (s − pm )w j = m, yields ∞ aj ϕj (x0 ) lim (s − pm )w ˆλj (s)ψ(s) 0 = Res(ˆ z0 (x0 ; pm )) = j=1
m) = Res(w ˆλm ; pm )ψ(p
k∈Km
s→pm
ak ϕk (x0 ) .
11.3. Nonlinearity coefficient imaging
447
Here Km ⊆ N is an enumeration of the eigenspace basis (ϕk )k∈Km corresponding to the eigenvalue λm . By (11.48) and (11.25) this implies ak = 0 for all k ∈ Km . ∞Now since f only vanishes on a set of measure zero, from dκ · f = j=1 aj ϕj = 0, we can conclude that dκ = 0 almost everywhere. 11.3.2. Ill-posedness of the linearised inverse problem. As in the injectivity section, we consider the linear(ised at κ = 0) problem of recovering δκ(x) from time trace observations h(t) = trΣ z0 = z0 (x0 , t0 ), x0 ∈ Σ, where z0 solves (11.43) with u0 solving (11.44), both with homogeneous initial conditions. Again, we assume that the excitation r has been chosen such that u0 takes the form u0 (x, t) = f (x)χ(t) and employ the shorthand notation ψ = (χ2 ) . Using the eigensystem of A we can then write z0 (x, t) =
∞
zˆj (s) = w ˆλj (s)δκ · f, ϕj ψ(s)
ϕj (x)zj (t) ,
j=1
with w ˆλj (s) according to (11.45). As in the injectivity section we obtain (for simplicity in the one-dimensional case where all eigenvalues are single) (11.51)
m )δκ · f, ϕm ϕm (x0 ), ˆ pm ) = Res(w ˆm ; pm )ψ(p Res(h;
that is, the mth component of δκ · f can be computed from the data as
−1 ˆ pm ) Res(w m )ϕm (x0 ) (11.52) ˆm ; pm )ψ(p . δκ · f, ϕm = Res(h; By l’Hˆopital’s rule we have s − pm Res(w ˆm ; pm ) = lim s→pm ω(λ, s)
⎧ ⎨
1 ˜ β−1 for ch 1 2pm +βbλβ pm j = = lim 1 s→pm ω (λ, s) ⎩ β−1 for fz. 1+α (2+α)apm +2pm +βbλj pm
ˆ pm ) in (11.52) only grows mildly Thus, the factor multiplied with Res(h; with pm . Remark 11.4. In higher space dimensions, to resolve the equations m) ˆ pm ) = Res(w ˆm ; pm )ψ(p δκ · f, ϕk ϕk (x0 ) , x0 ∈ Σ, Res(h; k∈Km
that replace (11.51) then, condition (11.25) is obviously again crucial. Referring to Remark 11.2, in the separable eigenfunction setting of, e.g., discs, balls, cubiods, etc., the fact that the eigenfunction factors along Σ are mutually orthogonal clearly aids numerical implementation and stability.
448
11. Inverse Problems for Fractionally Damped Wave Equations
The major ill-posedness in computing (11.52) seems to lie in the evalˆ pm ) at the poles pm , from uation of the residue of the observations Res(h; ∞ ˆ knowledge of h(t) for t > 0, that is, from h(s) = 0 e−st h(t) dt for s with nonnegative real part (so that the integral defining the Laplace transform is well-defined). If these poles lie on the imaginary axis (the undamped wave equation case), this is still well-posed. The further left the poles lie, the more ill-posed this problem is. For some illustration on the location of these poles in dependence of the ˜ we refer to Section 11.1.4. parameters b, c, α, β, β, 11.3.3. Reconstructions of κ. In this section, we will provide some reconstruction results of κ by means of a frozen Newton scheme. To this end, we first of all dicuss well-definedness of the method for both model cases ch and fz. 11.3.3.1. Newton type methods for recovering κ. From Theorems 7.5 and 7.9 we obtain well-definedness of the forward operator F by F (κ) = trΣ ◦ G of the parameter-to-state map G : κ → u, where u solves (11.40) with either of the in ch or fz damping models (11.41) or (11.42) (the latter with β = 1). Here trΣ v denotes the time trace at the observation surface Σ (which in one space dimension may also just be a single point Σ = {x0 }). Hence the inverse problem under consideration can be stated as F (κ) = h , where h is the measured data (11.39), and we consider F as an operator F : D(⊆ X) → Y . Here the domain D of F is defined as a ball with fixed radius in X = W 2,4 (Ω)∩W 1,∞ (Ω) (ch) or X = W 1,4 (Ω) (fz) and, additionally, the initial data and driving term are supposed to satisfy certain regularity and smallness conditions. For drawing this conclusion on the composite operator, in addition to the mentioned theorems on G, we use the fact that the trace operator trΣ is linear and well-defined on the spaces Ulo or U according to (7.66) or (7.84), respectively, and maps into Y ⊆ C([0, T ]; C(Σ)). Typically we will have Y = Lp (0, T, RN ), N ∈ N ∪ {∞} in case of Σ being a discrete set or Y = Lp (0, T ; Lq (Σ)) in case of Σ being a surface. From Theorems 7.7 and 7.11 we additionally conclude Fr´echet differentiability of F on D. Thus we are in the position to formulate Newton’s method, which defines the iterate κk+1 implicitly by the linearised problem F (κk )(κk+1 − κk ) = h − F (κk ) or its frozen version F (κ0 )(κk+1 − κk ) = h − F (κk ) .
11.3. Nonlinearity coefficient imaging
449
The latter is known to save the computational effort of evaluating the derivative in each step—at the cost of yielding only linear convergence. This is the approach we are going to take in the numerical reconstructions to be shown in this section. Concerning solvability of the linearisation, we have commented on injectivity of F (κ0 ) in case κ0 = 0 in Section 11.3.1. As pointed out in Section 11.3.2, inversion must be expected to be ill-posed, though, and also surjectivity of F (κ0 ) is not likely to hold on the relatively large space Y . Therefore, we will rely on a regularized least squares variant κ − κk ) − h Y + γ ˜ κ − κ 0 X (11.53) κk+1 = argminκ˜∈D F (κk ) + F (κ0 )(˜ of the frozen Newton method. A minimizer exists, since the cost function is weakly lower semicontinuous and has weakly compact sublevel sets in the space X, which in both ch and fz cases is the dual of a separable space; see, e.g., [332, Chapter 2]. Uniqueness of this minimizer follows easily from strict convexity of the cost function for γ > 0. In the injective setting of Section 11.3.1 this remains valid also with γ = 0. Dispensing with uniqueness (that is, replacing “= argmin” by “⊆ argmin” above), we can also choose to completely skip the regularization term, since stabilization is already achieved by imposing the constraint κ ˜ ∈ D. The formulation (11.53) also allows the inclusion of noisy data; cf., e.g., [130, 174, 178, 286]. 11.3.3.2. Reconstruction results. We now provide reconstruction results for κ in the Caputo–Wismer–Kelvin model, that is, ch with β˜ = 1. The boundary conditions for our test cases were homogeneous Dirichlet at x = 0 and homogeneous Neumann at x = 1 with a nonhomogeneous driving term r(x, t) that takes larger values near x = 1. Thus the solution was small in the region near x = 0 in comparison to near x = 1, where the data h(t) = u(1, t) was measured. The consequence of this was that κ(x) for x small was multiplied by terms that were also small in comparison to that at the rightmost endpoint, which resulted in greater ill-conditioning of the inversion near x = 0. This will be apparent in each of the reconstructions to be shown below. The data h(t) was computed by applying a direct solver for (11.40) and evaluating at the endpoint x = 1. A sample at 50 time points was taken to which uniformly distributed random noise was added as representing the actual data measurements that formed the overposed data. This was then filtered by a smoothing routine based on penalising the H 2 norm with regularisation parameter based on the estimate of the noise and then up-resolved to the working size for the inverse solver. Discretisation of κ was done by means of a fixed set of 40 chapeau basis functions, and we applied a regularised frozen Newton iteration, stopped
450
11. Inverse Problems for Fractionally Damped Wave Equations
κ (x)
0.15
0.10
0.05
0.00
β = 1.0
0.0
0.2
κ (x)
0.15
0.10
0.05
0.00
...........
........ ......... ........... ...... ...... ... ... ..... ... ... ...... ...... ........ ... ..... . ..... . ... .. ...... .... . . ..... .. ........... ..... . . .. ...... .. ... ...... . . . . .... . ... ...... ....... . . . . .. .. ..... ...... . .... .. .. .. . . ... ..... ... .. . . ... .... ..... . ... ... .. . ... .. . ...... ... ... . . . . . ... ... . ..... ..... . ..... . ..... ... . ... .... .......... ...... ... . . .... . . ..... . . ... ... . ... ... ... . . . . . .. . . . . . . ..... ............ .... ... ... ............ .. ...... .... .. .. .. ...... ... .. .... ... ...... ........ ...... ..... ....... ..... . . ..... . . .... ........ ......... .....
0.4
0.6
β = 0.5
0.8
0.2
0.4
0.6
...........
0.8
0.10
0.05
0.00
1.0
....... .......... ......... ...... ..... ...... ..... ... ....... ..... ..... ........ . ... ..... . . . .. ...... .. .... . . . ... ... . . .. . . ..... ... . . .. . . . ... ....... . . . . . . . . . ...... . ................. ..... . . ... ...... . .. .. . . . . . ... . . ....... ... .. . ....... .. ... ... ...... ... . . . . . . . .... ...... ... ... . ...... . ... .. ... ... . . ..... ... ..... . .. .... ........ ..... . . . .... ... .. ...... ...... . . . ... . .... . ..... ....... .. ... ............ ... ... ... . ....... .... .... ... .. ..... ....... ... ... ..... ... .... ...... ........ ... ....... . . . . . .... ........ ........ .. .....
0.0
κ (x)
0.15
1.0
β = 0.9
0.0
0.2
κ (x)
0.15
0.10
0.05
0.00
... .... .....
........ ......... ....... ...... ..... ...... .... ..... ..... ...... ........ ... .... ...... . . .. ......... .... ..... . . ... .. ... ...... .... . . ..... ... .... .. ... . . . ..... ....... .. . .. . . . . . . ..... ...... .. .. .. . . . . ...... ... . ...... . . . . .... ... . .... ..... ....... . . ..... . . ... .. ...... ... .... .... . . ... .. .... . ...... ..... . . ... .... ... . ..... ....... . ... ... . ... .. ...... . . . . ... .. . .... ....... .. . . . . . . . . ... .. ... .... ............ ... .. .... ... ........... .. ..... ... .. ... .. ..... ...... ... ... ...... ..... . ...... ..... ... . ...... . .. . . . . . . . .... ........ ........ ..... .....
0.4
0.6
β = 0.25
0.8
1.0
.......
........... ........ ........ ......... ..... ..... .... ... ..... ... ..... ........ ... . ...... . . . . ........... .... . ...... . . .... ........ ... .... . . ...... ..... .... .. . . . ..... .... . ....... . . . ...... . .. ...... . ....... . ...... . . .... ...... . ...... . . . ... . . ...... . .... ...... . ... . . . ..... ..... ....... . ... . . ..... . . . ....... ....... . .. . . ...... ..... ... .... ... . . . . ..... ..... ... .. .... . ... . . . . . . .... ... ... ......... . . ........ ... .............. ... ..... . . ... ............ ..... .... ..... ...... .. ...... . . . . ...... ..... . . . . . . ....... ............ .....
0.0
0.2
0.4
0.6
0.8
1.0
Figure 11.9. Reconstructions of κ(x) for various β values. Noise = 0.1%.
by the discrepancy principle, for numerically solving the discretised inverse problem. Figure 11.9 shows the reconstruction of a piecewise linear κ for the values β = 1, β = 0.9, β = 0.5, and β = 0.25. In each case the damping coefficient b was kept at b = 0.1 and the wave speed at c = 1. The (L∞ , L2 ) norm differences for the final versus the actual reconstruction were (0.315, 0.191), (0.184, 0.126), (0.116, 0.084), (0.109, 0.078), respectively, and they show the increase in resolution possible with a decrease in β. Note that the reconstructions of κ are clearly superior at the right-hand endpoint, where the measurements were taken, as the wave is essentially transmitting information primarily from right to left but the amplitude is damped as it travels. The smaller the fractional damping the lesser is this effect, which is also apparent from these figures. The reconstructions naturally worsen with increasing noise levels as Figure 11.10 shows. Figure 11.11 shows the singular values of the Jacobian matrix used in the (frozen) Newton method. The results confirm our findings from Section 11.1.4 (cf. Figure 11.3), namely the fact that the further away the (negative) real part of the poles is from the real axis, the more ill-posed the inverse problem of recovering information from them. This was pointed out at the end of Section 11.3.2. Note that if the function κ can be well represented by a small number of basis functions, then the dependence with
11.3. Nonlinearity coefficient imaging
κ (x)
0.15
0.10
0.05
0.00
β = 0.9
............
......... ........ ......... ...... ..... ....... ..... ....... ..... ........ ...... .... ... ..... . . ... .. ...... .. . . . ... ... . .. . . . ... ..... . . .. . . . ... ....... ... . . . . . . . ...... . .... ............. . . . . ... ....... .... . .... ... . . . ... . ...... .. . ........ ...... .. ... ........................... . . . . . . . ...... .... ... ... . . ........ ...... ..... .. .... .. ..... ..... ...... . . . ... ... ... . ........ ...... . . .... . . . . . ...... ......... ... . . . .... .. .. . . ....... ....... ... ... ............. ...... ... ... ... ... ......... ....... ... .. ... .... ....... ... ... .... ..... ... ....... ............... .......... ............... .... . . . . ..... . . . . . ....... ........ .....
0.0
0.2
0.4
0.6
0.8
451
κ (x)
0.10
0.05
0.00
1.0
β = 0.25
..........
........... .......... ......... ......... ...... ...... ..... ..... ...... ...... ........ ...... ..... ... . . . ...... . ... .. . . . ..... .. ......... .. .... . . ... .. .... ...... ........ . . . . .. ..... . . . . ......... . . . .... . ..... . . .... ... ... ......... ....... ....... ..... . ... ... . . . . ....... . . . . . . . . . . . ... ... .. ... ........ ...... . ... .. ..... ............ ..... .... .. ....... ... ... ... . ... .. ... ...... . . ... .... .. . ... ........ . . . ..... . . . . . . ......... ..... . . .. ... .. . ... . . . . . .......... ............. ... ... .... ... ... ..................... .... ... ... .. ... .... ............... ... ... ... ...... ... ....... ............... ............ .............. . . . . . .... . . . ... ........ ........ .....
0.15
0.0
0.2
0.4
0.6
0.8
1.0
Figure 11.10. Reconstructions of κ(x) for β = 0.25, 0.9. Noise = 0.5% (blue) and 1% (green).
respect to β will be fairly weak. On the other hand, if a large number of basis functions is needed for κ to be represented, then the dependence on β becomes much stronger, although by this point the condition number of the Jacobian is already extremely high for all β and relatively few singular values are likely to be usable in any reconstruction with data subject to extremely small noise levels. The effect of damping is to directly contribute to the 0
−4
−8
−12
◦• log10 (σ n) ◦ • ◦ •◦ • •◦ • ◦• ◦• ◦◦ •◦ • •◦ ◦ ◦ ••◦◦ ••◦◦ ••◦◦◦ •• ◦◦ •• ◦◦ •• ◦◦ ◦◦ • β = 0.25 • ◦ • ◦ β = 0.5 • ◦ β = 0.9 • • • β = 1.0 c=1 • n
0
−4
−8
−12
log10 (σ n) •◦ ◦• ◦• ◦• ◦• ◦• •◦ ◦• •◦ ◦• ◦• ◦• ◦ • ◦• ◦• ◦• ◦ ◦ • • •◦ •◦ ◦ ◦ ◦ • • • ◦• ◦ ◦ ◦ •••◦◦◦ ••• β = 0.25 β = 0.5 ◦ β = 0.9 • β = 1.0 c=5 n
Figure 11.11. Singular values for various β values: left; c = 1,
right; c = 5
ill-conditioning and thus it is clear that for fixed β and c this will increase as the coefficient b increases. The degree of ill-conditioning as a function of the wave speed c is less clear. Figure 11.11 shows the singular values {σn } of the Jacobian matrix for both c = 1 and c = 5. This illustrates the decay of the singular values, and hence the level of ill-conditioning does depend on c but certainly not uniformly for all values of the fractional exponent β. For β near unity, where damping approaches the classical Kelvin–Voigt paradigm, there is a considerable increase in the smaller, high index singular values indicating the problem is much less ill-posed for larger wave speeds c. For the smaller index σn the ratio σn /σ1 is almost the same, indicating at most a weak effect due to the wave speed. Thus for a function κ(x) requiring
452
11. Inverse Problems for Fractionally Damped Wave Equations
only a small number of basis functions, the effect of wave speed is marginal, but this changes quite dramatically if a larger number of singular values is required. For β less than about one-half, the condition number σn /σ1 becomes relatively independent of c—at least in the range indicated.
Chapter 12
Outlook Beyond Abel
To this point nearly all the fractional derivatives in this book have the Abel integral operator as a key component. As such they are inherently one dimensional. Riesz in the 1930s extended the idea to Rd , although this was still based on fractional integrals of functions. In the middle of the twentieth century there began an intense interest in seeking ways to define fractional operators on function spaces. Applying this to the (negative) Laplace operator led to definitions of fractional derivatives in higher space dimensions. The impact in inverse problems has been an explosion of results starting from the second decade of this century. As inverse problems using the Abel integral have been shown to have distinctly different properties from the classical case, similar results are being developed for these fractional operator based derivatives with comparable consequences. What we show below is meant as a very brief introductory survey, not doing justice to what is a strongly expanding area. Doing so would require a substantial addition to the mathematical background that is beyond the scope of this book.
12.1. Fractional powers of operators 12.1.1. Semigroups of operators. This short section contains a quick survey of relevant results on semigroup theory, that had originally been developed as a tool for studying parabolic pdes based on the fundamental theorem of Hille-Phillips and Yosida [147, 358]. Only a few years after this result, it was noted by Balakrishnan [20] that the approach could be used to define fractional powers of operators—in particular, of elliptic operators. 453
454
12. Outlook Beyond Abel
Also Bochner’s approach to defining fractional powers of operators [32] relies on semigroups. We will take this as the beginning pathway towards looking at a more operator-theoretic definition of fractional derivatives. Let X be a Banach space and (for the moment) let A be a bounded ∞ (−t)k k operator on X. Then we can form the power series k=0 k! A and denote its sum by exp(−tA). It easily follows that exp(−(t + s)A) = exp(−tA)exp(−sA). Also, with Ah x := h1 [exp(−hA)x − x], for h > 0, the limit limh→0+ Ah x recovers Ax for any x ∈ X, and exp(−0A) = I holds trivially. This motivates the following definition. Definition 12.1. We say that a family {T (t), 0 ≤ t < ∞} of bounded linear operators in X is a strongly continuous semigroup if for each x ∈ X, T (t)x is continuous in t for t ∈ [0, ∞) and T (t + s) = T (t)T (s),
for all s, t ≥ 0,
T (0) = I.
In fact for a bounded operator A, T (t) := exp(−t A) forms a full group with the group action valid for −∞ < t < ∞. We would like to extend this to the case where A is not bounded, for example, to the case where A represents a differential operator and, in particular, to a strongly elliptic operator typified by −. We can no longer expect a full group action but a semigroup valid for t ≥ 0 will suffice. Turning the above around we have Definition 12.2. If {T (t), 0 ≤ t < ∞} is a strongly continuous semigroup, then set Ah x := h1 [T (h))x − x], and we denote by DA the set of all x ∈ X for which limh→0+ Ah x exists. Define the operator A with domain DA as Ax = limh→0+ Ah x. We say that A generates the semigroup {T (t), 0 ≤ t < ∞} and write T (t) = e−tA . Then the question becomes “What is the condition on A for this to occur?” To this end, we define the resolvent set ρ(A) of an operator A as the set of those (complex-valued) λ for which R(λ, A) = (λI − A)−1 exists as a bounded operator with domain X. The answer is given by the famous Hille–Phillips–Yosida theorem. Theorem 12.1. A necessary and sufficient condition for a closed linear operator A with dense domain DA to be the generator of a strongly continuous semigroup {T (t), 0 ≤ t < ∞} is that there exist real numbers M and ω such that the interval (ω, ∞) is contained in ρ(A) and M , n ∈ N, ω < λ < ∞. (12.1)
R(λ, A)n ≤ (λ − ω)n ∞ In addition, the integral representation R(λ, A)x = 0 e−λt T (t) dt x holds for all x ∈ X.
12.1. Fractional powers of operators
455
Then the question becomes which useful operators A satisfy the above. The following answer is highly relevant for the solution theory of parabolic pdes, in view of the fact that u(t) = T (t)x solves d u(t) + Au(t) = 0, t > 0, u(0) = x. dt Independently of this property, for us semigroup tools will be crucial in some of the definitions of fractional powers of the Laplacian to be considered below. Theorem 12.2. Let Ω be a bounded domain with ∂Ω ∈ C 2 and let L be a uniformly strongly elliptic operator in Ω with coefficients in C 1 (Ω). Set A = −L equipped with Dirichlet, Neumann, or impedance boundary conditions. Then the resolvent set ρ(A) contains the positive real axis (0, ∞) and there exists a constant C > 0 such that the estimate C (12.2)
R(λ, A) ≤ , λ ∈ (0, ∞), λ holds. An obvious member of the above class of elliptic operators is L = , equipped with the boundary conditions mentioned in the theorem, and this has been by far the most-used case. But there are clearly many other useful applications. References for the above include [85, 98, 109] and also from the original authors’ sources, [147, 358]. 12.1.2. Definitions of fractional powers of elliptic operators. In this section we collect various definitions of fractional powers of elliptic differential operators—some of them applying to a more general class of operators A, some of them specific to the Laplacian −. We will be far from exhaustive, but instead we will concentrate on those cases where there will be utility in modelling and in particular for inverse problems involving them. 12.1.2.1. Fractional power via resolvent operators. Let A be a closed operator obeying (12.1), so that A generates a strongly continuous semigroup e−tA . According to Theorem 12.2, this includes uniformly strongly elliptic operators on bounded domains. Then for any α > 0, we can define the fractional power Aα by sin(απ) ∞ α (sI − A)−1 Asα−1 ds, (12.3) A = π 0 an idea due to Balakrishnan in 1960 [20]. The mathematical justification for this can also be found in standard references such as [109]; see also [205] for the case of A being the Laplacian. As a matter of fact, the Abel integral operator from Section 4.1 can also be interpreted as the fractional
456
12. Outlook Beyond Abel
power of an operator—the common integral operator A defined by (Af )(t) = t 0 f (s) ds; see, e.g., [203, Lemma 2.4]. 12.1.2.2. Fractional power via semigroups. The following identity is easily proven as a basic Laplace transform analogously to Example A.1: ∞ 1 dt s (e−tλ − 1) 1+s λ ≥ 0, 0 < s < 1. (12.4) λ = Γ(−s) 0 t In the spirit of the generalisation that we have become accustomed to, this suggests we define the fractional power of an operator A, for example, a strongly elliptic operator A = −L, equipped with Dirichlet boundary conditions by the formula ∞ dt −tA 1 s − I f 1+s , λ ≥ 0, 0 < s < 1, e (12.5) A f := Γ(−s) 0 t and we clearly can use A = −L = − for a further version of the fractional Laplacian. This formula goes back to Bochner in 1949 [32]; see also [205] for the fractional Laplacian case. 12.1.2.3. Fractional power via spectral definition. While the two definitions above apply to general closed operators A satisfying the resolvent condition (12.1), a definition more specific to self-adjoint positive definite operators is the one relying on its spectral decomposition ∞ by means of the eigenvalue/eigenfunction pairs {λn , ϕn }: for Au = n=1 λn ϕn , uϕn , we define ∞ α λαn ϕn , uϕn . (12.6) A u= n=1
This applies, e.g., to strongly elliptic operators L by setting A = −L, equipped with homogeneous Dirichlet, Neumann, or impedance boundary conditions and has been made use of several times in the previous chapters of this book. 12.1.2.4. Fractional power via pseudo-differential calculus. Equation (A.12) on the interaction of the Fourier transform with differentiation immediately suggests a way to define fractional powers α of the Laplacian by the now-familiar approach of replacing an integer in a formula by a real number and then justifying the result in terms of, for example, convergence of the resulting integral or series, (12.7)
α f (ξ) = −|ξ|2α f (ξ),
where we are using the shorthand f to denote the Fourier transform F f . Thus with this viewpoint the fractional Laplacian of order α is that pseudodifferential operator with symbol −|ξ|2α . In more general terms one can extend this association in the reverse order: given a symbol as a function
12.1. Fractional powers of operators
457
of ξ, define L to be that operator whose Fourier transform is precisely that symbol. We will see more of this in this section. 12.1.2.5. Fractional power via finite differences. A commonly used alternative definition of the fractional Laplacian of a function u : Rd → R that is sufficiently smooth and well behaved at infinity is 2u(x) − u(x + y) − u(x − y) s (12.8) (−) u(x) = Cd,s dy. y d+2s d R Here Cd,s is constant, and its value will be obtained through Theorem 12.3. The relationship to the definition taken in (12.7) is given by the following equivalence result on the Schwartz space S(Rd ) = {f ∈ C ∞ (Rd ) : ∀α, β ∈ Nd0 : supx∈Rd |xα ∂ β f (x)| < ∞}. Theorem 12.3 ([40, Lemma 2.1]). There exists a constant Cd,s ∈ R such that the operator (−)s defined by (12.8) satisfies for all u ∈ S(Rd ) , (ξ) . (12.9) (−)s u(x) = F −1 |ξ|2s u Proof. Take Fourier transforms in (12.9) and use (F u(· ± y))(ξ) = (F u)(ξ)e±iξ·y to obtain , (ξ) F (−)s u(x) = Cd,s u
(12.10)
2 − eiξ·y − e−iξ·y dy y d+2s Rd 1 − cos(ξ · y) (ξ) dy = 2Cd,s u y d+2s Rd 1 − cos ξ · z |ξ| 2s = 2|ξ| u (ξ) Cd,s dz d+2s |z| Rd
(ξ) Cd,s I(ξ, d, s), = 2|ξ|2s u where we have used the change of variables z = |ξ|y in the penultimate line. The last step is to recognise that the integral I(ξ, d, s) is rotationally invariant and so is independent of ξ (although the value as a function of d, s is complicated). Now the constant Cd,s can be chosen so that 2Cd,s I(d, s) = 1, and we have , (ξ), F (−)s u(x) = |ξ|s u which gives the conclusion of the theorem.
For further details of this version and some applications, see [40], where the following results can also be found. One of the classical results for harmonic functions is the maximum principle, and there is a fractional counterpart. We give a version and its proof
458
12. Outlook Beyond Abel
for the finite difference definition (12.8) of (−)s . Let Brd denote the ball of radius r in Rd . Theorem 12.4 ([40, Theorem 2.3.2]). If u is continuous with (−)s u ≥ 0 in Brd and u ≥ 0 in Rd \Brd , then u ≥ 0 in Brd . Proof. We argue by contradiction. Suppose that for x∗ ∈ argmin{u(x) : u ∈ Brd } the value u(x∗ ) is negative u(x∗ ) < 0. Then since u ≥ 0 in Rd \ Brd , u(x∗ ) is a minimum in all of Rd . Thus for any y ∈ Rd , we have 2u(x∗ ) − u(x∗ − d the inverse triangle inequality y) − u(x∗ + y) ≤ 0. However, for y ∈ Rd \ B2r yields x∗ ± y ∈ Rd \Brd and so u(x∗ ± y) ≥ 0. Putting all of this together gives 2u(x∗ ) − u(x∗ − y) − u(x∗ + y) s ∗ dy 0 ≤ (−) u(x ) = |y|d+2s Rd 2u(x∗ ) − u(x∗ − y) − u(x∗ + y) 2u(x∗ ) dy ≤ dy < 0 , ≤ d d |y|d+2s |y|d+2s Rd \B2r Rd \B2r which gives the required contradiction.
The Strong Maximum Principle also holds. Theorem 12.5 ([40, Theorem 2.3.3]). If u is continuous with (−)s u ≥ 0 in Brd and u ≥ 0 in Rd \Brd , then u > 0 in Brd unless u is identically zero. Proof. From the above weak Maximum Principle we know that u ≥ 0 in all of Rd . Hence if u is not strictly positive, then there must exist an x∗ ∈ Brd with u(x∗ ) = 0. In this case we have 2u(x∗ ) − u(x∗ − y) − u(x∗ + y) dy 0≤ |y|d+2s Rd u(x∗ − y) + u(x∗ + y) dy. =− |y|d+2s Rd
Now since both u(x∗ −y) and u(x∗ +y) are nonnegative, the above inequality implies that both terms in the integrand must vanish identically. This shows that u also vanishes identically. Harmonic functions are quite uncompromising in their behaviour and therefore are not amenable to being used as basis functions. For example, if a function f has a strict maximum in the interior of a domain Ω, then it cannot be arbitrarily approximated by harmonic functions due to the maximum principle. The nonlocal situation is quite different, and solutions of the fractional Laplacian can locally approximate any given function without any geometric constraints. Here is such an approximation result.
12.1. Fractional powers of operators
459
Theorem 12.6 ([40, Theorem 2.5.1]). Let k ∈ N0 , f ∈ C k (B1d ), then for any > 0 there exists an R > 0 and a function u ∈ H s (Rd ) ∩ C s (Rd ) d and such that f − satisfying (−)s u(x) = 0 in B1d with u = 0 in Rd \BR u C k (B d ) < . 1
While there is often an effective equivalence between these various definitions of fractional powers when applied to the same elliptic operator (such as the Laplacian), when the domain Ω = Rd , the same is not necessarily true for a bounded domain. The next subsection will explore this with an example comparing the Riesz and fractional Laplacian when Ω is an interval of the real line. 12.1.2.6. The Riesz derivative on a finite interval. The Riesz derivative on a bounded interval is the sum of a left- and right-sided fractional derivative. We take the interval to be [a, b], then we have the Abel integral pairs of order α ∈ (0, 1] as defined in equations (4.3) and (4.4), x 1 u(t) γ dt, for x > a, a Ix u(x) := Γ(γ) a (x − t)1−γ (12.11) b 1 u(t) γ dt, for x < b. x Ib u(x) := Γ(γ) x (t − x)1−γ Then for real 2α ∈ [k − 1, k) with k ∈ N, the left- and right-sided Riemann–Liouville derivatives of order 2α > 0 are as defined in Section 4.2.1 (see Definition 4.2) by RL 2α a Dx f (x)
dn (a I k−2α f )(x) dxn x x dn 1 (x − s)k−2α−1 f (s) ds = Γ(k − 2α) dxn a :=
and RL 2α x Db f (x)
dn (x I k−2α f )(x) dxn b b (−1)k dn (s − x)k−2α−1 f (s) ds, = Γ(k − 2α) dxn x := (−1)n
respectively. We cannot simply add the left and right Riemann–Liouville derivatives as there is a sign change between left and right, so this would make a difference between derivatives taken of odd and even order. Thus, for γ ∈ [0, 1)
460
12. Outlook Beyond Abel
the Riesz fractional integrals are defined by b 1 sign(x − t) u(t) dt 2Γ(γ) sin(γπ/2) a |x − t|1−γ γ 1 γ = a Ix − x Ib u(x), sin(γπ/2) b 1 u(t) γ dt I2 u(x) = 2Γ(γ) cos(γπ/2) a |x − t|1−γ γ 1 γ = a Ix + x Ib u(x). cos(γπ/2) I1γ u(x) =
(12.12)
Then for 2α ∈ [k − 1, k), we define the Riesz fractional derivative of order 2α by k k−2α d if k is odd, k I1 2α (12.13) D u(x) := dx k k−2α d I if k is even. dxk 2 Jacobi polynomials. For illustrative purposes we shall take the interval to be [−1, 1] and seek suitable basis functions. One example is given by the Jacobi polynomials modified by the multiplication with a weight function ω(x) to vanish at both endpoints, and which are mutually orthogonal with respect to ω. These have considerable utility in the context of the Riesz derivative and are associated fractional ordinary differential equations. For real α, β > −1, let Pnα,β be the classical Jacobi polynomials with respect the weight function ω α,β (x) = (1 − x)α (1 + x)β over [−1, 1]. These are such that (12.14) 1 α,β Pnα,β (x)Pm (x)ω α,β (x) dx = γnα,β δmn , −1
α,β 2
L where γnα,β = Pm
ω α,β
=
2α+β+1 Γ(n + α + 1)Γ(n + β + 1) . (2n + α + β + 1) n! Γ(n + α + β + 1)
The Jacobi polynomials satisfy a three term recursion scheme and in fact are a multiple of the hypergeomtric function 2 F1 −n, n+α+β+1, β+1; 12 (1+x) , but for our purposes we shall only require the case where α = β, and this will considerably simplify the formulae. We let Pnα (x) = Pnα,α (x). Then we have ⎧ ⎪ = 1, P0α (x) ⎪ ⎪ ⎪ α ⎨P (x) = (α + 1)x, 1 (12.15) . .. ⎪ ⎪ ⎪ ⎪ ⎩ α α (x), Pn+1 (x) = An xPnα (x) − Cnα Pn−1
12.1. Fractional powers of operators
461
where (12.16) Aαn =
(2n + 2α + 1)(n + α + 1) , (n + 1)(n + 2α + 1)
Cnα =
(n + α)2 (n + α + 1) . (n + 1)2 (n + 2α + 1)
Again following standard notation we have Definition 12.3. Jn−α (x) := (1 − x2 )α Pnα (x), α > −1. d −α (±1) Then it can be verified that Jn−α (−x) = (−1)n Jn−α (x) and dx k Jn = 0 for k = 0, 1, . . . , α − 1, and it satisfies a three term recursion scheme analogous to (12.15) as well as the orthogonality condition 1 Jn−α Jm−α ω −α,−α (x) dx = γnα,α δmn . (12.17) k
−1
More information on these functions including their numerical analysis can be found in [239]. Solution of the fractional Riesz Poisson equation. We now consider the solution to the inhomogeneous problem (12.18)
(−1) D2α u(x) = f (x),
−1 < x < 1,
u() (±1) = 0,
where 2α ∈ [2 − 1, 2) and f ∈ L2ωα . The classical solution can be found by expanding f (x) in the polynomials {Pnα (x)} and seeking a solution u(x) in an expansion with the functions {Jn−α (x)}. (12.19)
f (x) =
∞
fj Pjα (x)
u(x) =
j=0
∞
uj Jj−α (x)
j=0
The function u(x) satisfies the boundary conditions and D2α u(x) =
∞ n=0
un D2α Jn−α (x) = (−1)
∞ Γ(n + 1 + 2α) n=0
n!
un Pnα (x).
Now substituting the above into (12.19) and (12.18) and using the fact that {Pjα } are orthogonal shows that the basis coefficients are related by (12.20)
un = (−1) n! fn /Γ(n + 1 + 2α),
n = 0, . . . .
To illustrate the above method and the dependence of the solution on α, we take the case where 1 < 2α ≤ 2 and in order not to let aspects of f paint a perhaps misleading perspective, we take the intrinsically featureless function f = 1. We will also compare this Abel integral-based fractional derivative with the solution generated by the fractional (one-dimensional) Laplacian according to (12.6) for the same f and boundary conditions at x = ±1. Here the the interval [−1, 1] are relevant basis (eigen)functions for 1 nπ 2α {sin 2 nπ(x + 1) } and the eigenvalues are λn = ( 2 ) ; see Figure 12.1.
462
12. Outlook Beyond Abel
1.00
0.75
0.50
0.25
0.00
u(x)
u(x) 0.75
........................................ ........ ........... ....... ....... ...... ...... ...... ...... ..... . . . . . . . . . . . . ..... . . . . . . . . . . . . . . . . . . . . . . . . . . . .......... ..... ... ....... . . . . . . . . . . . ..... . . . . . ....... ..... .... .... . . . . . . . . . . . ...... .... .. .... . . . . . . . . . . . ... ..... ............................................ .... . . ... . . . . . . . . . . . . . . . . ..... ..... ........ ...... .. ....... . . . . . . . . . . . . . ..... .... .. ...... ...... ........................................................ ........... . . . . . . . . . .. ........ ..... .. .. ...... ..... ....... ...... ...... ..... ... .... ......... ............. . . ...... ..... ... ... .. .. ...... ..... ..... .... ... ..... ......... .......... ..... ..... ... ... . ..... ... .. .. .. .. ... .... ..... ... ... .. .. ..... ................ . . . . . ..... ...... .... .... . . .. .. ..... ... ... ... ... ... ... ..... .... ... .. .. .. .. ... .... .... ... ... .. .. .. .. ... ....... ... ... ... .... ........... ...... .. .. ....... ........ ...... ..... ............ ......... ............ ............ ......... ......... ............ ........... ........... . ..
0.50
0.25
0.00
2α = 1.1 2α = 1.5
.......... 2α = 1.75 ............ .................. ....... ....... 2α = 1.9 . ....... ...... ................................. ........... ...... ...... ................ ....... . . . . ....... ..... ...... ....... . .................. . ..... ........... ......................................................... .......... ......... . . . . . ........... ....... ...... .... .... ..... ....... ........... ........ ...... ..... .... ... ..... ...... ........ ....... ...... ..... .. ... ...... ....... ........ ...... ..... .... ... ...... ..... .... ... ... .... ...... ....... ..... ..... .... ... ... ...... ................. . .......... .. .. . .. .... .............. ......... ... ... . . ........ ... .. .. .... ............ . ........ ... ... . ....... ... ... ................... . . ... ....... ...... ... ... . . ...... ... ... .................. ...... .. .. . ............ ............. . ........... .......... . .......... . ............ .................. . ............ ............. . ........... ............. ............ . ............ .......... . ........ ....... . . . ........ ..... . . ..... . . ...
Figure 12.1. Solution to Poisson’s equation for selected α values. Left, Riesz derivative; right, fractional Laplacian.
One sees that the solutions, although similar, have definite differences. Note the scaling between the Riesz and fractional Laplacian case and also the much steeper slopes at the endpoints in the Riesz case. If we had to extend these graphs for α-values less than unity, then due to the (1 − x2 )α term inherent in the computation of the Riesz derivative, the slopes at the endpoints would increase as expected, and as α → 0+ the solution would tend to a constant on (−1, 1) but with u(±1) = 0. In both the Riesz and fractional Laplacian case, the solutions converge to 12 (1 − x2 ) as α → 1− . This is, of course, the solution obtained from the boundary value problem −u = 1, u(−1) = u(1) = 0. Remark 12.1. The reader might want to compare the solutions in Figure 12.1 to those for the one-sided Abel integral-based derivatives in Figure 5.4. 12.1.3. Extension theorems. Suppose Ω is an open set in Rd , and we ask if there is a nontrivial function u ∈ H 2 (Ω) that satisfies u = 0 and hence u = 0 in Ω. The answer is, of course, the function and the operator are locally defined and what happens in two disjoint sets is totally uncorrelated. Is the same result true with (−)s u for 0 < s < 1? Now the nonlocality plays a significant role, and the answer is more complex. In particular, we certainly should expect that any nontrivial values in an open set Ω should pervade all of Rd . In fact, we have Theorem 12.7. Let 0 < s < 1. If u ∈ H −r (Rd ) for some r ∈ R, and if both u and (−)s u vanish in some open set, then u ≡ 0. See, for example, [112], although this result dates back to Riesz [289]. The ramifications of this are considerable. In a typical inverse problem there are terms such as unknown coefficients that are defined in the context of the partial differential operator L in the domain Ω, and measurements
12.1. Fractional powers of operators
463
are made in the exterior of Ω, typically on ∂Ω, in order to recover this interior information. The ability to achieve this depends strongly on L, and there are specific rather than general results obtained. In the case of fractional operators and nonlocality, there is much more opportunity for exterior information to be transferred into the interior of Ω for the recovery of unknowns. Even if this is possible in the classical situation, one must expect that the degree of ill-conditioning involved will be much greater than with a nonlocal operator. We shall see an example of this in Section 12.2. One of the remarkable results for nonlocal operators of a certain class defined in subsets of Rd is that they have a purely local analogue in one dimension higher, Rd+1 . There are several consequences of this result in addition to the fact that it allows classical tools to be employed. Of note here is that some inverse problems such as the Calder´ on problem are much less ill-conditioned in higher space dimensions, and so lifting the fractional counterpart up one further dimension by an extension theorem can lead to a lesser degree of ill-conditioning. Turning this around, it also can show that the fractional Calder´on problem is less ill-conditioned than its classical counterpart. We will return to this in a later section. The ability to use well-understood classical methods is of course extremely important: for one it allows standard numerical solvers for the resulting extension to be used, and this can be a considerable advantage despite the increase in dimension. Extension results for operators based on fractional powers of (in particular elliptic) operators, have been known for some time, but the period since 2007 showed a remarkable increase in number and sophistication. This has much to do with the paper by Caffarelli and Silvestre [43], and we will give a brief overview of their extension of the fractional Laplacian based in Rd and its equivalent classical extension lying in Rd+1 . However, many other works are worthy of consultation in this regard and we mention [40], which also has physical science applications. See in addition [44, 112]. Let us suppose we have a smooth, bounded function u : Rd → R. We extend it harmonically to Rd × R+ by U : Rd × R+ → R satisfying (12.21)
xy U (x, y) = 0,
in Rd × R+ ,
with U (x, 0) = u(x).
The Dirichlet to Neumann map is then defined by (12.22)
DtN : u → −
∂U (., 0). ∂y
The operator DtN has the following properties: • (DtN)2 = −. This follows since −x,y U = −x U −
∂U (·,0) ∂ = −x u. and then (DtN)2 u = ∂y ∂y
∂2 U ∂y 2
=0
464
12. Outlook Beyond Abel
• (DtN)2 is a positive operator. Indeed, since U is harmonic, an integration by parts shows ∂U 2 U )(x, 0) dx, Ux,y U dxdy = |∇x,y U | dxdy + ( 0=− Rd ×R+ Rd ×R+ Rd ∂y and therefore ∂U U )(x, 0) dx ≥ 0. u (DtN u) dx = − ( Rd Rd ∂y 1
1
The above allows us to define (−x ) 2 := DtN, that is, (−x ) 2 u := d − ∂U ∂y (·, 0), which is the Neumann trace on the surface R × {0} and hence the term the Dirichlet to Neumann operator. Of course the above only defines the square root of the negative Laplacian, and the next step will be extend this to arbitrary powers 0 < s < 1. To achieve this, we consider the following equation and initial/boundary conditions. For u : Rd → R, we take the boundary value problem for an elliptic equation (with a coefficient that exhibits a singularity), a U (x, 0) = u(x). (12.23) x U + Uy + Uyy = 0, y This is identical to the previous case when a = 0. Note that the pde in (12.23) can be written as (12.24)
divx,y (y a ∇x,y U ) = 0,
and this can be identified with the Euler–Lagrange equation for the functional (12.25) J(U ) = y a |∇U |2 dX. y>0
The goal below is to show that (12.26)
C (−)s u = − lim y a uy = − y→0+
1 U (x, y) − U (x, 0) , lim 1 − a y→0 y 1−a
where s = 12 (1−a) and the constant C = Cd,s . If we now make the change of
1−a y , then we obtain equation (12.23) in nondivergence variables z = 1−a form, (12.27)
x U + z β Uzz = 0,
β=
−2a . 1−a
Since Uz = y a Uy , the claim becomes that (up to a multiplicative constant) we have (−)s u(x) = − limy→0+ Uy (x, y) = −Uz (x, 0), showing the connection to the Neumann map.
12.2. The Calder´on problem
465
By taking Fourier transforms with respect to x in equation (12.27), it (ξ, z) + z β U zz (ξ, z) = 0, and we thus obtain an ordinary becomes −|ξ|2 U differential equation for each value of ξ. If now φ : [0, ∞) → R solves (12.28)
−z β φ (z) + φ(z) = 0,
φ(0) = 1,
lim φ(z) = 0,
z→∞
2 (ξ, z) = U (ξ, 0)φ |ξ| 2−β z and so then a scaling shows that U (12.29)
(ξ, 0)|ξ| 2−β φ (0) = Ca |ξ|1−a U (ξ, 0) = Ca |ξ|1−a u z (ξ, 0) = U (ξ). U 2
Provided we can show that such a φ exists, then we have proven (12.26). We note that φ = φβ (z) is very similar to a Bessel function, and its existence follows from standard expansion in series arguments. We remark that the above is taken from the paper [43], which gives no fewer than four alternative versions of this proof!
12.2. The Calder´ on problem We now turn to inverse problems involving the fractional Laplacian, more precisely to just a single example, which appears to be the most prominent and best-studied one. Also its classical (integer derivative) counterpart is certainly among the most well-known inverse problems overall, probably even the most well-known one. 12.2.1. Electrical impedance tomography and the classical Calder´ on problem. A classical inverse problem is the following: Let Ω be an open, bounded, simply connected subset of Rd with smooth boundary ∂Ω. The electrical conductivity of Ω is γ(x) which is assumed to be positive and bounded. The question is whether γ(x) can be determined by making a series of measurements of voltage and current pairs at the boundary ∂Ω. This known as the Electrical Impedance Tomography (eit) method. The physical background is the following. By Ohm’s law the electric current density is given by −γ∇u, where −∇u is the electric field, and in the absence of an additional impressed current, by Amp´ere’s law, the governing equation for the electric potential u is ∇ · (γ(x)∇u) = 0. Thus given an induced voltage f on ∂Ω the potential u solves the Dirichlet boundary value problem (12.30)
∇ · (γ(x)∇u) = 0,
in Ω,
u = f,
on ∂Ω.
The aim is to recover γ(x) and thereby infer information about the medium within Ω. The observations are defined as follows: For each value of the input voltage f measure the associated boundary current γ ∂u ∂ν . Thus the problem is to recover γ from the Dirichlet to Neumann map $ # ∂u ** (12.31) Λγ : f → γ(x) * . ∂ν ∂Ω
466
12. Outlook Beyond Abel
Calder´ on was interested in this problem from the perspective of oil recovery when he worked as an engineer for the state oil company of Argentina in the 1940s, but since then the number of applications has grown considerably. Perhaps of particular note is medical imaging or nondestructive testing where a series of different current patterns is set up at electrodes surrounding the area to be imaged and the resulting voltage patterns are measured. √ √ √ For γ ∈ C 2 Ω) a change of variables, setting u ˇ = γu, q(x) = ( γ)/ γ allows the problem to be reduced to the Schr¨ odinger form (12.32)
−u + q(x)u = 0 in Ω,
and this has been the preferred form in most situations. The first uniqueness result was due to Kohn and Vogelius, first for the case of analytic q(x), then in 1984 for the case of piecewise analytic conductivities when Ω ⊂ R2 [199]. This was followed by Sylvester and Uhlmann in 1987 for the case Ω ⊂ Rd with d ≥ 3 and q ∈ C ∞ (Ω) [324]. A year later Nachman solved the R2 case for q ∈ L∞ (Ω) [257] This problem is severely ill-posed in all dimensions (it is trivially nonunique for d = 1), but a dimension count on the available information from the Dirichlet to Neumann map is enlightening. We seek to recover a function of d variables from a total of (d−1)+(d−1)-dimensional available boundary measurements. For d ≥ 3 the problem is actually overposed on this basis, and the more overposed, the greater the value of d; for d = 2 the dimension count matches exactly. Despite the huge effort this problem has received, there are many remaining open questions. As an example, can we obtain uniqueness if data is measured only on part of the boundary ∂Ω? In the case of R2 an affirmative answer was given in [157]: one needs measurements on a nonempty open subset Γ of ∂Ω. The case of R3 is more complex. It was shown that in some sense measurements over half of ∂Ω suffice and, later in [194], that this can be reduced to an arbitrary open subset of the boundary, provided Ω is convex. The techniques used in the above vary between the cases d = 2 and d ≥ 3. In the latter case the standard method used is that of complex geometrical optics solutions. Excellent surveys of results for the Calder´on problem can be found in [35, 335]. 12.2.2. The fractional Calder´ on problem. First we pose the question of whether in the classical problem one might replace the data measured on ∂Ω by data measured on a set W where W ∩ Ω = ∅? A moment’s thought will quickly dispense with this idea as there is absolutely no longer any correlation between the measured values and the information on q (and
12.2. The Calder´on problem
467
hence γ) inside Ω, as noted in Section 12.1. However, in the fractional case this is no longer true; the values of the solution of a fractional operator equation depend not simply in a single domain Ω but also on Rd \Ω. This gives hope that a version of the Calder´on problem cast with a fractional operator may have much better behaviour in terms of uniqueness and illconditioning. In fact this turns out to be the case. For the setting, let Ω ⊂ Rd , d ≥ 1, be a bounded domain with Lipschitz boundary ∂Ω and Ωe = Rd \ Ω. Take α ∈ (0, 1) and define (−)α to be the fractional Laplacian based on the Fourier transform, as in equation (12.7). Solutions u will be taken in the Sobolev space H α (Rd ). Let q ∈ L∞ (Ω). We α assume that λ = 0 is not a Dirichlet eigenvalue for the operator * (−) + q * on Ω; that is, if u ∈ H α (Rd ) solves (−)α + q in Ω and u* = 0, then u ≡ 0. This holds if, for example, q ≥ 0.
Ωe
Then we have the following forwards problem for the fractional Schr¨odinger equation in Ω, u = f in Ωe . (12.33) (−)α + q)u = 0 It can be shown that there is, for any given q, a unique solution for any f ∈ H α (Ωe ) [112]. The assumption is that we have access to measurements of solutions u outside of Ω, and the inverse problem is to determine q from these. Specifically, these will be encoded in the exterior Dirichlet to Neumann map * * (12.34) Λq : H α (Ωe ) → H α (Ωe )∗ with Λq f = (−)α u* , Ωe
where H α (Ωe )∗ is the dual space. The following result shows that exterior measurements, even on arbitrary and possibly disjoint subsets of Ωe , uniquely determine the potential q in Ω; see [112]. The summary notes by Salo [307] are worth consulting here, and the proof below is based on these notes. Theorem 12.8 ([112, Theorem 1.1]). Let Ω be bounded and open, assume q1 , q2 ∈ L∞ (Ω), and let W1 , W2 ⊂ Ωe be open sets. If the DtN maps (12.34) for the equation (12.33) in Ω satisfy * * * * for any f ∈ Cc∞ (W1 ), (12.35) Λq1 f * = Λq2 f * W2
W2
then q1 = q2 in Ω. Proof. We sketch the main idea, and in fact after obtaining the same integral identity below as appears in most formulations of the classical proof, the key steps we need rely only on Theorems 12.7 and 12.6.
468
12. Outlook Beyond Abel
With ui denoting the solution of (12.33) with q = qi , f = fi , i = 1, 2, the following integral identity holds: (12.36) (Λq1 − Λq2 )f1 , f2 Ωe = (q1 − q2 ) u1 u2 dx. Ω
This is in fact simply an integration arising from the defi* by parts formula * * * nition of the DN map. Now if Λq1 f * = Λq2 f * for all f ∈ Cc∞ (W1 ), then W2
(12.36) implies that
W2
(q1 − q2 ) u1 u2 dx = 0
(12.37) Ω
* * for all uj ∈ H α (Rd ) solving (−)α + qj uj = 0 in Ω with uj *
Ωe
(the set of compactly supported C ∞ functions).
∈ Cc∞ (Wj )
* * It is therefore enough to show that the products {u1 u2 * } of such soluΩ
tions form a complete set in L1 (Ω). One can fix any v ∈ L2 (Ω) and, by an analogue of Theorem 12.6 for * the fractional Schr¨odinger equation, choose (k) (k) * solutions uj satisfying uj * ∈ Cc∞ (Wj ) such that Ωe
(k) u1
→ v,
(k) u2
→1
in L2 (Ω)
as k → ∞.
Inserting these solutions into (12.37) and letting k → ∞ gives (q1 − q2 )v dx = 0, (12.38) Ω
and since v was arbitrary, we have q1 = q2 .
Remark 12.2. The same method proves Theorem 12.8 in all dimensions d ≥ 1. This is in contrast to the classical Calder´ on problem where usually different methods are often needed for d = 2 and d ≥ 3 (and of course uniqueness fails for d = 1). Note also that uniqueness holds with measurements made in arbitrarily small, possibly disjoint sets in Ωe . This is again in contrast to the classical problem. Still, the severe instability of the classical Calder´ on problem seems to remain valid in the fractional case [294] The Dirichlet to Neumann map for the classical Calder´ on problem is known to have certain monotonicity properties [156, 192]. This can be effectively used in numerical reconstruction schemes [142]. The monotonicity result carries over to the fractional case as follows; see [140, 141]. To this end, we introduce an ordering between operators A, B : H α (Ωe ) → H α (Ωe )∗ by A ( B ⇔ (B − A)f, f ≥ 0 for all f ∈ H α (Ωe ).
12.3. Notes
469
Theorem 12.9. For any q1 , q2 ∈ L∞ (Ω), the relation 0 ≤ q1 ≤ q2 implies Λq1 ( Λq2 . Of course, alternatively, the fractional Calder´ on problem can be numerically treated by any of the regularization methods from Section 8.2. This requires forward computation of solutions to the fractional Schr¨odinger equation (12.33), and we point to the vast and still growing literature on numerical methods for fractional differential equations in higher space dimensions; see, e.g., [33, 74, 139] and the references therein.
12.3. Notes As noted earlier in this chapter, there are many different ways to define fractional operators—even with basic a one as the negative Laplacian. In the case of the domain being the whole of Rn , many of these definitions lead to equivalent operators. In fact there is a well-referenced paper “Ten Equivalent Definitions of the Fractional Laplace Operator” [205] showing that calculations such as those in Section 12.1 can be performed for many other versions. However, there is a fundamental difference when the domain Ω of the operator is not all of Rn , for in this case the boundary conditions on ∂Ω can play a significant role. One option, for example for homogeneous Dirichlet conditions, is to extend the solution by zero in the exterior Rn \ Ω. There is a definite context for this in terms of the random walk model: the Brownian random walk is replaced by a so-called killed isotropic α-stable L´evy process; see [244]. This gives behaviour similar to Dirichlet boundary conditions for the classical integer order versions of the operators. As such, there is a near parallel for many classical inverse problems that have space-dependent unknowns such as coefficients that arise within the operators, and there are many interesting (and difficult) questions about whether these can be recovered uniquely, and if so, what is the regularity of the forward operator, and in turn what is the relative degree of ill-conditioning of associated inverse problems. There is also the possibility of allowing the solution to extend outside of the domain Ω, as we saw in the fractional Calderon problem. If the physical model enables this, then nonlocality conditions allow measurements to be taken exterior to the domain Ω in order to recover information about the operator in its interior. This can lead to a substantial change in the ability to recover such information, which is of course impossible in the classical setting. Thus from an inverse problem perspective, there are two open spheres of largely unsolved problems with genuine physical applications.
470
12. Outlook Beyond Abel
In the first class are revisiting classical, undetermined coefficient/term problems, where the change is the replacement of − (or a more general elliptic operator) with a version of one of its fractional counterparts, as the physical model and L´evy process dictates. This group includes some of the basic building blocks that have appeared extensively in this book such as the inverse Sturm–Liouville problem. The second group occurs when the nonlocality of the operator extends outside of the domain Ω and allows for measurements to be made potentially far from the object. This has no sequel in the classical case, and again it opens up a wide range of challenging mathematical problems. An excellent survey of some of the possibilities that are based on different L´evy processes and their connection to physics (although not focused on inverse problems) can be found in the article [221].
Appendix A
Mathematical Preliminaries
A.1. Integral transforms In this part, we recall the Laplace and Fourier transforms. In one dimension and with complex arguments, the former can be viewed as a restricted (one-sided) version of the latter, but the intrinsic importance of the Laplace transform to fractional calculus and the special functions arising from this make it essential to study on its own. Both these transforms play an important role in deriving the fractional diffusion equation and in analysing the solution properties. They also play a role in the development and analysis of several numerical schemes. A.1.1. Laplace transform. The Laplace transform of a function f : R+ → R, denoted by f or L[f ] is defined by ∞ e−zt f (t)dt f(z) = L[f ](z) = 0
for those values of z for which this integral exists. Suppose f is locally integrable on [0, ∞), and there exists some c > 0, λ ∈ R such that |f (t)| ≤ ceλt
for large t.
In other words, the function f (t) must not grow faster than a certain exponential function as t → ∞. Then f(z) exists and is analytic for Re(z) > λ. The Laplace transform of the convolution t t f (t − s)g(s) ds = f (s)g(t − s) ds (A.1) (f ∗ g)(t) = 0
0
471
472
A. Mathematical Preliminaries
satisfies the convolution rule (A.2)
f ∗ g(z) = f(z) g(z),
provided that both f(z) and g(z) exist. This rule is very useful for evaluating the Laplace transform of fractional derivatives. Example A.1. In this example, we compute the Laplace transform of the function ωα (t) = tα−1 . For α > 0, the substitution s = tz gives ∞ ∞ s α−1 ds 1 −zt α−1 e t dt = e−s L[ωα ](z) = Γ(α) 0 z z 0 ∞ = z −α e−s sα−1 ds = Γ(α)z −α , 0
where the Γ(α) is Gamma function which we discuss in Section 3.3. We use this formula in defining fractional powers of an operator in Section 12.1. Example A.2. The Abel fractional integral 0 Itα f (t) is a convolution with 1 (ωα ∗ f )(t). Hence, by the convolution rule, the ωα , i.e., 0 Itα f (t) = Γ(α) Laplace transform of the fractional integral 0 Itα f is given by L[0 Itα f ](z) = z −α f(z). The Laplace transform of the nth order derivative f (n) is given by L[f (n) ](z) = z n f(z) −
n−1
z n−k−1 f (k) (0+ ),
k=0
which follows from straightforward integration by parts in the definition, provided that all the involved integrals make sense. We also have the standard inversion formula for the Laplace transform a+i∞ 1 ezt f(z)dz for real a, (A.3) f (t) = 2πi a−i∞ where the integral is interpreted to be a semicircular contour with semiaxis at z = a and infinite radius to enclose any poles of the meromorphic function f. In practice one uses the semicircular contour with radius R and takes the limit as R → ∞. If the function f(z) entails a branch cut, then the contour must be deformed further into a Hankel path as can be seen in Figure 3.1. The direct evaluation of the inversion formula is often inconvenient, but it sometimes gives very useful information on the behaviour of the unknown f (t), especially the asymptotics as t → 0+ and t → ∞ (commonly known as a Karamata-Feller type Tauberian theorem) [101]; see Section 3.2. Numerically, it can also be turned into an efficient algorithm by suitably deforming the integration contour.
A.1. Integral transforms
473
There are many results that go under the name of Bernstein’s theorem, but the one needed here is concerned with completely monotonic functions. A function f : [0, ∞) → R is said to be completely monotonic if f has derivatives of all orders that satisfy (A.4)
(−1)n f (n) (x) ≥ 0
for all n = 0, 1, . . . .
Bernstein’s theorem states that f is completely monotonic if and only if it has the representation ∞ (A.5) f (x) = e−xt dμ(t), 0
where μ is a nonnegative measure on [0, ∞) such that the integral converges for all x > 0. In short, f is completely monotonic if and only if it is the Laplace transform of a nonnegative measure. It is easy to show that if f1 and f2 are completely monotonic, then so also are cf1 , f1 + f2 and f1 f2 for c > 0. A proof of the theorem as well as many of the important properties of such functions can be found in [348, Chap IV]. A.1.2. Fourier transform. The Fourier transform of a function f : Rd → R can be defined by e−iξ·x f (x)dx , ξ ∈ Rd , (A.6) f (ξ) = F [f ](ξ) = Rd
where ξ · x is the Euclidean inner product of the vectors (ξ1 , . . . , ξd ) and (x1 , . . . , xd ). If f ∈ L1 (Rd ), then f is continuous on R, and f (ξ) → 0 as |ξ| → ∞. The operator F : L1 (Rd ) → L∞ (Rd ) can be extended to a bounded mapping from L2 (Rd ) into itself. The inversion formula is given by 1 eiξ·x f (ξ) dξ, (A.7) f (x) = (2π)d Rd
x ∈ Rd .
We remark that there are alternative definitions all associated with the attempt at symmetry between the transform formula and that of its inversion. The above version is common in the physics literature. Alternative definitions are e−2πiξ·x f (x) dx f (ξ) = F [f ](ξ) = Rd
with the inversion formula
f (x) = Rd
e2πiξ·x F [f ](x) dξ,
or f (ξ) = F [f ](ξ) =
1 (2π)d/2
e−iξ·x f (x) dx Rd
474
A. Mathematical Preliminaries
with the inversion formula 1 f (x) = (2π)d/2
Rd
eiξ·x F [f ](x) dξ.
Plancherel’s theorem for the Fourier transform (A.6) corresponds to Parseval’s theorem for Fourier series and is often written in the form of the moduli of f and f (ξ) = F f (ξ), 1 2 |f (x)| dx = |f |2 dξ. (A.8) d/2 d d (2π) R R Roughly speaking, it states that the integral of a function’s squared modulus is equal to the integral of the squared modulus of its frequency spectrum, up to a factor. It also preserves the L2 inner product. For suitable f , it is easy to see using integration by parts that (A.9)
∂H xi f (ξ) = −iξi f (ξ).
With f being the plane wave f (x) = eix·ξ , there is a simple relationship: eix·ξ = −|ξ|2 eix·ξ . In other words, the plane wave appears as an eigenfunction of the Laplacian on Rd with eigenvalue −|ξ|2 . We must be cautious as eix·ξ fails to be in L2 (Rd ), so it is more accurately called a generalised eigenfunction. Now since the Fourier transform lets us write every function f as a superposition of plane waves, we see that 1 −|ξ|2 eix·ξ f (ξ) dξ. (A.10) f (x) = (2π)d Rd This can be justified for f suitably smooth and decaying at infinity. 1 ix·ξ dξ (cf. (A.7)), we H Since f can be represented as (2π) d Rd f (ξ)e obtain for the Laplacian acting on f that (A.11)
H(ξ) = −|ξ|2 f (ξ), f
which actually also follows directly from (A.9). Equation (A.11) shows that the Fourier transform diagonalizes the Laplacian: the operation of taking the Laplacian, when viewed through the Fourier transform, is simply a multiplication operator by the function −|ξ|2 . This quantity −|ξ|2 , is often refered to as the symbol of the Laplacian. In fact the above can be easily extended to higher powers: (A.12)
n f (ξ) = −|ξ|2n f (ξ)
for n = 0, 1, . . . .
A.2. Basics of Sobolev spaces
475
A.2. Basics of Sobolev spaces Let Ω ∈ Rd be an open set, and let Cc∞ (Ω) denote the space of infinitely differentiable functions φ : Ω → R with compact support in Ω. By Lp (Ω) we denote the Lebesgue space of functions whose Lp norm,
1/p , is finite. In fact the space Lp (Ω) can be defined by f Lp = Ω |f |p dx regarded as the closure of Cc∞ (Ω) with respect to this norm. (More precisely, it is a set of classes of almost-everywhere equal functions.) In case p = 2, we can define an inner product by f, g = Ω f g dx, and L2 (Ω) is a Hilbert space. The space of locally integrable functions L1loc (Ω) consists of those functions f such that ω |f | dx is finite for every compact subset ω ⊂ Ω. If u, v ∈ L1loc (Ω) and γ ∈ Nd0 a multiindex γ = (γ1 , . . . , γd ), then we say that v is the γth weak partial derivative of u, v = D γ u provided uD γ φ dx = (−1)|γ| vφ dx (A.13) Ω
for all test functions φ ∈
Ω
Cc∞ (Ω),
where |γ| = γ1 + · · · + γd .
Note that a weak partial derivative, if it exists, is uniquely defined up to a set of measure zero. This follows since v and v˜ ∈ L1loc (Ω) satisfy, for any φ ∈ Cc∞ (Ω), γ |γ| |γ| uD φ dx = (−1) vφ dx = (−1) v˜φ dx, thus
Ω (v
Ω
Ω
Ω
− v˜)φ dx = 0 showing that v = v˜ a.e.
Definition A.1. Fix p, 1 ≤ p ≤ ∞, and let k be a nonnegative integer. Then the Sobolev space W k,p (Ω) consists of all locally integrable functions u : Ω → R such for each multi-index γ with |γ| ≤ k, D γ u exists in the weak sense and lies in Lp (Ω). We define the norm on this space by ⎧
1/p ⎨ γ u|p dx |D for 1 ≤ p < ∞, (A.14)
u W k,p (Ω) = |γ|≤k Ω ⎩ γ p ess sup |D u| dx for p = ∞. |γ|≤k
Ω
With this norm the space W k,p , for integer k and 1 ≤ p ≤ ∞, is a Banach space; see, for example, [98]. The case of p = 2 allows one to retain Hilbert space structure, and in fact W k,2 is a Hilbert space, and the special notation of H k (Ω) := W k,2 is commonly used. Moreover, we denote by W0k,p (Ω) (or H0k (Ω)) the closure of Cc (Ω) with respect to the W k,p (or H k ) norm. It is important to remember that the dimension d of the underlying space plays a critical role. If d = 1 and Ω is an open interval, then u ∈ W 1,p (Ω) if and only if u is almost everywhere equal to an absolutely continuous function which possesses an almost everywhere ordinary derivative and so belongs to
476
A. Mathematical Preliminaries
Lp (Ω). In higher dimensions quite singular functions can lie in a Sobolev space. The example below is taken from [98]. −γ If we take Ω to be the unit ball in Rd and u(x) = |x| , x ∈Ω, x = 0, then it can be shown by direct computation that Ω uφxi dx = − Ω uxi φ dx provided that 0 ≤ γ < d − 1. The bound on γ follows from the requirement that the boundary term appearing in the integration by parts tends to zero as the ball B(0, ) shrinks to zero. Further, |∇u(x)| = |γ||x|−γ−1 , and so ∇u ∈ Lp (Ω) if and only if (1 + γ)p < d. In summary, u ∈ W 1,p (Ω) if and only if γ < (d − p)/p and in particular u ∈ W 1,p (Ω) if p > d.
It is important to know that if, say, u ∈ W k,p (Ω), then in what other spaces it might also lie and, further, whether its norm in that other space can be bounded by the Sobolev norm (A.14). In more succinct terminology, in which spaces is W k,p (Ω) continuously embedded? Here are two such results: Theorem A.1 (Gagliardo–Nirenberg–Sobolev inequality). Assume 1 ≤ p < ∞. Then there is a constant C = C(p, d) such that for all u ∈ Cc1 (Rd )
u Lp¯(Rn ) ≤ C ∇u Lp (Rd ) , where p¯ := dp/(d − p) > p. The example u = 1 shows that compact support is needed here. Theorem A.2 (Sobolev inequality). Let Ω be a bounded open set in Rd with a C 1 boundary. Assume u ∈ W k,p (Ω). (1) If k < dp , then u ∈ Lq (Ω), where
1 q
=
1 p
− kd .
¯ ¯ where k¯ is the (2) If k > dp , then u lies in the H¨ older space C k,γ (Ω), integer k¯ = k − dp − 1 and γ = dp − dp + 1 if dp is not an integer and γ can be taken arbitrarily in (0, 1) otherwise.
A.2.1. Fractional Sobolev spaces. We also need to look at the case of fractional derivatives, that is, taking k to have noninteger values, in which case the space is often named Sobolev-Slobodeckij space. We do this here only briefly, and we suggest the references [5, 121, 210] for further reading. We define, for s = k + σ, k ∈ N ∪ {0}, σ ∈ [0, 1), W s,p (Ω) = {u ∈ W k,p (Ω) | |u|W σ,p (Ω) < ∞}, where
u W s,p (Ω)
⎞1/p γ γ p |D u(x) − D u(y)| := ⎝ u pW k,p (Ω) + dx dy ⎠ . |x − y|d+σp Ω Ω ⎛
|γ|=k
A.2. Basics of Sobolev spaces
477
For any β ≥ 0, we denote H β (0, 1) to be the Sobolev space of order β on β (D) to be the set of functions the unit interval D = (0, 1), and we denote H β in H (0, 1) whose extensions by zero to R are in H β (R). For example, it is β (D) coincides with the interpolation known that for β ∈ (0, 1), the space H 2 1 space [L (D), H0 (D)]β . It is important for further study to note that φ ∈ β (D), β > 3/2, satisfies the boundary conditions φ(0) = φ (0) = 0 and H β (D) (resp., H β (D)) to be φ(1) = φ (1) = 0. Analogously, we define H L R the set of functions u whose extension by zero u is in H β (−∞, 1) (resp., β (D), we set u β := u H β (−∞,1) with the H β (0, ∞)). Here for u ∈ H L HL (D) β β [0, 1] (resp., C β [0, 1]) (D). Let C analogous definition for the norm in H R L R denote the set of functions in v ∈ C ∞ [0, 1] satisfying v(0) = v (0) = · · · = v (k) (0) = 0 (resp., v(1) = v (1) = · · · = v (k) (1) = 0) for any nonnegative β [0, 1] and C β [0, 1] integer k < β −1/2. It is not hard to see that the spaces C L R β (D), respectively. Moreover, these β (D) and H are dense in the spaces H L R are Banach spaces as well. A.2.2. Spaces of time dependent functions (Bochner spaces). For the case of time dependent differential equations, we need spaces taking the special role of time into account in an appropriate manner. Definition A.2. Let T ∈ [0, ∞], let X be a real Banach space with norm
· X , and denote by λ the Lebesgue measure on (0, T ). A function s : (0, T ) → X is called simple if and only if it only takes finitely many values ξ1 , . . . , ξm with Lebesgue measurable sets s−1 ({ξi }) = {t ∈ (0, T ) : s(t) = ξi }. A mapping u : (0, T ) → X is called Bochner integrable if and only if there exists a sequence of simple functions (s n )n∈N that is a Cauchy sequence −1 with respect to the norm s L1 (0,T ;X) = ξ∈s((0,T )) λ(s (ξ)) ξ X , such that sn (t) − u(t) X → 0 for λ-a.e. t ∈ (0, T ). Note that for each n ∈ N, the sum ξ∈s((0,T )) is finite. The Bochner integral of u is then defined as limn→∞ ξ∈sn ((0,T )) λ(s−1 n (ξ))ξ. In particular its value is independent of the choice of the approximating sequence. We define the Lebesgue type and Schauder type spaces with values in X via their norms: Lp (0, T ; X) = {u : (0, T ) → X Bochner integrable : u Lp (0,T ;X) < ∞}; ⎧
1/p ⎨ T p
u(t) dt 1 ≤ p < ∞, X 0
u Lp (0,T ;X) := ⎩ sup
u(t) p = ∞; t∈(0,T )
X
C([0, T ], X) = {u : [0, T ] → X continuous };
u C([0,T ],X) := max u(t) X . t∈[0,T ]
478
A. Mathematical Preliminaries
Also weak derivatives are defined analogously to the special case X = R above. Definition A.3. Let u ∈ L1 (0, T ; X). Then v ∈ L1 (0, T ; X) is the weak derivative of u v = u if and only if T T φ (t)u(t) dt = − φ(t)v(t) dt ∀φ ∈ Cc∞ (0, T ) . 0
0
Note that here the test functions φ are real valued, whereas the values of the integrals to the left and right of the equality sign are elements of X. Therewith we can define Sobolev spaces with values in X. Definition A.4. W 1,p (0, T ; X) = {u ∈ Lp (0, T ; X) : u (in the weak sense) ∈ Lp (0, T ; X)}, ⎧
1/p ⎨ T p (t) p ) dt ( u(t) +
u 1 ≤ p < ∞, X X 0
u W 1,p (0,T ;X) := ⎩ sup p = ∞. t∈[0,T ] ( u(t) X + u (t) X ) Analogously to Theorem A.2, we have the following continuous embedding. Theorem A.3. W 1,p (0, T ; X) ⊆ C([0, T ], X), and with C only depending on T , ∀u ∈ W 1,p (0, T ; X) : u C([0,T ],X) ≤ C u W 1,p (0,T ;X) , t u (τ ) dτ ∀0 ≤ s ≤ t ≤ T . u(t) = u(s) + s
This can be refined to a result allowing for weaker spaces. Theorem A.4. Let u ∈ L2 (0, T ; H01 (Ω)) with u ∈ L2 (0, T ; H −1 (Ω)). Then u ∈ C([0, T ], L2 (Ω)) and (with C depending only on T )
u C([0,T ],L2 (Ω)) ≤ C( u L2 (0,T ;H01 (Ω)) + u L2 (0,T ;H −1 (Ω)) ) . The map t → u(t) 2L2 (Ω) is absolutely continuous with d
u(t) 2L2 (Ω) = 2(u (t), u(t)) dt
∈ L1 (0, T ),
where (·, ·) is the dual pairing between elements in H −1 (Ω) = (H01 (Ω))∗ and H01 (Ω).
A.3. Some useful inequalities
479
A.3. Some useful inequalities The well-known H¨ older inequality is an indispensable tool for the study of Lp spaces. It states that if p, q ∈ [1, ∞] with 1/p + 1/q = 1 (in which case we say that p, q are conjugate H¨older exponents), then for all measurable functions f and g on a set Ω, f g L1 ≤ f Lp g Lq . Equality holds if and only if f and g are linearly dependent. There are several generalisations of this result. One such is Young’s inequality: if p, q are conjugate H¨older exponents and if a and b are nonnegap bq + with equality if and only if ap = bq . ative real numbers, then ab ≤ p q Young’s inequality which is based on this H¨ older’s inequality can be proven by means of concavity of the log function. The version most useful for our purposes relates the Lp norm of the convolutions of two functions to norms of the functions themselves. 1 1 1 + = +1 Theorem A.5. Suppose f ∈ Lp (Rd ) and g ∈ Lq (Rd ). If p q r where 1 ≤ p, q, r ≤ ∞, then
f ∗ g Lr ≤ f Lp g Lq .
(A.15)
A related inequality we use in Chapter 4 is the Hardy–Littlewood–Sobolev inequality. This was a result first proved by Sobolev, based on the Hardy– Littlewood inequality [138], and used in his proof of the Sobolev embedding theorem. Roughly speaking, it bounds an Ls norm of the convolution of a function with the fractional potential |x|−α and the Lp norm of the function. More precisely, we have [219] Theorem A.6. Let f ∈ Lp (Rd ), g ∈ Lq (Rd ), and 0 < α < d, where 1 < p, q < ∞ and 1/p + 1/q + α/d = 2. Then for some constant N = Np,d,α , * * * * |x − y|−α f (x)g(y) dx dy * ≤ N f Lp g Lq . (A.16) * Rd
Rd
The Poincar´e inequality gives a bound on the maximum size of a function over an open bounded set Ω ⊂ Rd by the size of its derivative. More precisely, let 1 ≤ p < ∞. Then there exists a constant C such that for every function f ∈ W01,p we have (A.17)
f Lp (Ω) ≤ C ∇f Lp (Ω) .
There is a generalisation of this inequality due to Friedrichs which bounds a function in terms of its highest derivatives:
1/p
Dμ f pLp (Ω) for f ∈ W0k,p (Ω), (A.18)
f Lp (Ω) ≤ ρk |μ|=k
480
A. Mathematical Preliminaries
where ρ is the diameter of Ω and Dμ is the mixed partial derivative ∂ |μ| D μ = μ1 with |μ| = μ1 + · · · + μd . ∂x1 · · · ∂xμdd Gronwall’s inequalities are a basic tool for differential equations. Here is the classical integral formulation that does not require interior differentiability of the function u(t). Theorem A.7. Let β and u be real-valued continuous functions defined on the interval [0, T ] while α is integrable on every subinterval of (0, T ). If β(t) ≥ 0 and if t β(s)u(s) ds for all t ∈ [0, T ], u(t) ≤ α(t) + 0
then (A.19)
t
u(t) ≤ α(t) +
α(s)β(s) e
t s
β(r) dr
ds,
for all t ∈ [0, T ].
0
If, in addition, the function α is nondecreasing, then (A.20)
u(t) ≤ α(t) e
t 0
β(s) ds
for all t ∈ [0, T ].
We need the corresponding results for fractional order equations. Theorem A.8. Let β > 0. Suppose that a(t) is a nonnegative, locally integrable function on (0, T ) for some T > 0, while g(t) is a nonnegative, nondecreasing continuous function on (0, T ). If u(t) is nonnegative and locally integrable on (0, T ) such that t 0 < t < T, (A.21) u(t) ≤ a(t) + g(t) (t − τ )β−1 u(τ ) dτ, 0
then (A.22)
t ∞ (g(t)Γ(β))n (t − τ )nβ−1 a(τ ) dτ. u(t) ≤ a(t) + Γ(nβ) 0 n=1
In addition, if a(t) is nondecreasing on [0, T ), then (A.23)
u(t) ≤ a(t)Eβ,1 (Γ(β)tβ g(t)),
0 < t < T.
Proof. For locally integrable functions φ(t) on [0, T ) define the operator B t by Bφ(t) = g(t) 0 (t − τ )β−1 φ(τ ) dτ . Then for each integer n by successively applying (A.21), we obtain (A.24)
u(t) ≤
n k=0
B k a(t) + B n u(t).
A.3. Some useful inequalities
481
We claim that (A.25)
B u(t) ≤ [g(t)] n
t
n 0
Γ(β)n (t − τ )nβ−1 u(τ ) dτ. Γ(βn)
The proof of this is by induction, and the inequality (A.25) is clearly true for n = 1. Assume now it is true for n = j. Then we have (A.26) B j+1 u(t) = B(B j u(t)) t τ (g(τ )Γ(β))j β−1 (τ − s)jβ−1 u(s) ds dτ ≤ g(t) (t − τ ) Γ(jβ) 0 0 t τ Γ(β)j ≤ (g(t))j+1 (τ − s)jβ−1 u(s) ds dτ (t − τ )β−1 0 0 Γ(jβ) t t Γ(β)j (t − τ )β−1 (τ − s)jβ−1 dτ u(s) ds = (g(t))j+1 Γ(jβ) 0 s t j+1 Γ(β) (t − τ )(j+1)β−1 u(τ ) dτ = (g(t))j+1 Γ((j + 1)β) 0 where the second line follows from the first by the fact that g(t) is nondecreasing and the third line follows by an obvious change of variables. Writing the integrand as a Beta function and then invoking the Beta-Gamma funct tion duplication formula, yields the identity s (t − τ )β−1 (τ − s)jβ−1 dτ = Γ(jβ)Γ(β) (j+1)β−1 . This now shows (A.25). Γ((j+1)β) (t − s) With M = sup[0,T ] g(t), we obtain t (M Γ(β))n (t − τ )nβ−1 u(τ ) dτ, (A.27) B n u(t) ≤ Γ(nβ) 0 and it is easily seen that B n u(t) → 0 as n → ∞.
(A.28)
From the above we now obtain (A.22). If a(t) is nondecreasing on [0, T ), then we can write (A.22) as t ∞ (g(t)Γ(β))n (t − τ )nβ−1 dτ u(t) ≤ a(t) 1 + Γ(nβ) 0 n=1 (A.29) ∞ (g(t)Γ(β)tβ )n = a(t)Eβ,1 Γ(β)tβ g(t) , ≤ a(t) Γ(nβ + 1) n=0
where we have used the fact that t t nβ−1 (t − τ ) dτ = snβ−1 ds = 0
0
Γ(nβ) nβ t . Γ(nβ + 1)
482
A. Mathematical Preliminaries
An important particular case occurs when g is constant and the expression on the right-hand side of the estimate can be written as a Mittag-Leffler function Eβ,β ; cf. (1.24). Lemma A.1. Suppose b ≥ 0, β > 0, and a(t) is a nonnegative function locally integrable on 0 ≤ t < T (for some T < +∞), and suppose u(t) is nonnegative and locally integrable on 0 ≤ t < T with t b u(t) ≤ a(t) + (t − s)β−1 u(s) ds 0 ≤ t < T. Γ(β) 0 Then t u(t) ≤ a(t) + b (t − s)β−1 Eβ,β (b(t − s)β )a(s) ds, 0 ≤ t < T. 0
Bibliography
[1] Niels Henrik Abel, Opløsning af et par opgaver ved hjælp integraler, Magazin for Naturvidenskaberne, 2 (1823), 55–68, 205–215. [2] Niels Henrik Abel, Aufl¨ osung einer mechanischen Aufgabe (German), J. Reine Angew. Math. 1 (1826), 153–157, DOI 10.1515/crll.1826.1.153. MR1577605 ´ [3] Niels Henrik Abel, Œuvres compl` etes. Tome I (French), Editions Jacques Gabay, Sceaux, 1992. Edited and with a preface by L. Sylow and S. Lie; Reprint of the second (1881) edition. MR1191901 [4] Milton Abramowitz and Irene A Stegun, Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, Dover, New York, 1965. [5] Robert A. Adams and John J. F. Fournier, Sobolev spaces, 2nd ed., Pure and Applied Mathematics (Amsterdam), vol. 140, Elsevier/Academic Press, Amsterdam, 2003. MR2424078 [6] Ratan Prakash Agarwal, A propos d’une note de M. Pierre Humbert (French), C. R. Acad. Sci. Paris 236 (1953), 2031–2032. MR55502 [7] Mark Agranovsky and Peter Kuchment, Uniqueness of reconstruction and an inversion procedure for thermoacoustic and photoacoustic tomography with variable sound speed, Inverse Problems 23 (2007), no. 5, 2089–2102, DOI 10.1088/0266-5611/23/5/016. MR2353329 [8] Mohammed Al-Refai, On the fractional derivatives at extreme points, Electron. J. Qual. Theory Differ. Equ. (2012), No. 55, 5. MR2959045 [9] T. S. Aleroev, On the completeness of a system of eigenfunctions of a fractional-order differential operator (Russian, with Russian summary), Differ. Uravn. 36 (2000), no. 6, 829–830, 863, DOI 10.1007/BF02754416; English transl., Differ. Equ. 36 (2000), no. 6, 918– 919. MR1819468 [10] A. A. Alikhanov, A priori estimates for solutions of boundary value problems for equations of fractional order (Russian, with Russian summary), Differ. Uravn. 46 (2010), no. 5, 658–664, DOI 10.1134/S0012266110050058; English transl., Differ. Equ. 46 (2010), no. 5, 660–666. MR2797545 [11] A. A. Alikhanov, A priori estimates for solutions of boundary value problems for equations of fractional order (Russian, with Russian summary), Differ. Uravn. 46 (2010), no. 5, 658–664, DOI 10.1134/S0012266110050058; English transl., Differ. Equ. 46 (2010), no. 5, 660–666. MR2797545
483
484
Bibliography
[12] Karen A. Ames, Gordon W. Clark, James F. Epperson, and Seth F. Oppenheimer, A comparison of regularizations for an ill-posed problem, Math. Comp. 67 (1998), no. 224, 1451–1471, DOI 10.1090/S0025-5718-98-01014-X. MR1609682 [13] V. V. Anh and N. N. Leonenko, Spectral analysis of fractional kinetic equations with random data, J. Statist. Phys. 104 (2001), no. 5-6, 1349–1387, DOI 10.1023/A:1010474332598. MR1859007 [14] Teodor Atanackovi´ c, Sanja Konjik, Ljubica Oparnica, and Duˇsan Zorica, The Cattaneo type space-time fractional heat conduction equation, Contin. Mech. Thermodyn. 24 (2012), no. 46, 293–311, DOI 10.1007/s00161-011-0199-4. MR2992836 [15] Teodor M. Atanackovi´ c, Stevan Pilipovi´ c, Bogoljub Stankovi´c, and Duˇsan Zorica, Fractional calculus with applications in mechanics: Wave propagation, impact and variational principles, Mechanical Engineering and Solid Mechanics Series, ISTE, London; John Wiley & Sons, Inc., Hoboken, NJ, 2014. MR3242674 [16] Zhanbing Bai and Haishen L¨ u, Positive solutions for boundary value problem of nonlinear fractional differential equation, J. Math. Anal. Appl. 311 (2005), no. 2, 495–505, DOI 10.1016/j.jmaa.2005.02.052. MR2168413 [17] G. Baker and P. Graves-Morris, Pad´ e approximation, Cambridge University Press, 1996. [18] A. B. Bakushinski˘ı, On a convergence problem of the iterative-regularized Gauss-Newton method (Russian, with Russian summary), Zh. Vychisl. Mat. i Mat. Fiz. 32 (1992), no. 9, 1503–1509; English transl., Comput. Math. Math. Phys. 32 (1992), no. 9, 1353–1359 (1993). MR1185952 [19] A. B. Bakushinsky and M. Yu. Kokurin, Iterative methods for approximate solution of inverse problems, Mathematics and Its Applications (New York), vol. 577, Springer, Dordrecht, 2004. MR2133802 [20] A. V. Balakrishnan, Fractional powers of closed operators and the semigroups generated by them, Pacific J. Math. 10 (1960), 419–437. MR115096 [21] Alain Bamberger, Roland Glowinski, and Quang Huy Tran, A domain decomposition method for the acoustic wave equation with discontinuous coefficients and grid change, SIAM J. Numer. Anal. 34 (1997), no. 2, 603–639, DOI 10.1137/S0036142994261518. MR1442931 [22] P. Barenblatt, G. I. Zheltov, and I. N. Kochina, Basic concepts in the theory of seepage of homogeneous liquids in fissured rocks (strata), PMM24, Transl. of Priklad. Mat. Mekh., 24 (1960), 1286–1303. [23] Bruce A. Barnes, Spectral properties of linear Volterra operators, J. Operator Theory 24 (1990), no. 2, 365–382. MR1150626 [24] J. H. Barrett, Differential equations of non-integer order, Canad. J. Math. 6 (1954), 529– 541, DOI 10.4153/cjm-1954-058-2. MR64936 [25] Emilia Grigorova Bajlekova, Fractional evolution equations in Banach spaces, Eindhoven University of Technology, Eindhoven, 2001. Dissertation, Technische Universiteit Eindhoven, Eindhoven, 2001. MR1868564 [26] Serge Bernstein, Sur les fonctions absolument monotones (French), Acta Math. 52 (1929), no. 1, 1–66, DOI 10.1007/BF02547400. MR1555269 [27] M. V. Berry, Uniform asymptotic smoothing of Stokes’s discontinuities, Proc. Roy. Soc. London Ser. A 422 (1989), no. 1862, 7–21. MR990851 [28] H. Beyer and S. Kempfle, Definition of physically consistent damping laws with fractional derivatives (English, with English and German summaries), Z. Angew. Math. Mech. 75 (1995), no. 8, 623–635, DOI 10.1002/zamm.19950750820. MR1347683 [29] Garrett Birkhoff and Gian-Carlo Rota, On the completeness of Sturm-Liouville expansions, Amer. Math. Monthly 67 (1960), 835–841, DOI 10.2307/2309440. MR125274 [30] L. Bjørnø, Characterization of biological media by means of their non-linearity, Ultrasonics, 24 (1986), no. 5, 254–259. [31] Ralph Philip Boas Jr., Entire functions, Academic Press, Inc., New York, 1954. MR0068627
Bibliography
485
[32] S. Bochner, Diffusion equation and stochastic processes, Proc. Nat. Acad. Sci. U.S.A. 35 (1949), 368–370, DOI 10.1073/pnas.35.7.368. MR30151 [33] Andrea Bonito, Juan Pablo Borthagaray, Ricardo H. Nochetto, Enrique Ot´ arola, and Abner J. Salgado, Numerical methods for fractional diffusion, Comput. Vis. Sci. 19 (2018), no. 5-6, 19–46, DOI 10.1007/s00791-018-0289-y. MR3893441 [34] Andrea Bonito, Wenyu Lei, and Joseph E. Pasciak, Numerical approximation of space-time fractional parabolic equations, Comput. Methods Appl. Math. 17 (2017), no. 4, 679–705, DOI 10.1515/cmam-2017-0032. MR3709056 [35] Liliana Borcea, Electrical impedance tomography, Inverse Problems 18 (2002), no. 6, R99– R136, DOI 10.1088/0266-5611/18/6/201. MR1955896 [36] G¨ oran Borg, Eine Umkehrung der Sturm-Liouvilleschen Eigenwertaufgabe. Bestimmung der Differentialgleichung durch die Eigenwerte (German), Acta Math. 78 (1946), 1–96, DOI 10.1007/BF02421600. MR15185 [37] Peter Borwein and Tam´ as Erd´ elyi, The full M¨ untz theorem in C[0, 1] and L1 [0, 1], J. London Math. Soc. (2) 54 (1996), no. 1, 102–110, DOI 10.1112/jlms/54.1.102. MR1395070 [38] R. Brown, Brief account of microscopical observations, Phil. Mag. S.4, 161 (1828), 30–39. [39] Hermann Brunner, Houde Han, and Dongsheng Yin, The maximum principle for timefractional diffusion equations and its application, Numer. Funct. Anal. Optim. 36 (2015), no. 10, 1307–1321, DOI 10.1080/01630563.2015.1065887. MR3402825 [40] Claudia Bucur and Enrico Valdinoci, Nonlocal diffusion and applications, Lecture Notes of the Unione Matematica Italiana, vol. 20, Springer, [Cham]; Unione Matematica Italiana, Bologna, 2016, DOI 10.1007/978-3-319-28739-3. MR3469920 [41] Martin Burger and Stanley Osher, Convergence rates of convex variational regularization, Inverse Problems 20 (2004), no. 5, 1411–1421, DOI 10.1088/0266-5611/20/5/005. MR2109126 [42] V. Burov, I. Gurinovich, O. Rudenko, and E. Tagunov, Reconstruction of the spatial distribution of the nonlinearity parameter and sound velocity in acoustic nonlinear tomography, Acoustical Physics, 40 (1994), no. 11, 816–823. [43] Luis Caffarelli and Luis Silvestre, An extension problem related to the fractional Laplacian, Comm. Partial Differential Equations 32 (2007), no. 7-9, 1245–1260, DOI 10.1080/03605300600987306. MR2354493 [44] Luis A. Caffarelli, Sandro Salsa, and Luis Silvestre, Regularity estimates for the solution and the free boundary of the obstacle problem for the fractional Laplacian, Invent. Math. 171 (2008), no. 2, 425–461, DOI 10.1007/s00222-007-0086-6. MR2367025 [45] Wei Cai, Wen Chen, Jun Fang, and Sverre Holm, A survey on fractional derivative modeling of power-law frequency-dependent viscous dissipative and scattering attenuation in acoustic wave propagation, Applied Mechanics Reviews, 70 (2018), no. 3, 6. [46] Charles A. Cain.U Ultrasonic reflection mode imaging of the nonlinear parameter B/A: I. A theoretical basis. The Journal of the Acoustical Society of America, 80 (1986), no. 1, 28–32. [47] J. R. Cannon, Determination of an unknown heat source from overspecified boundary data, SIAM J. Numer. Anal. 5 (1968), 275–286, DOI 10.1137/0705024. MR231552 [48] J. R. Cannon and Paul DuChateau, Structural identification of an unknown source term in a heat equation, Inverse Problems 14 (1998), no. 3, 535–551, DOI 10.1088/02665611/14/3/010. MR1629991 [49] John Rozier Cannon, The one-dimensional heat equation, Encyclopedia of Mathematics and its Applications, vol. 23, Addison-Wesley Publishing Company, Advanced Book Program, Reading, MA, 1984. With a foreword by Felix E. Browder, DOI 10.1017/CBO9781139086967. MR747979 [50] B. Canuto and O. Kavian, Determining coefficients in a class of heat equations via boundary measurements, SIAM J. Math. Anal. 32 (2001), no. 5, 963–986, DOI 10.1137/S003614109936525X. MR1828313
486
Bibliography
[51] Bruno Canuto and Otared Kavian, Determining two coefficients in elliptic operators via boundary spectral data: a uniqueness result (English, with English and Italian summaries), Boll. Unione Mat. Ital. Sez. B Artic. Ric. Mat. (8) 7 (2004), no. 1, 207–230. MR2044267 [52] Michele Caputo, Linear models of dissipation whose Q is almost frequency independent—II, Geophys. J. Int., 13 (1967), no. 5, 529–539. [53] Michele Caputo and Francesco Mainardi, Linear models of dissipation in anelastic solids, La Rivista del Nuovo Cimento (1971-1977), 1 (1971), no. 1, 161–198. [54] Michele Caputo and Francesco Mainardi, A new dissipation model based on memory mechanism, Pure Appl. Geophys., 91 (1971), no. 1, 134–147. [55] Alfred Carasso, Determining surface temperatures from interior observations, SIAM J. Appl. Math. 42 (1982), no. 3, 558–574, DOI 10.1137/0142040. MR659413 [56] Carlo Cattaneo, Sulla conduzione del calore (Italian), Atti Sem. Mat. Fis. Univ. Modena 3 (1949), 83–101. MR0032898 [57] Khosrow Chadan, David Colton, Lassi P¨ aiv¨ arinta, and William Rundell, An introduction to inverse scattering and inverse spectral problems, SIAM Monographs on Mathematical Modeling and Computation, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1997. With a foreword by Margaret Cheney, DOI 10.1137/1.9780898719710. MR1445771 [58] Khosrow Chadan, David Colton, Lassi P¨ aiv¨ arinta, and William Rundell, An introduction to inverse scattering and inverse spectral problems, SIAM Monographs on Mathematical Modeling and Computation, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1997. With a foreword by Margaret Cheney, DOI 10.1137/1.9780898719710. MR1445771 [59] J. M. Chambers, C. L. Mallows, and B. W. Stuck, A method for simulating stable random variables, J. Amer. Statist. Assoc. 71 (1976), no. 354, 340–344. MR415982 [60] Peter J. Chen and Morton E. Gurtin, On a theory of heat conduction involving two temperatures, Zeitschrift f¨ ur angewandte Mathematik und Physik ZAMP, 19 (1968), no. 4, 614–627. [61] W. Chen and S. Holm, Fractional Laplacian time-space models for linear and nonlinear lossy media exhibiting arbitrary frequency power-law dependency, The Journal of the Acoustical Society of America, 115 (2004), no. 4, 1424–1430. [62] YangQuan Chen and Caibin Zeng, Global Pad´ e approximations of the generalized MittagLeffler function and its inverse, Fract. Calc. Appl. Anal. 18 (2015), no. 6, 1492–1506, DOI 10.1515/fca-2015-0086. MR3433025 [63] Jin Cheng, Junichi Nakagawa, Masahiro Yamamoto, and Tomohiro Yamazaki, Uniqueness in an inverse problem for a one-dimensional fractional diffusion equation, Inverse Problems 25 (2009), no. 11, 115002, 16, DOI 10.1088/0266-5611/25/11/115002. MR2545997 [64] Jin Cheng, Junichi Nakagawa, Masahiro Yamamoto, and Tomohiro Yamazaki, Uniqueness in an inverse problem for a one-dimensional fractional diffusion equation, Inverse Problems 25 (2009), no. 11, 115002, 16, DOI 10.1088/0266-5611/25/11/115002. MR2545997 [65] Gordon W. Clark and Seth F. Oppenheimer, Quasireversibility methods for non-well-posed problems, Electron. J. Differential Equations (1994), No. 08, approx. 9. MR1302574 [66] Christian Clason and Michael V. Klibanov, The quasi-reversibility method for thermoacoustic tomography in a heterogeneous medium, SIAM J. Sci. Comput. 30 (2007/08), no. 1, 1–23, DOI 10.1137/06066970X. MR2377428 [67] Bernard D. Coleman, Richard J. Duffin, and Victor J. Mizel, Instability, uniqueness and nonexistence theorems for the equation ut = uxx − uxtx on a strip, Arch. Rational Mech. Anal. 19 (1965), 100–116, DOI 10.1007/BF00282277. MR177215 [68] Albert Compte and Ralf Metzler, The generalized Cattaneo equation for the description of anomalous transport processes, J. Phys. A 30 (1997), no. 21, 7277–7289, DOI 10.1088/03054470/30/21/006. MR1603438
Bibliography
487
[69] John B. Conway, Functions of one complex variable, 2nd ed., Graduate Texts in Mathematics, vol. 11, Springer-Verlag, New York-Berlin, 1978. MR503901 [70] R. Courant and D. Hilbert, Methods of mathematical physics. Vol. I, Interscience Publishers, Inc., New York, N.Y., 1953. MR0065391 [71] David G. Crighton, Model equations of nonlinear acoustics, Annual Review of Fluid Mechanics, 11 (1979), no. 1, 11–33. [72] Philip J. Davis, Leonhard Euler’s integral: A historical profile of the gamma function, Amer. Math. Monthly 66 (1959), 849–869, DOI 10.2307/2309786. MR106810 [73] Klaus Deimling, Nonlinear functional analysis, Springer-Verlag, Berlin, 1985, DOI 10.1007/978-3-662-00547-7. MR787404 [74] Marta D’Elia, Qiang Du, Christian Glusa, Max Gunzburger, Xiaochuan Tian, and Zhi Zhou, Numerical methods for nonlocal and fractional models, Acta Numer. 29 (2020), 1–124, DOI 10.1017/s096249292000001x. MR4189291 [75] Peter Deuflhard, Heinz W. Engl, and Otmar Scherzer, A convergence analysis of iterative methods for the solution of nonlinear ill-posed problems under affinely invariant conditions, Inverse Problems 14 (1998), no. 5, 1081–1106, DOI 10.1088/0266-5611/14/5/002. MR1654603 [76] P. Deuflhard and G. Heindl, Affine invariant convergence theorems for Newton’s method and extensions to related methods, SIAM J. Numer. Anal. 16 (1979), no. 1, 1–10, DOI 10.1137/0716001. MR518680 [77] Peter Deuflhard and Florian A. Potra, Asymptotic mesh independence of Newton-Galerkin methods via a refined Mysovski˘ı theorem, SIAM J. Numer. Anal. 29 (1992), no. 5, 1395–1412, DOI 10.1137/0729080. MR1182736 [78] Kai Diethelm, The analysis of fractional differential equations: An application-oriented exposition using differential operators of Caputo type, Lecture Notes in Mathematics, vol. 2004, Springer-Verlag, Berlin, 2010, DOI 10.1007/978-3-642-14574-2. MR2680847 [79] Mkhitar M Djrbashian. Integral transformations and representation of functions in a complex domain, [in Russian], Nauka, Moscow, 1966. MR0209472 [80] Mkhitar M. Djrbashian, Fractional derivatives and the Cauchy problem for differential equations of fractional order, Izv. Akad. Nauk Armajan. SSR, 75 (1970), no. 2, 71–96. [81] Mkhitar M. Djrbashian, Harmonic analysis and boundary value problems in the complex domain, Operator Theory: Advances and Applications, vol. 65, Birkh¨ auser Verlag, Basel, 1993. Translated from the manuscript by H. M. Jerbashian and A. M. Jerbashian [A. M. Dzhrbashyan], DOI 10.1007/978-3-0348-8549-2. MR1249271 [82] M. M. Djrbashian and A. B. Nersesjan, Fractional derivatives and the Cauchy problem for differential equations of fractional order (Russian, with Armenian and English summaries), Izv. Akad. Nauk Armjan. SSR Ser. Mat. 3 (1968), no. 1, 3–29. MR0224984 [83] Willy D¨ orfler, Hannes Gerner, and Roland Schnaubelt, Local well-posedness of a quasilinear wave equation, Appl. Anal. 95 (2016), no. 9, 2110–2123, DOI 10.1080/00036811.2015.1089236. MR3515098 [84] Paul DuChateau and William Rundell, Unicity in an inverse problem for an unknown reaction term in a reaction-diffusion equation, J. Differential Equations 59 (1985), no. 2, 155–164, DOI 10.1016/0022-0396(85)90152-4. MR804886 [85] Nelson Dunford and Jacob T. Schwartz, Linear operators. Part I: General theory, Wiley Classics Library, John Wiley & Sons, Inc., New York, 1988. With the assistance of William G. Bade and Robert G. Bartle; Reprint of the 1958 original; A Wiley-Interscience Publication. MR1009162 [86] M. M. Djrbashian, On the integral representation of functions continuous on several rays (generalization of the Fourier integral) (Russian), Izv. Akad. Nauk SSSR. Ser. Mat. 18 (1954), 427–448. MR0065684
488
Bibliography
[87] M. M. Djrbashian, Integralnye preobrazovaniya i predstavleniya funktsi˘ıv kompleksno˘ı oblasti (Russian), Izdat. “Nauka”, Moscow, 1966. MR0209472 [88] Alan Edelman and H. Murakami, Polynomial roots from companion matrix eigenvalues, Math. Comp. 64 (1995), no. 210, 763–776, DOI 10.2307/2153450. MR1262279 [89] Paul P. B. Eggermont, On Galerkin methods for Abel-type integral equations, SIAM J. Numer. Anal. 25 (1988), no. 5, 1093–1117, DOI 10.1137/0725063. MR960868 [90] Samuil D. Eidelman and Anatoly N. Kochubei, Cauchy problem for fractional diffusion equations, J. Differential Equations 199 (2004), no. 2, 211–255, DOI 10.1016/j.jde.2003.12.002. MR2047909 [91] A. Einstein, Ist die tr¨ agheit eines k¨ opers von seinem energieinhalt abh¨ angig?, Annalen der Physik, 323 (1905), no. 13, 639–641. ¨ [92] A. Einstein, Uber die von der molekularkinetischen Theorie der W¨ arme geforderte Bewegung von in ruhenden Fl¨ ussigkeiten suspendierten Teilchen, Ann. Phys., 322 (1905), no. 8, 549– 560. ¨ [93] A. Einstein, Uber einen die erzeugung und verwandlung des lichtes betreffenden heuristischen gesichtspunkt, Annalen der Physik, 322 (1905), no. 17, 132–148. [94] A. Einstein, Zur elektrodynamik bewegter k¨ orper, Annalen der Physik, 322 (1905), no. 10, 891–921. [95] Peter Elbau, Otmar Scherzer, and Cong Shi, Singular values of the attenuated photoacoustic imaging operator, J. Differential Equations 263 (2017), no. 9, 5330–5376, DOI 10.1016/j.jde.2017.06.018. MR3688416 [96] Heinz W. Engl, Karl Kunisch, and Andreas Neubauer, Convergence rates for Tikhonov regularisation of nonlinear ill-posed problems, Inverse Problems 5 (1989), no. 4, 523–540. MR1009037 [97] Heinz W. Engl, Martin Hanke, and Andreas Neubauer, Regularization of inverse problems, Mathematics and its Applications, vol. 375, Kluwer Academic Publishers Group, Dordrecht, 1996. MR1408680 [98] Lawrence C. Evans, Partial differential equations, 2nd ed., Graduate Studies in Mathematics, vol. 19, American Mathematical Society, Providence, RI, 2010, DOI 10.1090/gsm/019. MR2597943 [99] Mauro Fabrizio, Some remarks on the fractional Cattaneo–Maxwell equation for the heat propagation, Fract. Calc. Appl. Anal. 18 (2015), no. 4, 1074–1079, DOI 10.1515/fca-20150061. MR3377408 [100] Mauro Fabrizio, Claudio Giorgi, and Angelo Morro, Modeling of heat conduction via fractional derivatives, Heat and Mass Transfer, 53 (2017), no. 9, 2785–2797. [101] William Feller, On the classical Tauberian theorems, Arch. Math. (Basel) 14 (1963), 317– 322, DOI 10.1007/BF01234960. MR155131 untz’ theorem and completely monotone functions, Amer. Math. [102] William Feller, On M¨ Monthly 75 (1968), 342–350, DOI 10.2307/2313410. MR230009 [103] William Feller, An introduction to probability theory and its applications. Vol. II, 2nd ed., John Wiley & Sons, Inc., New York-London-Sydney, 1971. MR0270403 [104] Francesca Ferrillo, Renato Spigler, and Moreno Concezzi, Comparing Cattaneo and fractional derivative models for heat transfer processes, SIAM J. Appl. Math. 78 (2018), no. 3, 1450–1469, DOI 10.1137/17M1135918. MR3805552 ¨ [105] A. Fick, Uber diffusion, Phil. Mag. S.4, 10 (1855), 30–39. [106] Jens Flemming and Bernd Hofmann, Convergence rates in constrained Tikhonov regularization: equivalence of projected source conditions and variational inequalities, Inverse Problems 27 (2011), no. 8, 085001, 11, DOI 10.1088/0266-5611/27/8/085001. MR2819943
Bibliography
489
[107] Gerald B. Folland, Real analysis: Modern techniques and their applications, 2nd ed., Pure and Applied Mathematics, John Wiley & Sons, Inc., New York, 1999. A Wiley-Interscience Publication. MR1681462 ´ [108] Joseph Fourier, Th´ eorie analytique de la chaleur (French), Editions Jacques Gabay, Paris, 1988. Reprint of the 1822 original. MR1414430 [109] Avner Friedman, Partial differential equations of parabolic type, Prentice-Hall, Inc., Englewood Cliffs, N.J., 1964. MR0181836 [110] P. R. Garabedian, Partial differential equations, John Wiley & Sons, Inc., New York-LondonSydney, 1964. MR0162045 [111] I. M. Gelfand and B. M. Levitan, On the determination of a differential equation from its spectral function, Amer. Math. Soc. Transl. (2) 1 (1955), 253–304. MR0073805 [112] Tuhin Ghosh, Mikko Salo, and Gunther Uhlmann, The Calder´ on problem for the fractional Schr¨ odinger equation, Anal. PDE 13 (2020), no. 2, 455–475, DOI 10.2140/apde.2020.13.455. MR4078233 [113] R. Gorenflo and F. Mainardi, Fractional calculus: integral and differential equations of fractional order, Fractals and fractional calculus in continuum mechanics (Udine, 1996), CISM Courses and Lect., vol. 378, Springer, Vienna, 1997, pp. 223–276. MR1611585 [114] Rudolf Gorenflo, Anatoly A. Kilbas, Francesco Mainardi, and Sergei V. Rogosin, MittagLeffler functions, related topics and applications, Springer Monographs in Mathematics, Springer, Heidelberg, 2014, DOI 10.1007/978-3-662-43930-2. MR3244285 [115] Rudolf Gorenflo, Joulia Loutchko, and Yuri Luchko, Computation of the Mittag-Leffler function Eα,β (z) and its derivative, Fract. Calc. Appl. Anal. 5 (2002), no. 4, 491–518. Dedicated to the 60th anniversary of Prof. Francesco Mainardi. MR1967847 [116] Rudolf Gorenflo, Yuri Luchko, and Francesco Mainardi, Analytical properties and applications of the Wright function, Fract. Calc. Appl. Anal. 2 (1999), no. 4, 383–414. TMSF, AUBG’99, Part A (Blagoevgrad). MR1752379 [117] Rudolf Gorenflo, Yuri Luchko, and Francesco Mainardi, Wright functions as scale-invariant solutions of the diffusion-wave equation: Higher transcendental functions and their applications, J. Comput. Appl. Math. 118 (2000), no. 1-2, 175–191, DOI 10.1016/S03770427(00)00288-0. MR1765948 [118] T. Graham, On the law of the diffusion of gases, Phil. Mag. S.2, 351 (1833), 175–269. [119] Peter Grindrod, The theory and applications of reaction-diffusion equations: Patterns and waves, 2nd ed., Oxford Applied Mathematics and Computing Science Series, The Clarendon Press, Oxford University Press, New York, 1996. MR1423804 [120] G. Gripenberg, S.-O. Londen, and O. Staffans, Volterra integral and functional equations, Encyclopedia of Mathematics and its Applications, vol. 34, Cambridge University Press, Cambridge, 1990, DOI 10.1017/CBO9780511662805. MR1050319 [121] P. Grisvard, Elliptic problems in nonsmooth domains, Monographs and Studies in Mathematics, vol. 24, Pitman (Advanced Publishing Program), Boston, MA, 1985. MR775683 [122] C. W. Groetsch, The theory of Tikhonov regularization for Fredholm equations of the first kind, Research Notes in Mathematics, vol. 105, Pitman (Advanced Publishing Program), Boston, MA, 1984. MR742928 [123] Charles W. Groetsch, Inverse problems in the mathematical sciences, Vieweg Mathematics for Scientists and Engineers, Friedr. Vieweg & Sohn, Braunschweig, 1993, DOI 10.1007/9783-322-99202-4. MR1247696 [124] Morton E. Gurtin and A. C. Pipkin, A general theory of heat conduction with finite wave speeds, Arch. Rational Mech. Anal. 31 (1968), no. 2, 113–126, DOI 10.1007/BF00281373. MR1553521 [125] Jacques Hadamard. Sur les probl` emes aux d´ eriv´ees partielles et leur signification physique. Princeton University Bulletin, 13(4), 1902.
490
Bibliography
´ [126] Jacques Hadamard, La th´ eorie des ´ equations aux d´ eriv´ ees partielles (French), Editions Sci´ entifiques, Peking; Gauthier-Villars Editeur, Paris, 1964. MR0224957 [127] Jacques Hadamard, Lectures on Cauchy’s problem in linear partial differential equations, Dover Publications, New York, 1953. MR0051411 [128] P. R. Halmos, What does the spectral theorem say?, Amer. Math. Monthly 70 (1963), 241– 247, DOI 10.2307/2313117. MR150600 [129] Mark F Hamilton and David T Blackstock. Nonlinear acoustics, volume 1. Academic press San Diego, 1998. [130] Martin Hanke, A regularizing Levenberg-Marquardt scheme, with applications to inverse groundwater filtration problems, Inverse Problems 13 (1997), no. 1, 79–95, DOI 10.1088/0266-5611/13/1/007. MR1435869 [131] Martin Hanke, The regularizing Levenberg-Marquardt scheme is of optimal order, J. Integral Equations Appl. 22 (2010), no. 2, 259–283, DOI 10.1216/JIE-2010-22-2-259. MR2661721 [132] Martin Hanke, Andreas Neubauer, and Otmar Scherzer, A convergence analysis of the Landweber iteration for nonlinear ill-posed problems, Numer. Math. 72 (1995), no. 1, 21–37, DOI 10.1007/s002110050158. MR1359706 [133] Hermann Hankel, Die Euler’schen Integrale bei unbeschr¨ ankter Variabilit¨ at des Arguments, Z. Math. Phys., 9 (1864), 1–21. [134] Per Christian Hansen, Analysis of discrete ill-posed problems by means of the L-curve, SIAM Rev. 34 (1992), no. 4, 561–580, DOI 10.1137/1034115. MR1193012 [135] Per Christian Hansen, Regularization tools: a Matlab package for analysis and solution of discrete ill-posed problems, Numer. Algorithms 6 (1994), no. 1-2, 1–35, DOI 10.1007/BF02149761. MR1260397 [136] Per Christian Hansen and Dianne Prost O’Leary, The use of the L-curve in the regularization of discrete ill-posed problems, SIAM J. Sci. Comput. 14 (1993), no. 6, 1487–1503, DOI 10.1137/0914086. MR1241596 [137] G. H. Hardy and J. E. Littlewood, Tauberian theorems concerning power series and Dirichlet’s series whose coefficients are positive, Proc. London Math. Soc. (2) 13 (1914), 174–191, DOI 10.1112/plms/s2-13.1.174. MR1577498 [138] G. H. Hardy and J. E. Littlewood, Some properties of fractional integrals. II, Math. Z. 34 (1932), no. 1, 403–439, DOI 10.1007/BF01180596. MR1545260 [139] Stanislav Harizanov, Raytcho Lazarov, and Svetozar Margenov, A survey on numerical methods for spectral space-fractional diffusion problems, Fract. Calc. Appl. Anal. 23 (2020), no. 6, 1605–1646, DOI 10.1515/fca-2020-0080. MR4197087 [140] Bastian Harrach and Yi-Hsuan Lin, Monotonicity-based inversion of the fractional Schr¨ odinger equation I. Positive potentials, SIAM J. Math. Anal. 51 (2019), no. 4, 3092– 3111, DOI 10.1137/18M1166298. MR3984302 [141] Bastian Harrach and Yi-Hsuan Lin, Monotonicity-based inversion of the fractional Sch¨ odinger equation II. General potentials and stability, SIAM J. Math. Anal. 52 (2020), no. 1, 402–436, DOI 10.1137/19M1251576. MR4057616 [142] Bastian Harrach and Jin Keun Seo, Exact shape-reconstruction by one-step linearization in electrical impedance tomography, SIAM J. Math. Anal. 42 (2010), no. 4, 1505–1518, DOI 10.1137/090773970. MR2679585 [143] Y Hatano and N Hatano, Dispersive transport of ions in column experiments: An explanation of long-tailed profiles, Water Resour. Res., 34 (1998), no. 5, 1027–1033. [144] Yuko Hatano, Junichi Nakagawa, Shengzhang Wang, and Masahiro Yamamoto, Determination of order in fractional diffusion equation, J. Math. for Ind. 5A (2013), 51–57. MR3072335 [145] F. Hettlich and W. Rundell, A second degree method for nonlinear inverse problems, SIAM J. Numer. Anal. 37 (2000), no. 2, 587–620, DOI 10.1137/S0036142998341246. MR1740766
Bibliography
491
[146] Rudolf Hilfer, Threefold introduction to fractional derivatives, pp. 17–73, Wiley–VCH Verlag GmbH & Co. KGaA, 2008. [147] Einar Hille and Ralph S. Phillips, Functional analysis and semi-groups, American Mathematical Society Colloquium Publications, Vol. 31, American Mathematical Society, Providence, R.I., 1957. rev. ed. MR0089373 [148] B. Hofmann, B. Kaltenbacher, Ch. P¨ oschl, and O. Scherzer, A convergence rates result in Banach spaces with non-smooth operators, Inverse Problems, 23 (2007), 987–1010. [149] Thorsten Hohage, Regularization of exponentially ill-posed problems, Numer. Funct. Anal. Optim. 21 (2000), no. 3-4, 439–464, DOI 10.1080/01630560008816965. MR1769885 [150] Thorsten Hohage, Logarithmic convergence rates of the iteratively regularized Gauss-Newton method for an inverse potential and an inverse scattering problem, Inverse Problems 13 (1997), no. 5, 1279–1299, DOI 10.1088/0266-5611/13/5/012. MR1474369 [151] Sverre Holm and Sven Peter N¨ asholm, A causal and fractional all-frequency wave equation for lossy media, The Journal of the Acoustical Society of America, 130 (2011), no. 4, 2195– 2202. [152] P. Humbert and R. P. Agarwal, Sur la fonction de Mittag-Leffler et quelques-unes de ses g´ en´ eralisations (French), Bull. Sci. Math. (2) 77 (1953), 180–185. MR60643 [153] Pierre Humbert, Quelques r´ esultats relatifs a ` la fonction de Mittag-Leffler (French), C. R. Acad. Sci. Paris 236 (1953), 1467–1468. MR54107 [154] Christiaan Huygens, The pendulum clock or geometrical demonstrations concerning the motion of pendula as applied to clocks, Iowa State University Press Series in the History of Technology and Science, Iowa State University Press, Ames, IA, 1986. Translated from the Latin and with a preface and notes by Richard J. Blackwell; With an introduction by H. J. M. Bos. MR865493 [155] Nobuyuki Ichida, Takuso Sato, and Melvin Linzer, Imaging the nonlinear ultrasonic parameter of a medium, Ultrasonic Imaging, 5 (1983), no. 4, 295–299. [156] M. Ikehata, Size estimation of inclusion, J. Inverse Ill-Posed Probl. 6 (1998), no. 2, 127–140, DOI 10.1515/jiip.1998.6.2.127. MR1637360 [157] Oleg Yu. Imanuvilov, Gunther Uhlmann, and Masahiro Yamamoto, The Calder´ on problem with partial data in two dimensions, J. Amer. Math. Soc. 23 (2010), no. 3, 655–691, DOI 10.1090/S0894-0347-10-00656-9. MR2629983 [158] Oleg Yu. Imanuvilov and Masahiro Yamamoto, Lipschitz stability in inverse parabolic problems by the Carleman estimate, Inverse Problems 14 (1998), no. 5, 1229–1245, DOI 10.1088/0266-5611/14/5/009. MR1654631 [159] Victor Isakov, Inverse parabolic problems with the final overdetermination, Comm. Pure Appl. Math. 44 (1991), no. 2, 185–209, DOI 10.1002/cpa.3160440203. MR1085828 [160] Victor Isakov, Inverse problems for partial differential equations, 2nd ed., Applied Mathematical Sciences, vol. 127, Springer, New York, 2006. MR2193218 [161] Bangti Jin, Fractional differential equations—an approach via fractional derivatives, Applied Mathematical Sciences, vol. 206, Springer, Cham, 2021, DOI 10.1007/978-3-030-760434. MR4290515 [162] Bangti Jin and Yavar Kian, Recovery of the order of derivation for fractional diffusion equations in an unknown medium, SIAM J. Appl. Math. 82 (2022), no. 3, 1045–1067, DOI 10.1137/21M1398264. MR4443674 [163] Bangti Jin, Raytcho Lazarov, Yikan Liu, and Zhi Zhou, The Galerkin finite element method for a multi-term time-fractional diffusion equation, J. Comput. Phys. 281 (2015), 825–843, DOI 10.1016/j.jcp.2014.10.051. MR3281997 [164] Bangti Jin, Raytcho Lazarov, Xiliang Lu, and Zhi Zhou, A simple finite element method for boundary value problems with a Riemann-Liouville derivative, J. Comput. Appl. Math. 293 (2016), 94–111, DOI 10.1016/j.cam.2015.02.058. MR3394205
492
Bibliography
[165] Bangti Jin, Raytcho Lazarov, Joseph Pasciak, and William Rundell, Variational formulation of problems involving fractional order differential operators, Math. Comp. 84 (2015), no. 296, 2665–2700, DOI 10.1090/mcom/2960. MR3378843 [166] Bangti Jin, Buyang Li, and Zhi Zhou, Numerical analysis of nonlinear subdiffusion equations, SIAM J. Numer. Anal. 56 (2018), no. 1, 1–23, DOI 10.1137/16M1089320. MR3742688 [167] Bangti Jin, Buyang Li, and Zhi Zhou, Pointwise-in-time error estimates for an optimal control problem with subdiffusion constraint, IMA J. Numer. Anal. 40 (2020), no. 1, 377– 404, DOI 10.1093/imanum/dry064. MR4050544 [168] Bangti Jin and William Rundell, An inverse problem for a one-dimensional time-fractional diffusion problem, Inverse Problems 28 (2012), no. 7, 075010, 19, DOI 10.1088/02665611/28/7/075010. MR2946798 [169] Bangti Jin and William Rundell, An inverse Sturm-Liouville problem with a fractional derivative, J. Comput. Phys. 231 (2012), no. 14, 4954–4966, DOI 10.1016/j.jcp.2012.04.005. MR2927980 [170] Bangti Jin and William Rundell, A tutorial on inverse problems for anomalous diffusion processes, Inverse Problems 31 (2015), no. 3, 035003, 40, DOI 10.1088/0266-5611/31/3/035003. MR3311557 [171] B. Frank Jones Jr., Various methods for finding unknown coefficients in parabolic differential equations, Comm. Pure Appl. Math. 16 (1963), 33–44, DOI 10.1002/cpa.3160160106. MR152760 [172] Mark Kac, Can one hear the shape of a drum?, Amer. Math. Monthly 73 (1966), no. 4, 1–23, DOI 10.2307/2313748. MR201237 [173] Barbara Kaltenbacher and Irena Lasiecka, Global existence and exponential decay rates for the Westervelt equation, Discrete Contin. Dyn. Syst. Ser. S 2 (2009), no. 3, 503–523, DOI 10.3934/dcdss.2009.2.503. MR2525765 [174] Barbara Kaltenbacher, Andreas Neubauer, and Otmar Scherzer, Iterative regularization methods for nonlinear ill-posed problems, Radon Series on Computational and Applied Mathematics, vol. 6, Walter de Gruyter GmbH & Co. KG, Berlin, 2008, DOI 10.1515/9783110208276. MR2459012 [175] Barbara Kaltenbacher, On Broyden’s method for the regularization of nonlinear illposed problems, Numer. Funct. Anal. Optim. 19 (1998), no. 7-8, 807–833, DOI 10.1080/01630569808816860. MR1642573 [176] B. Kaltenbacher, A convergence rates result for an iteratively regularized Gauss-NewtonHalley method in Banach space, Inverse Problems 31 (2015), no. 1, 015007, 20, DOI 10.1088/0266-5611/31/1/015007. MR3302368 [177] Barbara Kaltenbacher, An iteratively regularized Gauss-Newton-Halley method for solving nonlinear ill-posed problems, Numer. Math. 131 (2015), no. 1, 33–57, DOI 10.1007/s00211014-0682-5. MR3383327 [178] Barbara Kaltenbacher, Andrej Klassen, and Mario Luiz Previatti de Souza, The Ivanov regularized Gauss-Newton method in Banach space with an a posteriori choice of the regularization radius, J. Inverse Ill-Posed Probl. 27 (2019), no. 4, 539–557, DOI 10.1515/jiip2018-0093. MR3987896 [179] Barbara Kaltenbacher, Irena Lasiecka, and Richard Marchand, Wellposedness and exponential decay rates for the Moore-Gibson-Thompson equation arising in high intensity ultrasound, Control Cybernet. 40 (2011), no. 4, 971–988. MR2977496 [180] Barbara Kaltenbacher and Vanja Nikoli´ c, Time-fractional Moore-Gibson-Thompson equations, Math. Models Methods Appl. Sci. 32 (2022), no. 5, 965–1013, DOI 10.1142/S0218202522500221. MR4430362 [181] Barbara Kaltenbacher and Mario Luiz Previatti de Souza, Convergence and adaptive discretization of the IRGNM Tikhonov and the IRGNM Ivanov method under a tangential cone condition in Banach space, Numer. Math. 140 (2018), no. 2, 449–478, DOI 10.1007/s00211018-0971-5. MR3851063
Bibliography
493
[182] Barbara Kaltenbacher and William Rundell, On an inverse potential problem for a fractional reaction-diffusion equation, Inverse Problems 35 (2019), no. 6, 065004, 31, DOI 10.1088/1361-6420/ab109e. MR3975371 [183] Barbara Kaltenbacher and William Rundell, On the identification of a nonlinear term in a reaction-diffusion equation, Inverse Problems 35 (2019), no. 11, 115007, 38, DOI 10.1088/1361-6420/ab2aab. MR4019539 [184] Barbara Kaltenbacher and William Rundell, Recovery of multiple coefficients in a reaction-diffusion equation, J. Math. Anal. Appl. 481 (2020), no. 1, 123475, 23, DOI 10.1016/j.jmaa.2019.123475. MR4002155 [185] Barbara Kaltenbacher and William Rundell, Regularization of a backward parabolic equation by fractional operators, Inverse Probl. Imaging 13 (2019), no. 2, 401–430, DOI 10.3934/ipi.2019020. MR3925425 [186] Barbara Kaltenbacher and William Rundell, The inverse problem of reconstructing reactiondiffusion systems, Inverse Problems 36 (2020), no. 6, 065011, 34, DOI 10.1088/13616420/ab8483. MR4115070 [187] Barbara Kaltenbacher and William Rundell, Some inverse problems for wave equations with fractional derivative attenuation, Inverse Problems 37 (2021), no. 4, Paper No. 045002, 28, DOI 10.1088/1361-6420/abe136. MR4234446 [188] Barbara Kaltenbacher and William Rundell, Determining damping terms in fractional wave equations, Inverse Problems 38 (2022), no. 7, Paper No. 075004, 35. MR4434591 [189] Barbara Kaltenbacher and William Rundell, On an inverse problem of nonlinear imaging with fractional damping, Math. Comp. 91 (2021), no. 333, 245–276, DOI 10.1090/mcom/3683. MR4350539 [190] Manfred Kaltenbacher, Fundamental equations of acoustics, Computational acoustics, CISM Courses and Lect., vol. 579, Springer, Cham, 2018, pp. 1–33. MR3700720 [191] Barbara Blaschke, Andreas Neubauer, and Otmar Scherzer, On convergence rates for the iteratively regularized Gauss-Newton method, IMA J. Numer. Anal. 17 (1997), no. 3, 421– 436, DOI 10.1093/imanum/17.3.421. MR1459331 [192] Hyeonbae Kang, Jin Keun Seo, and Dongwoo Sheen, The inverse conductivity problem with one measurement: stability and estimation of size, SIAM J. Math. Anal. 28 (1997), no. 6, 1389–1405, DOI 10.1137/S0036141096299375. MR1474220 ¨ [193] J. Karamata, Uber die Hardy-Littlewoodschen Umkehrungen des Abelschen Stetigkeitssatzes (German), Math. Z. 32 (1930), no. 1, 319–320, DOI 10.1007/BF01194636. MR1545168 [194] Carlos E. Kenig, Johannes Sj¨ ostrand, and Gunther Uhlmann, The Calder´ on problem with partial data, Ann. of Math. (2) 165 (2007), no. 2, 567–591, DOI 10.4007/annals.2007.165.567. MR2299741 [195] Anatoly A. Kilbas, Hari M. Srivastava, and Juan J. Trujillo, Theory and applications of fractional differential equations, North-Holland Mathematics Studies, vol. 204, Elsevier Science B.V., Amsterdam, 2006. MR2218073 [196] J. Klafter and I. M. Sokolov, First steps in random walks: From tools to applications, Oxford University Press, Oxford, 2011, DOI 10.1093/acprof:oso/9780199234868.001.0001. MR2894872 [197] M. Klimek and O. P. Agrawal, Fractional Sturm-Liouville problem, Comput. Math. Appl. 66 (2013), no. 5, 795–812, DOI 10.1016/j.camwa.2012.12.011. MR3089387 [198] Anatoly N. Kochubei, Distributed order calculus and equations of ultraslow diffusion, J. Math. Anal. Appl. 340 (2008), no. 1, 252–281, DOI 10.1016/j.jmaa.2007.08.024. MR2376152 [199] Robert Kohn and Michael Vogelius, Determining conductivity by boundary measurements, Comm. Pure Appl. Math. 37 (1984), no. 3, 289–298, DOI 10.1002/cpa.3160370302. MR739921
494
Bibliography
[200] M. M. Kokurin, The uniqueness of a solution to the inverse Cauchy problem for a fractional differential equation in a Banach space, Russian Math. (Iz. VUZ) 57 (2013), no. 12, 16–30, DOI 10.3103/S1066369X13120037. Translation of Izv. Vyssh. Uchebn. Zaved. Mat. 2013, no. 12, 19–35. MR3230393 [201] Richard Kowar and Otmar Scherzer, Attenuation models in photoacoustics, Mathematical modeling in biomedical imaging. II, Lecture Notes in Math., vol. 2035, Springer, Heidelberg, 2012, pp. 85–130, DOI 10.1007/978-3-642-22990-9 4. MR3024671 [202] M. A. Krasnoselski˘ı, Positive solutions of operator equations, P. Noordhoff Ltd., Groningen, 1964. Translated from the Russian by Richard E. Flaherty; edited by Leo F. Boron. MR0181881 [203] Adam Kubica, Katarzyna Ryszewska, and Masahiro Yamamoto, Time-fractional differential equations—a theoretical introduction, SpringerBriefs in Mathematics, Springer, Singapore, c [2020] 2020, DOI 10.1007/978-981-15-9066-5. MR4200127 [204] Peter Kuchment and Leonid Kunyansky, Mathematics of photoacoustic and thermoacoustic tomography, Handbook of mathematical methods in imaging. Vol. 1, 2, 3, Springer, New York, 2015, pp. 1117–1167. MR3560086 [205] Mateusz Kwa´snicki, Ten equivalent definitions of the fractional Laplace operator, Fract. Calc. Appl. Anal. 20 (2017), no. 1, 7–51, DOI 10.1515/fca-2017-0002. MR3613319 [206] S. F. Lacroix, Traite du Calcul Differentiel et du Calcul Integral, Mme. VeCourcier, Paris, Tome Troisieme, seconde edition, 1812. [207] R. Latt` es and J.-L. Lions, The method of quasi-reversibility. Applications to partial differential equations, Modern Analytic and Computational Methods in Science and Mathematics, No. 18, American Elsevier Publishing Co., Inc., New York, 1969. Translated from the French edition and edited by Richard Bellman. MR0243746 [208] A.-M. Legendre, Exercices de Calcul Int´ egral, volume 1, Coucier, Paris, 1811. [209] G. W. Leibniz, Mathematische Schriften. Bd. IV: Briefwechsel zwischen Leibniz, Wallis, Varignon, Guido Grandi, Zendrini, Hermann und Freiherrn von Tschirnhaus (German), Georg Olms Verlagsbuchhandlung, Hildesheim, 1962. Herausgegeben von C. I. Gerhardt. MR0141578 [210] Giovanni Leoni, A first course in Sobolev spaces, Graduate Studies in Mathematics, vol. 105, American Mathematical Society, Providence, RI, 2009, DOI 10.1090/gsm/105. MR2527916 [211] Norman Levinson, The inverse Sturm-Liouville problem, Mat. Tidsskr. B 1949 (1949), 25– 30. MR32067 [212] Changpin Li, Zhengang Zhao, and YangQuan Chen, Numerical approximation of nonlinear fractional differential equations with subdiffusion and superdiffusion, Comput. Math. Appl. 62 (2011), no. 3, 855–875, DOI 10.1016/j.camwa.2011.02.045. MR2824676 [213] Gongsheng Li, Dali Zhang, Xianzheng Jia, and Masahiro Yamamoto, Simultaneous inversion for the space-dependent diffusion coefficient and the fractional order in the time-fractional diffusion equation, Inverse Problems 29 (2013), no. 6, 065014, 36, DOI 10.1088/02665611/29/6/065014. MR3066390 [214] Zhiyuan Li, Yikan Liu, and Masahiro Yamamoto, Initial-boundary value problems for multiterm time-fractional diffusion equations with positive constant coefficients, Appl. Math. Comput. 257 (2015), 381–397, DOI 10.1016/j.amc.2014.11.073. MR3320678 [215] Zhiyuan Li, Yuri Luchko, and Masahiro Yamamoto, Asymptotic estimates of solutions to initial-boundary-value problems for distributed order time-fractional diffusion equations, Fract. Calc. Appl. Anal. 17 (2014), no. 4, 1114–1136, DOI 10.2478/s13540-014-0217-x. MR3254683 [216] Zhiyuan Li and Masahiro Yamamoto, Initial-boundary value problems for linear diffusion equation with multiple time-fractional derivatives, arXiv:1306.2778, 2013.
Bibliography
495
[217] Zhiyuan Li and Masahiro Yamamoto, Uniqueness for inverse problems of determining orders of multi-term time-fractional derivatives of diffusion equation, Appl. Anal. 94 (2015), no. 3, 570–579, DOI 10.1080/00036811.2014.926335. MR3306120 [218] Zhiyuan Li and Masahiro Yamamoto, Uniqueness for inverse problems of determining orders of multi-term time-fractional derivatives of diffusion equation, Appl. Anal. 94 (2015), no. 3, 570–579, DOI 10.1080/00036811.2014.926335. MR3306120 [219] Elliott H. Lieb, Sharp constants in the Hardy-Littlewood-Sobolev and related inequalities, Ann. of Math. (2) 118 (1983), no. 2, 349–374, DOI 10.2307/2007032. MR717827 [220] Joseph Liouville, M´ emoire sur quelques qu´ estions de g´ eometrie et de m´ ecanique, et sur un nouveau genre de calcul pour r´ esoudre ces qu´ estions, Journal de l’Ecole Polytechnique, XIII (1832), 1–69. [221] Anna Lischke, Guofei Pang, Mamikon Gulian, Fangying Song, Christian Glusa, Xiaoning Zheng, Zhiping Mao, Wei Cai, Mark M. Meerschaert, Mark Ainsworth, and George Em Karniadakis, What is the fractional Laplacian? A comparative review with new results, J. Comput. Phys., 404 (2020), no. 62, 109009. [222] Jun S. Liu, Monte Carlo strategies in scientific computing, Springer Series in Statistics, Springer, New York, 2008. MR2401592 [223] Yikan Liu, William Rundell, and Masahiro Yamamoto, Strong maximum principle for fractional diffusion equations and an application to an inverse source problem, Fract. Calc. Appl. Anal. 19 (2016), no. 4, 888–906, DOI 10.1515/fca-2016-0048. MR3543685 [224] Alexander Logunov and Eugenia Malinnikova, Nodal sets of Laplace eigenfunctions: estimates of the Hausdorff measure in dimensions two and three, 50 years with Hardy spaces, Oper. Theory Adv. Appl., vol. 261, Birkh¨ auser/Springer, Cham, 2018, pp. 333–344. MR3792104 [225] Alexander Logunov and Eugenia Malinnikova, Review of Yau’s conjecture on zero sets of Laplace eigenfunctions, Current developments in mathematics 2018, Int. Press, Somerville, c MA, [2020] 2020, pp. 179–212. MR4363378 [226] Bruce D. Lowe, Michael Pilant, and William Rundell, The recovery of potentials from finite spectral data, SIAM J. Math. Anal. 23 (1992), no. 2, 482–504, DOI 10.1137/0523023. MR1147873 [227] Bruce D. Lowe and William Rundell, The determination of a coefficient in a parabolic equation from input sources, IMA J. Appl. Math. 52 (1994), no. 1, 31–50, DOI 10.1093/imamat/52.1.31. MR1270801 [228] Yu. Luchko, Asymptotics of zeros of the Wright function, Z. Anal. Anwendungen 19 (2000), no. 2, 583–595, DOI 10.4171/ZAA/970. MR1769012 [229] Yurii Luchko and Rudolf Gorenflo, An operational method for solving fractional differential equations with the Caputo derivatives, Acta Math. Vietnam. 24 (1999), no. 2, 207–233. MR1710779 [230] Yury Luchko, Algorithms for evaluation of the Wright function for the real arguments’ values, Fract. Calc. Appl. Anal. 11 (2008), no. 1, 57–75. MR2379273 [231] Yury Luchko, Boundary value problems for the generalized time-fractional diffusion equation of distributed order, Fract. Calc. Appl. Anal. 12 (2009), no. 4, 409–422. MR2598188 [232] Yury Luchko, Maximum principle for the generalized time-fractional diffusion equation, J. Math. Anal. Appl. 351 (2009), no. 1, 218–223, DOI 10.1016/j.jmaa.2008.10.018. MR2472935 [233] Yury Luchko, Initial-boundary problems for the generalized multi-term time-fractional diffusion equation, J. Math. Anal. Appl. 374 (2011), no. 2, 538–548, DOI 10.1016/j.jmaa.2010.08.048. MR2729240 [234] F. Mainardi, The fundamental solutions for the fractional diffusion-wave equation, Appl. Math. Lett. 9 (1996), no. 6, 23–28, DOI 10.1016/0893-9659(96)00089-4. MR1419811
496
Bibliography
[235] Francesco Mainardi, Fractional calculus and waves in linear viscoelasticity: An introduction to mathematical models, Imperial College Press, London, 2010, DOI 10.1142/9781848163300. MR2676137 [236] Francesco Mainardi, Yuri Luchko, and Gianni Pagnini, The fundamental solution of the space-time fractional diffusion equation, Fract. Calc. Appl. Anal. 4 (2001), no. 2, 153–192. MR1829592 [237] Francesco Mainardi, Antonio Mura, and Gianni Pagnini, The M -Wright function in timefractional diffusion processes: a tutorial survey, Int. J. Differ. Equ., posted on 2010, Art. ID 104505, 29, DOI 10.1155/2010/104505. MR2592742 [238] Francesco Mainardi and M Tomirotti, On a special function arising in the time fractional diffusion-wave equation, in Transform Methods & Special Functions, Sofia’94 (P. Rusev, I. Dimovski, and V. Kiryakova, eds.), pp. 171–183. Science Culture Technology Publ., Singapore, 1995. [239] Zhiping Mao, Sheng Chen, and Jie Shen, Efficient and accurate spectral method using generalized Jacobi functions for solving Riesz fractional differential equations, Appl. Numer. Math. 106 (2016), 165–181, DOI 10.1016/j.apnum.2016.04.002. MR3499964 [240] R. Marchand, T. McDevitt, and R. Triggiani, An abstract semigroup approach to the thirdorder Moore-Gibson-Thompson partial differential equation arising in high-intensity ultrasound: structural decomposition, spectral analysis, exponential stability, Math. Methods Appl. Sci. 35 (2012), no. 15, 1896–1929, DOI 10.1002/mma.1576. MR2982472 [241] V. A. Marˇ cenko, Concerning the theory of a differential operator of the second order (Russian), Doklady Akad. Nauk SSSR. (N.S.) 72 (1950), 457–460. MR0036916 [242] William McLean, Regularity of solutions to a time-fractional diffusion equation, ANZIAM J. 52 (2010), no. 2, 123–138, DOI 10.1017/S1446181111000617. MR2832607 [243] Mark M. Meerschaert and Hans-Peter Scheffler, Limit distributions for sums of independent random vectors: Heavy tails in theory and practice, Wiley Series in Probability and Statistics: Probability and Statistics, John Wiley & Sons, Inc., New York, 2001. MR1840531 [244] Mark M. Meerschaert and Alla Sikorskii, Stochastic models for fractional calculus, De Gruyter Studies in Mathematics, vol. 43, Walter de Gruyter & Co., Berlin, 2012. MR2884383 [245] Ralf Metzler and Joseph Klafter, The random walk’s guide to anomalous diffusion: a fractional dynamics approach, Phys. Rep. 339 (2000), no. 1, 77, DOI 10.1016/S03701573(00)00070-3. MR1809268 [246] Ralf Metzler and Joseph Klafter, The restaurant at the end of the random walk: recent developments in the description of anomalous transport by fractional dynamics, J. Phys. A 37 (2004), no. 31, R161–R208, DOI 10.1088/0305-4470/37/31/R01. MR2090004 [247] R. E. Meyer, A simple explanation of the Stokes phenomenon, SIAM Rev. 31 (1989), no. 3, 435–445, DOI 10.1137/1031090. MR1012299 [248] J. G. Mikusi´ nski, Sur les ´ equations diff´ erentielles du calcul op´ eratoire et leurs applications aux ´ equations classiques aux d´ eriv´ ees partielles (French), Studia Math. 12 (1951), 227–270, DOI 10.4064/sm-12-1-227-270. MR46550 [249] Kenneth S. Miller and Stefan G. Samko, A note on the complete monotonicity of the generalized Mittag-Leffler function, Real Anal. Exchange 23 (1997/98), no. 2, 753–755. MR1639957 [250] G¨ osta Magnus Mittag-Leffler, Sur la nouvelle function Ea , C. R. Acad. Sci. Paris, 137 (1903), 554–558. [251] G. Mittag-Leffler, Sur la repr´ esentation analytique d’une branche uniforme d’une fonction monog` ene (French), Acta Math. 29 (1905), no. 1, 101–181, DOI 10.1007/BF02403200. cinqui` eme note. MR1555012 [252] Elliott W. Montroll and George H. Weiss, Random walks on lattices. II, J. Mathematical Phys. 6 (1965), 167–181, DOI 10.1063/1.1704269. MR172344 [253] T. B. Moodie and R. J. Tait, On thermal transients with finite wave speeds, Acta Mech. 50 (1983), no. 1-2, 97–104, DOI 10.1007/BF01170443. MR729370
Bibliography
497
[254] J. D. Murray, Mathematical biology. I: An introduction, 3rd ed., Interdisciplinary Applied Mathematics, vol. 17, Springer-Verlag, New York, 2002. MR1908418 [255] Mark Naber, Distributed order fractional sub-diffusion, Fractals 12 (2004), no. 1, 23–32, DOI 10.1142/S0218348X04002410. MR2052734 [256] Adrian Nachman, John Sylvester, and Gunther Uhlmann, An n-dimensional Borg-Levinson theorem, Comm. Math. Phys. 115 (1988), no. 4, 595–605. MR933457 [257] Adrian I. Nachman, Reconstructions from boundary measurements, Ann. of Math. (2) 128 (1988), no. 3, 531–576, DOI 10.2307/1971435. MR970610 [258] A. M. Nahuˇsev, The Sturm-Liouville problem for a second order ordinary differential equation with fractional derivatives in the lower terms (Russian), Dokl. Akad. Nauk SSSR 234 (1977), no. 2, 308–311. MR0454145 [259] John P. Nolan, Numerical calculation of stable densities and distribution functions: Heavy tails and highly volatile phenomena, Comm. Statist. Stochastic Models 13 (1997), no. 4, 759–774, DOI 10.1080/15326349708807450. MR1482292 [260] Jace W. Nunziato, On heat conduction in materials with memory, Quart. Appl. Math. 29 (1971), 187–204, DOI 10.1090/qam/295683. MR295683 [261] Ljubica Oparnica and Endre S¨ uli, Well-posedness of the fractional Zener wave equation for heterogeneous viscoelastic materials, Fract. Calc. Appl. Anal. 23 (2020), no. 1, 126–166, DOI 10.1515/fca-2020-0005. MR4069924 [262] John E. Pearson, Complex patterns in a simple system, Science, 261 (1993), no. 5118, 189– 192. [263] K. Pearson, The problem of the random walk, Nature, 72 (1905), no. 294. [264] Marta Pellicer and Joan Sol` a-Morales, Optimal scalar products in the Moore-GibsonThompson equation, Evol. Equ. Control Theory 8 (2019), no. 1, 203–220, DOI 10.3934/eect.2019011. MR3923818 [265] Alan Pierce, Unique identification of eigenvalues and coefficients in a parabolic problem, SIAM J. Control Optim. 17 (1979), no. 4, 494–499, DOI 10.1137/0317035. MR534419 [266] Michael S. Pilant and William Rundell, Iteration schemes for unknown coefficient problems arising in parabolic equations, Numer. Methods Partial Differential Equations 3 (1987), no. 4, 313–325, DOI 10.1002/num.1690030404. MR1108124 [267] Michael Pilant and William Rundell, Fixed point methods for a nonlinear parabolic inverse coefficient problem, Comm. Partial Differential Equations 13 (1988), no. 4, 469–493, DOI 10.1080/03605308808820549. MR920911 [268] Michael Pilant and William Rundell, An iteration method for the determination of an unknown boundary condition in a parabolic initial-boundary value problem, Proc. Edinburgh Math. Soc. (2) 32 (1989), no. 1, 59–71, DOI 10.1017/S001309150000691X. MR981993 [269] Michael S. Pilant and William Rundell, An inverse problem for a nonlinear parabolic equation, Comm. Partial Differential Equations 11 (1986), no. 4, 445–457, DOI 10.1080/03605308608820430. MR829324 [270] Everett Pitcher and W. E. Sewell, A correction, Bull. Amer. Math. Soc. 44 (1938), no. 12, 888, DOI 10.1090/S0002-9904-1938-06898-X. MR1563897 [271] Everett Pitcher and W. E. Sewell, Existence theorems for solutions of differential equations of non-integral order, Bull. Amer. Math. Soc. 44 (1938), no. 2, 100–107, DOI 10.1090/S00029904-1938-06695-5. MR1563690 [272] Igor Podlubny, Fractional differential equations: An introduction to fractional derivatives, fractional differential equations, to methods of their solution and some of their applications, Mathematics in Science and Engineering, vol. 198, Academic Press, Inc., San Diego, CA, 1999. MR1658022 [273] Igor Podlubny, Richard L. Magin, and Iryna Trymorush, Niels Henrik Abel and the birth of fractional calculus, Fract. Calc. Appl. Anal. 20 (2017), no. 5, 1068–1075, DOI 10.1515/fca2017-0057. MR3721889
498
Bibliography
λ
[274] Harry Pollard, The representation of e−x as a Laplace integral, Bull. Amer. Math. Soc. 52 (1946), 908–910, DOI 10.1090/S0002-9904-1946-08672-3. MR18286 [275] Harry Pollard, The completely monotonic character of the Mittag-Leffler function Ea (−x), Bull. Amer. Math. Soc. 54 (1948), 1115–1116, DOI 10.1090/S0002-9904-1948-09132-7. MR27375 ¨ [276] Georg P´ olya, Uber die Nullstellen gewisser ganzer Funktionen (German), Math. Z. 2 (1918), no. 3-4, 352–383, DOI 10.1007/BF01199419. MR1544326 [277] A. Yu. Popov, On the number of real eigenvalues of a boundary value problem for a secondorder equation with a fractional derivative (Russian, with English and Russian summaries), Fundam. Prikl. Mat. 12 (2006), no. 6, 137–155, DOI 10.1007/s10948-008-0169-7; English transl., J. Math. Sci. (N.Y.) 151 (2008), no. 1, 2726–2740. MR2314136 [278] A. Yu. Popov and A. M. Sedletski˘ı, Distribution of roots of Mittag-Leffler functions (Russian), Sovrem. Mat. Fundam. Napravl. 40 (2011), 3–171, DOI 10.1007/s10958-0131255-3; English transl., J. Math. Sci. (N.Y.) 190 (2013), no. 2, 209–409. MR2883249 [279] J¨ urgen P¨ oschel and Eugene Trubowitz, Inverse spectral theory, Pure and Applied Mathematics, vol. 130, Academic Press, Inc., Boston, MA, 1987. MR894477 [280] C. P¨ oschl, Tikhonov regularization with general residual term, PhD thesis, University of Innsbruck, 2008. [281] Y. Z. Povstenko, Thermoelasticity which uses fractional heat conduction equation (English, with English, Russian and Ukrainian summaries), Mat. Metodi Fiz.-Mekh. Polya 51 (2008), no. 2, 239–246, DOI 10.1007/s10958-009-9636-3; English transl., J. Math. Sci. (N.Y.) 162 (2009), no. 2, 296–305. MR2459441 [282] Yuriy Povstenko. Fractional Cattaneo-type equations and generalized thermoelasticity, Journal of Thermal Stresses, 34 (2011), no. 2, 97–114. [283] Murray H. Protter and Hans F. Weinberger, Maximum principles in differential equations, Prentice-Hall, Inc., Englewood Cliffs, N.J., 1967. MR0219861 [284] Robert A. Rankin, An Introduction to Mathematical Analysis. International Series of Monographs in Pure and Applied Mathematics. MacMillan, 1963. [285] Elena Resmerita and Otmar Scherzer, Error estimates for non-quadratic regularization and the relation to enhancement, Inverse Problems 22 (2006), no. 3, 801–814, DOI 10.1088/02665611/22/3/004. MR2235638 [286] Andreas Rieder, On convergence rates of inexact Newton regularizations, Numer. Math. 88 (2001), no. 2, 347–365, DOI 10.1007/PL00005448. MR1826857 [287] Bernhard Riemann, Versuch einer Allgemeinen Auffassung der Integration und Differentiations, in Gesammelte Mathematische Werke und Wissenschaftlicher Nachlass (Richard Dedekind and Heinrich Weber, eds.), Teubner, Leipzig, 1875. [288] M. Riesz, L’int´ egrale de Riemann-Liouville et le probl` eme de Cauchy pour l’´ equation des ondes (French), Bull. Soc. Math. France 67 (1939), 153–170. MR1505102 [289] Marcel Riesz, Integrales de Riemann-Liouville et potentiels, Acta Sci. Math. (Szeged), 9 (1938), 1–42. [290] Marcel Riesz, L’int´ egrale de Riemann-Liouville et le probl` eme de Cauchy (French), Acta Math. 81 (1949), 1–223, DOI 10.1007/BF02395016. MR30102 [291] M. Rivero, J. J. Trujillo, and M. P. Velasco, A fractional approach to the sturm-liouville problem, Central European Journal of Physics, 11 (2013), 1246–1254. [292] Christian P. Robert and George Casella, Monte Carlo statistical methods, 2nd ed., Springer Texts in Statistics, Springer-Verlag, New York, 2004, DOI 10.1007/978-1-4757-4145-2. MR2080278 [293] Walter Rudin, Real and complex analysis, 3rd ed., McGraw-Hill Book Co., New York, 1987. MR924157
Bibliography
499
[294] Angkana R¨ uland and Mikko Salo, The fractional Calder´ on problem: low regularity and stability, Nonlinear Anal. 193 (2020), 111529, 56, DOI 10.1016/j.na.2019.05.010. MR4062981 [295] William Rundell, Determination of an unknown nonhomogeneous term in a linear partial differential equation from overspecified boundary data, Applicable Anal. 10 (1980), no. 3, 231–242, DOI 10.1080/00036818008839304. MR577334 [296] William Rundell, The determination of a parabolic equation from initial and final data, Proc. Amer. Math. Soc. 99 (1987), no. 4, 637–642, DOI 10.2307/2046467. MR877031 [297] William Rundell and Paul Sacks, Numerical technique for the inverse resonance problem, J. Comput. Appl. Math. 170 (2004), no. 2, 337–347, DOI 10.1016/j.cam.2004.01.035. MR2075015 [298] William Rundell and Paul Sacks, An inverse eigenvalue problem for a vibrating string with two Dirichlet spectra, SIAM J. Appl. Math. 73 (2013), no. 2, 1020–1037, DOI 10.1137/120896426. MR3045668 [299] William Rundell and Paul E. Sacks, The reconstruction of Sturm-Liouville operators, Inverse Problems 8 (1992), no. 3, 457–482. MR1166492 [300] William Rundell and Paul E. Sacks, Reconstruction techniques for classical inverse SturmLiouville problems, Math. Comp. 58 (1992), no. 197, 161–183, DOI 10.2307/2153026. MR1106979 [301] William Rundell, Xiang Xu, and Lihua Zuo, The determination of an unknown boundary condition in a fractional diffusion equation, Appl. Anal. 92 (2013), no. 7, 1511–1526, DOI 10.1080/00036811.2012.686605. MR3169116 [302] William Rundell and Masahiro Yamamoto, Uniqueness for an inverse coefficient problem for a one-dimensional time-fractional diffusion equation with non-zero boundary conditions, Applicable Analysis, 2021. [303] W. Rundell and Z. Zhang, Fractional diffusion: recovering the distributed fractional derivative from overposed data, Inverse Problems 33 (2017), no. 3, 035008, 27, DOI 10.1088/13616420/aa573e. MR3626813 [304] William Rundell and Zhidong Zhang, On the identification of source term in the heat equation from sparse data, SIAM J. Math. Anal. 52 (2020), no. 2, 1526–1548, DOI 10.1137/19M1279915. MR4080388 [305] Paul Sacks, Techniques of functional analysis for differential and integral equations, Mathematics in Science and Engineering, Elsevier/Academic Press, London, 2017. MR3643782 [306] Kenichi Sakamoto and Masahiro Yamamoto, Initial value/boundary value problems for fractional diffusion-wave equations and applications to some inverse problems, J. Math. Anal. Appl. 382 (2011), no. 1, 426–447, DOI 10.1016/j.jmaa.2011.04.058. MR2805524 [307] Mikko Salo, The fractional Calder´ on problem, Journ´ ees ´ equations aux d´eriv´ees partielles, 2017, talk:7. [308] Stefan G. Samko, Anatoly A. Kilbas, and Oleg I. Marichev, Fractional integrals and derivatives: Theory and applications, Gordon and Breach Science Publishers, Yverdon, 1993. Edited and with a foreword by S. M. Nikolski˘ı; Translated from the 1987 Russian original; Revised by the authors. MR1347689 [309] A. M. Savchuk and A. A. Shkalikov, On the eigenvalues of the Sturm-Liouville operator with potentials in Sobolev spaces (Russian, with Russian summary), Mat. Zametki 80 (2006), no. 6, 864–884, DOI 10.1007/s11006-006-0204-6; English transl., Math. Notes 80 (2006), no. 5-6, 814–832. MR2311614 [310] Harvey Scher and Elliott W. Montroll, Anomalous transit-time dispersion in amorphous solids, Phys. Rev. B, 12 (Sep 1975), 2455–2477. [311] Otmar Scherzer, Markus Grasmair, Harald Grossauer, Markus Haltmeier, and Frank Lenzen, Variational methods in imaging, Applied Mathematical Sciences, vol. 167, Springer, New York, 2009. MR2455620
500
Bibliography
[312] Thomas Schmelzer and Lloyd N. Trefethen, Computing the gamma function using contour integrals and rational approximations, SIAM J. Numer. Anal. 45 (2007), no. 2, 558–571, DOI 10.1137/050646342. MR2300287 [313] W. R. Schneider, Completely monotone generalized Mittag-Leffler functions, Exposition. Math. 14 (1996), no. 1, 3–16. MR1382012 [314] W. R. Schneider and W. Wyss, Fractional diffusion and wave equations, J. Math. Phys. 30 (1989), no. 1, 134–144, DOI 10.1063/1.528578. MR974464 [315] Thomas Schuster, Barbara Kaltenbacher, Bernd Hofmann, and Kamil S. Kazimierski, Regularization methods in Banach spaces, Radon Series on Computational and Applied Mathematics, vol. 10, Walter de Gruyter GmbH & Co. KG, Berlin, 2012, DOI 10.1515/9783110255720. MR2963507 [316] Thomas I. Seidman and Curtis R. Vogel, Well-posedness and convergence of some regularisation methods for nonlinear ill posed problems, Inverse Problems 5 (1989), no. 2, 227–238. MR991919 [317] Hansj¨ org Seybold and Rudolf Hilfer, Numerical algorithm for calculating the generalized Mittag-Leffler function, SIAM J. Numer. Anal. 47 (2008/09), no. 1, 69–88, DOI 10.1137/070700280. MR2452852 [318] R. E. Showalter, The final value problem for evolution equations, J. Math. Anal. Appl. 47 (1974), 563–572, DOI 10.1016/0022-247X(74)90008-0. MR352644 [319] R. E. Showalter, Quasi-reversibility of first and second order parabolic evolution equations, Improperly posed boundary value problems (Conf., Univ. New Mexico, Albuquerque, N.M., 1974), Res. Notes in Math., No. 1, Pitman, London, 1975, pp. 76–84. MR0477359 [320] R. E. Showalter, Regularization and approximation of second order evolution equations, SIAM J. Math. Anal. 7 (1976), no. 4, 461–472, DOI 10.1137/0507037. MR422808 [321] Ian N. Sneddon, Special functions of mathematical physics and chemistry, 3rd ed., Longman Mathematical Texts, Longman, London-New York, 1980. MR600287 [322] B. Stankovi´ c, On the function of E. M. Wright, Publ. Inst. Math. (Beograd) (N.S.) 10(24) (1970), 113–124. MR280762 [323] G. G. Stokes, On the numerical calculation of a class of definite integrals and infinite series, Trans. Cambridge Philos. Soc., 9 (1850), 166–187. [324] John Sylvester and Gunther Uhlmann, A global uniqueness theorem for an inverse boundary value problem, Ann. of Math. (2) 125 (1987), no. 1, 153–169, DOI 10.2307/1971291. MR873380 [325] Thomas L. Szabo, Time domain wave equations for lossy media obeying a frequency power law, The Journal of the Acoustical Society of America, 96 (1994), no. 1, 491–500. [326] A. Tauber, Ein Satz aus der Theorie der unendlichen Reihen (German), Monatsh. Math. Phys. 8 (1897), no. 1, 273–277, DOI 10.1007/BF01696278. MR1546472 [327] Gerald Teschl, Ordinary differential equations and dynamical systems, Graduate Studies in Mathematics, vol. 140, American Mathematical Society, Providence, RI, 2012, DOI 10.1090/gsm/140. MR2961944 [328] Vidar Thom´ ee, Galerkin finite element methods for parabolic problems, 2nd ed., Springer Series in Computational Mathematics, vol. 25, Springer-Verlag, Berlin, 2006. MR2249024 [329] Andrey N. Tikhonov and Vasiliy Y. Arsenin, Solutions of ill-posed problems, Scripta Series in Mathematics, V. H. Winston & Sons, Washington, D.C.; John Wiley & Sons, New YorkToronto, Ont.-London, 1977. Translated from the Russian; Preface by translation editor Fritz John. MR0455365 [330] E. C. Titchmarsh, Eigenfunction expansions associated with second-order differential equations. Part I, 2nd ed., Clarendon Press, Oxford, 1962. MR0176151 [331] Bradley E. Treeby and B. T. Cox, Modeling power law absorption and dispersion for acoustic propagation using the fractional Laplacian,. The Journal of the Acoustical Society of America, 127 (2010), no. 5, 2741–2748.
Bibliography
501
[332] Fredi Tr¨ oltzsch, Optimal control of partial differential equations: Theory, methods and applications, Graduate Studies in Mathematics, vol. 112, American Mathematical Society, Providence, RI, 2010. Translated from the 2005 German original by J¨ urgen Sprekels, DOI 10.1090/gsm/112. MR2583281 [333] A. M. Turing, The chemical basis of morphogenesis, Philos. Trans. Roy. Soc. London Ser. B 237 (1952), no. 641, 37–72. MR3363444 [334] A. Tychonoff, Ein Fixpunktsatz (German), Math. Ann. 111 (1935), no. 1, 767–776, DOI 10.1007/BF01472256. MR1513031 ´ [335] Gunther Uhlmann, 30 years of Calder´ on’s problem, S´ eminaire Laurent Schwartz—Equations ´ ´ aux d´ eriv´ees partielles et applications. Ann´ ee 2012–2013, S´emin. Equ. D´eriv. Partielles, Ecole Polytech., Palaiseau, 2014, pp. Exp. No. XIII, 25. MR3381003 [336] Fran¸cois Varray, Olivier Basset, Piero Tortoli, and Christian Cachard, Extensions of nonlinear B/A parameter imaging methods for echo mode, IEEE transactions on ultrasonics, ferroelectrics, and frequency control, 58 (2011), no. 6, 1232–1244. [337] Urs V¨ ogeli, Khadijeh Nedaiasl, and Stefan A. Sauter, A fully discrete Galerkin method for Abel-type integral equations, Adv. Comput. Math. 44 (2018), no. 5, 1601–1626, DOI 10.1007/s10444-018-9598-4. MR3874031 [338] Daniel Walgraef and E. Aifantis, Dislocation patterning in fatigued metals as a result of dynamical inst abilities, Journal of Applied Physics, 58 (1985), no. 8, 688–691. [339] G. N. Watson, The Harmonic Functions Associated with the Parabolic Cylinder, Proc. London Math. Soc. (2) 17 (1918), 116–148, DOI 10.1112/plms/s2-17.1.116. MR1575566 [340] H. F. Weinberger, A first course in partial differential equations with complex variables and transform methods, Dover Publications, Inc., New York, 1995. Corrected reprint of the 1965 original. MR1351498 [341] Lutz Weis, Operator-valued Fourier multiplier theorems and maximal Lp -regularity, Math. Ann. 319 (2001), no. 4, 735–758, DOI 10.1007/PL00004457. MR1825406 [342] F. Werner, Inverse problems with Poisson data: Tikhonov-type regularization and iteratively regularized Newton methods, PhD thesis, University of G¨ ottingen, 2012. [343] Aleksander Weron and Rafa l Weron, Computer simulation of L´ evy α-stable variables and processes, Chaos—the interplay between stochastic and deterministic behaviour (Karpacz, 1995), Lecture Notes in Phys., vol. 457, Springer, Berlin, 1995, pp. 379–392, DOI 10.1007/3540-60188-0 67. MR1452625 [344] P. J. Westervelt, Parametric acoustic array, The Journal of the Acoustic Society of America, 35 (1963) 535–537. [345] Hermann Weyl, Das asymptotische Verteilungsgesetz der Eigenwerte linearer partieller Differentialgleichungen (mit einer Anwendung auf die Theorie der Hohlraumstrahlung) (German), Math. Ann. 71 (1912), no. 4, 441–479, DOI 10.1007/BF01456804. MR1511670 [346] Hermann Weyl, Bemerkungen zum Begriff de Differentialquotienten gebrochener Ordnung (German), Vierteljschr. Naturforsch. Ges. Z¨ urich 62 (1917), 296–302. MR3618577 [347] D. V. Widder, Necessary and sufficient conditions for the representation of a function as a Laplace integral, Trans. Amer. Math. Soc. 33 (1931), no. 4, 851–892, DOI 10.2307/1989513. MR1501621 [348] David Vernon Widder, The Laplace transform, Princeton Mathematical Series, vol. 6, Princeton University Press, Princeton, N. J., 1941. MR0005923 ¨ [349] A. Wiman, Uber die Nullstellen der Funktionen Eα (x) (German), Acta Math. 29 (1905), no. 1, 217–234, DOI 10.1007/BF02403204. MR1555016 [350] Margaret G. Wismer, Finite element analysis of broadband acoustic pulses through inhomogenous media with power law attenuation, The Journal of the Acoustical Society of America, 120 (2006), no. 6, 3493–3502.
502
Bibliography
[351] R. Wong and Y.-Q. Zhao, Smoothing of Stokes’s discontinuity for the generalized Bessel function, R. Soc. Lond. Proc. Ser. A Math. Phys. Eng. Sci. 455 (1999), no. 1984, 1381– 1400, DOI 10.1098/rspa.1999.0365. MR1701756 [352] R. Wong and Y.-Q. Zhao, Smoothing of Stokes’s discontinuity for the generalized Bessel function. II, R. Soc. Lond. Proc. Ser. A Math. Phys. Eng. Sci. 455 (1999), no. 1988, 3065– 3084, DOI 10.1098/rspa.1999.0440. MR1807056 [353] R. Wong and Yu-Qiu Zhao, Exponential asymptotics of the Mittag-Leffler function, Constr. Approx. 18 (2002), no. 3, 355–385, DOI 10.1007/s00365-001-0019-3. MR1906764 [354] E. Maitland Wright, On the Coefficients of Power Series Having Exponential Singularities, J. London Math. Soc. 8 (1933), no. 1, 71–79, DOI 10.1112/jlms/s1-8.1.71. MR1574787 [355] E. Maitland Wright, The asymptotic expansion of the generalized Bessel function, Proc. London Math. Soc. (2) 38 (1935), 257–270, DOI 10.1112/plms/s2-38.1.257. MR1576315 [356] Edward M. Wright, The asymptotic expansion of the generalized hypergeometric function, J. London Math. Soc., 10 (1935), 287–293. [357] Edward M. Wright, The generalized Bessel function of order greater than one, Quart. J. Math. Oxford Ser. 11 (1940), 36–48, DOI 10.1093/qmath/os-11.1.36. MR3875 [358] Kˆ osaku Yosida, Lectures on differential and integral equations, Pure and Applied Mathematics, Vol. X, Interscience Publishers, New York-London, 1960. MR0118869 [359] V. Zaburdaev, S. Denisov, and J. Klafter, L´ evy walks, Rev. Modern Phys. 87 (2015), no. 2, 483–530, DOI 10.1103/RevModPhys.87.483. MR3403266 [360] Mohsen Zayernouri and George Em Karniadakis, Fractional Sturm-Liouville eigen-problems: theory and numerical approximation, J. Comput. Phys. 252 (2013), 495–517, DOI 10.1016/j.jcp.2013.06.031. MR3101519 [361] Steve Zelditch, Local and global analysis of nodal sets, Surveys in differential geometry 2017. Celebrating the 50th anniversary of the Journal of Differential Geometry, Surv. Differ. Geom., vol. 22, Int. Press, Somerville, MA, 2018, pp. 365–406. MR3838125 [362] Dong Zhang, Xi Chen, and Xiu-fen Gong, Acoustic nonlinearity parameter tomography for biological tissues via parametric array from a circular piston source–theoretical analysis and computer simulations, The Journal of the Acoustical Society of America, 109 (2001), no. 3, 1219–1225. [363] Dong Zhang, Xiufen Gong, and Shigong Ye, Acoustic nonlinearity parameter tomography for biological specimens via measurements of the second harmonics, The Journal of the Acoustical Society of America, 99 (1996), no. 4, 2397–2402. [364] Shuqin Zhang, Positive solutions for boundary-value problems of nonlinear fractional differential equations, Electron. J. Differential Equations (2006), No. 36, 12. MR2213580 [365] Wei Zhang, Xing Cai, and Sverre Holm, Time-fractional heat equations and negative absolute temperatures, Comput. Math. Appl. 67 (2014), no. 1, 164–171, DOI 10.1016/j.camwa.2013.11.007. MR3141713 [366] Ying Zhang and Xiang Xu, Inverse source problem for a fractional diffusion equation, Inverse Problems 27 (2011), no. 3, 035010, 12, DOI 10.1088/0266-5611/27/3/035010. MR2772529 [367] Zhidong Zhang and Zhi Zhou, Recovering the potential term in a fractional diffusion equation, IMA J. Appl. Math. 82 (2017), no. 3, 579–600, DOI 10.1093/imamat/hxx004. MR3671483
Index
Abel fractional integral, 6 symbol a Ixα , 87 fractional integral operator, 7 summability, 52 Abel, Niels Henrik, 5 Ambartsumian, Victor, 13 argument principle, 46 backwards heat problem, 229 Balakrishnan, Alampallam Venkatachalaiyer, 453, 455 Beta function, 56 Bochner spaces, 477 Borg, Goran, 13 Brahe, Tycho, 12 Bromwich path, 50 Brown, Robert, 2 Brownian motion, 1, 18, 21 Calder´ on problem, 465 fractional, 466 uniqueness, 466 Caputo, Michele, 11, 98 Caputo–Wismer–Kelvin model, 449 Cauchy problem, 14 Cauchy representation theorem, 64 central limit theorem, 20 completely monotone functions, 49, 60 contractive map, 404 diffusion equation, 17 Dirichlet–Neumann map fractional, 468 Djrbashian, Mkhitar, 11, 72, 98
Djrbashian–Caputo derivative, 7, 29, 121, 127–129, 131, 281, 283, 329, 330, 412, 414 α symbol DC a D , 98 Einstein, Albert, 2 energy estimates, 41, 131, 138, 154, 158, 160, 164, 166, 172, 179, 180, 182, 184, 187, 300, 307 Euler, Leonard, 5 extension theorems, 462, 463 Fick’s law, 37, 43 Fick, Adolf, 2 Fourier series, 474 Fourier transform, 19, 457, 473, 474 Fourier, Joseph, 1 Fourier–Laplace transform, 23 Fox H-function, 33 Fr´echet derivative, 382 fundamental solution heat equation, 20 subdiffusion equation, 131 Gamma function, 5, 55 duplication formula, 57 integral representation, 58 plots, 57 reflection formula, 55 Stirling’s formula, 59 Gaussian distribution, 20 Gaussian stochastic process, 1 Gel’fand, Israel, 13 Gel’fand–Levitan equation, 13, 296 Graham, Thomas, 2
503
504
Green’s functions, 121, 123, 127–129, 234, 350 Gronwall’s inequality, 480 fractional, 480 Gr¨ unwald–Letnikov derivative, 9 Hadamard, Jaques, 14 Halley iteration scheme, 214, 218, 220 method application, 380, 386, 389, 392, 394 Hankel path, 50 Hardy, Godfrey Harold, 10 Hardy–Littlewood inequality, 479 Hardy–Littlewood theorem, 10 Hardy–Littlewood–Sobolev inequality, 479 harmonic functions, 458 Helmholtz equation, 15 Hille–Phillips–Yosida theorem, 229, 454 H¨ older inequality, 384, 479 Huygens, Christian, 5, 12 inequalities Cauchy–Schwarz, 357 Friedrichs, 479 Gagliardo–Nirenberg–Sobolev, 476 Gronwall, 402, 480 Hardy–Littlewood, 479 Hardy–Littlewood–Sobolev, 479 Parseval, 354 Poincar´e, 479 Sobolev, 476, 479 Young, 403, 479 inverse problems backwards heat problem, 15 Calder´ on, 463 determining fractional power, 234 damped wave, 246 subdiffusion, 234 DtN map, 463 eigenvalue, 281–286, 290 inverse scattering, 15 inverse Sturm–Liouville, 270 fractional derivative, 292 nonlinearity coefficient imaging, 443 photoacoustic tomography, 418 potential recovery q(x), 346 reconstruction of wave speed, 439 recovering nonlinear boundary condition, 412 θ-function, 412 sideways subdiffusion, 327
Index
sideways superdiffusion, 330 undetermined coefficient, 341 inverse scattering, 15 inverse source problems, 333 f (x) from boundary data, 338 fractional reaction-diffusion, 395 f (u) from final time, 397 systems, 407 recovering f (t) from boundary data, 336 recovering f (x) from final time, 333 Jacobi polynomials, 460 Jacobian matrix, 215, 256, 356, 358–360, 450, 451 Kelvin–Voigt wave equation, 41 Kepler, Johannes, 12 L´evy distribution, 30, 31, 36 L´evy process, 33, 470 Lacroix, Silvestre Fran¸ois, 5 Laplace transform, 7, 429, 471 Laplacian, 470 −, 12, 37, 38, 159, 176, 177, 181, 183, 188, 228, 261, 302, 315, 333, 335–337, 362–365, 380, 383, 389, 390, 396, 418, 419, 444, 454, 466, 474 fractional, 455, 461 Legendre, Adrien-Marie, 5 Leibniz, Gottfried Wilhelm, 4 Levitan, Boris, 13 Liouville transform, 440 Liouville, Joseph, 7 Littlewood, John Ensor, 10 M -Wright function, 82, 85 Mainardi, Francesco, 81, 98 Markov process, 1 maximum principle, 154 fractional case, 457 Maxwell–Catteneo law, 38 mean square displacement, 30 Mittag-Leffler function, 11, 59, 382 asymptotics, 66 completely monotone, 65 identities, 61 Laplace transform, 60 numerical methods, 74, 76 Pad´e approximation, 74–76 plots, 67, 76 recurrence formulae, 61 representation, 62
Index
series, 61 Stokes lines, 77 strict positivity, 74 zeros, 72 Mittag-Leffler, G¨ osta, 11 M¨ untz–Sz´ asz theorem, 51, 245 Newton iteration scheme, 214, 220 method application, 380, 386, 389, 392, 394, 442 Newton, Isaac, 12 Parseval’s theorem, 474 Perrin, Jean Baptiste, 3 Picard iteration, 226 Plancherel’s theorem, 474 Poincar´e inequality, 479 Poisson’s equation, 12 Riesz derivative, 461 pseudo-parabolic equation, 230 random walk, 3 anomalous, 32 divergent mean distribution, 24, 26 divergent variance distribution, 24, 30 fractional Laplacian, 33 lattice, 18 Montroll–Weiss, 20 subdiffusion, 27, 30 superdiffusion, 31 Rayleigh, 3rd Baron, 3 regularisation methods analysis, 205 Gauss–Newton, 224 Halley, 225 Landweber, 205, 211 Levenberg–Marquardt, 222 Morozov discrepancy, 208 paramater choice, 208 quasi-reversibility, 229 Tikhonov, 202, 203, 210 truncated singular value decomposition, 200 regularisation strategy, 199 residue theorem, 46 Riemann, Bernhard, 8 Riemann–Liouville derivative, 7, 121–123, 129, 283, 285, 288 α symbol RL a D , 91 Riesz derivative, 10, 459 fractional integrals, 460
505
fractional operator, 33 Riesz, Frigyes, 10 Rouch´e’s theorem, 422 Schauder fixed point theorem, 401 Schauder spaces, 477 Schauder–Tikhonov fixed point, 227 Schr¨ odinger equation, 440 fractional, 467 semigroup (operator), 453–456 singular value decomposition, 198 Sobolev embedding theorem, 478 fractional space, 476 inequality, 479 space, 475 Stankovi´c, Bogoljub, 81 Stokes phenomenon, 51 Sturm–Liouville problem, 13, 292 fractional, 292, 294 inverse, 13, 292, 294 Tauberian theorems, 52, 66 Feller, 54 Karamata, 54 Watson, 54 tautochrone problem, 5 Tikhonov fixed point theorem, 227, 403 regularisation, 202, 210 Tikhonov, Andrey, 210 ultrasound imaging, 443 waiting time distribution, 21 Weierstrass factorisation theorem, 48 Westervelt equation, 444, 445 Weyl derivative, 10 Weyl eigenvalue estimate, 338 Weyl, Hermann, 9 Wright function, 11, 77 M -Wright variant, 81 asymptotic behaviour, 79, 81 history, 78 integral representation, 84 Laplace and Fourier transforms, 82 numerical computation, 83 plots, 85 Wright, Edward Maitland, 11 Young’s inequality, 479 Zener model, 40
Selected Published Titles in This Series 231 Lu´ıs Barreira and Yakov Pesin, Introduction to Smooth Ergodic Theory, Second Edition, 2023 230 Barbara Kaltenbacher and William Rundell, Inverse Problems for Fractional Partial Differential Equations, 2023 229 Giovanni Leoni, A First Course in Fractional Sobolev Spaces, 2023 228 Henk Bruin, Topological and Ergodic Theory of Symbolic Dynamics, 2022 227 William M. Goldman, Geometric Structures on Manifolds, 2022 226 Milivoje Luki´ c, A First Course in Spectral Theory, 2022 225 Jacob Bedrossian and Vlad Vicol, The Mathematical Analysis of the Incompressible Euler and Navier-Stokes Equations, 2022 224 223 222 221
Ben Krause, Discrete Analogues in Harmonic Analysis, 2022 Volodymyr Nekrashevych, Groups and Topological Dynamics, 2022 Michael Artin, Algebraic Geometry, 2022 David Damanik and Jake Fillman, One-Dimensional Ergodic Schr¨ odinger Operators, 2022
220 219 218 217
Isaac Goldbring, Ultrafilters Throughout Mathematics, 2022 Michael Joswig, Essentials of Tropical Combinatorics, 2021 Riccardo Benedetti, Lectures on Differential Topology, 2021 Marius Crainic, Rui Loja Fernandes, and Ioan M˘ arcut ¸, Lectures on Poisson Geometry, 2021
216 215 214 213
Brian Osserman, A Concise Introduction to Algebraic Varieties, 2021 Tai-Ping Liu, Shock Waves, 2021 Ioannis Karatzas and Constantinos Kardaras, Portfolio Theory and Arbitrage, 2021 Hung Vinh Tran, Hamilton–Jacobi Equations, 2021
212 211 210 209
Marcelo Viana and Jos´ e M. Espinar, Differential Equations, 2021 Mateusz Michalek and Bernd Sturmfels, Invitation to Nonlinear Algebra, 2021 Bruce E. Sagan, Combinatorics: The Art of Counting, 2020 Jessica S. Purcell, Hyperbolic Knot Theory, 2020 ´ ´ 208 Vicente Mu˜ noz, Angel Gonz´ alez-Prieto, and Juan Angel Rojo, Geometry and Topology of Manifolds, 2020 207 Dmitry N. Kozlov, Organized Collapse: An Introduction to Discrete Morse Theory, 2020 206 Ben Andrews, Bennett Chow, Christine Guenther, and Mat Langford, Extrinsic Geometric Flows, 2020 205 Mikhail Shubin, Invitation to Partial Differential Equations, 2020 204 203 202 201
Sarah J. Witherspoon, Hochschild Cohomology for Algebras, 2019 Dimitris Koukoulopoulos, The Distribution of Prime Numbers, 2019 Michael E. Taylor, Introduction to Complex Analysis, 2019 Dan A. Lee, Geometric Relativity, 2019
200 Semyon Dyatlov and Maciej Zworski, Mathematical Theory of Scattering Resonances, 2019 199 Weinan E, Tiejun Li, and Eric Vanden-Eijnden, Applied Stochastic Analysis, 2019 198 Robert L. Benedetto, Dynamics in One Non-Archimedean Variable, 2019 197 Walter Craig, A Course on Partial Differential Equations, 2018 196 Martin Stynes and David Stynes, Convection-Diffusion Problems, 2018 195 Matthias Beck and Raman Sanyal, Combinatorial Reciprocity Theorems, 2018 194 Seth Sullivant, Algebraic Statistics, 2018
For a complete list of titles in this series, visit the AMS Bookstore at www.ams.org/bookstore/gsmseries/.
As the title of the book indicates, this is primarily a book on partial differential equations (PDEs) with two definite slants: toward inverse problems and to the inclusion of fractional derivatives. The standard paradigm, or direct problem, is to take a PDE, including all coefficients and initial/boundary conditions, and to determine the solution. The inverse problem reverses this approach asking what information about coefficients of the model can be obtained from partial information on the solution. Answering this question requires knowledge of the underlying physical model, including the exact dependence on material parameters. The last feature of the approach taken by the authors is the inclusion of fractional derivatives. This is driven by direct physical applications: a fractional derivative model often allows greater adherence to physical observations than the traditional integer order case. The book also has an extensive historical section and the material that can be called “fractional calculus” and ordinary differential equations with fractional derivatives. This part is accessible to advanced undergraduates with basic knowledge on real and complex analysis. At the other end of the spectrum, lie nonlinear fractional PDEs that require a standard graduate level course on PDEs.
For additional information and updates on this book, visit www.ams.org/bookpages/gsm-230
GSM/230
www.ams.org