122 22 8MB
English Pages 275 [268] Year 2023
Mathematics for Industry 37
Takashi TAKIGUCHI · Takashi OHE · Jin Cheng · Cheng HUA Editors
Practical Inverse Problems and Their Prospects Proceedings of PIPTP
Mathematics for Industry Volume 37
Editor-in-Chief Masato Wakayama, Kyushu University, Fukuoka, Japan NTT Institute for Fundamental Mathematics, Tokyo, Japan Series Editors Robert S. Anderssen, Commonwealth Scientific and Industrial Research Organisation, Canberra, ACT, Australia Yuliy Baryshnikov, Department of Mathematics, University of Illinois at Urbana-Champaign, Urbana, IL, USA Heinz H. Bauschke, University of British Columbia, Vancouver, BC, Canada Philip Broadbridge, School of Engineering and Mathematical Sciences, La Trobe University, Melbourne, VIC, Australia Jin Cheng, Department of Mathematics, Fudan University, Shanghai, China Monique Chyba, Department of Mathematics, University of Hawaii at M¯anoa, Honolulu, HI, USA Georges-Henri Cottet, Joseph Fourier University, Grenoble, Isère, France José Alberto Cuminato, University of São Paulo, São Paulo, Brazil Shin-ichiro Ei, Department of Mathematics, Hokkaido University, Sapporo, Japan Yasuhide Fukumoto, Kyushu University, Nishi-ku, Fukuoka, Japan Jonathan R. M. Hosking, IBM T. J. Watson Research Center, Scarsdale, NY, USA Alejandro Jofré, University of Chile, Santiago, Chile Masato Kimura, Faculty of Mathematics and Physics, Kanazawa University, Kanazawa, Japan Kerry Landman, The University of Melbourne, Victoria, Australia Robert McKibbin, Institute of Natural and Mathematical Sciences, Massey University, Palmerston North, Auckland, New Zealand Andrea Parmeggiani, Dir Partenariat IRIS, University of Montpellier 2, Montpellier, Hérault, France Jill Pipher, Department of Mathematics, Brown University, Providence, RI, USA Konrad Polthier, Free University of Berlin, Berlin, Germany Osamu Saeki, Institute of Mathematics for Industry, Kyushu University, Fukuoka, Japan Wil Schilders, Department of Mathematics and Computer Science, Eindhoven University of Technology, Eindhoven, The Netherlands Zuowei Shen, Department of Mathematics, National University of Singapore, Singapore, Singapur, Singapore Kim Chuan Toh, Department of Analytics and Operations, National University of Singapore, Singapore, Singapur, Singapore Evgeny Verbitskiy, Mathematical Institute, Leiden University, Leiden, The Netherlands Nakahiro Yoshida, The University of Tokyo, Meguro-ku, Tokyo, Japan
Aims & Scope The meaning of “Mathematics for Industry” (sometimes abbreviated as MI or MfI) is different from that of “Mathematics in Industry” (or of “Industrial Mathematics”). The latter is restrictive: it tends to be identified with the actual mathematics that specifically arises in the daily management and operation of manufacturing. The former, however, denotes a new research field in mathematics that may serve as a foundation for creating future technologies. This concept was born from the integration and reorganization of pure and applied mathematics in the present day into a fluid and versatile form capable of stimulating awareness of the importance of mathematics in industry, as well as responding to the needs of industrial technologies. The history of this integration and reorganization indicates that this basic idea will someday find increasing utility. Mathematics can be a key technology in modern society. The series aims to promote this trend by 1) providing comprehensive content on applications of mathematics, especially to industry technologies via various types of scientific research, 2) introducing basic, useful, necessary and crucial knowledge for several applications through concrete subjects, and 3) introducing new research results and developments for applications of mathematics in the real world. These points may provide the basis for opening a new mathematics-oriented technological world and even new research fields of mathematics. To submit a proposal or request further information, please use the PDF Proposal Form or contact directly: Smith Ahram Chae, Publishing Editor ([email protected]).
Takashi TAKIGUCHI · Takashi OHE · Jin Cheng · Cheng HUA Editors
Practical Inverse Problems and Their Prospects Proceedings of PIPTP
Editors Takashi TAKIGUCHI Department of Mathematics National Defense Academy of Japan Yokosuka, Kanagawa, Japan Jin Cheng School of Mathematical Sciences Fudan University Shanghai, China
Takashi OHE Department of Applied Mathematics Okayama University of Science Okayama, Japan Cheng HUA Department of Aeronautics and Astronautics Fudan University Shanghai, China
ISSN 2198-350X ISSN 2198-3518 (electronic) Mathematics for Industry ISBN 978-981-99-2407-3 ISBN 978-981-99-2408-0 (eBook) https://doi.org/10.1007/978-981-99-2408-0 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Organizing Committee
Jin Cheng, Fudan University (China) Kenji Hashizume, West Nippon Expressway Engineering Shikoku Company Limited (Japan) Cheng Hua, Fudan University (China) Takayuki Ochi, Tohoku Polytechnic College (Japan) Takashi Ohe, Okayama University of Science (Japan) Takashi Takiguchi, National Defense Academy of Japan (Japan)
Sponsoring Institution Institute of Mathematics for Industry, Kyushu University, Japan
v
Program
Kyushu University IMI Workshop of the Joint Research Projects (I)
Practical Inverse Problems and Their Prospects March 2nd–March 4th, 2022, held online by zoom
March 2nd, Wednesday 9:50 Opening (Chair: T. Ohe) 10:00–10:40 Kenji Hashizume (West Nippon Expressway Shikoku Engineering Company Limited, Japan) Maintenance of permeable asphalt based on quantitative analysis of deterioration by non-integer dimensional analysis 11:00–11:40 Cheng Hua (Fudan University, China) Quantitative estimation of the cavity position near the surface in a concrete structure using Rayleigh and the shear waves: Numerical simulation (Chair: C. Hua) 13:00–13:40 Makoto Maruya (Geoinsight LLC, Japan) Visual 3D reconstruction of a rotating object in space environment with a leastsquares framework 14:00–14:40 Kazumi Tanuma (Gunma University, Japan) Surface waves in anisotropic elasticity and piezoelectricity
vii
viii
Program
15:00–15:40 Mishio Kawashita (Hiroshima University, Japan) Inverse problems for wave equations with the Dirichlet and Neumann cavities (Chair: M. Machida) 19:00–19:40 Hiroshi Fujiwara (Kyoto University, Japan) On feasibility of numerical reconstruction of the attenuation coefficient in the stationary radiative transport equation 20:00–20:40 Daisuke Kawagoe (Kyoto University, Japan) Propagation of boundary-induced discontinuity in stationary radiative transfer and its application to the optical tomography 21:00–21:40 Mikyoung Lim (Korea Advanced Institute of Science and Technology, Korea) Shape recovery of a planar Lipschitz inclusion using the Faber polynomials
March 3rd, Thursday (Chair: K. Tanuma) 10:00–10:40 Alexandru Tamasan (University of Central Florida, USA) Applications of A-analytic theory to inverse source problems in two dimensional radiative transport 11:00–11:40 Yikan Liu (Hokkaido University, Japan) Inverse source problems for time-fractional diffusion equations: Old and new (Chair: C. Hua) 13:00–13:40 Takahiro Saitoh (Gunma University, Japan) Inverse scattering for a cavity in 2-D anisotropic and viscoelastic solids 14:00–14:40 Toshiaki Takabatake (West Nippon Expressway Engineering Shikoku Company Limited, Japan) Investigation of reinforcing bars in reinforced concrete structures by ultrasonic measurements 15:00–15:40 Takayuki Ochi (Tohoku Polytechnic College, Japan) Recommendation to teach how to analyze an overdetermined system of linear equations with no solution in the class of linear algebra (Chair: T. Takiguchi) 19:00–19:40 Yoshifumi Saijo (Tohoku University, Japan) Beyond conventional medical ultrasound imaging 20:00–20:40 Jin Cheng (Fudan University, China) The inverse contact problem in elasticity 21:00–21:40 Masahiro Yamamoto (The University of Tokyo, Japan) Theoretical backgrounds and inverse problems for anomalous diffusion-wave equations in applications
Program
ix
March 4th, Friday (Chair: T. Ohe) 10:00–10:40 Yuko Hatano (Tsukuba University, Japan) Application of inverse problems to Fukushima accident 11:00–11:40 Manabu Machida (Hamamatsu University School of Medicine, Japan) Direct and iterative reconstruction methods in optical tomography (Chair: T. Takiguchi) 13:00–13:40 Naoya Oishi (Kyoto University, Japan) Denoising in non-invasive medical imaging 14:00–14:40 Dietmar Hömberg (Weierstrass Institute, Germany) On two-scale topology optimization 15:00–15:40 Takashi Ohe (Okayama University of Science, Japan) Algebraic reconstruction of a dipolar wave source from observations on several points 16:00 Closing The time schedule in this program is based on Japan standard time (JST=UTC+9).
Preface
These are the proceedings of Kyushu University IMI Workshop of the Joint Research Projects (I) “Practical inverse problems and their prospects”, held online by zoom, from March 2 to 4, 2022. For the past decade, the organizers of this workshop had organized the workshops on practical inverse problems about once a year with the support by Institute of Mathematics for Industry, Kyushu University. This workshop was held as the compilation of the series of such workshops. These proceedings contain the papers on the followings topics: – – – – – – – – –
Medical imaging. Mathematical problems for maintenance of infrastructure. Non-invasive and/or non-destructive inspections. Education of linear algebra from the viewpoint of practical applications. Inverse problems in remote sensing. Inverse problems of anomalous diffusions. Inverse problems for elastic and/or visco-elastic materials. Reconstruction of cavities and/or shapes. Reconstruction of dipolar wave sources.
The speakers and participants in this workshop having various majors and backgrounds, most of the authors in these proceedings kindly introduced the physical or phenomenal backgrounds of their problems and gave short summary of the related researches before introducing their latest contributions, in order that each paper be a short introductory textbook with the author’s or authors’ latest contributions. On behalf of the organizing committee, I would express our gratefulness to IMI Joint Usage/Research Center Office, Kyushu University, for its kind support in finance and administration. At the end of Preface, I would like to devote these proceedings to Professor Tsutomu Sakurai, my supervisor when I was an undergraduate student and one of the initial members of this series of workshops, who passed away in 2021. Yokosuka, Japan March 2023
Takashi TAKIGUCHI
xi
Contents
Beyond Conventional Medical Ultrasound Imaging . . . . . . . . . . . . . . . . . . . Yoshifumi Saijo, Norma Hermawan, Hayato Ikeda, Ryo Shintate, Shiho Furudate, and Takuro Ishii
1
Denoising with Graphics Processing Units and Deep Learning in Non-invasive Medical Imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Naoya Oishi
15
Tomography from Scattered Signals Obeying the Stationary Radiative Transport Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I-Kun Chen, Hiroshi Fujiwara, and Daisuke Kawagoe
27
The Algebraic Range of the Planar X-Ray Transform of Symmetric Tensors and Applications to Noise Reduction . . . . . . . . . . . . . . . . . . . . . . . . . Hiroshi Fujiwara, Kamran Sadiq, and Alexandru Tamasan
47
Radiative Transport Equation in Optical Tomography . . . . . . . . . . . . . . . . Manabu Machida Maintenance of Permeable Asphalt Based on Quantitative Analysis of Deterioration by Non-integer Dimensional Analysis . . . . . . . . Kenji Hashizume and Takashi Takiguchi Investigation of Reinforcing Bars in Reinforced Concrete Structures by Ultrasonic Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Toshiaki Takabatake, Kenji Hashizume, Takayuki Ochi, and Takashi Takiguchi
69
81
97
Recommendation to Teach How to Analyze an Overdetermined System of Linear Equations with No Solution in the Class of Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Takayuki Ochi, Kenji Hashizume, Satoshi Ishikawa, Toshiaki Takabatake, and Takashi Takiguchi
xiii
xiv
Contents
Visual 3D Reconstruction of a Rotating Object in Space Environment with a Least-Squares Framework . . . . . . . . . . . . . . . . . . . . . . 127 Makoto Maruya and Takashi Takiguchi Uniqueness of Inverse Source Problems for Time-Fractional Diffusion Equations with Singular Functions in Time . . . . . . . . . . . . . . . . . 145 Yikan Liu and Masahiro Yamamoto Long-time Asymptotic Estimate and a Related Inverse Source Problem for Time-Fractional Wave Equations . . . . . . . . . . . . . . . . . . . . . . . . 163 Xinchi Huang and Yikan Liu A Big Data Processing Technique Based on Tikhonov Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 Yu Chen, Jin Cheng, Jiantang Zhang, and Min Zhong Quantitative Estimation of Crack on or Near Surface Using Laser-Ultrasonic Surface Wave: Numerical Simulation . . . . . . . . . . . . . . . . 215 Cheng Hua and Takashi Takiguchi Enclosure Method for Inverse Problems with the Dirichlet and Neumann Combined Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 Mishio Kawashita and Wakako Kawashita Algebraic Reconstruction of a Dipolar Wave Source from Observations on Several Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 Takashi Ohe and Misa Yokoyama
Beyond Conventional Medical Ultrasound Imaging Yoshifumi Saijo, Norma Hermawan, Hayato Ikeda, Ryo Shintate, Shiho Furudate, and Takuro Ishii
Abstract Ultrasound (US) imaging is one of the most popular medical imaging modalities. Generally, US is characterized as easy, safe, portable, and inexpensive. Besides those, US is available functional and multi-scale imaging. Examples of the functional imaging include functional ultrasound imaging such as speckle tracking of the B-mode (brightness mode) ultrasound image for quantitative assessment of tissue motion and two-dimensional (2D) blood flow imaging named echodynamography, by applying fluid dynamics function. Examples of multi-scale imaging include 350MHz acoustic microscope to visualize a cultured single cell and Optical Resolution Photoacoustic Microscopy (OR-PAM), in which nanosecond pulsed light generates photoacoustic signal, to visualize a single red blood cell. Keywords Ultrasound · Speckle tracking · Blood flow · Acoustic microscopy · Photoacoustic imaging
1 Speckle Tracking of B-mode Ultrasound Image 1.1 Principle of Speckle Tracking Basically, an ultrasonic imaging is reconstruction of a reflected echo of a transmitted ultrasound pulse. When an ultrasound signal travels in a tissue, it might come across reflector or scatterers. A reflector is a location within the tissue where ultrasound waves are reflected in a specular fashion. It is formed by tissue transition with different acoustic impedance. On the other hand, scatterers reflect the ultrasound waves in all directions. The scatter locations are spreading in the tissue, hence the detected signal by ultrasound transducer is normally a superposition of reflection by multiple scatterers. The different phase between signals makes the constructive or destructive interference on the reflected signals. The constructive superposition will be larger Y. Saijo (B) · N. Hermawan · H. Ikeda · R. Shintate · S. Furudate · T. Ishii Tohoku University, Sendai 980-8579, Japan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 T. Takiguchi et al. (eds.), Practical Inverse Problems and Their Prospects, Mathematics for Industry 37, https://doi.org/10.1007/978-981-99-2408-0_1
1
2
Y. Saijo et al.
in amplitude, meanwhile the destructive interference will look smaller. The spatial distribution of this dark and bright pattern is commonly defined as speckle. Speckle tracking is one of the methods for tissue characterizing by tracking the motion of ultrasound speckles during a subject movement. In a speckle tracking method, a spatial distribution of gray values which is called speckle pattern constitutes the underlying biological tissue such as ligaments. The position of this unique pattern may move following the tissue motion. Tracking this pattern during a tissue movement within 2D image is the basic principle of the speckle tracking methods. The speckle tracking techniques are divided into block-matching and optical flow approach. In the block-matching approach, the best match of a rectangular patch in the image a frame is sought within a Region of Interest (ROI) in the following frame. The speckle motion is determined based on the maximum similarity of the matching. A map of velocity vector is produced after completing the search on a predetermined ROI.
1.2 Clinical Study on Adhesive Capsulitis (Hermawan et al. 2022) Adhesive capsulitis, also called frozen shoulder or shoulder of 50s in Japanese, is a painful condition in which the movement of the shoulder becomes limited. Frozen shoulder occurs when the strong connective tissue surrounding the shoulder joint becomes thick, stiff, and inflamed. A limited sliding movement of the supraspinatus tendon was observed by dynamic sonography. The routine examination of the shoulder was performed including the rotator cuff and biceps tendon, with the arm held in a neutral position, maximal internal rotation, and hyperextension. Under a sonography evaluation during the first external rotation motion of a normal shoulder, a subscapularis is observed gliding freely under the deltoid muscle. The free gliding motion can be represented by the high velocity relative to the surrounding tissue. The adhesion is considered when the subscapularis moves at similar speed to the deltoid muscles. A total of 26 participants were enrolled. 22 participants were those diagnosed with Adhesive Capsulitis (AC), in which 12 among them were further classified to secondary AC due to the associated Rotator Cuff Tears (RCT), and the 10 others were classified as idiopathic Adhesive Capsulitis. The rest four participants showed no significant pathological change was found in the clinical examinations. All patients complained with shoulder pain and majority were observed with the restricted motion with some exceptions. Both sides of the shoulder of normal subjects were included as control, six affected shoulder side of idiopathic AC patient and eight affected shoulders of secondary AC patients were analyzed after excluding the measurement with severe contracture where the patient lack ability to perform external rotation. The non-affected shoulders of 10 idiopathic AC and 9 secondary AC patients were included when analyzing the
Beyond Conventional Medical Ultrasound Imaging
3
relationship between Range of Motion (ROM) and the subscapularis adhesion. Since adhesive capsulitis shoulder disease is not traumatic and gradually develops with age, it is not uncommon to have adhesion symptoms on both sides of patient shoulder. Therefore, the non-affected side of adhesive capsulitis was not considered as a perfect control, and thus excluded when finding the relationship of symptoms with diagnosis. All participants underwent B-mode sonography using Aplio i800 TUS-AI800 (Canon Medical Systems Inc.). The measurement data were analyzed by speckle tracking method with a MATLAB® written program. The study was approved by the Ethics Committee Review Board of Tohoku University, and written informed consent was obtained from all participants. Figure 1 shows the velocity analysis of subscapularis and the deltoid muscle in a rectangular ROI of a healthy shoulder. The tissue velocity is represented by arrows pointing to the direction of motion with arrow length proportional to the velocity magnitude. The motion is performed from the 45° internal rotation to the neutral position. Analysis was carried out in the middle frame of the motion and ±3 frames around it. The adhesion index is calculated as a division of deltoid muscle by subscapularis velocity to the direction of motion. In this case, the Adhesion Index (AdI) of exampled normal during middle frame of the movement is 0.17 and the AdI value of ±3 frames around the middle was observed below 0.2. This value is basically interpreted as subscapularis free gliding motion under the deltoid muscle. In this circumstance, subscapularis can be visualized, and the velocity can be easily seen during the movement. Analysis of ligament adhesion by speckle tracking velocity visualization of participant shoulder with adhesive capsulitis diagnosis is shown in Fig. 2. The instantaneous velocity flow pattern was analyzed frame by frame using speckle tracking of the Bmode image. It means that the deltoid muscle moves at almost equal velocity with subscapularis. Adhesion severity between the subscapularis and the deltoid muscle was quantified by speckle tracking method. The application of speckle tracking in orthopedics suggests that the origin of motion limitation could be a potentially better
deltoid muscle subscapularis
Fig. 1 Velocity analysis of subscapularis and the deltoid muscle of a healthy shoulder
4
Y. Saijo et al.
deltoid muscle
subscapularis
Fig. 2 Speckle tracking velocity visualization of shoulder with adhesive capsulitis
substitute for physical ROM assessment since it rules out pain caused by motion limitation in the examination.
2 Echodynamography 2.1 Principle of Echodynamography (Oktamuliani et al. 2019a) Color Doppler imaging shows the two-dimensional (2D) distribution of blood flow. Generally, red indicates blood flow toward the transducer and blue indicates flow away from the transducer (Fig. 3). However, the blood flow measured by conventional Doppler technique is merely the flow component along the ultrasound beam. Thus, true 2D vectors of blood flow on the observation plane should be obtained by either dual angle Doppler measurement from more than two directions or combination with Doppler data and fluid dynamics theories. Operating the principle of continuity equation in EDG is to calculate the blood flow velocity in the perpendicular direction to the beam. The continuity equation for incompressible flow three dimensional is given below: ∂u ∂x
+
∂v ∂y
+
∂w ∂z
=0
(1)
The rest of the components v on the x–y-plane can be estimated by integrating the continuity equation for incompressible flow along the y-axis as follows: v(x, y) = − ∂∂x
y y0−(x)
u x, y ∂ y −
∂ ∂z
y y0−(x)
w x, y ∂ y + v x, y0−(x)
(2)
Beyond Conventional Medical Ultrasound Imaging
5
Red (toward the transducer) Blue (away from the transducer)
Perpendicular to the beam True velocity
Fig. 3 Schematic illustration of velocity component on ultrasonic beam obtained with conventional color Doppler image (left) and the relation between measured and true velocity vector (right) (Colour figure online)
Unfortunately, the velocity component w in the z-direction cannot be measured. Accordingly, the second term of Eq. (2) is ignored assuming w = 0. Then, the velocity component on the x–y-plane is v(x, y) = − ∂∂x
y y0−(x)
u x, y ∂ y
(3)
The equation is known as the stream function that is influenced by the flow function. Fc (x, y) =
y y0−(x)
u x, y ∂ y
(4)
EDG method performs separation of velocity by the beam and perpendicular directions. The vortex and base flow components can separate each one-dimensional velocity. Ultrasound color Doppler knows the velocity of beam direction, the velocity of perpendicular direction is obtained by calculating vortex and base flows. Vortex component is considered as a flow on the observed plane so that classical stream function is applied. The base flow is defined as the flow without formation on the observed plane, considers blood flow to and from another plane, so that flow function is applied. The EDG method obtains the velocity in the perpendicular direction as follows:
6
Y. Saijo et al.
∂ v(x, y) = − ∂x
y y0−(x)
∂ u x, y ∂ y + ∂x
y
1 − k x, y u x, y ∂ y
y0−(x)
+(1 − k(x, y))u(x, y) tan k
(5)
Thus, if the perpendicular flow velocity can be obtained at all measurement points in the LV, combined with the information obtained by velocity in the beam direction and perpendicular flow velocity components can be obtained. Then, velocity vectors obtained as true velocity at each point in the observed plane and distribution of vectors represented as the blood flow. U = u x , u y (u vx + u bx ), u vy + u by
(6)
2.2 Clinical Images of Left Ventricular Blood Flow Figure 4 shows (a) conventional color Doppler image and (b) 2D blood flow vectors obtained with EDG of normal human left ventricle (LV) at the late diastolic phase. In the EDG image, vortex formation is clearly visualized. Figure 5 shows (a) conventional color Doppler image and (b) 2D blood flow vectors at the ejection phase. The flow is concentrated into the LV outflow. The method is already clinically applied and some interesting and important information has been obtained (Nakajima et al. 2011; Oktamuliani et al. 2019b; Tanaka et al. 2019). cm/s (a)
(b)
Fig. 4 a Conventional color Doppler image and b 2D blood flow vectors obtained with EDG of normal human left ventricle (LV) at the late diastolic phase
Beyond Conventional Medical Ultrasound Imaging
7
Fig. 5 a Conventional color Doppler image and b 2D blood flow vectors at the ejection phase
3 Acoustic Microscopy 3.1 High-Frequency Ultrasound and High-Resolution Image The resolution of ultrasound image depends on the ultrasonic frequency because the wavelength and beam width are inversely proportional to the frequency. In clinical settings, 2.5–3.5 MHz ultrasound is used in abdominal echography and echocardiography. 7.5–13 MHz ultrasound is applied for observation of carotid artery and muscles. 20–60 MHz ultrasound is applied for intravascular ultrasound (IVUS) or endoscopic ultrasound (EUS). Trade-off between the resolution and the penetration depth should be considered in the selection of the frequency. The application of scanning acoustic microscopy (SAM) for medicine and biology had started in the early phase in the history of acoustic microscopy. SAM introduced a new form of contrast that was based on the mechanical properties of tissues or cells. There are three unique and important characteristics of SAM compared with other microscopies such as optical, electron, and atomic force. First, SAM is simple and easy histopathological examination because it does not require special staining techniques. The contrast observed in SAM images depends on the acoustic properties (i.e., acoustic impedance, sound speed, and attenuation) and on the topographic contour of the tissue. Second, microscopic acoustic properties measured by SAM provide important information for assessing echo intensity and texture in macroscopic ultrasound images with lower frequency ultrasound. Density ρ and sound speed c determine the characteristic acoustic impedance Z of the material as Z = ρc
(7)
8
Y. Saijo et al.
On the assumption that the interface between two fluid-like media is infinite and plane, the reflected sound power in d B can be determined by the values of specific acoustic impedance of each medium as d B = 10 log10
Pr Pi
= 10 log10
(Z a −Z b )2 (Z a +Z b )2
(8)
where Pr is the sound power reflected at the interface, Pi is the incident sound power, Z a is the acoustic impedance of medium a, and Z b is the acoustic impedance of medium b. Third, SAM provides the basic data for assessment of biomechanics of tissues and cells. Especially, it is useful for the microscopic target where direct mechanical measurements cannot be applied. In its simplest form, the relation between the sound speed and the elastic bulk modulus of liquid-like material is expressed as c=
K ρ
(9)
where c is the sound speed, K is the elastic bulk modulus, and ρ is the density of the material. If a biological soft tissue is considered as a liquid-like material, this equation would be applied to assess elasticity of the tissue. Recent biomechanical studies have suggested that the mechanical properties of tissue are not sufficiently considered as liquid but should be treated as soft solid material. However, the relation between acoustic properties of solid material would also be described by the following equation if the material is assumed as isotropic. c=
E(1−σ ) ρ(1+σ )(1−2σ )
(10)
where c is the sound speed, E is Young’s modulus, σ is Poisson’s ratio, and ρ is the density of the material. This relation shows that Young’s modulus of tissue is inversely proportional to the square of sound speed. Soft materials are sometimes considered as viscoelastic material. At that time, viscosity is also correlated with the acoustic properties. α=
2 f 2π 2 ηv 3ρc3
+ 43 ηs
(11)
where α is the absorption, f is the ultrasonic frequency, ηv is the volumetric viscosity, ηs is the shear viscosity, ρ is the density, and c is the sound speed of the material.
Beyond Conventional Medical Ultrasound Imaging
9
3.2 Scanning Acoustic Microscopy Figure 6 shows the schematic illustration of reflections from the tissue surface and from the interface between tissue and slide glass in acoustic microscopy. The tissue is thinly sliced and attached on the slide glass without cover slip. High-molecular polymer materials used in cell culture dish can be also used on behalf of the slide glass. Single-layered cultured cells are also appropriate objects for SAM. The ultrasound is transmitted through a coupling medium and focused on the surface of the slide glass. Transmitted ultrasound is reflected at both the surface of the tissue and the interface between the tissue and the slide glass. The transducer receives the sum of these two reflections. The analysis of the phase of the interference of these two reflections deduces the thickness, sound speed, and attenuation of the tissue. Recently, an optical-acoustical hybrid microscope is developed for measurement of thickness and sound speed of the cell and to assess biomechanical change during differentiation (Arakawa et al. 2015, 2018; Nagaoka et al. 2019). Optical modules mainly consisted of inverted optical microscope and CMOS camera. In this system, a thermo-control system of an inverted optical microscope was installed for observation of living cells. The temperature of the medium was kept between 35 and 37 °C during measurement. Ultrasonic unit mainly consisted of ultrasonic transducer, pulser/receiver, A/D converter, and automatic stage. The transducer was made of thin-film ZnO attached with sapphire lens. The central frequency of the transducer was 250 MHz, focal length was 0.5 mm, and aperture length was 0.45 mm. The pulser/ receiver had the operating frequency of 350 MHz and amplification of 23 dB. The piezo-actuator automatic stage was used for two-dimensional mechanical scan with the precision of 1 nm. The thickness and sound speed of the cell were obtained with time-domain and frequency-domain analyses. 3T3-L1 cells are widely used for studying adipogenesis and diabetes because of their potential to differentiate from fibroblast-like morphology to adipocyte-like morphology (Hatakeyama and Kanzaki 2011). Generally, fibroblasts are hard because they contain a cytoskeleton consisting mainly of actin filaments. In contrast, adipocytes are considered to be soft because they contain soft lipid droplets. Thus, biomechanical properties of 3T3-L1 cells may dramatically change during differentiation. Figure 7 shows that the sound speed of the central portion of the fibroblast-like cell was 1650 m/s. Figure 8 shows the sound Fig. 6 Schematic illustration of reflections from the tissue surface and from the interface between tissue and slide glass in observation with acoustic microscopy
Transducer
Coupling medium Tissue Slide glass
10
Y. Saijo et al.
speed of the adipocyte-like cell, especially the position of lipid droplet (white dot line circles) was 1430 m/s, respectively. Optical-acoustical hybrid microscopy may contribute to assess biomechanics of living cells.
a
b
Fig. 7 a Phase contrast microscope image and b acoustic microscope image (sound speed) of fibroblast-like morphology of 3T3-L1 cell
a
b
Fig. 8 a Phase contrast microscope image and b acoustic microscope image (sound speed) of adipocyte-like morphology of 3T3-L1 cell
Beyond Conventional Medical Ultrasound Imaging
11
4 Photoacoustic Imaging 4.1 Principle of Photoacoustic Imaging Photoacoustic (PA) imaging is one of the most rapidly progressing imaging modalities in the twenty-first century. The principle of PA is that very short pulse optical illumination causes thermal expansion of the target and generates PA signal which is received by ultrasound transducer made of piezoelectric materials. There are two major settings in PA microscopic imaging. One is the acoustic resolution photoacoustic microscopy (AR-PAM) (Nagaoka et al. 2017, 2018; Suzuki et al. 2022) and the other is the optical resolution photoacoustic microscopy (OR-PAM) (Yao and Wang 2014; Hu and Wang 2013). In both PAMs, the PA transducer should be mechanically scanned to result relatively slow frame rate compared with ultrasound or photoacoustic tomography (PAT) with arrayed transducer. In OR-PAM, optical excitement, usually with laser illumination, is focused on one spot of the tissue and the received PA signal is considered as the reflection from the illuminated spot. In principle, the lateral resolution is dependent on the focal spot size and independent on the ultrasonic frequency in OR-PAM.
4.2 Red Blood Cell Imaging An OR-PAM with the lateral resolution of 650 nm is developed (Shintate et al. 2020). The system employed a nanosecond pulsed 532 nm laser with pulse repetition frequency of 10 kHz and pulse width of 6 ns. The laser beam was collimated by a laser beam expander, attenuated by a neutral density filter, and coupled into a single-mode optical fiber with a 3-µm core diameter using an objective lens. The output laser beam from the optical fiber was focused through an objective lens, passed through a glass bottom of the dish, which is thin enough to ignore the optical aberration, and then irradiated into the sample. The PA signal generated from the sample was detected by a focused high-frequency ultrasonic transducer with the center frequency of 50 MHz. The detected signals were amplified and recorded by an 8-bit high-speed digitizer at the sampling rate of 5 GS/s. The cell culture dish was scanned by an XY piezo-stage to obtain 2D distribution of PA signal. Figure 9 shows the bovine red blood cells where the biconcave shape is clearly visualized.
5 Conclusions Ultrasound imaging is one of the most popular medical imaging modalities. Generally, US is characterized as easy, safe, portable, and inexpensive imaging modality compared with CT (computed tomography) or MRI (magnetic resonance imaging).
12
Y. Saijo et al.
Fig. 9 OR-PAM image of bovine red blood cells (Colour figure online)
But that’s not at all. Speckle tracking of the B-mode ultrasound image of the shoulder bone and muscles is applied for quantitative assessment of adhesive capsulitis. Twodimensional blood flow imaging is obtained with echodynamography, in which fluid dynamics theories are applied for conventional color Doppler imaging. Scanning acoustic microscopy with the frequency of 350 MHz visualizes the shape and biomechanical properties of a single cell in different differentiation phase. Finally, Optical Resolution Photoacoustic Microscopy (OR-PAM), in which nanosecond pulsed light generates photoacoustic signal, successfully visualizes a single red blood cell. US is not only used to measure the size and shape of the organ but also applied functional and multi-scale imaging.
References Arakawa M, Shikama J, Yoshida K, Nagaoka R, Kobayashi K, Saijo Y (2015) Development of an ultrasound microscope combined with optical microscope for multiparametric characterization of a single cell. IEEE Trans Ultrason Ferroelectr Freq Contr 62(9):1615–1622 Arakawa M, Kanai H, Ishikawa K, Nagaoka R, Kobayashi K, Saijo Y (2018) A method for the design of ultrasonic devices for scanning acoustic microscopy using impulsive signals. Ultrasonics 84:172–179 Hatakeyama H, Kanzaki M (2011) Molecular basis of insulin-responsive GLUT4 trafficking systems revealed by single molecule imaging. Traffic 12(12):1805–1820 Hermawan N, Ishii T, Saijo Y (2022) Color Doppler shear wave elastography using commercial ultrasound machine with compensated transducer scanning delay. J Med Ultrason 49(2):1–11 Hu S, Wang LV (2013) Optical-resolution photoacoustic microscopy: auscultation of biological systems at the cellular level. Biophys J 105(4):841–847 Nagaoka R, Kobayashi K, Arakawa M, Hasegawa H, Saijo Y (2019) Correction of phase rotation in pulse spectrum method for scanning acoustic microscopy and its application to measurements of cells. Ultrasonics 99:105949
Beyond Conventional Medical Ultrasound Imaging
13
Nagaoka R, Tabata T, Takagi R, Yoshizawa S, Umemura SI, Saijo Y (2017) Development of realtime 3-D photoacoustic imaging system employing spherically curved array transducer. IEEE Trans Ultrason Ferroelectr Freq Contr 64(8):1223–1233 Nagaoka R, Tabata T, Yoshizawa S, Umemura SI, Saijo Y (2018) Visualization of murine lymph vessels using photoacoustic imaging with contrast agents. Photoacoustics 9:39–48 Nakajima H, Sugawara S, Kameyama T, Tabuchi H, Ohtsuki S, Tanaka M, Saijo Y (2011) Location of flow axis line in the left ventricle and its interaction with local myocardial motion. J Echocardiogr 9(1):24–27 Oktamuliani S, Kanno N, Maeda M, Hasegawa K, Saijo Y (2019a) Validation of echodynamography in comparison with particle-image velocimetry. Ultrason Imag 41(6):336–352 Oktamuliani S, Hasegawa K, Saijo Y (2019b) Left ventricular vortices in myocardial infarction observed with echodynamography. In: Conf Proc 41st IEEE Eng Med Biol Soc 5816–5819 Suzuki R, Shintate R, Ishii T, Saijo Y (2022) Comparative investigation of coherence factor weighting methods for an annular array photoacoustic microscope. Jpn J Appl Phys 61:SG1047 Shintate R, Morino T, Kawaguchi K, Nagaoka R, Kobayashi K, Kanzaki M, Saijo Y (2020) Development of optical resolution photoacoustic microscopy with sub-micron lateral resolution for visualization of cells and their structures. Jpn J Appl Phys 59:SKKE11 Tanaka M, Sakamoto T, Saijo Y, Katahira Y, Sugawara S, Nakajima H, Kurokawa T, Kanai H (2019) Role of intra-ventricular vortex in left ventricular ejection elucidated by echo-dynamography. J Med Ultrason 46(4):413–423 Yao J, Wang LV (2014) Sensitivity of photoacoustic microscopy. Photoacoustics 2(2):87–101
Denoising with Graphics Processing Units and Deep Learning in Non-invasive Medical Imaging Naoya Oishi
Abstract Medical imaging is not only essential to the diagnostic process, but also plays a very important role in determining the course and effectiveness of treatment. In the last few decades, tremendous technological innovations have been made in the field of non-invasive medical imaging. Among them, imaging methods represented by computed tomography and magnetic resonance imaging are indispensable in current clinical medicine because they can acquire biological structures and functions in three to four dimensions with high spatial resolution non-invasively. However, the acquisition of data with high spatial resolution generally leads to a decrease in the signal-to-noise ratio. A longer acquisition time is required to improve the signal-to-noise ratio. However, for non-invasive medical image acquisition in clinical settings, a long acquisition time is impractical and results in a decrease in signal-to-noise ratio, especially in high spatial resolution images. It is thus essential to develop effective denoising techniques as post-processing and also to adapt the optimal denoising method in accordance with the user’s objectives. This review provides a brief overview of denoising techniques as post-processing for medical imaging, and introduces our work on fast and accurate denoising methods using graphics processing units and denoising with deep learning. Keywords Medical imaging · Non-invasive · Denoising · Graphics processing units · Deep learning
1 Introduction Medical imaging is the non-invasive imaging of the internal structures and functions of living organisms for the purposes of screening and diagnosing various diseases and medical research, among others. Since the discovery of X-rays by Röntgen in 1895, medical imaging has revolutionized medicine, and medical imaging technology N. Oishi (B) Medical Innovation Center, Kyoto University Graduate School of Medicine, 53 Shogoin-Kawahara-Cho, Sakyo-Ku, Kyoto 606-8507, Japan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 T. Takiguchi et al. (eds.), Practical Inverse Problems and Their Prospects, Mathematics for Industry 37, https://doi.org/10.1007/978-981-99-2408-0_2
15
16
N. Oishi
has made tremendous progress in recent years. Innovations and new discoveries in ultrasound, nuclear medicine, computed tomography (CT), and magnetic resonance imaging (MRI) have made medical imaging an indispensable tool of both clinical and basic medicine. Medical imaging is one of the mathematical inverse problems in terms of imaging the characteristics (cause) of biological tissue from the observed signals (result). Recent technological advances have made it possible to acquire medical images faster, with higher image quality, and less invasively. However, there is a trade-off between faster and less invasive imaging and higher image quality. For example, highspeed imaging leads to a lower image signal-to-noise ratio (SNR), and obtaining highSNR images requires a longer acquisition time, which also leads to increased invasiveness in methods that involve radiation exposure, such as CT and positron emission tomography (PET). In medical imaging, obtaining accurate information is extremely important for disease diagnosis and treatment decisions. Low-quality medical images containing noise can lead to misdiagnosis (Fig. 1). However, increased acquisition time and increased invasiveness should be avoided in medical practice. It is thus important to use image processing techniques to improve the SNR, that is, to remove noise from the acquired images as post-processing. In this review, we provide an overview of denoising techniques as post-processing of medical images, and introduce denoising techniques using general-purpose computing on graphics processing units (GPGPU) with high speed and high accuracy, and denoising techniques using deep learning, including our own work.
Fig. 1 Examples of low- and high-quality medical images. Left: Noisy chest X-ray. It is difficult to identify abnormal nodules from the image, which can lead to misdiagnosis. Right: The same chest X-ray without noise. Multiple nodules due to metastatic lung cancer can only be identified on a noiseless image (arrows)
Denoising with Graphics Processing Units and Deep Learning …
17
2 Denoising with Graphics Processing Units 2.1 Denoising Methods for Medical Imaging Image processing techniques for denoising can be broadly classified into two categories: one based on unsupervised learning and the other on supervised learning. The former mainly uses methods that take advantage of redundancy in image patterns, while the latter uses deep learning and other techniques that have made remarkable technological progress in recent years. Both methods require an enormous amount of computation to achieve high-precision denoising, and since it is difficult for ordinary central processing units (CPUs) to perform real-time processing as required in the medical field, they are used only for simple denoising methods. Therefore, the use of GPGPU, a technology that converts the functions of graphics processing units (GPUs), which are good at repeatedly applying relatively simple numerical calculations to a large amount of data in parallel, is a solution to this problem. It enables high-speed and high-precision denoising. Another advantage of utilizing GPGPU for denoising is its use as an image viewer. In the medical field, medical images are displayed on an image viewer, and physicians view the images and make a diagnosis. If a denoising filter can be applied and displayed in real time to the level of denoising required by the physician, it is possible to obtain the optimal signal for the detection of abnormalities. However, highly accurate denoising requires an enormous amount of computation, making it difficult to adapt the filter in real time. Therefore, the use of a GPGPU not only enables high-speed computation, but also further speeds up the process by allowing images computed on the GPU to be displayed directly on the screen without having to return them to the CPU (Fig. 2).
2.2 Perpendicular Gaussian Filter for Denoising Although nowadays many medical images can be acquired as three-dimensional (3D) images, it is necessary to display certain cross sections, such as axial, coronal, and sagittal sections, as 2D images in order to display them in an image viewer. In this case, efficient denoising can be achieved, especially in medical images obtained with high spatial resolution, by taking into account the adjacent cross-sectional information that is not used for display. Blurred boundaries can also be avoided by averaging with stronger weights in the neighborhood of the currently displayed cross section. This filter is called a perpendicular Gaussian filter. This needs to be computed for each section, and since each computation is large, the use of GPGPU is indispensable for real-time display. Optical coherence tomography (OCT) is a non-invasive technique that uses nearinfrared light. It is a medical imaging technique that uses the scattering of reflected light to obtain three-dimensional images of biological tissue. While this technique
18
N. Oishi
Fig. 2 General-purpose computing on graphics processing units (GPGPU) workflow in real-time filtering for an image viewer. The workflow requires data transfer from central processing unit (CPU) memory to graphics processing unit (GPU) memory only for the first time, which can be a bottleneck
can obtain 3D images with extremely high spatial resolution, it is also essential to use denoising techniques in conjunction because of the large influence of noise. Figure 3 shows an OCT image of the skin of a healthy subject’s palm. While it has extremely high spatial resolution, noise is noticeable. By applying a perpendicular Gaussian filter to each cross section, highly accurate denoising can be achieved, and structures such as sweat glands can be clearly seen. By applying this filter to OCT images of medaka brains, we were able to clearly delineate cerebral vasculature of 7–23 µm in size (Suzuki et al. 2020).
2.3 Non-local Means Filter Not only medical images but also natural images are known to have spatial redundancy in image patterns, and image noise can be efficiently reduced by an averaging process using this property. The Gaussian filter, a major noise reduction filter, is based on the principle of signal averaging using spatial redundancy in images, and is widely used in medical imaging such as PET image reconstruction and functional MRI (fMRI) analysis because of its relatively high processing speed. However, this filter has the major drawback of blurring edges because it also averages dissimilar data (Fig. 4, middle panel). The non-local means (NLM) filter, proposed by Buades et al. (2005), averages similar data in the image by weighting the data with similar spatial patterns, thereby reducing the blurring of edges (Buades et al. 2005). This filter achieves efficient denoising while avoiding edge blurring (Fig. 4, right panel).
Denoising with Graphics Processing Units and Deep Learning …
19
Fig. 3 Examples of optical coherence tomography (OCT) images of healthy skin of the palm of the hand. Left: Original OCT images in axial, coronal, and sagittal sections. Right: Denoised images with a perpendicular Gaussian filter under general-purpose computing on graphics processing units. Sweat glands, which are three dimensionally coiled tubular structures, can be easily identified (red arrows) (Color figure online)
Fig. 4 Effects of denoising filters on brain magnetic resonance imaging (MRI) of a healthy subject. Left: Original MRI with extremely high spatial resolution (0.5 × 0.5 × 0.5 mm), but with noticeable noise. Middle: MRI denoised using a common three-dimensional (3D) Gaussian filter. Although the noise is reduced, the edges are blurred and detailed structures cannot be distinguished. Right: MRI denoised using the 3D non-local means (NLM) filter, which is extremely effective in denoising while avoiding blurred edges
The NLM filter adjusts the value of each pixel with a weighted average of other pixels with similar geometric patterns in the neighborhood. Since the pixels in this image are highly correlated while the noise is generally independently and identically distributed (i.i.d.), averaging these pixels reduces the noise component and yields
20
N. Oishi
pixels that are close to the ideal value. However, a major drawback of the NLM filter is its huge computational complexity, which is especially apparent in 3D images. The NLM filter requires the calculation of the distance (similarity) between each voxel in the 3D image and all voxels in the spatial region defined as the neighborhood. In other words, if the size of the 3D image is N 3 , the size of the search region is (2S + 1)3 , and the size of the neighborhood region is (2Ni + 1)3 , the complexity of the filter algorithm is on the order of O(N (2S + 1)(2Ni + 1))3 . In fact, in the first application of 3D NLM to 3D MRI, Coupe et al. reported a computation time of 6 h on a 3 GHz CPU with an image size of 181 × 217 × 181 and minimum values for S and Ni (S = 5, N i = 1) (Coupe et al. 2008). Our implementation of the same 3D MRI with S = 7, N i = 1 and a CPU (Xeon W-2295 3.0 GHz, 1 thread) takes approximately 301 s, which is faster than the previous implementation, but is not fast enough to be used for real-time viewing. Therefore, we implemented a 3D NLM using GPGPU and achieved increased speed of approximately 1,000 times faster than the CPU (Table 1). The same 3D MRI with S = 7, N i = 1 takes about 0.28 s, which is 0.001 s for one slice, a level that poses no problem for real-time display. As an example of the NLM filter, a segmentation using a brain MRI from the BrainWeb database (http://www.bic.mni.mcgill.ca/brainweb/) is shown in Fig. 5. This MRI has an image size of 181 × 217 × 181. Segmentation is a typical form of medical imaging processing of brain MRI, which separates the brain into gray matter, white matter, and cerebrospinal fluid. This is important for detecting brain atrophy that can occur in neurodegenerative diseases such as Alzheimer’s disease. Segmentation is typically performed using a Gaussian mixture model based on signal intensity values, but noise affects its accuracy. In the middle panel of Fig. 5, 9% noise is added, which reduces the segmentation accuracy. The 3D NLM process improves the segmentation accuracy (Fig. 5, right panel). We applied the 3D NLM filter to reduce the amount of contrast agent, which could be toxic, used in repeated CTangiography in a rat model of carotid artery occlusion (Kitamura et al. 2012). We Table 1 Comparison of three-dimensional non-local means (3D NLM) filter processing time using general-purpose computing on graphics processing units (GPGPU) and central processing unit (CPU). The magnetic resonance imaging (MRI) image size is 181 × 217 × 181 and the neighborhood size is fixed at 3 (Ni = 1) Search voxels/blocks
Computational time (in s)
Computational time (in s)
Ratio
GPGPU
CPU
CPU/GPGPU
1 × GeForce TITAN RTX
Xeon W-2295 3.0 GHz, 1 thread
53 = 225 voxels
0.17 (0.0009 s/slice)
73 = 343 voxels
0.28 (0.001 s/slice)
113 = 1,331 voxels
0.73 (0.004 s/slice)
153 = 3,375 voxels
1.5 (0.008 s/slice)
213 = 9,261 voxels
3.7 (0.02 s/slice)
301.2 (1.66 s/slice)
1043.4
Denoising with Graphics Processing Units and Deep Learning …
21
Fig. 5 Effect of the non-local means (NLM) filter on brain segmentation using three-dimensional magnetic resonance imaging (3D MRI). Left: Segmentation into gray matter, white matter, and cerebrospinal fluid using a Gaussian mixture model based on signal intensities. Since the data are generated by a simulation without noise, accurate segmentation is performed. Middle: Image with 9% Gaussian noise added. The noise has reduced the segmentation accuracy. Right: The 3D NLM filter applied to the noise image. The segmentation accuracy is close to that of the original image due to efficient denoising that avoids blurring of the edges
also adapted the 3D NLM filter to capture minute changes in brain structures using MRI in a stress model rat (Yoshii et al. 2017).
3 Denoising with Deep Learning 3.1 Denoising Methods with Deep Learning The emergence of artificial intelligence (AI) in medicine is expected to have an impact comparable to the breakthrough discoveries of vaccines, anesthesia, sterilization, Xrays, antibiotics, and deoxyribonucleic acid (DNA) in the history of medicine. The recent tremendous progress in AI technology has been driven by deep learning. Deep learning is a method in which a multi-layered neural network architecture transforms input information into multiple levels of abstraction, automatically learning from the data representations that have been designed by humans in conventional machine learning, such as feature extraction. Among others, the development of convolutional neural network (CNN) technology for image pattern recognition tasks drove early deep learning techniques. CNNs have also been applied to various problems
22
N. Oishi
in medical imaging, such as lesion detection and classification, segmentation, and image reconstruction, of which noise reduction is a representative example. Denoising with deep learning can be categorized as an image-to-image (I2I) translation task. I2I translation is the task of learning a model between images from a source domain and images from a target domain. The goal of I2I for denoising is to convert an input noisy image in a source domain to a target domain with the corresponding denoised image. I2I translation can be categorized into two main learning methods: supervised and unsupervised I2I translation. The supervised I2I translation can also be divided into two major networks: encoder–decoder network and generative adversarial network (GAN).
3.2 Denoising with the Encoder–Decoder Network A well-known encoder–decoder network is U-Net, which is derived from convolutional neural networks reported by Ronneberger et al. (2015). U-Net comprises an encoder–decoder network with skip connections. The input image is scaled down toward the lower layer (encoder) and enlarged as it returns to the upper layer (decoder). The encoder increases the ability to abstract the content of the image, whereas the decoder increases the ability to generate an image from the encoder’s information. Skip connections can transfer non-abstract raw information, especially high-frequency signal information, directly to the final output layer. The name U-Net is derived from the shape of the overall architecture, which resembles the letter “U” in English. Although U-Net was originally developed for segmentation of biological tissue images, its usefulness has led to various applications in medical imaging, including denoising. We have also developed a multi-tasking deep learning approach that, in addition to identifying ischemic core regions in cerebrovascular disorders using U-Net, combines another deep learning model from the abstract representation reduced in dimensionality by U-Net to predict prognosis. We reported that the prognosis of mechanical thrombectomy for large-vessel occlusion was significantly improved by this method over the conventional method (Nishi et al. 2020). Figure 6 shows an example of brain MRI denoising using U-Net. Recently, parallel imaging techniques using multi-channel phased array coils have been used to speed up MRI scans by acquiring only a portion of the k-space data collected by the MRI system. Although a variety of reconstruction algorithms have been developed, position-dependent variations in noise values in the reconstructed image occur, especially with sparse sample data. In other words, the noise is not spatially constant, making effective denoising difficult with conventional denoising techniques. In the example, a standard structural image (Fig. 6, first column) was acquired for 10 healthy subjects, which were imaged with a 3 T-MRI system for 6 min, and a noisy structural image (Fig. 6, second column), which was acquired in 3 min using a highspeed imaging method with a parallel imaging technique. The images were trained and validated by five-fold cross validation. Whereas U-Net achieves highly accurate denoising including at the center of the image (Fig. 6, third column), noise in the
Denoising with Graphics Processing Units and Deep Learning …
23
Fig. 6 Example of brain magnetic resonance imaging (MRI) denoising using U-Net. First column: Standard three-dimensional (3D) structural image acquired in 6 min with a 3 T-MRI system. Second column: Noisy structural image acquired in 3 min by high-speed imaging method using parallel imaging technique. Third column: Denoised image using 3D U-Net. Denoising is achieved throughout the image, including the central region. Fourth column: Denoising image using the 3D non-local means (NLM) filter. Although noise in the periphery has been denoised, noise in the center still remains
center still remains with conventional denoising techniques such as the 3D NLM method (Fig. 6, 4th column). Although U-Net is a supervised learning method and thus requires a large amount of data for general-purpose denoising, it can be a useful method for situations where image acquisition protocols are somewhat fixed, such as in the medical field, because a training model can be constructed with a small number of data sets.
3.3 Denoising with the Generative Adversarial Network A deep learning algorithm for image generation, GAN, was presented by Goodfellow et al. (2014). GAN consists of two models, a generator and a discriminator, and the idea is that these two models learn adversarially. These two models are typically implemented in neural networks. The generator attempts to learn the distribution of true examples for generating new images. The discriminator, usually a binary classifier, learns to discriminate as accurately as possible between the generated image and the true image. Learning converges when the discriminator can no longer distinguish between the true image and the image generated by the generator. Since the publication of GAN, a vast number of GAN-derived methods have been proposed, such as conditional GAN (cGAN), which can be extended to a conditional model by conditioning both the discriminator and the generator on additional information. Isola
24
N. Oishi
et al. reported a GAN that performs image transformation by inputting corresponding images as pairs and identifying whether they are pairs, which they called Pix2Pix (Isola et al. 2017). This is a type of cGAN because it conditions the corresponding images as additional information, and it is one of the I2I translations in terms of performing image transformations. Figure 7 shows an example of brain MRI denoising using Pix2Pix. In this example, as in the example in Fig. 6, a standard 3D structural image (Fig. 7, first column) and a noisy structural image (Fig. 7, second column) were acquired for 10 healthy subjects. Each was acquired in 6 min with a 3 T-MRI system and 3 min with a high-speed imaging method using parallel imaging technology. The images were trained and validated by five-fold cross validation. Pix2Pix achieves more accurate denoising than the denoising using U-Net described in 3.2 (Fig. 7, third and fourth columns). In particular, it should be noted that Pix2Pix reconstructs structures such as the cerebellar fissures and putamen (Fig. 7, third column, red arrows), which are almost completely lost in the noisy structural image. This means that the concept of normal brain structure was effectively learned by GAN.
Fig. 7 Example of brain magnetic resonance imaging (MRI) denoising using Pix2Pix. First column: Normal structural image taken in 6 min with a 3 T-MRI system. Second column: Noisy structural image acquired in 3 min by high-speed imaging method using parallel imaging technique. Third column: Noise-reduced image using Pix2Pix. High-precision noise reduction was achieved, and structures such as the cerebellar fissures and putamen, which were almost completely lost in the noisy structural image, were restored (red arrows). Fourth column: Denoised image using U-Net. Although the noise is removed with high precision, it is worse than Pix2Pix, including the restoration of microstructures (Color figure online)
Denoising with Graphics Processing Units and Deep Learning …
25
4 Conclusions In this paper, we review a rapid and efficient method for denoising medical images using GPU and deep learning, including our own work. Table 2 lists the advantages and disadvantages of the denoising methods introduced in the paper. With the tremendous improvement in computer processing speed in recent years, real-time processing of computationally demanding high-precision noise reduction algorithms, which was difficult in the past, can now be realized using GPGPU. Furthermore, by using deep learning, even information lost due to noise can be recovered. This is a noise reduction technique at a level previously unthinkable. We believe that these techniques will generally be adapted to medical imaging in the future and will be beneficial to both clinical and basic medicine. Table 2 Advantages and disadvantages of the denoising methods introduced in this paper Denoising methods
Advantages
Disadvantages
Perpendicular Gaussian filter
The concept is simple No training data are required On a two-dimensional display, not only denoising but also 3D structural information of the previous and next slices can be visualized
It is not suitable for 3D image analysis because it cannot be represented as a 3D structure
Non-local means filter
By changing the parameters, it Noise remains when there is spatial can adapt to images with a wide non-uniformity in the noise range of noise levels properties Suitable for 3D image analysis because denoising with edge preserving can be expressed as a 3D structure No training data required
Encoder–decoder network
Among deep learning algorithms, it is relatively stable over training It can be adapted to cases where there is spatial non-uniformity in the noise properties
Training data are required It cannot be adapted to images with noise with different properties in comparison to the training data, and a new training model needs to be built additionally
Generative adversarial network
It is possible to achieve denoising with restoration of even minute structures that have lost most of their information in the image
Training data is required It cannot be adapted to images with noise with different properties in comparison to the training data, and a new training model needs to be built additionally Parameters can be difficult to adjust and it is sometimes unstable to train
26
N. Oishi
Acknowledgements We would like to thank all of the participants in this study and all research collaborators for their invaluable work in data collection and analysis. This work was supported in part by JSPS KAKENHI grant numbers JP21K07593 and JP18K07712.
References Buades A, Coll B, Morel JM (2005) A review of image denoising algorithms, with a new one. Multiscale Model Simul 4(2):490–530 Coupe P, Yger P, Prima S, Hellier P, Kervrann C, Barillot C (2008) An optimized blockwise nonlocal means denoising filter for 3-D magnetic resonance images. IEEE Trans Med Imag 27(4):425–441 Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Neural information processing systems, pp 2672–2680 Isola P, Zhu JY, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: IEEE conference on computer vision and pattern recognition, pp 5967–5976 Kitamura A, Fujita Y, Oishi N, Kalaria RN, Washida K, Maki T, Okamoto Y, Hase Y, Yamada M, Takahashi J, Ito H, Tomimoto H, Fukuyama H, Takahashi R, Ihara M (2012) Selective white matter abnormalities in a novel rat model of vascular dementia. Neurobiol Aging 33(5):e25–e35 Nishi H, Oishi N, Ishii A, Ono I, Ogura T, Sunohara T, Chihara H, Fukumitsu R, Okawa M, Yamana N, Imamura H, Sadamasa N, Hatano T, Nakahara I, Sakai N, Miyamoto S (2020) Deep learningderived high-level neuroimaging features predict clinical outcomes for large vessel occlusion. Stroke 51(5):1484–1492 Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. arXiv. 1505.04597 Suzuki T, Ueno T, Oishi N, Fukuyama H (2020) Intact in vivo visualization of telencephalic microvasculature in medaka using optical coherence tomography. Sci Rep 10(1):19831 Yoshii T, Oishi N, Ikoma K, Nishimura I, Sakai Y, Matsuda K, Yamada S, Tanaka M, Kawata M, Narumoto J, Fukui K (2017) Brain atrophy in the visual cortex and thalamus induced by severe stress in animal model. Sci Rep 7(1):12731
Tomography from Scattered Signals Obeying the Stationary Radiative Transport Equation I-Kun Chen, Hiroshi Fujiwara, and Daisuke Kawagoe
Abstract We study computerized tomography from scattered signals based on the stationary radiative transport equation which is a mathematical model of particle propagation with absorption and scattering by media. Discontinuity of the solution induced by a proper boundary condition plays an essential role. The jump in the discontinuities leads the X-ray transform of the attenuation coefficient, and thus Xray inversion algorithms enable us to reconstruct it. Numerical experiments in two dimensions are exhibited to support its feasibility in quantitative reconstruction. Keywords Inverse problems · Stationary radiative transport equation · Reconstruction of coefficients · Discontinuous Galerkin method · Computerized tomography
1 Introduction We consider the stationary radiative transport equation ξ · ∇x I (x, ξ ) + (μa (x) + μs (x))I (x, ξ ) − μs (x) p(x, ξ, ξ )I (x, ξ ) dσξ = 0. S d−1
(1) The stationary radiative transport equation describes migration of particles in turbid media (Chandrasekhar 1960), e.g. photon propagation in biological tissue (Arridge 1999; Arridge and Schotland 2009). The function I (x, ξ ) stands for density of I.-K. Chen National Taiwan University, No. 1, Sec. 4, Roosevelt Rd., Taipei10617, Taiwan e-mail: [email protected] H. Fujiwara (B) · D. Kawagoe Kyoto University, Yoshida-Honmachi, Sakyo, Kyoto 606-8501, Japan e-mail: [email protected] D. Kawagoe e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 T. Takiguchi et al. (eds.), Practical Inverse Problems and Their Prospects, Mathematics for Industry 37, https://doi.org/10.1007/978-981-99-2408-0_3
27
28
I.-K. Chen et al.
particles at a point x ∈ Rd , d = 2 or 3, with a direction ξ ∈ S d−1 . Here, S d−1 is the unit sphere in Rd . The coefficient μa characterizes absorption of particles by the media, and the coefficient μs and the integral kernel p characterize scattering of particles in media; they are called the absorption coefficient, the scattering coefficient, and the scattering phase function, respectively. In this paper we let μt = μa + μs and call it the (total) attenuation coefficient. Denote the incoming boundary Γ− and the outgoing boundary Γ+ by Γ± := {(x, ξ ) ∈ ∂ × S d−1 | ±n(x) · ξ > 0}, where n(x) is the outer unit normal vector at x ∈ ∂. The forward problem is to seek a solution I to Eq. (1) satisfying I (x, ξ ) = I0 (x, ξ ), (x, ξ ) ∈ Γ−
(2)
for a given function I0 on Γ− . Contrary to the forward problem, the inverse problem in this paper is to reconstruct the attenuation coefficient μt from the boundary data, I0 and I |Γ+ , of the solution I to the boundary value problem (1)–(2). In particular, it should be stressed that our method to reconstruct μt assumes no a priori information on μs and p. It is a mathematical model of optical tomography, which is a new medical imaging modality (Arridge and Schotland 2009), e.g. for monitoring tissue activities or properties, which appear in changes of μa during activities or disease progression. Although our procedure could not split μt into μa and μs , one can find variation of μa in that of μt if μs is constant under tissue activity or disease. We review some previous works concerning the above inverse problem. The attenuation coefficient μt can be reconstructed by the use of an albedo operator (Choulli and Stefanov 1999) for d ≥ 2, although we restrict ourselves in the cases d = 2 or 3 from the view point of the application. It maps the incoming boundary data I0 to the outgoing boundary data I |Γ+ . The reconstruction in Choulli and Stefanov (1999) bases the singular decomposition of the albedo operator, which determines the X-ray transform of the attenuation coefficient μt defined by (X μt )(x, ξ ) := μt (x − r ξ ) dr, (x, ξ ) ∈ Rd × S d−1 . R
However, it is not feasible to know the decomposition from a finite number of experiments. On the contrary, our proposed method uses jumps in boundary measurements to determine X μt , which are observed by a finite number of experiments. Besides, though we can only reconstruct μt in our approach, it is the very coefficient which contains the most important information for the optical tomography. Anikonov et al. (1993) made use of propagation of the boundary-induced discontinuity, which is the discontinuity of the solution to the forward problem arising from discontinuous incoming boundary data, in order to solve the inverse problem. In particular, they put an incoming boundary data which is discontinuous with respect to the ξ variable in order to observe the X-ray transform of the attenuation coefficient μt .
Tomography in Radiative Transfer
29
However, their numerical examples were exhibited under the absence of scattering, i.e. μs = 0, since numerical methods for the case had not been well established. On the other hand, a jump of the boundary-induced discontinuity also propagates along a line when the boundary data has a jump with respect to the x variable. Aoki et al. (2001) showed this property for the case of the two-dimensional half homogeneous space with an incoming boundary data independent of ξ . The first and second authors (Chen and Kawagoe 2019) extended the result to a bounded convex domain case. In numerical reconstruction, one can adopt an iterative method where target coefficients are determined so that simulation results of the forward problem are close to observed data (Hielscher et al. 1999; Klose and Hielscher 1999). Therefore numerical methods for the forward problem play an essential role in reliable and effective reconstruction. Although Monte Carlo methods have been frequently used in particle transport models (Boas et al. 2002; Brown 2005; Wang et al. 1995), its lack of quantitative reliability is crucial in application to the inverse problems. Finite difference methods (Klose et al. 2002) could overcome the issue, whereas they assume the regularity of the exact solution. Recently a piecewise constant approximation method has been developed (Fujiwara 2020). It could reproduce discontinuities of the exact solution with respect to both spatial and directional variables. In this paper, we revisit results of Chen and Kawagoe (2019) in the next section, and give a remark on the so-called generalized convexity condition introduced by Anikonov et al. (Anikonov et al., 1993) and assumed in Anikonov et al. (1993), Chen and Kawagoe (2019) with a simple example in Sect. 3. We also exhibit some numerical examples in two dimensions to support their feasibility by the efficient use of the discontinuous Galerkin method (Fujiwara 2020) in Sects. 4 and 5.
2 Propagation of Boundary-Induced Discontinuity In this section, we review the results on propagation of boundary-induced discontinuity obtained in Chen and Kawagoe (2019), on which our present paper is based. We introduce a generalized convexity condition (Anikonov et al. 1993). Let be a bounded convex domain in Rd with the C 1 boundary ∂. Suppose that = ∪ Nj=1 j , where j , 1 ≤ j ≤ N , are disjoint subdomains of with piecewise C 1 boundaries. Let 0 := ∪ Nj=1 j . We say that the partition { j } Nj=1 satisfies the generalized convexity condition if, for all (x, ξ ) ∈ × S d−1 , the half line {x − tξ | t ≥ 0} intersects with ∂0 at most finite times. In other words, for all (x, ξ ) ∈ × S d−1 , there exist positive integer l(x, ξ ) and real numbers ) {t j (x, ξ )}l(x,ξ j=1 such that 0 ≤ t1 (x, ξ ) < t2 (x, ξ ) < · · · < tl(x,ξ ) (x, ξ ), x − tξ ∈ ∂0 if and only if t = t j (x, ξ ), and sup(x,ξ )∈×S d−1 l(x, ξ ) < ∞. In what follows, we use these notations t j (x, ξ ) and l(x, ξ ) for the generalized convexity condition, and we put t0 (x, ξ ) = 0. We put Fig. 1 as an example of a partition of the domain satisfying the generalized convexity condition in order to facilitate the readers’ imagine. Let μt and μs be nonnegative bounded functions on such that μt (x) ≥ μs (x) for all x ∈ . We assume that there exists a partition { j } Nj=1 of the domain such
30
I.-K. Chen et al.
Fig. 1 An example of the generalized convexity condition
that all of discontinuous points of these two functions are contained in ∂0 and it satisfies the generalized convexity condition. For the discussion in what follows, we reset μt (x) = μs (x) = 0 for x ∈ Rd \ 0 . It is worth mentioning that μt and μs are bounded continuous at least on 0 . We also assume that the integral kernel p is a nonnegative bounded function on Rd × S d−1 × S d−1 which is continuous on 0 × S d−1 × S d−1 and p(x, ξ, ξ ) = 0 for (x, ξ, ξ ) ∈ (Rd \ 0 ) × S d−1 × S d−1 , and satisfies p(x, ξ, ξ ) dσξ = 1 (3) S d−1
for all (x, ξ ) ∈ 0 × S d−1 . We regard the directional derivative ξ · ∇x I (x, ξ ) as d I (x + tξ, ξ ) . ξ · ∇x I (x, ξ ) := dt t=0 The measure dσξ is the Lebesgue measure on the sphere S d−1 . We introduce some notations. Let D := ( × S d−1 ) ∪ Γ− ,
D := D ∪ Γ+ ,
and we define two functions τ± on D by / }. τ± (x, ξ ) := inf{t > 0 | x ± tξ ∈ Let Γ−,ξ and Γ−,x be projections of Γ− on ∂ and S d−1 , respectively; Γ−,ξ := {x ∈ ∂ | n(x) · ξ < 0}, ξ ∈ S d−1 and Γ−,x := {ξ ∈ S d−1 | n(x) · ξ < 0}, x ∈ ∂. Let disc(I) be a set of the discontinuous points for a function I . We define a solution to the boundary value problem (1)–(2). We call a bounded measurable function I on D a solution to the boundary value problem (1)–(2) if (i) it has the directional derivative ξ · ∇x I (x, ξ ) at all (x, ξ ) ∈ 0 × S d−1 , (ii) it satisfies the stationary radiative transport equation (1) for all (x, ξ ) ∈ 0 × S d−1 and the
Tomography in Radiative Transfer
31
boundary condition (2) for all (x, ξ ) ∈ Γ− , (iii) I (·, ξ ) is continuous along the line {x + tξ | t ∈ R} ∩ ( ∪ Γ−,ξ ) for all (x, ξ ) ∈ D, and (iv) ξ · ∇x I (·, ξ ) is continuous on the open line segments {x − tξ | t ∈ (t j−1 (x, ξ ), t j (x, ξ ))}, j = 1, . . . , l(x, ξ ) with t0 (x, ξ ) = 0 for all (x, ξ ) ∈ 0 × S d−1 . The first main result shows how the boundary-induced discontinuity propagates in the media. Theorem 1 Suppose that a boundary data I0 is bounded and that it satisfies at least one of the following two conditions: 1. I0 (x, ·) is continuous on Γ−,x for almost all x ∈ ∂; 2. I0 (·, ξ ) is continuous on Γ−,ξ for almost all ξ ∈ S d−1 . Then there exists a unique solution I to the boundary value problem (1)–(2), and we have disc(I) = {(x∗ + tξ∗ , ξ∗ ) | (x∗ , ξ∗ ) ∈ disc(I0 ), 0 ≤ t < τ+ (x∗ , ξ∗ )}. Theorem 1 shows that the boundary-induced discontinuity propagates only along a positive characteristic line starting from a discontinuous point of the incoming boundary data. Here, a positive characteristic line from a point (x, ξ ) ∈ Γ− is defined by {(x + tξ, ξ ) | t ≥ 0}. Remark 1 Theorem 1 implies that, for a bounded continuous boundary data I0 on Γ− , there exists a unique solution I , which is bounded continuous on D. Remark 2 Anikonov et al. (1993) showed Theorem 1 with Condition 2. Our main contribution is to show Theorem 1 with Condition 1. As the second result, we shall discuss the boundary-induced discontinuity of the solution extended up to Γ+ . In other words, we can extend the domain of the solution I up to Γ+ , and we see that the boundary-induced discontinuity propagates along a positive characteristic line up to Γ+ . Theorem 2 Let a boundary data I0 satisfy assumptions in Theorem 1 and let I be the solution to the boundary value problem (1)–(2). Then, it can be extended up to Γ+ , which is denoted by I , by ⎧ ⎨ I (x, ξ ), (x, ξ ) ∈ D, I (x, ξ ) := I (x − tξ, ξ ), (x, ξ ) ∈ Γ+ . lim ⎩ t↓0
Moreover, we have disc(I) = {(x∗ + tξ∗ , ξ∗ ) | (x∗ , ξ∗ ) ∈ disc(I0 ), 0 ≤ t ≤ τ+ (x∗ , ξ∗ )}. We state the decay of the boundary-induced discontinuity in some situation. Let γ be two points in ∂ when d = 2, while let γ be a simple closed curve in ∂ when d = 3. Then, γ splits ∂ into two connected components A and B, that is, ∂ = A ∪ B ∪ γ and A ∩ B = A ∩ γ = B ∩ γ = ∅. We put an incoming boundary data I0 by
32
I.-K. Chen et al.
C, (x, ξ ) ∈ ((A ∪ γ ) × S d−1 ) ∩ Γ− , I0 (x, ξ ) = 0, (x, ξ ) ∈ (B × S d−1 ) ∩ Γ− ,
(4)
where C is a non-zero constant. We note that I0 satisfies Condition 1 of Theorem 1, and that disc(I0 ) = {(x∗ , ξ∗ ) | x∗ ∈ γ , ξ∗ ∈ −,x∗ }. For (x, ξ ) ∈ disc(I), we define a jump [I ](x, ξ ) by [I ](x, ξ ) :=
lim
x→x P(x,ξ )∈(A∪γ )
I (x, ξ ) −
lim
x→x P(x,ξ )∈B
I (x, ξ ),
(5)
where P(x, ξ ) := x − τ− (x, ξ )ξ. We note that, in our situation, [I0 ](x, ξ ) = C for all (x, ξ ) ∈ disc(I0 ) = (γ × Sd−1 ) ∩ − . In this situation, we have the following theorem, which is the most important in this paper. Theorem 3 Let I be the extended solution to the boundary value problem (1)–(2) with the incoming boundary data given by (4), and let (x ∗ , ξ ∗ ) ∈ disc(I). Then, ⎛ [I ](x ∗ , ξ ∗ ) = C exp ⎝−
τ− (x ∗ ,ξ ∗ )
⎞
μt (x ∗ − r ξ ∗ ) dr ⎠ .
0
In particular, we take a point (x ∗ , ξ ∗ ) ∈ disc(I) ∩ + . From Theorem 3, we have ∗
∗
τ− (x ∗ ,ξ ∗ )
μt (x ∗ − r ξ ∗ ) dr = − log [I ](x ∗ , ξ ∗ )/C .
X μt (x , ξ ) = 0
The right-hand side is obtained from observed data. By arranging γ , we can observe the image X μt of the X-ray transform of μt . Then, applying a well-known method such as filtered back projection (Natterer 2001), we can reconstruct the attenuation coefficient μt . We give a sketch of proofs of the above theorem for readers’ convenience. For all (x, ξ ) ∈ D, integrating Eq. (1) with respect to x along the line {x − tξ | t > 0} until the line intersects with the boundary ∂ and taking the boundary condition (2) into consideration, we obtain the following integral equation:
I (x, ξ ) = exp −Mt x, ξ ; τ− (x, ξ ) I0 (P(x, ξ ), ξ ) τ−(x,ξ )
μs (x − sξ ) exp −Mt (x, ξ ; s)
+ 0
(6)
Tomography in Radiative Transfer
33
p(x − sξ, ξ, ξ )I (x − sξ, ξ ) dσξ ds,
× S d−1
where s Mt (x, ξ ; s) :=
μt (x − r ξ ) dr. 0
We call a bounded measurable function I on D satisfying the integral equation (6) for all (x, ξ ) ∈ D a solution to Eq. (6). In our setting, the boundary value problem (1)–(2) is equivalent to the integral equation (6) (Kawagoe 2018). Define a sequence of functions {I (n) }n≥0 on D by I (0) (x, ξ ) := exp −Mt (x, ξ ; τ− (x, ξ )) I0 (P(x, ξ ), ξ )
(7)
and
I
(n+1)
τ−(x,ξ )
μs (x − sξ ) exp −Mt (x, ξ ; s)
(x, ξ ) := 0
×
(8)
p(x − sξ, ξ, ξ )I (n) (x − sξ, ξ ) dσξ ds.
S d−1
We can easily see that sup |I (0) (x, ξ )| =
(x,ξ )∈D
sup |I0 (x, ξ )|.
(x,ξ )∈Γ−
In other words, I (0) is a bounded function on D. Moreover, since μs ≤ μt , we have |I
(n+1)
τ−(x,ξ )
μs (x − sξ ) exp −Mt (x, ξ ; s)
(x, ξ )| ≤ 0
×
p(x − sξ, ξ, ξ )|I (n) (x − sξ, ξ )| dσξ ds
S d−1
≤ sup |I (x,ξ )∈D
(n)
τ−(x,ξ )
= sup |I (n) (x, ξ )| (x,ξ )∈D
μt (x − sξ ) exp −Mt (x, ξ ; s) ds
(x, ξ )| 0
τ−(x,ξ )
− 0
d exp −Mt (x, ξ ; s) ds ds
34
I.-K. Chen et al.
≤M sup |I (n) (x, ξ )| (x,ξ )∈D
for all (x, ξ ) ∈ D, where M := sup
(x,ξ )∈D
1 − exp −Mt (x, ξ ; τ− (x, ξ )) .
(n) Since the domain is bounded, we have M < 1. Thus, the sum I := ∞ conn=0 I verges uniformly in D, and it is a solution to the integral equation (6). We decompose the solution I to the integral equation (6) into two parts as follows: I (x, ξ ) = I D (x, ξ ) + IC (x, ξ ),
(9)
where I D (x, ξ ) := exp −Mt (x, ξ ; τ− (x, ξ )) I0 (P(x, ξ ), ξ ), IC (x, ξ ) :=
∞
I (n) (x, ξ ).
n=1
Then, the discontinuous points of the function I D are described as disc(ID ) = {(x∗ + tξ∗ , ξ∗ ) | (x∗ , ξ∗ ) ∈ disc(I0 ), 0 ≤ t < τ+ (x∗ , ξ∗ )}, while the function IC is bounded continuous on D. Under the assumption of the generalized convexity condition, the function Mt (·, ·; s) is continuous on D for all s ∈ [0, R], where R is the diameter of the domain . In particular, since μt = 0 on Rd \ 0 , it follows that the function Mt (·, ·; τ− (·, ·)) is continuous on D. Thus, the discontinuity of the function I D only comes from that of the boundary data I0 . Now we discuss continuity of the function I (1) . Let G be the function defined by G(x, ξ ) :=
p(x, ξ, ξ )I D (x, ξ ) dσξ ds.
S d−1
More explicitly, we have G(x, ξ ) = S d−1
= ∂
p(x, ξ, ξ ) exp −Mt (x, ξ ; τ− (x, ξ )) I0 (P(x, ξ ), ξ ) dσξ x−y x−y exp −Mt x, ; |x − y| p x, ξ, |x − y| |x − y|
(10)
Tomography in Radiative Transfer
35
x − y |n(y) · (x − y)| × I0 y, dσ y . |x − y| |x − y|d Although the function I0 may be discontinuous, under the assumption of Theorem 1, the function G is bounded continuous on 0 × S d−1 . Since the function I (1) is described as
I
(1)
τ−(x,ξ )
μs (x − sξ ) exp −Mt (x, ξ ; s) G(x − sξ, ξ ) ds,
(x, ξ ) = 0
it is bounded continuous on D by the dominated convergence theorem. We inductively say that the functions {I (n) }n≥1 are bounded continuous on D. Since the series ∞ (n) converges uniformly on D, the function IC is also bounded continuous n=1 I on D. We consider the extension of functions I D and IC up to Γ+ in the way stated in Theorem 2, which are denoted by I D and IC , respectively. The function IC is bounded continuous on D by the dominated convergence theorem, while the boundaryinduced discontinuity in I D propagates up to Γ+ . Now we discuss the jump in Theorem 3. Let (x ∗ , ξ ∗ ) ∈ disc(I). From Theorem 2, we have (P(x ∗ , ξ ∗ ), ξ ∗ ) ∈ disc(I0 ) or P(x ∗ , ξ ∗ ) ∈ γ . Thus, we have
[I D ](x ∗ , ξ ∗ ) = exp −Mt (x ∗ , ξ ∗ ; τ− (x ∗ , ξ ∗ )) [I0 ](P(x ∗ , ξ ∗ ), ξ ∗ )
=C exp −Mt (x ∗ , ξ ∗ ; τ− (x ∗ , ξ ∗ )) . On the other hand, since IC is bounded continuous on D, we have [IC ](x ∗ , ξ ∗ ) = 0. Thus, we have [I ](x ∗ , ξ ∗ ) =[I D ](x ∗ , ξ ∗ ) + [IC ](x ∗ , ξ ∗ ) ⎛ τ− (x ∗ ,ξ ∗ ) ⎞ =C exp ⎝− μt (x ∗ − r ξ ∗ ) dr ⎠ . 0
3 A Remark on the Generalized Convexity Condition In this section, we discuss discontinuity of the solution to the boundary value problem (1)–(2) without assuming the generalized convexity condition. For simplicity, we only consider the case where the domain is the unit disk in R2 and it contains the square 1 defined by 1 := {(x1 , x2 ) ∈ R2 | max{|x1 |, |x2 |} < 1/2}
36
I.-K. Chen et al.
Fig. 2 The unit disk with the square inclusion
Ω1
Ω2
as an inclusion. In this case, we may take 2 := \ 1 so that 1 ∩ 2 = ∅ and = 1 ∪ 2 . Also we have ∂0 = {x12 + x22 = 1} ∪ {|x1 | < 1/2, |x2 | = 1/2} ∪ {|x1 | = 1/2, |x2 | < 1/2}. We put the picture of the domain for readers’ convenience (see Fig. 2). We can see that the set 0 violates the generalized convexity condition. For example, let ξ = (1, 0), 1/2 < x1 < and x2 = 1/2. Then, the half line {x − tξ | t > 0} √ intersects with ∂0 on the line segment {|x1 | < 1/2, x2 = 1/2} (and the point (− 3/2, 1/2)), which does not consist of finite numbers of points. Thus the set 0 violates the generalized convexity condition. We pose the same assumptions on μt , μs , and p as before. Namely, we assume that μt and μs are nonnegative bounded functions on Rd such that μt and μs are continuous on 0 , μt (x) ≥ μs (x) for x ∈ 0 , μt (x) = μs (x) = 0 for x ∈ Rd \ 0 , and discontinuity may occur only at ∂0 . Also we assume that the integral kernel p is a nonnegative bounded function on Rd × S d−1 × S d−1 which is continuous on 0 × S d−1 × S d−1 and p(x, ξ, ξ ) = 0 for (x, ξ, ξ ) ∈ (Rd \ 0 ) × S d−1 × S d−1 , and satisfies (3) for all (x, ξ ) ∈ 0 × S d−1 . Since the definition of the solution to the boundary value problem (1)–(2) in Sect. 1 is based on the generalized convexity condition, we cannot discuss the existence and uniqueness of the solution in our current setting. However, the definition of the solution to the integral equation (6) still makes sense in our current setting. Thus, in what follows, we discuss discontinuity of the solution to (6). We note that the free transport occurs along the flat part of the boundary ∂0 in the current setting. In the current setting the function Mt (·, ·; s) may not be continuous on D for some s ∈ [0, R], while it is continuous there for all s ∈ [0, R] under the generalized convexity condition. Let us describe its discontinuity for our further discussion. Let ξ ( j) := (cos(( j − 1)π/2), sin(( j − 1)π/2)), j = 1, . . . 4, and let
Tomography in Radiative Transfer
37
L 1 :={(x1 , x2 ) ∈ | x1 > −1/2, |x2 | = 1/2}, L 2 :={(x1 , x2 ) ∈ | |x1 | = 1/2, x2 > −1/2}, L 3 :={(x1 , x2 ) ∈ | x1 < 1/2, |x2 | = 1/2}, L 4 :={(x1 , x2 ) ∈ | |x1 | = 1/2, x2 < 1/2}. Then, the function Mt (x, ξ ; s) is continuous for all x ∈ and s ∈ [0, R] if ξ = ξ ( j) , j = 1, . . . , 4. Its continuity with respect to ξ can be shown as long as ξ = ξ ( j) , j = 1, . . . , 4. On the other hand, if ξ = ξ ( j) , it may be discontinuous on L j for some s ∈ [0, R]. Based on the above observation, we have disc(I(0) ) ⊂{(x∗ + tξ∗ , ξ∗ ) | (x∗ , ξ∗ ) ∈ disc(I0 ), 0 ≤ t < τ+ (x∗ , ξ∗ )} ∪4j=1 {(x, ξ ( j) ) ∈ D | x ∈ L j }. Here we have two reasons why we use “⊂” instead of “=”. One is that the function Mt (x, ξ ( j) ; s) may be continuous for some x ∈ L j and s ∈ [0, R]. The other is that the boundary-induced discontinuity may be cancelled by discontinuity of the function Mt (·, ·; s). Even if the boundary ∂0 has a flat part in , under the assumption of the boundary data I0 in Theorem 1, we can apply the dominated convergence theorem to show that the function G defined by (10) is still bounded continuous on 0 × S d−1 . So, the discontinuity of the function I (1) only comes from those of the coefficients. In other words, we have disc(I(1) ) ⊂ ∪4j=1 {(x, ξ (j) ) ∈ D | x ∈ Lj }. For n ≥ 1, let G
(n)
(x, ξ ) :=
p(x, ξ, ξ )I (n) (x, ξ ) dσξ ds
S d−1
so that
I
(n+1)
τ−(x,ξ )
μs (x − sξ ) exp −Mt (x, ξ ; s) G (n) (x − sξ, ξ ) ds.
(x, ξ ) = 0
By the dominated convergence theorem again, the function G (n) is bounded continuous on 0 × S d−1 , and hence we have disc(I(n) ) ⊂ ∪4j=1 {(x, ξ (j) ) ∈ D | x ∈ Lj }, n ≥ 2.
38
I.-K. Chen et al.
Eventually all of functions {I (n) }n≥0 defined by (7) and (8) are continuous at least on the set D \ ({(x∗ + tξ∗ , ξ∗ ) | (x∗ , ξ∗ ) ∈ disc(I0 ), 0 ≤ t < τ+ (x∗ , ξ∗ )} ∪4j=1 {(x, ξ ( j) ) ∈ D | x ∈ L j }), (n) and since the sum I = ∞ converges uniformly in D, it is also continuous n=0 I there. In other words, we have disc(I) ⊂{(x∗ + tξ∗ , ξ∗ ) | (x∗ , ξ∗ ) ∈ disc(I0 ), 0 ≤ t < τ+ (x∗ , ξ∗ )} ∪4j=1 {(x, ξ ( j) ) ∈ D | x ∈ L j }. Let us recall that the sum I is the solution to the integral equation (6). The solution I can be extended up to Γ+ as in Theorem 2 without the generalized convexity condition, and we denote the extended solution by I . Correspondingly, let L 1 :={(x1 , x2 ) ∈ | x1 > −1/2, |x2 | = 1/2}, L 2 :={(x1 , x2 ) ∈ | |x1 | = 1/2, x2 > −1/2}, L 3 :={(x1 , x2 ) ∈ | x1 < 1/2, |x2 | = 1/2}, L 4 :={(x1 , x2 ) ∈ | |x1 | = 1/2, x2 < 1/2}. Then, in the same way as the proof of Theorem 2, we have disc(I) ⊂{(x∗ + tξ∗ , ξ∗ ) | (x∗ , ξ∗ ) ∈ disc(I0 ), 0 ≤ t ≤ τ+ (x∗ , ξ∗ )} ∪4j=1 {(x, ξ ( j) ) ∈ D | x ∈ L j }. We discuss the jump of the boundary-induced discontinuity with the boundary condition (4). Let (x∗ , ξ∗ ) ∈ disc(I0 ) and suppose that ξ∗ = ξ ( j) , j = 1, . . . , 4. In this case, following the decomposition (9), the function I D (·, ξ∗ ) is discontinuous on the line {x∗ + tξ∗ | 0 ≤ t ≤ τ+ (x∗ , ξ∗ )} while the function IC (·, ξ∗ ) is continuous there. Thus, we have [I ](x∗ + tξ∗ , ξ∗ ) = [I D ](x∗ + tξ∗ , ξ∗ ) = C exp (−Mt (x∗ + tξ∗ , ξ∗ ; t)) , or, letting x ∗ = x∗ + tξ∗ for some t ∈ [0, τ∗ (x∗ , ξ∗ )] and ξ ∗ = ξ∗ , we have
[I ](x ∗ , ξ ∗ ) = [I D ](x ∗ , ξ ∗ ) = C exp −Mt (x ∗ , ξ ∗ ; τ− (x ∗ , ξ ∗ )) . If ξ = ξ ( j) , it is quite hard that we discuss the jump of the discontinuity of the solution I (·, ξ ) since it would contain the discontinuity arising from the discontinuity of the function Mt (·, ·; s). However, since the measure of the set {ξ ( j) | j = 1, . . . , 4} is 0, it does not affect the reconstruction of the attenuation coefficient μt .
Tomography in Radiative Transfer
39
4 Numerical Detection of Discontinuities This section is devoted to numerical demonstration of Theorem 3. The discontinuity of the solution in Theorem 1 is replicated in numerical solutions by the use of the discontinuous Galerkin method with piecewise constant approximation where spatial meshes are generated along the characteristic line. Throughout the paper, all computations are processed with the standard double precision arithmetic. In numerical experiments, let be the unit disk centered at the origin whose absorption and scattering coefficients are given uniformly as μa ≡ 0.1 and μs ≡ 0.3. The Poisson kernel p(ξ · ξ ) =
1 − g2 1 , g = 0.5 2π 1 − 2gξ · ξ + g 2
(11)
is adopted as the scattering phase function. The parameter g ∈ [0, 1) stands for the anisotropy in scattering; g = 0 is isotropic, while g ≈ 1 represents strongly forward scattering. The case with g = 0.5 lies in the middle of them. A piecewise continuous approximation with respect to both spatial and directional variables (Fujiwara 2020) is employed to solve the boundary value problem (1)–(2) numerically. In what follows, the directional space S 1 ∼ = R/2π is always discretized into equi-spaced intervals denoted by ξn = {ξ ∈ S 1 | 2nπ/360 < arg ξ < 2(n + 1)π/360} for 0 ≤ n < 360. For two distinct points P and Q on ∂ with arg P ≤ arg Q (arguments are in a certain fixed branch), we write arc[P, Q] = {x ∈ ∂ | arg P ≤ arg x ≤ arg Q}, arc(Q, P) = ∂ \ arc[P, Q]. As a simple case satisfying the assumption in Theorem 1, the boundary condition is introduced by 1, x ∈ arc[P, Q] and ξ ∈ −,x ; I0 (x, ξ ) = (12) 0, otherwise, √ which leads √ discontinuity of the solution. For P = (1/2, − 3/2) and Q = (1/2, 3/2), Fig. 3 depicts an example of spatial triangular mesh whose edges cover the characteristic line P Q in the left figure, and total intensity (x) = I (x, ξ ) dσξ obtained by numerical computation in the right one. 1 S Let P and Q be boundary points and suppose that Q is a vertex of triangulation. Then there uniquely exists a triangle τ1 (resp. τ0 ) having Q as one of vertices and facing to arc[P, Q] (resp. arc(Q, P)). If the directional vector of the characteristic line P Q coincides with the interface between directional intervals ξn−1 and ξn , then the jump (5) at Q appears as the difference Ih (τ1 , ξn ) − Ih (τ0 , ξn−1 ), where Ih (τ, ξn ) is the numerical solution to (1) with the boundary condition (12) on a triangle τ and a direction ξn . On the contrary, if the directional vector of the characteristic
40
I.-K. Chen et al.
line belongs to some directional interval ξn , the jump is measured as the difference Ih (τ1 , ξn+1 ) − Ih (τ0 , ξn−1 ). Note that ξn is treated with the periodicity of {ξn }. The boundary value of our numerical solution is introduced as values on triangles facing to the boundary. Specifically, for x ∈ ∂ being not a vertex of triangulation, there uniquely exists a triangle τ = τ (x) being the closest to x. These τ and the direction ξn uniquely determine Ih (τ, ξn ) in our numerical solution. We consider it as an approximation of boundary values of our numerical solution at x ∈ ∂ in ξn direction, and write Ih (x, ξn ) = Ih (τ, ξn ). √ √ For instance, for points P = (1/2, − 3/2) and Q = (1/2, 3/2) in Fig. 3, the directional vector (0, 1) of the characteristic line P Q is the interface between directional intervals ξ89 and ξ90 . Boundary values Ih (x, ξ89 ) and Ih (x, ξ90 ) near Q are depicted in Fig. 4 where the horizontal and vertical axes are, respectively, arg x for x ∈ ∂ and values of Ih (x, ξn ). The jump at Q is approximated with the difference between
Fig. 3 Example of a spatial triangular mesh covering the characteristic line P Q (left) and the total intensity obtained by solving the boundary value problem (1) and (12) (right). The number of triangles is 218 and the maximum diameter is h = 0.283 (Colour figure online)
Fig. 4 Numerically observed outflow near arg Q = π/3 in the forward computation on a mesh with the characteristic line, the number of triangles is 22,986 and the maximum diameter is h = 0.0294 (Colour figure online)
Tomography in Radiative Transfer
41
Ih (τ1 , ξ90 ) ≈ 0.694 and Ih (τ0 , ξ89 ) ≈ 0.162, thus it is 0.532, while the exact value is
exp −(μa + μs )|P Q| ≈ 0.500, √ where |P Q| = 3 is the length of the segment P Q. It is observed that jump in numerical solutions shows quantitatively good agreement to that of exact solution stated in Theorem 3. On the other hand, Fig. 5 illustrates a triangular mesh whose edges do not cover the characteristic P Q (left) and the total intensity obtained on the mesh (right). Figure 6 depicts the gap of numerical solutions corresponding to Fig. 4. On this mesh, the gap of numerical solutions at Q is approximately 0.106 which does not
Fig. 5 Example of mesh (left) and the total intensity (right) where the mesh does not cover the characteristic line. The number of triangles is 238 and the maximum diameter is h = 0.257 (Colour figure online)
Fig. 6 Numerically observed outflow around arg Q = π/3 in the forward computation on a mesh without the characteristic line, the number of triangles is 22,860 and the maximum diameter is h = 0.309 (Colour figure online)
42
I.-K. Chen et al.
Fig. 7 Comparison of boundary values of numerical solutions in cases with the characteristic line (purple and green) and without it (blue and yellow) (Colour figure online)
reproduce one stated in Theorem 3. Note that the profiles of numerical solutions Ih (x, ξ89 ) and Ih (x, ξ90 ) along the boundary ∂ coincide with each other as displayed in Fig. 7 except a neighborhood of Q (arg x = π/3) whose profiles are magnified in Figs. 4 and 6. Hence it is concluded that the use of piecewise constant approximation on meshes having the characteristic line P Q is efficient to reproduce discontinuities in numerical solutions.
5 Computerized Tomography Based on the Stationary Radiative Transport Equation Finally, we exhibit numerical experiments on computerized tomography for reconstruction of the profile of attenuation μt in the domain. The measurement, the X-ray transform X μt in two dimensions, is numerically given by jumps of numerical outflows described in the last section. In the example, we use the modified Shepp–Logan model (Toft 1996) to set the absorption coefficients μa in the ellipse x 2 x 2 1 2 E = (x1 , x2 ) + 1/2. If {gn,k } satisfy (15) and (16), then there exists a real-valued f ∈ C μ (Sm ; Ω) (non-unique if m ≥ 2), such that the mapping
(Γ × S1 ) (eiβ , eiθ ) −→ 2Re
⎧ ⎪ ⎨ ⎪ ⎩n≤−1 k∈Z
gn,k einθ eikβ
⎫ ⎪ ⎬ ⎪ ⎭
(18)
n: odd
defines a function in L 1sym,odd (Γ × S1 ), which coincides with X f on Γ+ (and with −X f on Γ− ).
52
H. Fujiwara et al.
The constraints (13) are due to the angularly odd extension (9) to the entire torus. The conjugacy constraints (14) are due to the reality of the tensor. The symmetry constraints (15) merely account for each line being doubly parametrized in Γ+ , and they are shared by any function on the torus satisfying the symmetry (11); see Sadiq and Tamasan (2023, Lemma A.2) for details. The moment constraints (16) are due to the nature of the operator (integration) along the line in the definition of the X -ray transform. In the next section, we present a new proof of (16), which does not use the theory of A-analytic maps.
3 An Elementary Proof of the Moment Conditions In this section we present a new proof of (16), which does not use tools from the theory of A-analytic maps. The starting point is Pantyukhina’s result in Pantyukhina (1990), which extends the original GGHL characterization from 0 order tensors to arbitrary order tensors. The set of lines in Pantyukhina (1990) are parametrized by points on the tangent bundle T S1 = {(x, θ )|θ ∈ S1 , x · θ = 0} of the unit circle. To distinguish from the parametrization of lines by points on a torus, we use the notation I f(x, θ ) :=
∞ −∞
f(x + tθ), θ m dt, (x, θ ) ∈ T S1 ,
(1)
where f is extended by 0 outside Ω. In two dimensions, Pantyukhina’s result states Theorem 2 (Pantyukhina (1990)) Let ϕ ∈ S(T S1 ) be a rapidly decaying function on T S1 . Then ϕ = I f for some symmetric m-tensor field f ∈ S(R2 ), if and only if 1. ϕ(x, −θ ) = (−1)m ϕ(x, θ ) and 2. for every integer k ≥ 0, there exist homogeneous polynomials Pik of degree k, such that
∞ −∞
s k ϕ(sθ ⊥ , θ )ds =
m
Pik (cos θ, sin θ ) cosi θ sinm−i θ.
(2)
i=0
Below, we revisit the necessity part of Theorem 2 adapted to functions on the torus. For a tensor f supported in the unit disk, the X -ray of f with lines parameterized on Γ × S1 is connected to the X -ray of f with lines parameterized on T S1 by π
X f(eiβ , ei(α+β) ) = I f(sin α ei(α+β− 2 ) , ei(α+β) ) for |α| ≤
π and β ∈ (−π, π ]. 2 (3)
To bring Pantyukhina’s moment conditions (2) closer to our needs, we also recast them as orthogonality conditions.
The Algebraic Range of the Planar X -Ray Transform of Symmetric Tensors …
53
Theorem 3 Let f ∈ L 10 (Sm ; Ω) be a real-valued, integrable symmetric m-tensor field satisfying (1), and h ∈ L 1 (Γ × S1 ) be defined by
X f on Γ+ , h := X f + = −X f on Γ−
2 [X f] on Γ+ , 0 on Γ− .
(4)
Then, for each p ∈ Z, p ≥ 0,
π
−π
π −π
ein(β+α) (sin α) p (cos α) h(eiβ , ei(α+β) ) dα dβ = 0, for all
(5)
(i) n ∈ Z with m + p − n odd or (ii) n ∈ Z with |n| > m + p and m + p − n even. Proof Since f is supported in the unit disk, the integration over the line reduces to the integration over the interval [−1, 1] regardless of the orientation of the line.
1 −1
π
s p I f(s ei(θ − 2 ) , eiθ ) ds = = =
m
1
1
j=0 −1 −1 p m
1
−1
sp
1
m
−1 j=0
π
f j (sei(θ − 2 ) + teiθ )(cos θ)m− j (sin θ) j dtds
(x2 cos θ − x1 sin θ) p f j (x1 , x2 )(cos θ)m− j (sin θ) j d x1 d x2
c j, p,k (cos θ)m+ p−( j+k) (sin θ) j+k
j=0 k=0
=
p m j=0 k=0
c j, p,k
eiθ + e−iθ 2
= 2−m− p e−iθ (m+ p)
p m
m+ p−( j+k)
eiθ − e−iθ 2i
j+k
(−i) j+k c j, p,k Q m+ p, j+k (e2iθ ),
j=0 k=0
where c j, p,k =
1 1 p p−k f j (x1 , x2 )(−1)k x1k x2 d x1 d x2 , and k −1 −1 Q r.k (t) = (t + 1)r −k (t − 1)k , for 0 ≤ k ≤ r.
(6)
Since {Q r,k (t)}rk=0 form a basis for the space of polynomials of degree r (e.g., see Sadiq and Tamasan (2023, Lemma A.I), the map ζ → ζ m+ p
π
R
s p I f(sζ e−i 2 , ζ )ds
is a polynomial of degree 2(m + p) in ζ ∈ C with even powers only. In particular, we get the orthogonality (in L 2 (S1 )) conditions:
54
H. Fujiwara et al.
π
1
π
s p I f(s ei(θ− 2 ) , eiθ ) ds dθ = 0, −1 for q < 0, or q > 2(m + p), or for 1 ≤ q ≤ 2(m + p) − 1, and q odd.
eiθ(m+ p−q)
−π
By setting n = m + p − q,
π
einθ
−π
1
π
s p I f(s ei(θ− 2 ) , eiθ ) ds dθ = 0, −1 for |n| > m + p, or for |n| ≤ m + p, and m + p − n is odd.
Thus,
π
1
einθ
−π
−1
π
s p I f(s ei(θ− 2 ) , eiθ ) ds dθ = 0, for all
(7)
(i) n ∈ Z with m + p − n odd or (ii) n ∈ Z with |n| > m + p and m + p − n even.
For each eiβ ∈ Γ , let α ∈ − π2 , π2 be the angle measured counter-clockwise from the outer unit normal at eiβ ∈ Γ ; see Fig. 2. By the change of variables s = sin α in (7), and using the relation (3), the moment conditions become 0=
e
in(β+α)
e
in(β+α)
−π
=
π
π
−π
π 2
− π2 π 2 − π2
π
(sin α) p I f(sin α ei(α+β− 2 ) , ei(α+β) ) cos α dα dβ (sin α) p cos α X f(eiβ , ei(α+β) ) dα dβ. θ α
ν
eiβ Ω
α β 0
−θ ⊥
s
π
Fig. 2 eiβ ∈ Γ, θ = (cos θ, sin θ), eiθ = ei(α+β) , −θ ⊥ = ei(α+β− 2 )
The Algebraic Range of the Planar X -Ray Transform of Symmetric Tensors …
55
Since g in (4) vanishes on Γ− , i.e., g(eiβ , ei(α+β) ) = 0, for all β ∈ (−π, π ] and π < |α| ≤ π , one obtains (5). This ends the proof of Theorem 2. 2 The moment conditions (16) in Theorem 1 follow as a corollary to Theorem 2. Since the odd angular modes are preserved upon addition with the modes of the angularly even function X f, it suffices to prove (16) for the function h in (4). Corollary 1 Let f ∈ L 10 (Sm ; Ω) be a real-valued, integrable symmetric tensor field of even order m ≥ 0 satisfying (1), and h ∈ L 1sym (Γ × S1 ) be defined by (4). Then its Fourier coefficients {h n,k }n,k∈Z satisfy h n,k = (−1)k h n+2k,−k , for all odd n ≤ −m − 1, and all k ≤ 0. Proof We use (5) in Theorem 3 part (ii) with |n| > m + p. Since m is even, m + ( p − n) and p − n have the same parity. We consider two separate cases: p and n both even, and p and n both odd. Case 1: |n| > m + p, and p and n both even. Since span cos α (sin α)2 j , 0 ≤ j ≤ k = span {cos[(2 j + 1)α], 0 ≤ j ≤ k} for all k ≥ 0, the orthogonality in (5) for this case becomes 0=
π −π
1 = 2
+
π
ein(β+α) cos[( p + 1)α] h(eiβ , ei(α+β) ) dα dβ
−π π π
−π
1 2
ein(β+α) ei( p+1)α h(eiβ , ei(α+β) ) dα dβ
−π π π −π
−π
(8)
ein(β+α) e−i( p+1)α h(eiβ , ei(α+β) ) dα dβ.
In the last equality of (8) let us consider the first term,
π
−π
π −π
α=θ−β
=
ein(β+α) ei( p+1)α h(eiβ , ei(α+β) ) dα dβ π π (9) iθ(n+ p+1) −i( p+1)β iβ iθ 2 e e h(e , e ) dθ dβ = (2π ) h −n− p−1, p+1 . −π
−π
Similarly, the last term in (8) rewrites
π
π
ein(β+α) e−i( p+1)α h(eiβ , ei(α+β) ) dα dβ −π −π π π (10) α=θ−β = eiθ(n− p−1) ei( p+1)β h(eiβ , eiθ ) dθ dβ = (2π )2 h −n+ p+1,− p−1 . −π
−π
Using (9) and (10), the expression in (8) yields h −n− p−1, p+1 = −h −n+ p+1,− p−1 , for n, p even, |n| > m + p, and p ≥ 0. (11)
56
H. Fujiwara et al.
By setting k = − p − 1 (odd) and |n| ≥ m + p + 2 = m − k + 1, and r = −n − k (odd), we obtain h r,k = (−1)k h r +2k,−k , for all odd k ≤ −1, and all odd r ≤ −m − 1.
(12)
Case 2: We consider (5) for all p ≥ 0, |n| > m + p,and p and n both odd. Since span cos α (sin α)2 j+1 , 0 ≤ j ≤ k = span {sin[(2 j + 2)α], 0 ≤ j ≤ k} for all k ≥ 0, the orthogonality in (5) for this case becomes
π
π
ein(β+α) sin[( p + 1)α] h(eiβ , ei(α+β) ) dα dβ −π −π 1 π π in(β+α) i( p+1)α iβ i(α+β) e e h(e , e ) dα dβ = 2i −π −π 1 π π in(β+α) −i( p+1)α iβ i(α+β) e e h(e , e ) dα dβ. − 2i −π −π
0=
(13)
Using (9) and (10), the expression in (13) yields h −n− p−1, p+1 = h −n+ p+1,− p−1 , for n, p odd, |n| > m + p, and p ≥ 0. By setting k = − p − 1 (even), |n| ≥ m + p + 2 = m − k + 1, and r = −n − k (odd), we obtain h r,k = (−1)k h r +2k,−k , for all even k ≤ −2, and all odd r ≤ −m − 1.
(14)
In summary (12) and (14), yields h r,k = (−1)k h r +2k,−k , for all k ≤ −1, and all odd r ≤ −m − 1. Since the relation above trivially holds for k = 0, we showed that the moment conditions (16) hold.
4 The Algebraic Range of the X-Ray Transform of Even Order Tensors In this section the square integrability of X f is needed. The following result gives a sufficient condition on f to ensure square integrability of X f. Proposition 1 Let the components of f ∈ L 20 (Sm ; Ω) satisfy supp f i1 ...im ⊂ {z ∈ Ω : |z| ≤
1 − δ 2 }, 0 < δ < 1.
(1)
Then X f ∈ L 2 (Γ × S1 ) and, thus the extension (9) is square integrable on Γ × S1 .
The Algebraic Range of the Planar X -Ray Transform of Symmetric Tensors …
57
Proof Since f is symmetric, for any m-tuple (i 1 , · · · , i m ) ∈ {1, 2}m such that 2 occurs exactly k times (and 1 occurs m − k times), the component f i1 ...im satisfies f i1 ...im = f 1 · · · 1 2 · · · 2 =: f˜k . m−k
Since there are
m
k
(2)
k
many m-tuples (i 1 , i 2 , · · · , i m ) that contain exactly k many 2 s, (4.2)
f(z), θ = m
m m
f˜k (z) (cos θ )(m−k) (sin θ )k
k
k=0
= e−imθ 2−m
m m ˜ (−i)k f k (z) Q m,k (e2iθ ), k k=0
where Q m,k are the polynomials in (6). Since the order of the tensor is even, say m = 2l, for some l ≥ 0, we obtained f, θ m =
l
f 2k e−i(2k)θ +
k=0
l
f −2k ei(2k)θ ,
(3)
k=1
where f k ’s are in a one-to-one correspondence with f˜k , and thus with f i1 ···im ; see Sadiq and Tamasan (2023, Lemma A.1). Since all the components f i1 ...im ∈ L 20 (Ω) satisfy the support condition (1), f 2k ∈ L 20 (Ω) also satisfies the same support condition (1), for all −l ≤ k ≤ l. Using the identity (3), the X -ray transform of f (with components extended by 0 l outside Ω) writes as X f = X k ( f 2k ), where X k is the weighted ray transform k=−l
X k ( f )(eiβ , eiθ ) :=
∞
−∞
f (ei(β−θ) + t)e−i2kθ dt.
For any f ∈ L 20 (Ω) satisfying the support condition (1), we show next that X k ( f ) ∈ L 2 (Γ × S1 ). In the estimate below (fourth equality) we denote by f θ the function obtained from f by a rotation of the domain by an angle θ , f θ (z) := f (zeiθ ). Note that f θ 2L 2 (Ω) = f 2L 2 (Ω) . X k f 2L 2 (Γ ×S1 ) =
1 (2π )2
π
−π π
π −π π
|X k f (eiβ , eiθ )|2 dβdθ
∞ 2 iβ iθ −2ikθ f (e + te )e dt dβdθ −π −π −∞ 2 π π 2 1 iβ iθ −2ikθ dβdθ f (e + te )e dt = (2π )2 −π −π −2
1 = (2π )2
58
H. Fujiwara et al.
π π 2 1 ≤4 | f (eiβ + teiθ )|2 dtdβdθ (2π )2 −π −π −2 π π 2 1 | f θ (ei(β−θ) + t)|2 dtdβdθ = 2 π −π −π −2 π π/2 2 α=β−θ 2 = | f θ (eiα + t)|2 dtdαdθ π 2 −π −π/2 −2 √ π 2 1 | f θ ( 1 − s 2 + t + is)|2 s=sin α 2 = dsdtdθ √ π 2 −π −2 −1 1 − s2 π 1 1 √ | f θ (u + is)|2 u=t+ 1−s 2 2 = dsdudθ, √ π 2 −π −1 −1 1 − s2
(4)
where the third equality uses that Ω has diameter 2, and the last equality uses the fact that the unit disk lies inside √ any rotated circumscribed square. Since supp f θ ⊂ {z : |z| ≤ 1 − δ 2 }, regardless of the rotation angle θ , X k f 2L 2 (Γ ×S1 )
π 1 √1−δ2 | f θ (u + is)|2 2 ≤ 2 dsdudθ √ √ π −π −1 − 1−δ2 1 − s2 π 4 2 f θ 2L 2 (Ω) dθ = f 2L 2 (Ω) . ≤ 2 π δ −π πδ
The constraints in Theorems 1 determine a closed subspace of L 2 (Γ × S1 ), which contains the range of the X -ray transform of square integrable tensors. Definition 1 Let m ≥ 0 be even. The algebraic range A R(X ) of the X -ray transform of symmetric m-tensors is defined by those g ∈ L 2 (Γ × S1 ) with Fourier coefficients satisfying (13), (14), (15), and (16). Since m is even, (13) yields that only the odd angular modes need be considered. To further describe the algebraic interaction between (14), (15), and (16), let us consider the three-set partition of Z− odd × Z = G ∪ R ∪ W (see Fig. 3) introduced in Sadiq and Tamasan (2023): • The region W := W + ∪ W − , where n+m+1 − + , W := (n, k) ∈ Zodd × Z : odd n ≤ −m − 1, and 0 ≤ k ≤ − 2 − W − := (n, k) ∈ Z− : odd n ≤ −m − 1, and k ≤ 0 . odd × Z (5) • The region G := G L ∪ G R ∪ , where +
+ G L := (n, k) ∈ Z− : k ≥ 1 and − 2k + 1 ≤ n < −k, n odd , odd × Z + (6) G R := (n, k) ∈ Z− : k ≥ 1 and n > −k, n odd , odd × Z := {(−k, k) : odd k ≥ 1} .
The Algebraic Range of the Planar X -Ray Transform of Symmetric Tensors …
59
k 6 5 4 3 2 1 −n
······
−m − 3 −m − 1 −m + 1
······
-3
-1
n + 2k ≤ −(m + 1) n ≤ −(m + 1)
0 -1 -2 -3 -4
Fig. 3 An even order m-tensor field f is determined by the odd negative angular modes on or above the diagonal k = −n (green region), and the odd negative angular modes (marked red) on the m2 red lines n + 2k = −(m + 1) for k ≥ 0. All the odd non-positive angular modes on and below the line n + 2k = −(m + 1), and on and left of the line n = −(m + 1) vanish. For n ≥ 0 the picture is symmetric with respect to the origin
• For m ≥ 2, the region R = R + ∪ R − with + R + := (n, k) ∈ Z− : k ≥ 1, and − m + 1 − 2k ≤ n ≤ −1 − 2k, n odd , odd × Z − R − := (n, k) ∈ Z− : k ≤ 0, and − m + 1 ≤ n ≤ −1, n odd . odd × Z
(7)
If m = 0 (case considered in the numerical experiments in Sect. 5), then R = ∅ and Z− × Z = W ∪ G; see Fig. 4. Note the invariance of G under the transformation (n, k) → (−n − 2k, k), and that of R (for m ≥ 2) under the transformation (n, k) → (n + 2k, −k). More precisely if (n, k) ∈ G L , then (−n − 2k, k) ∈ G R , and conversely, if (n, k) ∈ G R , then (−n − 2k, k) ∈ G L . Similarly, if (n, k) ∈ R + , then (n + 2k, −k) ∈ R − , and conversely, if (n, k) ∈ R − , then (n + 2k, −k) ∈ R + . For a given integer N ≥ 1, we also consider the finite sub-lattice I N of points inside the rectangle [−m + 1 − 2N , m − 1 + 2N ] × [−N , N ], I N = {(n, k) ∈ Zodd × Z : n odd , |n| ≤ m − 1 + 2N , and |k| ≤ N }.
(8)
In the case of even order tensors we only work with functions that are angularly odd. For brevity, let denote
60
H. Fujiwara et al.
k
Fig. 4 The 0-tensor f is determined by the odd negative angular modes on or above the diagonal k = −n. The diagonal modes gn,−n are real valued. All the odd non-positive angular modes on and below the line n + 2k = −1 vanish
6 5 4 3 2
n + 2k ≤ −1 −n
-7
-9
-5
1 -3
-1
0 -1 -2 -3
L 2odd (Γ × S1 ) := g ∈ L 2 (Γ × S1 ) : g(eiβ , eiθ ) = −g(eiβ , −eiθ ) . Following directly from (14), (15), and (16), g ∈ if for all odd n ≤ −1, ⎧ ⎨ 0, gn,k = (−1)1+k g−n−2k,k , ⎩ (−1)1+k gn+2k,−k ,
L 2odd (Γ × S1 ) ∩ A R(X ) if and only if (n, k) ∈ W, if (n, k) ∈ G, if (n, k) ∈ R.
(9)
Note also that (13) already implies L 2odd (Γ × S1 ) ∩ A R(X ) = L 2 (Γ × S1 ) ∩ A R(X ). The following result provides the theoretical support of the denoising method proposed in Sect. 5. Theorem 4 Let g ∈ L 2odd (Γ × S1 ) be an angularly odd real-valued function with Fourier coefficients {gn,k }. For some fixed even m ≥ 0, consider the partition G ∪ R ∪ W of Z− × Z with W in (5), G in (6), and R in (7) and define the new function ∗
g (e , e ) := 2Re iβ
iθ
∞
∗ gn,k einθ eikβ
,
(10)
odd n≤−1 k=−∞
where ∗ gn,k =
⎧ ⎨ 0,
if (n, k) ∈ W, 1+k + (−1) g g n,k −n−2k,k , if (n, k) ∈ G, ⎩ gn,k + (−1)1+k gn+2k,−k , if (n, k) ∈ R. 1 2 1 2
(11)
The Algebraic Range of the Planar X -Ray Transform of Symmetric Tensors …
61
Then
g ∗ = argmin g − h2L 2 (Γ ×S1 ) : h ∈ L 2odd (Γ × S1 ) ∩ A R(X ) .
(12)
Moreover, for any N ≥ 1 arbitrarily fixed, the band limited approximation g ∗N (eiβ , eiθ )
−1
:= 2Re
N
∗ gn,k einθ eikβ
,
(13)
n=−m+1−2N k=−N
is the X -ray transform of some symmetric m-tensor in C 1 (Sm ; Ω). Proof Following directly from its definition in (10), it is easy to see that g ∗ ∈ A R(X ) ∩ L 2odd (Γ × S1 ). Also following directly from the definition (10) on the anti-diagonal , the Fourier coefficients of g and g ∗ coincide ∗ , g−k,k = g−k,k
for odd k ≥ 0.
(14)
Moreover, since g is angularly odd, for all n ∈ Z, g2n,k = 0. Now let h ∈ L 2odd (Γ × S1 ) ∩ A R(X ) be arbitrary. The Fourier coefficients h n,k of h, then satisfy (9). In particular, they vanish in W . Moreover, the third condition in (9) also yields h n,0 = 0 for all n ≤ −1 odd. Since g is real valued, (14) holds and the Fourier modes {gn,k } for n ≤ 0 and k ∈ Z determines half the norm. We estimate −1 1 g − h2L 2 (Γ ×S1 ) − |gn,k |2 − |gn,0 |2 2 n=−m+1 (n,k)∈W 2 = |gn,k − h n,k | + |gn,k − h n,k |2 (n,k)∈G
(n,k)∈R
=
|gn,k − h n,k |2 +
(n,k)∈G L ∪G R
+
(n,k)∈R +
|g−k,k − h −k,k |2
(−k,k)∈
|gn,k − h n,k | + 2
|gn,k − h n,k |2
(n,k)∈R −
|gn,k − h n,k |2 + |g−n−2k,k − h −n−2k,k |2
≥
(n,k)∈G L
+
|gn,k − h n,k |2 + |gn+2k,−k − h n+2k,−k |2
(n,k)∈R +
|gn,k − h n,k |2 + |g−n−2k,k − (−1)1+k h n,k |2
=
(n,k)∈G L
+
|gn,k − h n,k |2 + |gn+2k,−k − (−1)1+k h n,k |2
(n,k)∈R +
=
|gn,k − h n,k |2 + |(−1)1+k g−n−2k,k − h n,k |2
(n,k)∈G L
62
H. Fujiwara et al.
+
|gn,k − h n,k |2 + |(−1)1+k gn+2k,−k − h n,k |2 .
(15)
(n,k)∈R +
To show that g ∗ is a minimizer of the functional defined by the right-hand side above, we use the simple geometric fact: If z 0 , z 1 ∈ C, then their midpoint z0 + z1 = argmin |z − z 0 |2 + |z − z 1 |2 : z ∈ C 2
(16)
1 and the minimum value is |z 0 − z 1 |2 . 2 By applying (16) to each term (n, k) ∈ G L ∪ R + in (15), we further estimate −1 1 g − h2L 2 (Γ ×S1 ) − |gn,k |2 − |gn,0 |2 2 n=−m+1 (n,k)∈W
∗ 2 ∗ 2 |gn,k − gn,k ≥ | + |(−1)1+k g−n−2k,k − gn,k | (n,k)∈G L
+
∗ 2 ∗ 2 |gn,k − gn,k | + |(−1)1+k gn+2k,−k − gn,k |
(n,k)∈R +
=
(n,k)∈G
=
∗ 2 |gn,k − gn,k | +
∗ 2 |gn,k − gn,k |
(n,k)∈R
−1 1 g − g ∗ 2L 2 (Γ ×S1 ) − |gn,k |2 − |gn,0 |2 . 2 n=−m+1 (n,k)∈W
The next to the last equality above uses the fact in (14) that the anti-diagonal Fourier coefficients of g and g ∗ coincide. Since the functional in (12) is strictly convex, g ∗ is the unique minimizer. Moreover, g ∗ is the orthogonal projection of g onto A R(X ). The band limited approximation g ∗N lies in A R(X ) and trivially satisfies the decay condition (17) in the sufficiency part of Theorem 1 with μ = 1. Thus, g ∗N is the X -ray transform of some symmetric tensor in C 1 (Sm ; ). Since lim g ∗ − g ∗N L 2 (Γ ×S1 ) = 0, we have also shown that A R(X ) is the closure N →∞
in L 2 (Γ × S1 ) of the range of X -ray transform of symmetric tensor in C 1 (Sm ; ).
5 Numerical Experiments for the 0-Order Case We present three numerical examples to illustrate the effect of using the projection method in the inversion of the X -ray transform of functions of compact support in the unit disk. In the 0-order case, recall that the region R in the partition of the lattice Z− × Z is empty and, thus, Z− × Z = W ∪ G; see Fig. 4.
The Algebraic Range of the Planar X -Ray Transform of Symmetric Tensors …
63
For K and N positive even integers let β = 2π/K , θ = 2π/N and β = −π + β, θm = −π + m θ,
0 ≤ < K, 0 ≤ m < N,
which are equi-spaced points in the interval [−π, π ). In practice one is given some (noisy) data X f + = g + ∈ L 2 (Γ+ ) in the fan-beam coordinates, passing through eiβ ∈ ∂ and going in θm -directions. More precisely, our numerical experiment is processed in the following procedure: 1. We construct the discretized angularly odd extension g ∈ L 2odd (Γ × S1 ) in (9) via g(eiβ , eiθm ) =
g + (eiβ , eiθm ), if (eiβ , eiθm ) ∈ Γ+ , −g + (eiβ , −eiθm ), if (eiβ , eiθm ) ∈ Γ− .
(1)
2. Compute the Fourier coefficients (see (10)) by gn,k =
K −1 N −1 θ β g(eiβ , eiθm )e−inθm e−ikβ , |n| < N , |k| < K . (2π )2 =0 m=0
(2)
3. Denoise X f + by the algebraic range condition. See (13), (14), (9), and Fig. 4. a. b. c. d.
∗ gn,k = 0 for non-positive even n > −N (including n = 0) and all |k| < K . ∗ = 0, for positive n < min{K , N }. Img−n,n ∗ = 0 for (n, k) ∈ W ∩ {−N < n < 0, |k| < K }. gn,k For (n, k) ∈ G L ∩ {−N < n < 0}, ∗ = gn,k
1 ∗ gn,k + (−1)k+1 g−n−2k,k andg−n−2k,k 2
∗ = (−1)k+1 gn,k .
∗ 4. Modify {gn,k } as an image of Fourier coefficients of a real sequence. ∗ a. g−N −n,K −k = 0 for negative odd n > −N , and k with 0 < k < −(n + 1)/2. ∗ ∗ and g−N b. Take the average of gn,k −n,K −k , i.e., they are set as 1 2
∗ ∗ gn,k + g−N −n,K −k
for 1 ≤ k < K , −N /2 < n ≤ −1, odd n, with −1 < n + 2k < 2K − N . ∗ 5. Compute the denoised X -ray transform image X f m by
∗ X f m =
−1 K −1
∗ gn,k einθm eikβ ,
n=−N +1 k=0
for 0 ≤ < K and 0 ≤ m < N with (eiβ , eiθm ) ∈ Γ+ .
(3)
64
H. Fujiwara et al.
k (3a) ↓ ∗ g−1,1 R ∗ g−1,2
−ζ ∗ g−1,3
(3a) ↓
(3a) ↓
0
031
0
051
0
071
0
7
0
∗ g−3,2 ζ
0
052
0
072
0
6
0
∗ g−3,3 R
0
∗ g−3,5
0
073
0
5
∗ g−3,4
R 0
∗ g−3,4 δ
0
∗ g−1,4 γ
R 0
4
0
∗ g−3,3 R
0
∗ g−1,3 β
0
3
∗ g−1,4 −γ
0
073
0
072
0
052
0
∗ g−3,2 −α
0
∗ g−1,2 α
0
2
071
0
051
0
031
0
∗ g−1,1 R
0
1
0
0
0
R 0
0
0
0
R 0
0
−7
−6
−5
−4
−3
−2
−1
0
−δ ∗ g−3,5 β
(3c)
n
(3a) ↓
(4a)
Fig. 5 Example of denoised data for N = K = 8. For each element, below expresses the algebraic range condition, while above should be satisfied as an image of the discrete Fourier transform. It also leads that the entries in the blue region should be zero, since those in the red region are zero from the algebraic range condition. Zero entries 0ab in the blue region correspond to same one in the red region. The discrete Fourier transform also requires α = −ζ and β = , etc.
∗ 6. Reconstruct the function f from {X f m }.
Figure 5 illustrates the denoised coefficients by the above procedure in n ≤ 0 and ∗ k ≥ 0 for the case K = N = 8. For even n including n = 0, gn,k is zero by (3a). The ∗ ∗ ∗ ∗ diagonals g−1,1 , g−3,3 , g−5,5 , and g−7,7 are real by (3b). The Greek characters recall the relation in (3d), which is numerically satisfied. ∗ } is supposed to be an image of discrete Fourier transform (2), they Since {gn,k ∗ = g ∗N −n,K −k for all n and k, which is indicated by the above line in should satisfy gn,k Fig. 5. This is realized in the step (4a) and (4b). Note that the process (4b) is consistent ∗ ∗ , g−3,2 , to the algebraic range condition. For instance, in the step (4b) we modify g−1,2 ∗ ∗ ∗ ∗ ∗ ∗ g−5,6 , and g−7,6 which keep to enjoy g−1,2 = −g−3,2 and g−5,6 = −g−7,6 . We present three numerical experiments with different types of noisy data. We generate the data for a function f modeled on a modified Shepp–Logan phantom in E = {(x1 /0.69)2 + (x2 /0.92)2 < 1} inscribed in the unit disk . Outside E we set f ≡ 0. In the reconstructions below we use the numerical algorithm in Fujiwara and
The Algebraic Range of the Planar X -Ray Transform of Symmetric Tensors …
65
Fig. 6 Sinogram generated by specifying uniform random noise of 20% magnitude, which has 11.5% relative error in the L 2 sense (left) and denoised one by the projection to A R(X ) with 5.9% relative L 2 error (right)
Fig. 7 Reconstruction results; (left) Reconstruction from noisy data X f + in the left figure of Fig. 6, (right) Reconstruction from denoised data X f ∗ by the projection onto A R(X ) depicted in the right one in Fig. 6. X f + contains uniform random noise of 20% magnitude
Tamasan (2019), based on the A-analytic theory Bukhgeim 1995. This avoids the interpolation error that would occur due to the translation of the data from the torus to the tangent bundle of the circle, the latter being needed in a standard filtered back projection algorithm. For discretization we use K = N = 256. In the first experiment the exact data X f exact is corrupted by some additive uniformly distributed noise δ of 20% magnitude, which is approximately 11.5% in the relative L 2 sense and is given in the left figure of Fig. 6 as sinogram. The reconstruction is performed from this noisy data X f + := X f exact + δ in two different ways and shown in Fig. 7: On the left the reconstruction is obtained from the “raw” data X f + . This reconstruction (restricted to the elliptical region E) has a 53.1% relative error in the L 2 sense. The reconstruction in Fig. 7 on the right is obtained from inverting the projection of X f + on the range A R(X ). This reconstruction has an error of 34.5% in the relative L 2 -sense. Next we show the results in the two extreme case scenarios: the worst case, when the entire noise lies in the algebraic range and in the best case, when the entire noise is orthogonal to the algebraic range.
66
H. Fujiwara et al.
In order to simulate some noise which lies entirely in the algebraic range A R(X ), or lies entirely in A R(X )⊥ , we use the knowledge of X f exact and decompose the existing noise δ into its component δ A R ∈ A R(X ), respectively, δ ⊥ ∈ A R(X )⊥ as follows. For δ := X f + − X f exact compute its Fourier coefficients δn,k as in (2). Then find AR δn,k by projecting δn,k on A R(X ) via Step 3 of the algorithm. The component δ A R is AR found by Step 4 in the algorithm, and the discrete inverse Fourier transform of {δn,k } ⊥ AR via (3). We also set δ := δ − δ . Figure 8 on the left shows the data corrupted by some noise lying entirely in the algebraic range X f exact + δ A R , and Fig. 9 on the left displays the reconstructions from this data. The reconstruction in Fig. 9 on the right first performs a projection on A R(X ) and then performs the inversion from it. Both reconstructions (restricted to the ellipsoid E) in Fig. 9 contain approximately 34.5% relative L 2 error, confirming that, in this worst-case scenario, the projection method does not bring any improvement. Indeed, this was expected, since the noise in the data was artificially created to lie in A R(X ). The best case scenario is when the entire noise happens to be orthogonal to the algebraic range. Figure 10 on the left depicts such noisy data X f exact + δ ⊥ , while
Fig. 8 Sinogram of X f exact + δ A R with 5.8% relative L 2 error (left) and denoised one by the projection with 5.9% relative L 2 error (right)
Fig. 9 Worst-case scenario: The data X f exact + δ A R has 5.8% relative L 2 -error. Left: the reconstruction from this data has 34.5% error in E. Right: the reconstruction from denoised data also has 34.5% error in E. Since the noise was entirely in the algebraic range, the projection is redundant
The Algebraic Range of the Planar X -Ray Transform of Symmetric Tensors …
67
Fig. 10 Sinogram of X f exact + δ ⊥ with 10.0% relative L 2 error (left) and denoised one by the projection with 1.2% relative L 2 error (right)
Fig. 11 Best case scenario: The data X f exact + δ ⊥ has 10.0% relative L 2 -error. Left: the reconstruction from this data has 45.2% relative L 2 -error. Right: the reconstruction via the proposed denoising method has 19.4% relative L 2 -error. Since the noise was entirely orthogonal to the algebraic range, the projection method is most effective
Fig. 11 on the left depicts the reconstruction from this “raw” data. In contrast, Fig. 10 on the right shows the projection of the noisy data on the algebraic range, while Fig. 11 on the right shows the inversion from this projection. Reconstruction from the noisy data in Fig. 11 on the left has 45.2% error in the relative L 2 sense, whereas the reconstruction in Fig. 11 on the right has a 19.4% relative L 2 -error. In this best case scenario, one can observe that the reconstruction result shown in right figure is dramatically improved by our “denoising” algorithm. We also note that in this best case scenario example the raw data X f exact + δ ⊥ had a larger error than X f exact + δ A R , but because it was orthogonal to the range, the reconstruction from its projection on A R(X ) gave an accurate reconstruction image; see Fig. 11 on the right. Acknowledgements The work of H. Fujiwara was supported by JSPS KAKENHI Grant Numbers JP20H01821 and JP22K18674. The work of K. Sadiq was supported by the Austrian Science Fund (FWF), Project P31053-N32. The work of A. Tamasan was supported in part by the National Science Foundation DMS-1907097.
68
H. Fujiwara et al.
References Bukhgeim AL (1995) Inversion formulas in inverse problems. In: Lavrentiev MM, Savalev LY (eds) Linear operators and Ill-posed problems. Plenum, New York, pp 323–378 Fujiwara H, Tamasan A (2019) Numerical realization of a new generation tomography algorithm based on the Cauchy-type integral formula. Adv Math Sci Appl 28(2):413–424 Gel’fand IM, Graev MI (1960) Integrals over hyperplanes of basic and generalized functions. Soviet Math Dokl 1:1369–1372 Gompel GV, Defrise M, Dyck DV (2006) Elliptical extrapolation of truncated 2D CT projections using Helgason-Ludwig consistency conditions. In: Flynn MJ, Hsieh J (eds) Medical imaging 2006: physics of medical imaging, vol 6142. International Society for Optics and Photonics, SPIE, p 61424B. https://doi.org/10.1117/12.653293 Helgason S (1965) The Radon transform on Euclidean spaces, compact two-point homogeneous spaces and Grassmann manifolds. Acta Math 113:153–180. https://doi.org/10.1007/BF02391776 John F (1938) The ultrahyperbolic differential equation with four independent variables. Duke Math J 4(2):300–322 Karp JS, Muehllehner G, Lewitt RM (1988) Constrained Fourier space method for compensation of missing data in emission computed tomography. IEEE Trans Med Imaging 7(1):21–25. https:// doi.org/10.1109/42.3925 Kudo H, Saito T (1991) Sinogram recovery with the method of convex projections for limited-data reconstruction in computed tomography. J Opt Soc Am A Opt Image Sci Vis 8(7):1148–1160. https://doi.org/10.1364/JOSAA.8.001148 Ludwig D (1966) The Radon transform on Euclidean space. Commun Pure Appl Math 19:49–81. https://doi.org/10.1002/cpa.3160190207 Monard F (2016) Efficient tensor tomography in fan-beam coordinates. Inverse Probl Imaging 10(2):433–459. https://doi.org/10.3934/ipi.2016007 Nadirashvili NS, Sharafutdinov VA, Vl˘adut SG (2016) The John equation for tensor tomography in three-dimensions. Inverse Probl 32(10):105013 Pantyukhina EY (1990) Description of the image of a ray transformation in the two-dimensional case. In: Methods for solving inverse problems (Russian). Akad. Nauk SSSR Sibirsk. Otdel., Inst. Mat., Novosibirsk, pp 80–89, 144 Patch SK (2001) Moment conditions indirectly improve image quality. In: Radon transforms and tomography (South Hadley, MA, 2000), Contemporary Mathematics, vol 278. American Mathematical Society, Providence, RI, pp 193–205. https://doi.org/10.1090/conm/278/04605 Pestov L, Uhlmann G (2004) On characterization of the range and inversion formulas for the geodesic X-ray transform. Int Math Res Not IMRN 2004(80):4331–4347. https://doi.org/10. 1155/S1073792804142116 Sadiq K, Tamasan A (2015) On the range of the attenuated Radon transform in strictly convex sets. Trans Am Math Soc 367(8):5375–5398. https://doi.org/10.1090/S0002-9947-2014-06307-1 Sadiq K, Tamasan A (2022) On the range of the planar X -ray transform on the Fourier lattice of the torus. arXiv:2201.10926, under review Sadiq K, Tamasan A (2023) On the range of the X -ray transform of symmetric tensors compactly supported in the plane. Inverse Probl Imaging 17(3):660–685. https://doi.org/10.3934/ipi. 2022070 Sharafutdinov VA (1994) Integral geometry of tensor fields. Inverse Ill-posed Probl Ser, 1. VSP, Utrecht. https://doi.org/10.1515/9783110900095 Xia Y, Berger M, Bauer S, Hu S, Aichert A, Maier A (2017) An improved extrapolation scheme for truncated CT data using 2D Fourier-based Helgason-Ludwig consistency conditions. Int J Biomed Imaging, pp 1867025, 14 pages. https://doi.org/10.1155/2017/1867025 Yu H, Wang G (2007) Data consistency based rigid motion artifact reduction in fan-beam CT. IEEE Trans Med Imaging 26(2):249–260. https://doi.org/10.1109/TMI.2006.889717 Yu H, Wei Y, Hsieh J, Wang G (2006) Data consistency based translational motion artifact reduction in fan-beam CT. IEEE Trans Med Imaging 25(6):792–803. https://doi.org/10.1109/TMI.2006. 875424
Radiative Transport Equation in Optical Tomography Manabu Machida
Abstract A review on recent advances in radiative transport and optical tomography is given. In particular, the technique of rotated frames for the radiative transport equation is reviewed. Inverse problems for optical tomography are considered in terms of the inverse series. Keywords Radiative transport equation · Inverse problems · Optical tomography
1 Introduction Near-infrared light in biological tissue obeys the radiative transport equation (RTE). Since the RTE is difficult to solve even numerically, the diffusion equation is commonly used for near-infrared imaging. In this review, however, we focus on the RTE. The diffusion equation is derived as an asymptotic limit of the radiative transport equation. Although the diffusion equation is an approximation in this sense, the two equations can be understood as governing equations for two different layers in the hierarchy of light propagation in random media. That is, the RTE describes light propagation at the mesoscopic scale whereas light propagation at the macroscopic scale is governed by the diffusion equation. Here, the term mesoscopic is used for light propagation whose distance is comparable to the transport mean free path ∗ . When the propagation distance is significantly larger than ∗ , the light propagation is said to be at the macroscopic scale. In the early stage of inverse transport problems, inverse problems for the RTE were proposed and considered (Case 1973; Kanal and Moses 1978a, b; Larsen 1975, 1981; McCormick and Kušˇcer 1974; Siewert 1978). Then mathematical studies on M. Machida (B) Department of Informatics, Faculty of Engineering, Kindai University, Higashi-Hiroshima, Japan Institute for Medical Photonics Research, Hamamatsu University School of Medicine, Hamamatsu, Japan JST, PRESTO, Kawaguchi, Saitama, Japan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 T. Takiguchi et al. (eds.), Practical Inverse Problems and Their Prospects, Mathematics for Industry 37, https://doi.org/10.1007/978-981-99-2408-0_5
69
70
M. Machida
uniqueness (Choulli and Stefanov 1996a, b; Stefanov 2003; Stefanov and Tamasan 2009) and stability (Bal and Jollivet 2008; Bal et al. 2008; Langmore 2008; Machida and Yamamoto 2014; McDowall et al. 2010a, b; Romanov 1997, 1998; Stefanov and Uhlmann 2003; Tamasan 2002; Wang 1999) followed. For optical tomography, coefficients of the RTE or diffusion equation are reconstructed from boundary measurements. For those inverse problems, forward problems have to be solved repeatedly. Hence it is essential to efficiently obtain solutions to the equations. The technique of rotated frames helps us access to solutions of the RTE. The method of rotated reference frames was first formulated as a numerical method with the spherical-harmonic expansion (Markel 2004; Panasyuk et al. 2006). The method of rotated reference frames have been validated for the slab geometry (Machida et al. 2010), the half-space geometry (Liemert and Kienle 2014, 2012a, b, 2013a, b). In addition, the method was applied to the time-dependent RTE for an infinite medium (Liemert and Kienle 2012c, d) and the RTE in flatland (Liemert and Kienle 2011, 2012e, 2013c). Then the idea was further developed to find analytical expressions of the fundamental solution for the three-dimensional RTE (Machida 2014) and in flatland (Machida 2016). Until then, the singular-eigenfunction approach for the three-dimensional RTE for anisotropic scattering had not been available. Let be a convex, bounded, and open subset of Rd (d = 2, 3). Its boundary (C 1 ) is denoted by ∂. We consider light propagation in . Let u(x, θ ) be the specific intensity of light at position x ∈ in direction θ ∈ Sd−1 . The radiative transport equation is written as θ · ∇u + μt u − μs
Sd−1
p(θ, θ )u(x, θ ) dθ = 0
(1)
for x ∈ , θ ∈ Sd−1 . The boundary condition is given by u(x, θ ) = R(|ν · θ |)u(x, θ ) + g(x, θ ), (x, θ ) ∈ − . Here, ± are defined as ± = (x, θ ) ∈ ∂ × Sd−1 ; ±θ · ν(x) > 0 , where ν(x) is the outer unit normal vector at x ∈ ∂. Here, μt , μs are the total attenuation and scattering coefficient, respectively. The absorption coefficient μa is given by μa = μt − μs ≥ 0. In this review, we ignore the angular dependence of μa , μs and assume that they only depend on x. Coefficients μt , μs , μa implicitly depend on the wavelength of near-infrared light. Below, we further assume that of p is neglected. μa , μs are positive constants. In the integrand, the x-dependence The scattering phase function p satisfies p(·, θ ) ≥ 0 and Sd−1 p(θ, θ ) dθ = 1 for θ ∈ Sd−1 . On the boundary, R(|ν · θ |) describes the Fresnel reflection when light propagating in direction θ changes its direction to θ by the reflection. Here, the polar angle of θ is π − ϑ, where ϑ is the polar angle of θ . The azimuthal angles
Radiative Transport Equation in Optical Tomography
71
of θ and θ are the same. Let n in , n out berefractive indices inside and outside (n in > n out is assumed). We define xc = 1 − (n out /n in )2 . Assuming unpolarized light, the function R is given by ⎡
1⎣ x− R(x) = 2 x+
n in x n out 0 n in x n out 0
2 +
x0 − x0 +
n in n out n in n out
x x
2 ⎤ ⎦
for x ≥ xc , and R(x) = 1 for x < xc , where x0 =
1−
n in n out
2 (1 − x 2 ).
In optical tomography, tomographic images are obtained by the reconstruction of μa , μs . The kernel p is often replaced by a reasonable model and treated as a known function. In particular, the oxygen saturation can be obtained using μa for different wavelengths. In most cases, inverse problems for optical tomography are severely ill-posed. Quite often the diffusion approximation is employed for optical tomography.
2 Forward Problems 2.1 RTE with Constant Coefficients We set d = 3 and consider the following time-independent radiative transport equation. θ · ∇u(x, θ ) + μt u(x, θ ) = μs
S2
p(θ, θ )u(x, θ ) dθ + S(x, θ ),
(2)
where μt and μs are positive constants, and S(x, θ ) is the source term. We assume that p is given by p(θ, θ ) =
lmax lmax l βl 1 ∗ Ylm (θ )Ylm βl Pl (θ · θ ) = (θ ), 4π l=0 2l + 1 l=0 m=−l
(3)
where lmax ≥ 0 and Ylm (θ ) are spherical harmonics. The symbol ∗ denotes complex conjugate. Often a constant g ∈ [−1, 1] is introduced such that βl = (2l + 1)gl .
72
M. Machida
In the limit of lmax → ∞, this scattering phase function p with g is called the HenyeyGreenstein model (Henyey and Greenstein 1941). Let us introduce u ( x , θ ) = u(x/μ ˜ t , θ ). x = μt x, By dividing both sides of (2) by μt , we obtain u ( θ · ∇ x, θ) + u ( x, θ) =
S2
p(θ, θ ) u ( x , θ ) dθ +
1 S( x /μt , θ ), μt
(4)
= ∂/∂ where ∇ x and = μs /μt ∈ (0, 1) is called the albedo for single scattering. Hereafter we will take the unit of length to be 1/μt and drop tildes. The specific intensity u in (4) is given as a superposition of eigenmodes, which are solutions to the following homogeneous equation. θ · ∇u(x, θ ) + u(x, θ ) =
S2
p(θ, θ )u(x, θ ) dθ .
(5)
Let μ = cos ϑ be the cosine of the polar angle of θ and ϕ be the azimuthal angle of θ . We write p(θ, θ ) in (3) as McCormick and Kušˇcer (1966) p(θ, θ ) =
l l max l=0 m=−l
|m|/2 βl (l − m)! 2 |m|/2 m 1 − μ2 1 − μ pl (μ) plm (μ )eim(ϕ−ϕ ) . 4π (l + m)!
Here, the polynomials plm (μ) are related to associated Legendre polynomials Plm (μ) as |m|/2 m pl (μ). Plm (μ) = (−1)m 1 − μ2 They satisfy the following recurrence relations: m m (μ) = (2l + 1)μplm (μ) − (l + m) pl−1 (μ). (l − m + 1) pl+1
We also have orthogonality relations:
1 −1
where we introduced
plm (μ) plm (μ) dm(μ) =
2(l + m)! δll , (2l + 1)(l − m)!
|m| dμ. dm(μ) = 1 − μ2
Furthermore we have m (μ) p|m|
=
(2m)! 2m m! m (−1) 2|m| (|m|!)
for m ≥ 0, for m < 0.
(6)
Radiative Transport Equation in Optical Tomography
73
2.2 Eigenmodes To begin with, let us introduce polynomials h lm (ν) which satisfy the following threeterm recurrence relation (Inönü 1970): m m (ν) − (l + m)h l−1 (ν) = 0 ν(2l + 1)σl h lm (ν) − (l − m + 1)h l+1
(7)
with m hm |m| (ν) = p|m| ,
and
|m|
|m|
h |m|+1 (ν) = (2 m| + 1)νσ|m| h |m| (ν). Define −|m|
hl Here, σl were defined as
(ν) = (−1)|m|
(l − |m|)! |m| h (ν). (l + |m|)! l
σl = 1 − gl (lmax − l),
where the step function (·) is defined as (z) = 1 for z ≥ 0 and (z) = 0 for z < 0. We seek solutions to (5) of the form of plane-wave decomposition (Kaper 1969; Kim 2004; Kim and Keller 2003; Panasyuk et al. 2006). We introduce ν ∈ R and q ∈ R2 , and define vector k ∈ C3 as k=
1ˆ ˆ k, k = ν
−iνq 2 ˆ ˆk3 (ν|q|) , k3 (ν|q|) = 1 + (ν|q|) .
ˆ we introduce operator R ˆ . To proceed further, for a three-dimensional vector k, k By Rkˆ , angles are measured in a rotated reference frame so that the x3 -axis lies in ˆ For example, we have the direction of k. ˆ Rkˆ μ = Rkˆ θ · xˆ3 = θ · k, where xˆ3 is the unit vector in the positive x3 direction. We assume the specific intensity of the form
where
m −k·x , um ν (x, θ ; q) = Rkˆ ν (θ )e
(8)
m 2 |m|/2 imϕ e . Rkˆ m ν (θ ) = Rkˆ φ (ν, μ) 1 − μ
(9)
74
M. Machida
We normalize φ m as 1 2π
S2
2 |m|
Rkˆ φ (ν, μ) 1 − μ m
dθ =
1
−1
φ m (ν, μ) dm(μ) = 1.
(10)
We note that in the laboratory frame (kˆ = xˆ3 ), (8) reduces to the form used in McCormick and Kušˇcer (1966). We will determine solutions u m ν (x, θ ; q) in (8) so that they satisfy (5). We will calculate singular eigenfunctions φ m below. By plugging (8) into (5), we obtain
|m|/2 imϕ Rkˆ μ Rkˆ φ m (ν, μ) 1 − μ2 e ν |m|/2 2 = p Rkˆ θ, Rkˆ θ Rkˆ φ m (ν, μ ) 1 − μ eimϕ dθ . 1−
(11)
S2
The right-hand side can be calculated as lmax |m|/2 imϕ 2l + 1 (l − m)! p m (μ) e gl RHS = 2π (lmax − |m|) Rkˆ 1 − μ2 + m)! l 4π (l l =|m| 1 × plm (μ )φ m (ν, μ ) dm(μ ). −1
Hence, Rkˆ (ν − μ) φ m (ν, μ) = 2π ν (lmax − |m|)
l max l =|m|
gl
2l + 1 (l − m)! R pm (μ)h¯ lm (ν), 4π (l + m)! kˆ l
(12) where we defined h¯ lm (ν) =
1 −1
φ m (ν, μ) plm (μ) dm(μ).
Since the right-hand side of (12) is zero for |m| > lmax and then φ m = 0, hereafter we suppose 0 ≤ |m| ≤ lmax . From (12), we obtain σl ν h¯ lm (ν) =
1 −1
μφ m (ν, μ) plm (μ) dm(μ).
(13)
Equation (13) implies the three-term recurrence relation and indeed, we see h¯ lm (ν) = h lm (ν) (McCormick and Kušˇcer 1966; Mika 1961).
Radiative Transport Equation in Optical Tomography
75
Let us define g m (ν, μ) =
lmax
(2l + 1)gl
l =|m|
(l − m)! m p (μ)h lm (ν). (l + m)! l
We note that g −m (ν, μ) = g m (ν, μ). The function φ m is obtained as φ m (ν, μ) =
ν g m (ν, μ) P + λm (ν)(1 − ν 2 )−|m| δ(ν − μ), 2 ν−μ
(14)
where P denotes the Cauchy principal value and λm (ν) is given below.
2.3 Eigenvalues and Continuous Spectrum |m| By multiplying 1 − μ2 and integrating over θ , (14) becomes ν 1= P 2
1 −1
g m (ν, μ) dm(μ) + ν−μ
1
−1
λm (ν)δ(ν − μ) dμ.
For ν ∈ (−1, 1), we obtain λm (ν) = 1 −
1 m g (ν, μ) ν P dm(μ). 2 −1 ν − μ
We note that λ−m (ν) = λm (ν) and hence φ −m (ν, μ) = φ m (ν, μ). Let us define z 1 g m (z, μ) dm(μ), z ∈ C \ [−1, 1]. m (z) = 1 − 2 −1 z − μ Eigenvalues ν ∈ / [−1, 1] are solutions to m (ν) = 0.
(15)
m > 1). Note that ν −m = We write these eigenvalues as ±ν mj (ν0m > ν1m > · · · > ν M−1 j m ν j . The number of eigenvalues M depends on |m| and we have (McCormick and Kušˇcer 1966; Mika 1961) M ≤ lmax − |m| + 1. For ν ∈ (−1, 1), we have the continuous spectrum.
76
M. Machida
2.4 Fundamental Solution Let us consider the fundamental solution of the RTE, which obeys θ · ∇G 0 (x, θ ; x0 , θ0 ) + G 0 (x, θ ; x0 , θ0 ) =
S2
p(θ, θ )G 0 (x, θ ; x0 , θ0 ) dθ
+ δ(x − x0 )δ(θ − θ0 ).
(16)
Let us introduce m 2 m m m dm (z) (ν j ) g (ν j , ν j ) m, 2 dz z=ν
N jm = N m (ν mj ) =
j
and for ν ∈ (−1, 1), −|m| N m (ν) = νm+ (ν)m− (ν) 1 − ν 2 . We obtain (Machida 2014) G 0 (x, θ ; x0 , θ0 ) = lmax M−1
×
m=−lmax
1
+ 0
1 (2π )3
ei(q1 (x1 −x01 )+q2 (x2 −x02 ))
R2
∗ ˆ m m Rkˆ mj± (θ ) mj± (θ0 ) e−k3 (ν j |q|)|x3 −x03 |/ν j
1
m ˆ m j=0 k3 (ν j |q|)N j
1 kˆ3 (ν|q|)N m
m ∗ −kˆ3 (ν|q|)|x3 −x03 |/ν (ν)Rkˆ m (θ ) (θ ) e dν dq1 dq2 , ±ν ±ν 0 (17)
where upper signs are used for x3 > x03 and lower signs are used otherwise.
3 Inverse Problems Here we consider the optical tomography for which μt (x) is reconstructed from boundary values u|+ . Let us consider the following RTE. θ · ∇u(x, θ) + μt (x)u(x, θ) − μs
Sd−1
p(θ, θ )u(x, θ ) dθ = 0, (x, θ) ∈ × Sd−1 ,
u(x, θ) = g(x, θ), (x, θ) ∈ − .
Radiative Transport Equation in Optical Tomography
We write μt > 0 as
77
μt (x) = (1 + η(x))μt ,
where μt > 0 is a constant and supp η ⊂ Ba ⊂ , where Ba is a closed ball of radius a. We define a = Ba × Sd−1 . We suppose that positive constants μt , μs are known. Let u 0 be the solution of the RTE in which η ≡ 0. Let G(x, θ ; x , θ ) be the Green’s function. We construct the Born series as = K1η + K2η ⊗ η + K3η ⊗ η ⊗ η + · · · , where = (u 0 − u)|+ and
G(x, θ ; x1 , θ1 )G(x1 , θ1 ; x2 , θ2 ) · · · a ×···×a G(x j−1 , θ j−1 ; x j , θ j )u 0 (x j , θ j ) f (x1 , . . . , x j ) d x1 dθ1 · · · d x j dθ j ,
K j f (x, θ ) = (−1) ×
j+1
j μt
where f ∈ L ∞ (Ba × · · · × Ba ). The corresponding inverse Born series can be written as (Markel et al. 2003; Moskow and Schotland 2008) η∗ = K1 + K2 ⊗ + K3 ⊗ ⊗ + · · · , where K1 is a regularized pseudoinverse of K 1 . Since K1 is not the inverse of K 1 , η∗ is an approximation of η. The operator K1 can be constructed as follows. Let T be the Tikhonov functional which is given by T (η) = K 1 η − L 1 (+ ) + α F(η), where F is a convex penalty function and α > 0 is a regularization parameter. Let η† denote the minimizer of T . We define K1 as K1 : → η† . For the numerical calculation, K 1 , K1 become matrices. We can obtain K1 with the truncated SVD for K 1 . For j ≥ 2, we have ⎛ Kj = − ⎝
j−1
Km
m=1
⎞ K i1 ⊗ · · · ⊗ K im ⎠ K1 ⊗ · · · ⊗ K1 .
i 1 +···+i m = j
Let G 0 (x, θ ; x , θ ) be the fundamental solution. We have G(x, θ ; x , θ ) d xdθ ≤ G 0 (x, θ ; x , θ ) d xdθ Sd−1
Sd−1
for (x , θ ) ∈ × Sd−1 . Let us introduce G 0 (x, θ ; x , θ ) d xdθ, ξ0 = μt sup (x ,θ )∈a
a
78
M. Machida
ζ0 = μt
a
u 0 d xdθ
sup
(x ,θ )∈a
+
G 0 (x, θ ; x , θ ) d xdθ.
We have the following error estimate for the inverse Born series (Machida and Schotland 2015) Theorem 1 Suppose that K1 < 1/(ξ0 + ζ0 ) and K1 L ∞ (Ba ) < 1/(ξ0 + ζ0 ). Let M = max( η L ∞ (Ba ) , K1 K 1 η L ∞ (Ba ) ) and assume that M < 1/(ξ0 + ζ0 ). Then there exists a positive constant C = C(ξ0 , ζ0 , K1 , M) such that η−
∞
≤ C (I − K1 K 1 ) η L ∞ (Ba ) .
Kj ⊗ · · · ⊗
j=1
L ∞ (Ba )
4 Concluding Remarks In this review, we have investigated forward and inverse problems of the RTE. The inverse problem which determines coefficients of the RTE is used for optical tomography.
References Bal G, Jollivet A (2008) Stability estimates in stationary inverse transport. Inv Probl Imaging 2:427– 454. https://doi.org/10.3934/ipi.2008.2.427 Bal G, Langmore I, Monard F (2008) Inverse transport with isotropic sources and angularly averaged measurements. Inv Probl Imaging 2:23–42. https://doi.org/10.3934/ipi.2008.2.23 Case KM (1973) Inverse problem in transport theory. Phys Fluids 16:1607–1611. https://doi.org/ 10.1063/1.1694186 Choulli M, Stefanov P (1996a) Inverse scattering and inverse boundary value problems for the linear Boltzmann equation. Commun Part Differ Equ 21:763–785. https://doi.org/10.1080/ 03605309608821207 Choulli M, Stefanov P (1996b) Reconstruction of the coefficients of the stationary transport equation from boundary measurements. Inverse Probl 12:L19–L23. https://doi.org/10.1088/0266-5611/ 12/5/001 Henyey LG, Greenstein JL (1941) Diffuse radiation in the galaxy. Astrophys J 93:70–83. https:// doi.org/10.1086/144246 Inönü E (1970) Orthogonality of a set of polynomials encountered in neutron-transport and radiativetransfer theories. J Math Phys 11:568–577. https://doi.org/10.1063/1.1665171 Kanal M, Moses HE (1978a) Direct-inverse problems in transport theory. 1. The inverse problem. J Math Phys 19:1793–1798. https://doi.org/10.1063/1.523878 Kanal M, Moses HE (1978b) Direct-inverse problems in transport theory, the inverse albedo problem for a finite medium. J Math Phys 19:2641–2645. https://doi.org/10.1063/1.523621 Kaper HG (1969) Elementary solutions of the reduced three-dimensional transport equation. J Math Phys 10:286–297. https://doi.org/10.1063/1.1664844
Radiative Transport Equation in Optical Tomography
79
Kim AD (2004) Transport theory for light propagation in biological tissue. J Opt Soc Am A 21:820– 827. https://doi.org/10.1364/JOSAA.21.000820 Kim AD, Keller JB (2003) Light propagation in biological tissue. J Opt Soc Am A 20:92–98. https:// doi.org/10.1364/JOSAA.20.000092 Langmore I (2008) The stationary transport problem with angularly averaged measurements. Inverse Probl 24:015024. https://doi.org/10.1088/0266-5611/24/1/015024 Larsen EW (1975) The inverse source problem in radiative transfer. J Quant Spec Rad Trans 15:1–5. https://doi.org/10.1016/0022-4073(75)90102-8 Larsen EW (1981) Solution of the inverse problem in multigroup transport theory. J Math Phys 22:158–160. https://doi.org/10.1063/1.524748 Liemert A, Kienle A (2011) Radiative transfer in two-dimensional infinitely extended scattering media. J Phys A: Math Theor 44:505206. https://doi.org/10.1088/1751-8113/44/50/505206 Liemert A, Kienle A (2014) Explicit solutions of the radiative transport equation in the P3 approximation. Med Phys 41:111916. https://doi.org/10.1118/1.4898097 Liemert A, Kienle A (2012a) Light transport in three-dimensional semi-infinite scattering media. J Opt Soc Am A 29:1475–1481. https://doi.org/10.1364/JOSAA.29.001475 Liemert A, Kienle A (2012b) Spatially modulated light source obliquely incident on a semi-infinite scattering medium. Opt Lett 37:4158–4160. https://doi.org/10.1364/OL.37.004158 Liemert A, Kienle A (2012c) Infinite space Green’s function of the time-dependent radiative transfer equation. Biomed Opt Express 3:543–551. https://doi.org/10.1364/BOE.3.000543 Liemert A, Kienle A (2012d) Green’s function of the time-dependent radiative transport equation in terms of rotated spherical harmonics. Phys Rev E 86:036603. https://doi.org/10.1103/PhysRevE. 86.036603 Liemert A, Kienle A (2012e) Analytical approach for solving the radiative transfer equation in two-dimensional layered media. J Quant Spectrosc Radiat Transf 113:559–564. https://doi.org/ 10.1016/j.jqsrt.2012.01.013 Liemert A, Kienle A (2013a) Exact and efficient solution of the radiative transport equation for the semi-infinite medium. Sci Rep 3:2018. https://doi.org/10.1038/srep02018 Liemert A, Kienle A (2013b) The line source problem in anisotropic neutron transport with internal reflection. Annal Nucl Energy 60:206–209. https://doi.org/10.1016/j.anucene.2013.05.007 Liemert A, Kienle A (2013c) Two-dimensional radiative transfer due to curved Dirac delta line sources. Waves Random Complex Media 23:461–474. https://doi.org/10.1080/17455030.2013. 851430 Machida M (2014) Singular eigenfunctions for the three-dimensional radiative transport equation. J Opt Soc Am A 31:67–74. https://doi.org/10.1364/JOSAA.31.000067 Machida M (2016) The radiative transport equation in flatland with separation of variables. J Math Phys 57:073301. https://doi.org/10.1063/1.4958976 Machida M, Schotland JC (2015) Inverse Born series for the radiative transport equation. Inverse Probl 31:095009. https://doi.org/10.1088/0266-5611/31/9/095009 Machida M, Yamamoto M (2014) Global Lipschitz stability in determining coefficients of the radiative transport equation. Inverse Probl 30:035010. https://doi.org/10.1088/0266-5611/30/3/ 035010 Machida M, Panasyuk GY, Schotland JC, Markel VA (2010) The Green’s function for the radiative transport equation in the slab geometry. J Phys A: Math Gener 43:065402. https://doi.org/10. 1088/1751-8113/43/6/065402 Markel VA, O’Sullivan JA, Schotland JC (2003) Inverse problem in optical diffusion tomography. IV. Nonlinear inversion formulas. J Opt Soc Am A 20:903–912. https://doi.org/10.1364/JOSAA. 20.000903 Markel VA (2004) Modified spherical harmonics method for solving the radiative transport equation. Waves Random Media 14:L13–L19. https://doi.org/10.1088/0959-7174/14/1/L02 McCormick NJ, Kušˇcer I (1966) Bi-orthogonality relations for solving half-space transport problems. J Math Phys 7:2036–2045. https://doi.org/10.1063/1.1704886
80
M. Machida
McCormick NJ, Kušˇcer I (1974) On the inverse problem in radiative transfer. J Math Phys 15:926– 927. https://doi.org/10.1063/1.1666771 McDowall S, Stefanov P, Tamasan A (2010) Stability of the gauge equivalent classes in inverse stationary transport. Inverse Probl 26:025006. https://doi.org/10.1088/0266-5611/26/2/025006 McDowall S, Stefanov P, Tamasan A (2010) Gauge equivalence in stationary radiative transport through media with varying index of refraction. Inv Probl Imaging 4:151–167. https://doi.org/ 10.3934/ipi.2010.4.151 Mika JR (1961) Neutron transport with anisotropic scattering. Nucl Sci Eng 11:415–427. https:// doi.org/10.13182/NSE61-1 Moskow S, Schotland JC (2008) Convergence and stability of the inverse scattering series for diffuse waves. Inverse Probl 24: 065005. https://doi.org/10.1088/0266-5611/24/6/065005 Panasyuk G, Schotland JC, Markel VA (2006) Radiative transport equation in rotated reference frames. J Phys A: Math Gener 39:115–137. https://doi.org/10.1088/0305-4470/39/1/009 Romanov VG (1997) Stability estimates in the three-dimensional inverse problem for the transport equation. J Inv Ill-Posed Probl 5:463–475. https://doi.org/10.1515/jiip.1997.5.5.463 Romanov VG (1998) A conditional stability theorem in the problem of determining the dispersion index and relaxation for the stationary transport equation. Mat Tr 1:78–115 Siewert CE (1978) The inverse problem for a finite slab. Nucl Sci Eng 67:259–260. https://doi.org/ 10.13182/NSE78-A15442 Stefanov P (2003) Inverse problems in transport theory. In: Inside out: inverse problems and applications, vol 47 of Mathematical Sciences Research Institute Publications. Cambridge University Press, Cambridge, pp 111–131 Stefanov P, Tamasan A (2009) Uniqueness and non-uniqueness in inverse radiative transfer. Proc Am Math Soc 137:2335–2344 Stefanov P, Uhlmann G (2003) Optical tomography in two dimensions. Methods Appl Anal 10:1–10. https://doi.org/10.4310/MAA.2003.V10.N1.A1 Tamasan A (2002) An inverse boundary value problem in two-dimensional transport. Inverse Probl 18:209–219. https://doi.org/10.1088/0266-5611/18/1/314 Wang J-N (1999) Stability estimates of an inverse problem for the stationary transport equation. Ann Inst Henri Poincaré 70:473–495
Maintenance of Permeable Asphalt Based on Quantitative Analysis of Deterioration by Non-integer Dimensional Analysis Kenji Hashizume and Takashi Takiguchi
Abstract We analyzed the periodically measured data of the road surface and the depths of ruts by using the high-accuracy pavement-condition measuring vehicle called Eagle which has been developed by West Nippon Expressway Shikoku Company Limited to fully understand the progress of permeable asphalt pavement damages on the expressways. By measurements by Eagle, we know that the local subsidence causes cracks to develop into potholes in a short period of time. It is very difficult to detect these damages earlier than the occurrence of a pothole even with very frequent measurements of the conventional evaluation indexes (cracks, rut, flatness). The main purpose in this paper is to establish a new technique how to predict the occurrence of the pothole by analysis of the road surface and the depths of the ruts. For analysis of the road surface data, the idea of non-integer dimensional analysis is applied. The proposal contributes to the long-term performance of permeable asphalt pavement and the reduction of potholes. Keywords Pavement and road inspection · Non-integer dimensional analysis
1 Introduction In order to secure the social durability of infrastructure, the following matters are indispensable; (i) to maintain and manage the organized infrastructure, (ii) to prevent the outbreak of risk events of damage due to aging of infrastructure. Therefore, it is important to inspect and repair the infrastructure. For the given purpose, the efficient and effective inspections and maintenance practice shall be necessary, for which we K. Hashizume (B) West Nippon Expressway Shikoku Company Limited, 3-1-1 Hanazono-Cho Takamatsu, Kagawa 760-0072, Japan e-mail: [email protected] T. Takiguchi Department of Mathematics, National Defense Academy of Japan, 1-10-20 Hashirimizu Yokosuka, Kanagawa 239-8686, Japan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 T. Takiguchi et al. (eds.), Practical Inverse Problems and Their Prospects, Mathematics for Industry 37, https://doi.org/10.1007/978-981-99-2408-0_6
81
82
K. Hashizume and T. Takiguchi
develop new technologies to gain inspection in order to comply road management rule. Nexco-west’s group is in charge of the maintenance of the expressways in western part of Japan including 24 prefectures, where there are regions with an abundance of historical sites, places of interest, rich nature and fascinating culture. West Nippon Expressway Shikoku Company Limited, a company in Nexco-west’s group, maintains and manages the expressways in the Shikoku region, an island located in western area of Japan. Nexco-west’s group policy is implementation of maintenance for safety and reliability of the customers.
2 Road Inspection and Issues Nexco’s inspections are divided into four parts (Table 1). There are initial inspections performed at the beginning of the construction, periodic inspections performed periodically during use, and periodic inspections performed by specialist engineers at different intervals for each structure. The flow (Fig. 1) shows how we plan the maintenance procedures from inspection to evaluation of repair works and their schedule. Inspection is essential for making maintenance plan. In addition, the daily inspections are conducted mainly on road surfaces that are directly linked to risks for road users. Pavement is one of the most important structure for maintenance. For pavement management, we carry out daily on-board inspections and regular inspections in accordance with traffic volume in order to understand the damage status (Fig. 2). A pothole (Fig. 3) is a damaged pavement that creates a hole in the road surface. Potholes on the highway may cause serious accidents. Table 1 Inspection types Type
Procedure
Frequency
In Charge of
Start-up inspection
The safety of the structure is checked by short range visual inspection and hammering test more in detail
Before use one time
Inspectors (Engineering Companies)
Regular inspection
The safety of the structure is regularly confirmed by distant visual inspection, short range visual inspection and hammering test
More than once a year
NEXCO-West engineers and Inspectors (Engineering Companies)
Detailed inspection
The safety of the structure is checked by short range visual inspection and hammering test more in detail
Once every five years
Inspectors (Engineering Companies)
Daily inspection
Visible unusual conditions and deformations of structures are daily inspected behind the wheels
Once every four days
Inspectors (Engineering Companies)
Non-integer dimensional analysis
83
Fig. 1 Operation and maintenance management
Detailed inspection
Inspection frequency Once every 3 years (Shikoku-area)
Daily check (on-board inspection)
Inspection frequency 5days/2weeks (According to traffic volume)
Fig. 2 Road surface management Fig. 3 Pothole
We would like to repair all the risky cracks which may yield potholes (Fig. 4), however, if we do so, all expressways must be always closed. Potholes must be repaired quickly, but if they occur frequently, they cannot be repaired all at once. Since we manage a wide range of roads, we have a limited number of repair teams.
84
K. Hashizume and T. Takiguchi
Fig. 4 Asphalt pavement repair methods
Therefore, we want to precisely predict the risk of the potholes. This is the motivation for this study.
3 Road Inspection System 3.1 Inspection System Now, we propose a pavement inspection method using a non-destructive inspection device, which enables us to keep records of the pavements situation and to evaluate them. For this purpose, we needed the data acquisition by periodic inspection for road surface management, for which, we have developed a vehicle for measurement of road profiles, Eagle (cf. https://www.w-e-shikoku.co.jp/wp-content/uploads/2021/ 07/77750616deb068c73da5b76821fb0d7d.pdf). By driving Eagle, we can measure the profiles of the cracks, the ruts, the flatness, and so on. Let us explain main picturing system in Eagle, the L & L System, (Fig. 5) inspection method which uses the line sensor camera and laser marker. The line sensor cameras have a visual image sensor and can capture seamless, continuous images. This system is applied for the tunnel and pavement inspections. The light cutting method is the method of photographing the laser marker images from an upper and oblique position by using the laser which is irradiated vertically down on measuring surfaces and obtaining the object shape. This method is used for road surface profile measuring. Driving Eagles at high speed (more than 80 km/h (less than 100 km/h)), we can inspect the pavement conditions such as cracks or potholes, as well as the conditions of bridge expansion joints by using the line sensor cameras on board of Eagle. At
Non-integer dimensional analysis
85
Fig. 5 L&L System
the same time, it is also possible to measure rutting, bumps, and upheaval with laser cameras on Eagle. By these measured data, road surface profiles such as height, flatness, and bumps values are obtained.
3.2 Analysis of Deterioration In Fig. 6, we show the visualized images of the road surface pictured by the line sensor camera. The shooting width 4.5 m, that is, the width of the picture taken by L&L system is 4.5 m perpendicular to the traveling direction. Its image is precise and clear, where the cracks on the road surface could be detected up to 1 mm (Fig. 6). Next, let us introduce the device that acquires the height images of the ruts and the cracks. By light cutting method of L&L system, we can measure the surface images and the shapes of the objects as well as their depths. This method is used for road surface profile measuring. The upper image shows that the black area is lower than the white area (Fig. 7). The red line indicates the cross section, which shows the height information as shown in the lower diagram. The visualized images of road surface 3-dimensional profile is obtained by the light-cutting method. Photo-shooting width is 4.4 m wide. The resolution is up to 0.5 mm with the measuring accuracy of vertical height (Fig. 8).
K. Hashizume and T. Takiguchi
3.0m
86
Zoom
4.0m Accuracy at a speed of 100km/h Shooting width = 4.5m (Color image) Resolution :0.8mm 0.8mm/pixel
Fig. 6 Visual image (Detecting cracks)
Fig. 7 Height image (road surface)
4 Quantification of the Deterioration Degree of Permeable Asphalt 4.1 Motivation of this Study Road pavement is indispensable social shared capital for people’s living. In Japan, it often rains. It is easier to repair asphalt pavement than concrete one. In view of these two points, permeable asphalt pavement is very popular in Japan. Road administrators must maintain the safety, comfort of users, and the comfortable living environment
Non-integer dimensional analysis
87
Accuracy at a speed of 100km/h Shooting width=4.4m Dimension of rutting:1mm or less Resolution 1.27mm(Transversal) 2.70mm (Longitudinal) 0.50mm (Depth)
Fig. 8 Height image (Detecting ruts)
of roadside residents. On the expressways, permeable asphalt (Fig. 9) currently has been adopted considering environmental aspects, such as safety, comfort, and noise, during driving. This pavement is safe even on rainy conditions. A regular survey using the developed high-precision survey technology, such as measurements by Eagle, enabled us to grasp the deterioration process of permeable asphalt on the expressway and clarify the deterioration mechanism. In the permeable asphalt of the earthwork part, a minute local subsidence area generated on the road surface is a sign of the occurrence of potholes (Fig. 10). The specific damages of permeable asphalt were aggregate scattering and asphalt binder deterioration. The necessity to quantify the degree of deterioration was also indicated. Furthermore, the investigation on the actual state of breakage of this local subsidence area showed that the permeable asphalt deterioration occurred in a part deeper than the base layer (Fig. 11). Concretely, if the part deeper than the base layer, especially the lower roadbed material, becomes fragile, an interlaminar delamination is accelerated by the infiltration of groundwater, rainwater, etc. Looking at the visible image, it can be Normal asphalt
Normal asphalt
Fig. 9 Japan expressway’s pavement
Permeable asphalt
Permeable asphalt
K. Hashizume and T. Takiguchi
hight image (after processing)
visual image
88
red 15mm over yellow 15 10mm blue 10mm under (a) Oct.2013 (b) Nov.2013 (c) Dec.2013 (d) Jan.2014
(e) Feb.2014
Fig. 10 Transition by the regular inspection
seen that the scattering of aggregate and cracks have increased rapidly. Looking at the height image, you can see that there is local subsidence early on. It’s just a hint to find a pothole.
4.2 Quantification of the Deterioration Degree Compared with the conventional dense grain asphalt, permeable asphalt locally sinks in a short period and potholes may happen if damages such as cracks critically grow. It is difficult to grasp the damage situation with the conventional evaluation index for dense grained pavement. In this chapter, a quantitative evaluation method is proposed according to the breakage characteristics of permeable asphalt (cf. Hashizume et al. 2014).
Non-integer dimensional analysis
89
Fig. 11 Internal damage of permeable asphalt
First, an evaluation by the local subsidence amount is proposed (Fig. 12). This method is an evaluation method that extracts local collapse points that indicate pothole occurrence predictions that could not be grasped by conventional evaluation methods or survey frequency. In this method, the data presented in a chronological order and measured periodically were used. The evaluation index based on the amount of local subsidence can be calculated with the conventional measurement data obtained by investigating the road properties, making it possible to grasp the generation risk of the current potholes. In addition, the prediction results based on the damage growth curve model can be utilized as a mid-to-long-term repair planning document. The relative depth of local subsidence is calculated as the depth of local subsidence by calculating the difference between the rut depth of the objective point and the representative rut depth which is the central value of the maximum rut depth in a vicinity of 10.0 m. Second, the cracking of permeable asphalt proceeds with aggregate scattering (Fig. 13); however, by using Eagle, a surface texture depth (mean profile depth: MPD) evaluation can be applied to the full lane width. The MPD is a quantitative evaluation method that evaluates the progress of aggregate scattering and cracking peculiar to permeable asphalt. By acquiring highly accurate shape data of the road surface, it is possible to quantitatively evaluate the MPD of the full lane width in a planar way. Red: Shape of the rut at the objective point Blue: Representative shape of the rut Lane marking Depth of local subsidence
The depth of the rut is low, but that of local subsidence is high. ⇒ Potential risk of pothole occurrence
Fig. 12 The local subsidence amount
Dangerous
Lane marking
90
K. Hashizume and T. Takiguchi
Fig. 13 Aggregate scattering
The problem of surface aggregate scattering is increasing. This symptom causes increased noise and reduction in safety, but there is no way to evaluate it. The MPD measured by Eagle has been confirmed to be highly correlated with the MPD measured with a CT (circular texture) Meter (Fig. 14) (cf. Hayashi et al. 2013). As you can see in this figure, the surface becomes rough when the aggregate on the surface scatters. Therefore, the accurate data by Eagle was used to analyze the evaluation values of aggregate scattering. In addition, the evaluation result of aggregate scattering automatically analyzes the average texture depth used by the conventional method for each block (Fig. 15). The MPD measured by Eagle has been confirmed to be highly correlated with the MPD measured with a CT Meter. Although the MPD analysis is effective, it is not impossible to predict the occurrence of the potholes by this technique. Therefore, we have to develop another MPD measured by CT Meter
MPD measured by Eagle
Contrasting MPD Fig. 14 MPD analysis
Mean Profile Depth
MPD
Non-integer dimensional analysis
91
Fig. 15 Superficial (MPD) quantitative evaluation of aggregate scattering
technique to predict the occurrence of the potholes, for which we develop the new non-integer dimensional analysis technique of the cracks and the ruts in the next section.
5 Maintenance of Permeable Asphalt Based on Quantitative Analysis The maintenance method based on the damage characteristics and deterioration mechanism of permeable asphalt was organized (Fig. 16), in order to rationally maintain the permeable asphalt. Damage forms can be classified into the following three cases; a case where damage progresses from the surface, a case where damage progresses from the inside, and a case where both simultaneously progress. We obtain the profiles of such three damage forms by conducting a periodic measurement through a simple road survey, and by analyzing the local subsidence amount and MPD, which is an indicator combination according to the deterioration characteristics of permeable asphalt. Cracks are one of the important factors for the occurrence of potholes. Since cracks in drainage pavement are difficult to see in visual images, cracks are investigated using height images taken by Eagle. In order to quantize how critical the crack are for the occurrence of the potholes, we propose to apply non-integer dimensional analysis to study how critical, the cracks on the road surface and the deep ruts, are to cause a pothole. The idea of Hausdorff dimension (cf. Falconer 2014 for its definition) is famous in non-integer dimensional analysis. If the object C is self-similar as Koch curve shown in Fig. 17, then its Hausdorff dimension H (C) is the same as its box dimension B(C) (cf. also Falconer 2014 for its definition). Although the image of the cracks and the deep ruts is not self-similar, we define the idea of (n)-box dimension in the following way. We first fix a square S on the road surface whose sides are ’s. We represent the square as the direct sum of n 2 squares, which are called pixels. If a pixel contains a part of the crack or the part where the rut depth is more than the threshold value Tv = 0.2 mm then we mark the pixel. If the marked pixels are x then we define the (n)-box dimension B(n) S (C) of
92
K. Hashizume and T. Takiguchi
Fig. 16 Maintenance of permeable asphalt
Height
Cracks Height
Koch Self-similar Fig. 17 Image of non-integer dimensional analysis
the image C of the cracks and the ruts on the square S by S B(n) (C) = logn x
(1)
Note that if the image C of the cracks and the ruts is self-similar then there holds. lim B(n) S (C) = B(C|S) = H (C|S),
n→∞
(2)
where C|S is the image C restricted to the square S. We measure the road image and the depth of the ruts by Eagle. By the precision of the measurement by Eagle and the porosity of permeable asphalt, the minimal pixel is the square with its sides 2 cm. Therefore, in this study, we set
Non-integer dimensional analysis
93
= 50 cm, n = 25
(3)
See Fig. 18 for the image of (n)-box dimension. If the length of one side is 50 cm and the number n of the divisions is 25, then the size of the pixel is 2 cm. This (pixel) square size is reasonable considering the aggregate size of 13 mm of permeable asphalt pavement and the eagle measurement pitch. By shifting the square S by 10 cm vertically and horizontally, we repeat to calculate B(n) S (C) (Fig. 19). The pixels in the intersection of the high-dimensional squares are critical ones where a pothole can happen. 2
ℓ
ℓ
Height image
Cracks
ℓ
Fig. 18 (n)-box dimension
50cm N 25
One lane W=3.5m
2cm
2cm
Fig. 19 Non-integer dimensional analysis
W=10cm rap
50cm N 25
W=10cm rap
94
K. Hashizume and T. Takiguchi
6 Numerical Calculation by Real Data We calculate the (n)-box dimension by application of the real road data measured by Eagle. By experience of the road maintenance, we know that the occurrence of the pothole more highly depends on the rut depth than the shape of the cracks. Therefore, we applied only rut depths for calculation of the (n)-box dimension. See Table 2. In the left side, we show the picture, the pixel image, and the (n)-box dimensions of 27 days before repairment. In the right, we show those just before a pothole happens. After this measurement, we have decided that a pothole almost occurred and repaired it. We calculated the (n)-box dimension of each square of its size 50 cm × 50 cm and moved the square by 10 cm, which was repeated to cover the whole area of interest, as proposed in the previous section. After this calculation, we have divided the area of interest by the direct sum of 10 cm × 10 cm-squares and we calculated the average (n)-box dimension of each 10 cm × 10 cm-square, which is shown in the last column in Table 2. Table 2 Numerical calculation with real data 27 days before the pothole occurred Height image by Eagle
The minimal pixel is the square with its sides 2 cm (deeper than 2 mm) (n)-box dimension Legand 1.15 Over 1.10 Over 1.05 Over
A pothole almost occurred
Non-integer dimensional analysis
95
Legend 1.15 Over 1.10 Over 1.05 Over
(a) Height image
(b) Non-integer dimensional analysis
Fig. 20 Pothole area zooming (27 days before the pothole almost occurred)
The conclusions obtained by this calculation is the followings; (a) If the pixels of their (n)-box dimensions larger than 1.2 rapidly increase, then it is a sign that a pothole may happen in a near future. But in this case, immediate repair in not necessary. If we repair all such points, the expressways are always closed. (b) If the pixels of their (n)-box dimensions larger than 1.3 are more than 10, then it is a sign that we have to repair such place immediately, otherwise a pothole can happen. Figure 20 is an enlarged view of the crack location 27 days ago. It can be confirmed that the crack area in the height image has a high dimension.
7 Conclusion To conclude this paper, we summarize the obtained results in this study. (1) In order to detect how critical the cracks are for the occurrence of a pothole, we introduced the idea of the (n)-box dimension for the analysis of the road profile data measured by Eagle. (2) By the conclusions (a) and (b) in the previous section, if the pixels of their (n)box dimensions larger than 1.2 rapidly increase, then it is a sign that a pothole may happen in a near future and if the pixels of their (n)-box dimensions larger than 1.3 are more than 10, then it is a sign that we have to repair such place immediately, otherwise a pothole can happen.
96
K. Hashizume and T. Takiguchi
We pose open problems left to be solved for further development. • In our numerical calculation of the (n)-box dimension, we applied only the ruts depth data. If we include both the ruts depth data and the images of the cracks, then much better analysis may be possible. • By accumulation of such numerical calculations of the (n)-box dimension, we may establish a new inspection technique to know where a pothole almost happens, where we have to repair immediately.
References Eagle. https://www.w-e-shikoku.co.jp/wp-content/uploads/2021/07/77750616deb068c73da5b768 21fb0d7d.pdf Falconer K (2014) Fractal geometry: mathematical foundations and applications, 3rd edn. John & Wiley Sons Ltd. Hashizume K, Hashimoto K, Akashi Y, Chong P (2014) One approach of the forecasting method for the pot-hole occurred by the deterioration of deeper than binder course on permeable asphalt pavement. J JSCE, Ser.E1 70(3):I_17–I_24. (in Japanese) Hayashi S, Hashimoto K, Akashi Y (2013) Proposal of aggregate scattering evaluation method for permeable Pavement by Road survey vehicle. J JSCE V-339:667–678. (in Japanese)
Investigation of Reinforcing Bars in Reinforced Concrete Structures by Ultrasonic Measurements Toshiaki Takabatake, Kenji Hashizume, Takayuki Ochi, and Takashi Takiguchi
Abstract In order to maintain and manage reinforced concrete (RC) structures, it is important to know the precise position of the reinforcing bars (rebars). Knowing the precise position of the rebars can reduce the risk of damage to the rebars in the repair process and allow us high-quality repairs. In order to obtain the precise position of the rebars, it is one of the difficult problem to know the accurate position of their endpoints (ends). We describe the development of a non-destructive inspection method for detecting the precise position of rebar ends in a RC structure. For reconstruction of rebar ends, we obtain an overdetermined system of linear equations, containing the effects of errors, by ultrasonic measurements. We try to equalize (or cancel) the effects of errors in the measurements by application of the idea of the least square solutions. Keywords Non-destructive inspection · Ultrasound · Least square solution
1 Introduction For these years in many countries, it has been a serious problem that the infrastructures provided more than fifty years ago have become very old and likely to cause some accidents. For secure daily lives of the citizens, we have to give proper restoration of T. Takabatake (B) · K. Hashizume West Nippon Expressway Engineering Shikoku Company Limited, 3-1-1, Hanazonomachi, Takamatsu, Kagawa760-0072, Japan e-mail: [email protected] K. Hashizume e-mail: [email protected] T. Ochi Tohoku Polytechnic College, 26 Tsukidatehagisawa, Kurihashi, Miyagi 987-2223, Japan e-mail: [email protected] T. Takiguchi National Defense Academy of Japan, 1-10-20, Hashirimizu, Yokosuka, Kanagawa 239-8686, Japan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 T. Takiguchi et al. (eds.), Practical Inverse Problems and Their Prospects, Mathematics for Industry 37, https://doi.org/10.1007/978-981-99-2408-0_7
97
98
T. Takabatake et al.
the infrastructure. For this purpose, it is required to precisely grasp where and how much the structure is damaged. For the maintenance of reinforced concrete (‘RC’ for short) structures, it is required to determine whether and where the restoration is necessary, by watching, by palpation, by microfracture inspections, and by nondestructive inspections. It is, however, very difficult to precisely decide whether and where the restoration is necessary. Though there exist various non-destructive inspection techniques for RC structures, there are few ones to give their precise interior information. In order to give proper restoration before the damage in the structure becomes serious, it is required to establish a non-destructive inspection method for RC structures to give precise places where the restoration is necessary. A few years ago, the X-ray computerized tomography (‘CT’ for short) for concrete structures was practicalized. By application of X-ray CT, we can inspect the interior information concretely in a non-destructive way. It, however, costs very expensive and the harmful side effect of the X-rays to human bodies and the environment cannot be ignored. Therefore, it is not suitable to apply the X-ray CT for daily maintenance of RC structures. It is required to develop a precise, safe, and cheaply running nondestructive inspection technique to inspect the interior information of the structure. In the maintenance of RC structures, it is important to know the precise position of the reinforcing steel bar (‘rebar’ for short). In this paper, as the first step to precisely probe the rebar, we develop a non-destructive inspection method to detect the precise position of the rebar ends in RC structures, which is the main purpose in this paper. For this purpose, we apply ultrasonic measurements. For measured data, we apply the idea of the least square solutions to average the errors in the measurements and probe the precise position of the rebar ends. By application of our precise rebar end probe technique, we also develop a technique to correct the measurements of concrete cover. We claim that the running cost, as well as the price of the apparatus of ultrasonic measurement is very cheap, and that this apparatus has no harmful side effects to human beings nor environment. This paper consists of the following sections. §1. §2. §3. §4. §5.
Introduction Importance to know the precise position of the rebar Precise probe of the rebar ends Verification of the theoretical study by experiments Conclusion and open problems
In this section, as the introduction of this paper, we introduce the outline of our paper as well as our motivation for this research. In Sect. 2, we introduce the importance to precisely probe the rebar. In Sect. 3, we precisely probe the rebar ends by ultrasonic measurements. We first determine the average ultrasonic velocity in concrete, which is applied to precisely reconstruct the rebar ends, where the idea of the least square solution play important roles.
Investigation of Reinforcing Bars
99
In Sect. 4, we verify the theoretical study given in Sect. 3 by experiments. We made RC test pieces and applied our theory to them. We have verified that our theoretical study is right by destructing the test piece. In the final section, we summarize the conclusion and mention open problems left to be solved for further development.
2 Importance to Know the Precise Position of the Rebar In this section, we discuss how important it is to non-destructively probe the precise position of the rebar near the surface of the structure. We enumerate the advantages to detect the precise position of the rebar nearest to the surface of the structure. (a) It enables us to check whether the structure has been constructed as it was designed, by a non-destructive inspection. (b) When we take some concrete cores for a microfracture inspection of the structure, we must not damage the rebar. The knowledge of the precise position of the rebar prevents us from damaging the rebar in such inspections. (c) When some repairment is necessary, we can determine the precise chipping area in the concrete cover for the repairment in order not to damage the rebar. (d) It helps us to check whether the rebar is fine by a non-destructive inspection. (e) It enables us to develop a non-destructive inspection for the concrete cover by application of the ultrasonic measurements, developed by Mita and Takiguchi (2018, 2019). In (a), we claim that knowledge of the precise position of the rebar can be a method to check whether the structure has been constructed as it was designed. We can also check whether the length of the rebar is as designed or not. When we check whether and where the salt damage is, we sometimes take a microfracture inspection by obtaining concrete cores. If we know the precise position of the rebar, then we can take cores without damaging the rebar. It is what is meant by (b). When the repairment of the structure is required, it is a tough problem to determine the chipping area in advance in order not to damage the rebar. If we know the precise position of the rebar in advance, then we can determine the chipping area without damaging the rebar. It is what we claim in (c). If the rebar gets corroded then we cannot detect the rebar in the place it must be, by the inspection technique proposed in this paper (cf. Sect. 3, in this paper). In this case, immediate repairment is required. That is what is meant by (d). The authors claim that, in the near future, it can be a serious problem that the exfoliated pieces of concrete cover would cause some accidents, especially in highpopulation-density areas, where there are roads and towns under the expressways. In such areas, a drop of exfoliated piece of concrete cover may cause serious accidents on the road or in the town under the expressway. It has already been a problem in Japan. In order to avoid exfoliation of concrete cover, it is very important to inspect
100
T. Takabatake et al.
Fig. 1 Propagation of the ultrasound
whether and where the concrete cover for reinforcement is not sound. If the quality of the concrete cover for the rebar is inferior or the thickness of the concrete cover for the rebar is not sufficient then it may cause the corrosion of the rebar resulting in the lower durability in the structure. By (e), we mean that if we know the precise position of the rebar, then it is possible to inspect the concrete cover, as well as the rebar itself, in a non-destructive way by application of ultrasonic measurements (cf. Mita and Takiguchi (2018)). Let us shortly review this technique. Property 1 (cf. Mita and Takiguchi (2018)) The ultrasonic primary waves (P-waves) take the route in the concrete structures where the travel time be the shortest. Consider a section of the reinforced concrete structure in Fig. 1, where the other part than the rebar is concrete. Let the velocity of the ultrasound in the rebar be V and the average velocity of the ultrasound in the concrete be v. In usual, V > v holds and the orbit of the ultrasonic primary wave between the points O and P is the polyline O R P, and the orbit between the points O and Q is the polyline O S Q. Project the ultrasound from the point O and measure the travel time to the points P and Q. The gap of their travel times must be theoretically L/V , where L is the length of the segment RS. If the real gap of the travel time is much larger than L/V , then some part of the rebar in the segment RS gets corroded or some part of the concrete cover between the segment S Q is not sound. In either cases, it is necessary to repair the structure. This non-destructive inspection technique highly depends on the knowledge of the precise position of the rebar, which shall be given in this paper. For the repairment of the concrete cover, the knowledge of the precise position of the rebar is of great help too (cf. the advantage (c) above). For more detail of the non-destructive inspection of concrete cover (cf. Mita and Takiguchi (2018)). We also note that the advantage (e) is closely related with the development of a new ultrasonic imaging technique (cf. Mita and Takiguchi (2018), Takiguchi (2019, 2020)). As we have seen in this section, there are advantages to know the precise position of the rebar nearest to the edge surface. In this paper, as the first step to detect the precise position of the rebar by a non-destructive inspection, we develop a technique to precisely probe the rebar ends in a non-destructive way.
Investigation of Reinforcing Bars
101
3 Precise Probe of the Rebar Ends In this section, we establish a theory to detect the precise position of the rebar ends. Throughout this section, we assume the natural number k be sufficiently large, which represents the number of measured data. In our theory, we take much more measurement data than the required ones, for application of the idea of the least square solutions, which homogenizes some inhomogeneous structures and equalizes the errors and the noises in observation. Assume that a cuboid RC structure contains a rebar of the diameter d in its interior (cf. Fig. 2). It is natural to assume that the diameter of the rebar is a priori known. Let us assume that this cuboid is represented as {0 ≤ x ≤ L , 0 ≤ y ≤ l, 0 ≤ z ≤ l}
(1)
for some positive numbers L and l. In Fig. 2, we set the axes as shown. For determination of the average ultrasonic velocities in the test piece, let us consider the two-dimensional sections of the test piece (Figs. 3 and 4). We first determine the average ultrasonic velocity v in concrete. We set the source point S and the observation point Oi in order that the ultrasonic P-wave would not pass through the rebar between these two points (cf. Fig. 3 for its image). We project the ultrasound at S and receive it at Oi . In this case, the ultrasonic P-wave travels along the segment S Oi and the travel time ti is observed. We set
Fig. 2 A cuboid test piece
Fig. 3 Determination of v
102
T. Takabatake et al.
Fig. 4 Determination of V
such k observation points Oi ’s for i = 1, 2, . . . , k for sufficiently large k, where the lengths li ’s of the segments S Oi ’s are a priori obtained. We determine the average ultrasonic velocity v in concrete as the minimizer k v=
i=1 ti li k 2 i=1 ti
k of ϕ(v) := (vti − li )2
(2)
i=1
By application of the least square solution in the definition of the average ultrasonic velocity v in (2), the measurement errors are equalized and the concrete structure is homogenized. We then calculate the ultrasonic velocity V in the rebar. In Fig. 4, we project the ultrasound at the point O and receive it at the observation points Pi , i = 0, 1, 2, . . . , k. In this case, the lengths li , i = 1, 2, . . . , k, of the segments P0 Pi are a priori known and the travel time ti , i = 0, 1, 2, . . . , k, of the ultrasonic P-wave between two points O and Pi are observed. The difference of the travel time ti − t0 represents the travel time in the segment Q 0 Q i in the rebar. See Fig. 4 for its image. We also note that the length of the segment Q 0 Q i is almost the same as the length li of the segment P0 Pi . We reconstruct the ultrasonic velocity V in the rebar as the minimizer k V = i=1 k
(ti − t0 )li
i=1 (ti
− t 0 )2
of φ(V ) :=
k (V (ti − t0 ) − li )2 .
(3)
i=1
The reason for the application of the least square solution in (3) is the same as (2). Remark 1 In this paper, we homogenize the concrete structure, where the ultrasonic velocity v is defined by (2). This idea works very well in the reconstruction of rebar ends in RC structures by ultrasonic measurements, not only theoretically (in this section) but also for practical real RC structures (in the next section). This homogenization also works well for non-destructive inspection of concrete cover mentioned in the previous section (cf. Mita and Takiguchi (2018)).
Investigation of Reinforcing Bars
103
Fig. 5 Reconstruction of the rebar ends
Let us reconstruct the rebar ends. We reconstruct the end b in Fig. 5. We fix the source point S on the surface A where we project the ultrasound, which is received at the observation points Bi , i = 0, 1, 2, . . . , k, on the surface B. By preliminary experiments, we can set the points S, B0 , B1 , . . . , Bk in order that the ultrasonic P-wave between the points S and Bi passes through the whole rebar. See Fig. 5 for its image. Let b = (x, y, z) and r + ri be the length of the segment bBi , where r0 = 0, that is, r is the length of the segment bB0 . We observe the travel time ti , i = 0, 1, 2, . . . , k, of the ultrasonic P-wave between the points S and Bi . Since we have homogenized the concrete structure (without the rebar), the following equality holds. (4) (ti − t0 )v = ri Therefore, we can calculate ri ’s by observed data, which must contain errors. Let Bi = (xi , yi , z i ) which may contain small errors. We obtain the following overdetermined system. ⎧ (x − x0 )2 + (y − y0 )2 + (z − z 0 )2 = (r + r0 )2 ⎪ ⎪ ⎨ (x − x1 )2 + (y − y1 )2 + (z − z 1 )2 = (r + r1 )2 (5) ··· ⎪ ⎪ ⎩ (x − xk )2 + (y − yk )2 + (z − z k )2 = (r + rk )2 In (5), the unknowns are x, y, z and r . What we would like to reconstruct are x, y and z, which is the main purpose, however, as an auxiliary result, we also reconstruct r . We also note that we set the coordinate of the end b = (x, y, z). It is because that the reconstruction of b by application of the least suqare solution gives the center point of the end circle of the rebar. Therefore, in this paper, we do not care the diameter d of the rebar section, in other words, however the diameter d is, our reconstruction technique gives the same coordinate of the end b = (x, y, z). Subtract the i-th equation by the j-th one in (5) and divide the both hand sides of the difference by 2, then we obtain the following overdetermined system of k(k+1) 2 linear equations; (xi − x j )x + (yi − y j )y + (z i − z j )z + (r j − ri )r =
1 2 (x + yi2 + z i2 + r 2j − x 2j − y 2j − z 2j − ri2 ). 2 i
(6)
Therefore, our problem comes down to giving an approximate solution to the linear equations; following system of n = k(k+1) 2
104
T. Takabatake et al.
⎧ ⎪ ⎪ a1 x + b1 y + c1 z + d1r = s1 ⎨ a2 x + b2 y + c2 z + d2 r = s2 ········· ⎪ ⎪ ⎩ an x + bn y + cn z + dn r = sn
(7)
Axx = s .
(8)
or equivalently Taking suitable and sufficient measurements, it is satisfied that rank A = 4. Therefore, the least square solution to the system (7) (or the system (8)) is unique and is given by the unique solution to the following system (cf. Hadrien (2021), Nakamura (2007), Takiguchi (2015)) t
A Axx = t Ass ,
(9)
where t A is the transpose of the matrix A. By (9), we approximately reconstruct the position of the rebar end b as well as the length of the segment bB0 .
4 Verification of the Theoretical Study by Experiments In order to verify our theoretical study in the previous section, we made test pieces of reinforced concrete and experimented on them. We constructed three cuboid RC test pieces of the size 150 × 150 × 530 mm3 as shown in Fig. 2. In these test pieces, the diameter of the rebar is 10 mm and the coordinates of the ends were designed as a = (50, 95, 95), b = (440, 95, 95)
(10)
before the construction. We note that the coordinates of the ends a, b represents the center of the section circle of the rebar, that is, in our test pieces, the rebar is designed to locate in the cylinder domain (y − 95)2 + (z − 95)2 ≤ 25, 50 ≤ x ≤ 440
(11)
In this experiment the measurement accuracy of the coordinate values is 1 mm. Therefore, all values for the coordinates are rounded off to the integer. From the viewpoint of the maintenance of concrete structures, the accurate reconstruction of the y- and z-coordinates is much more important than the x-coordinate. In this experiment, the target accuracy of the reconstruction for the y- and z-coordinates is less than (the real value ±3 mm) and the target accuracy for the x-coordinate is (the real value ±5 mm). We show the materials, the mix proportion and the quality of the test pieces in Tables 1, 2 and 3. In Table 3, the first three data, the slump, the air-content and the
Investigation of Reinforcing Bars
105
Table 1 Materials of the concrete Materials Symbol Type
Cement Fine aggregate Fine aggregate Fine aggregate Coarse aggregate Water-reducing admixture Air-entraining admixture
Density (g/cm3 )
Coefficients of water absorption (%)
Fineness modulus
C S1 S2 S G –
Normal potland cement Limestone Sea sand Mixed S1:S:2=7:3 Limestone MasterPolyheed15H[N]
3.16 2.69 2.57 2.65 2.70 1.05
– 1.02 1.51 1.17 0.46 –
– 2.79 2.14 2.60 6.69 –
–
MasterAir202
1.04
–
–
Table 2 Mix proportion of the concrete W/C
A/C
(%) 46.2
5.64
S/A
Unit volume (kg/m3 )
Admixture (kg/m3 )
(%)
C
W
S1
S2
G
Waterreducing
Air-entraining
41.6
331
153
544
233
1092
1.99
0.0132
Table 3 Quality of the concrete At the casting
Standard curing (8 weeks)
Field curing (8 weeks)
Slump (cm)
Aircontent (%)
Density (kg/l)
Compressive Density strength (N/mm2 ) (kg/l)
Compressive Density strength (N/mm2 ) (kg/l)
11.0
5.9
2.33
52.8
35.7
2.39
2.38
density, were obtained at the casting of the test pieces and the others were at the age of eight weeks. We tried to detect the end a of one of the three test pieces by application of theory developed in the previous section. Throughout this experiment, we projected the ultrasound at points Ai ’s or Bi ’s, i = 1, 2, . . . , 9 in Fig. 6. When we projected the ultrasound at Ai , we received it at points B1 , B2 , . . . B9 , C∗9 ’s, C∗0 ’s, E ∗9 ’s and E ∗0 ’s, where ∗ = 0, 1, 2, 3 on each surface shown in Fig. 6, and vise versa. We first determined the average ultrasonic velocity v in concrete. We chose a source and an observation points in order that the ultrasonic P-wave between two points would not go through the rebar, for example, the source point A3 and the observation point B3 , C10 or E 30 will do. We took such pairs of the source and the observation points as many as possible with the travel times in order to determine the average ultrasonic velocity v by (2). See Table 4 for more in detail. In our experiment, v was determined as v = 5088.4 m/s which is faster than usual concrete (between 3600 m/s and 5000 m/s). Since we made the solider test pieces than the usual ones.
106
T. Takabatake et al.
Fig. 6 Source and observation points
Fig. 7 The sections of the test piece
We then tried to reconstruct the rebar end a by application of the idea of the least square solutions. In Table 5, the data for the system (7) (or (8)) are given. Note that the coordinates of the observation points are A1 = (0, 125, 125), A2 = (0, 75, 125), A4 = (0, 125, 75), A5 = (0, 75, 75), D1 = (25, 150, 125), D11 = (25, 150, 75), E 1 = (25, 125, 150), E 11 = (25, 75, 150). (12) By application of our theory established in the previous section with the above data, the end a is reconstructed as a = (56, 100, 101)
(13)
As shown in Figs. 7 and 8, the real position of the rebar end a is a = (51, 100, 100)
(14)
Note that Fig. 8 is obtained by measuring all data of the sections shown in Fig. 7. Comparing (13) and (14), we can conclude that our reconstruction of the ends by non-destructive inspection is precise, especially in the reconstruction of y- and zcoordinates, from the viewpoints of which, our reconstruction (13) is much closer to
Investigation of Reinforcing Bars
107
Table 4 The measured data applied to determine v Source pt. Obs. pt. Dist. Travel Source pt. Obs. pt. (mm) time (µs) A6 A6 A6 A6 A7 A7 A7 A7 A8 A8 A8 A8 A9 A9 A9 A9
B6 B7 B8 B9 B6 B7 B8 B9 B6 B7 B8 B9 B6 B7 B8 B9
530 542 535 532 542 530 532 539 535 532 530 532 532 539 532 530
101.1 105.4 104.0 104.2 103.6 105.3 105.9 106.9 102.9 105.9 105.4 105.9 102.9 107.1 106.3 106.2
B6 B6 B6 B6 B7 B7 B7 B7 B8 B8 B8 B8 B9 B9 B9 B9
A6 A7 A8 A9 A6 A7 A8 A9 A6 A7 A8 A9 A6 A7 A8 A9
Table 5 The measured data applied to reconstruct the rebar end a Source pt. Obs. pt. Travel time Source pt. Obs. pt. (µs) B1 B1 B1 B1 B1 B1 B1 B1 B2 B2 B2 B2 B2 B2 B2 B2
A1 A2 A4 A5 D1 D11 E1 E 11 A1 A2 A4 A5 D1 D11 E1 E 11
98.5 98.5 99.2 99.6 92.7 94.3 92.3 93.1 99.0 98.5 99.9 99.6 93.3 94.6 92.8 93.7
B4 B4 B4 B4 B4 B4 B4 B4 B5 B5 B5 B5 B5 B5 B5 B5
A1 A2 A4 A5 D1 D11 E1 E 11 A1 A2 A4 A5 D1 D11 E1 E 11
Dist. (mm)
Travel time (µs)
530 542 535 532 542 530 532 539 535 532 530 532 532 539 532 530
101.1 103.9 102.9 103.0 104.8 105.5 106.0 107.3 104.2 105.9 105.5 106.3 103.9 106.9 105.9 106.3
Travel time (µs) 99.6 100.3 100.5 100.9 94.3 95.6 93.7 94.1 100.5 100.1 101.0 100.5 95.0 97.3 94.8 94.5
108
T. Takabatake et al.
Fig. 8 Expansion plan of the test piece
the real position (14) than the designed one (10). We also note that our reconstruction satisfied the target accuracy mentioned at the beginning of this section, by which we mean that our reconstruction technique is practically applicable.
5 Conclusion and Open Problems As the closing of this paper, we summarize the conclusion in this paper and mention open problems left to be solved for further development. We first summarize the conclusion. (i) We have developed a non-destructive inspection technique to exactly probe the rebar ends by application of ultrasonic measurements. It may be the first technique to exactly probe the rebar ends in a non-destructive way. (ii) In order to obtain the above conclusion (i), the idea of ‘the least square solutions’ played an important role. (iii) Our theoretical studies to precisely probe the rebar ends (the conclusion (i)) has been verified to be good by our experiments (Sect. 4). There are open problems left to be solved for further development, which shall be discussed at the end of this paper. Problem 1 (Open problems for further development) (i) Though the reconstruction of the y- and z-coordinates of the rebar end is good and sophisticated, reconstruction of the x-coordinate is not so sharp, which must be improved. I
Investigation of Reinforcing Bars
109
(ii) It is very important to apply our theory to the maintenance of infrastructures. As a solution to the problem (i) in Problem 1, the authors are developing another non-destructive inspection technique to exactly probe the rebar end with application of the electromagnetic induction technique. It shall be discussed in our forthcoming paper. Precise probe of rebar is very important in relation with development of an ultrasonic imaging technique for RC structures (cf. Mita and Takiguchi (2018)). As for the problem (ii), the authors are preparing to conduct experiments on expressways. Acknowledgements Our research unit is based on collaboration among a private company (Hashizume and Takabatake), mathematics (Takiguchi) and engineering (Ochi), which was organized as an opportunity of the international workshop “Practical inverse problems based on interdisciplinary and industry-academia collaboration” held at IMI, Kyushu University, Japan, from Oct. 24th to Oct. 27th, 2017.
References Hadrien J (2021) Essential math for data science. O’Reilly Media Inc., USA, Sebastopol Mita N, Takiguchi T (2018) Principle of ultrasonic tomography for concrete structures and nondestructive inspection of concrete cover for reinforcement. Pac J Math Ind 10:6. https://doi.org/ 10.1186/s40736-018-0040-0 Nakamura I (2007) Linear algebra. Sugaku Shobou, Tokyo (in Japanese) Takiguchi T (2015) How the computerized tomography was practicalized. Bull Jpn Soc Symb Algebr Comput 21:50–57 Takiguchi T (2019) Ultrasonic tomographic technique and its applications. Appl Sci 9:1005. https:// doi.org/10.3390/app9051005 Takiguchi T (2020) A theoretical study of the algorithm to practicalize CT by G. N. Hounsfield and its applications. J Ind Appl Math 37:115–130. https://doi.org/10.1007/s13160-019-00391-1
Recommendation to Teach How to Analyze an Overdetermined System of Linear Equations with No Solution in the Class of Linear Algebra Takayuki Ochi, Kenji Hashizume, Satoshi Ishikawa, Toshiaki Takabatake, and Takashi Takiguchi Abstract In the course of elementary linear algebra, it is a must to teach how to solve a system of linear equations with plural solutions, however, it is rarely taught how to analyze an overdetermined system of linear equations with no solution, which is often required in practical applications. In this paper, we discuss how to analyze an overdetermined system of linear equations with no solution, which is important from the viewpoint of both practical applications and mathematical education. Keywords Least square solution · Overdetermined system of linear equations with no solution
1 Introduction In this paper, we recommend to teach how to analyze an overdetermined system of linear equations with no solution, in the course of elementary linear algebra. In the course of elementary linear algebra, it is a must to teach how to solve a system of linear equations with plural solutions, but it is rarely taught how to analyze an overdetermined system of linear equations with no solution, which is important from the viewpoint of both practical application and education of linear algebra. In this paper, we study this problem. We develop our theory in the following way. T. Ochi (B) Tohoku Polytechnic College, 26 Tsukidatehagisawa, Kurihashi, Miyagi 987-2223, Japan e-mail: [email protected] K. Hashizume · T. Takabatake West Nippon Expressway Engineering Shikoku Company Limited, 3-1-1, Hanazonomachi, Takamatsu, Kagawa 760-0072, Japan S. Ishikawa Polytechnic University of Japan, 2-32-1, OgawaNishimachi, Kodaira, Tokyo 187-0035, Japan T. Takiguchi National Defense Academy of Japan, 1-10-20, Hashirimizu, Yokosuka, Kanagawa 239-8686, Japan © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 T. Takiguchi et al. (eds.), Practical Inverse Problems and Their Prospects, Mathematics for Industry 37, https://doi.org/10.1007/978-981-99-2408-0_8
111
112
T. Ochi et al.
§2. Motivation of this research §3. How to analyze an overdetermined system of linear equations with no solution §4. Importance to teach how to analyze a system of linear equations with no solution §5. Conclusion. In the next section, we shall discuss the motivation of this research as well as its importance. In the third section, we discuss how to teach how to analyze an overdetermined system of linear equations with no solution in the course of linear algebra. In the fourth section, we discuss the importance to teach how to analyze an overdetermined system of linear equations with no solution studied in the third section. In the final section, we summarize the conclusion in this paper and sum up the teaching materials proposed in this paper.
2 Motivation of This Research The motivation of this research is based on the analysis of the following overdetermined system, frequently obtained in practical measurements. ⎧ ⎨ (x − x1 )2 + (y − y1 )2 + (z − z 1 )2 = r12 ··· ⎩ (x − xn )2 + (y − yn )2 + (z − z n )2 = rn2
(1)
where n > 3. In the system (1), we measure the distance between the object and the measurement points, where (x, y, z) is the coordinate of the unknown object which is to be reconstructed, (x j , y j , z j )’s are the coordinates of the measurement points and r j ’s are measured distances. We know (x j , y j , z j )’s and r j ’s by measurements and try to reconstruct (x, y, z). If the measurements are accurate and they contain no errors, then the system must have the unique solution (x, y, z), however overdetermined the system is. It is, however, impossible for each equation to be exact, since it is impossible to give measurements with no errors. Therefore, the overdetermined system (1) with errors must admit no solution. Even if the system (1) admits no solution because of the errors, we have to reconstruct the object. In order to construct an approximate solution to the system (1) containing errors, we subtract the i-th equation by the j-th one and divide the both hand sides of the difference by 2, and obtain the following overdetermined system of n(n−1) linear equations; 2 (xi − x j )x + (yi − y j )y + (z i − z j )z =
1 2 xi + yi2 + z i2 − x 2j − y 2j − z 2j + r 2j − ri2 . 2
(2)
Therefore, our problem comes down to giving an approximate solution to the linear equations; following system of k = n(n−1) 2
An overdetermined system of linear equations
113
⎧ ⎪ ⎪a11 x1 + a12 x2 + · · · + a1n xn = s1 ⎨ a21 x1 + a22 x2 + · · · + a2n xn = s2 ········· ⎪ ⎪ ⎩ ak1 x1 + ak2 x2 + · · · + akn xn = sk
(3)
Axx = s .
(4)
or equivalently
In practice, the system (3) cannot allow a solution because of the errors, however, the object to be reconstructed really exists. Therefore, we have to give an approximate solution to (3) which serves as an approximate solution to the object, for which we recommend to introduce the idea of the least square solutions to (3). We claim that this idea is very important from the viewpoint of both education of linear algebra and practical applications.
3 How to Analyze an Overdetermined System of Linear Equations with No Solution In this section, we review the idea of the least square solutions to an overdetermined system of linear equations. Consider the following system of six linear equations, for example, ⎧ x + y=3 ⎪ ⎪ ⎪ ⎪ x + y=1 ⎪ ⎪ ⎨ x =2 (5) x = 0 ⎪ ⎪ ⎪ ⎪ y=2 ⎪ ⎪ ⎩ y=0 This system trivially has no solution. We have constructed this system by adding the error of 1 or −1 to each equation for x = y = 1. We claim that if we can exclude or average the effect of the errors then we obtain x = y = 1 as an alternative of the solution to (5). Let us see another example. ⎧ (x ⎪ ⎪ ⎪ ⎪ (x ⎪ ⎪ ⎪ ⎪ (x ⎪ ⎪ ⎨ (x (x ⎪ ⎪ ⎪ ⎪ (x ⎪ ⎪ ⎪ ⎪ (x ⎪ ⎪ ⎩ (x
− 1)2 + (y − 1)2 + (z − 1)2 − 1)2 + (y − 1)2 + (z + 1)2 − 1)2 + (y + 1)2 + (z + 1)2 − 1)2 + (y + 1)2 + (z − 1)2 + 1)2 + (y − 1)2 + (z − 1)2 + 1)2 + (y − 1)2 + (z + 1)2 + 1)2 + (y + 1)2 + (z + 1)2 + 1)2 + (y + 1)2 + (z − 1)2
= 2.8 = 2.9 = 3.1 = 3.2 = 3.1 = 3.2 = 2.8 = 2.9
(6)
114
T. Ochi et al.
Fig. 1 Measurement from 8 points to the origin
In this system, we measure the distances between eight observation points and the origin in the three-dimensional space as shown in Fig. 1. The accurate value of the each right-hand side in the system (6) must be 3.0, however, we have added the error of ±0.1 or ±0.2 to each one. In order to exclude or average the effect of errors, we apply the idea of the least square solution, in order to introduce which, let us discuss the following system of linear equations. Problem 1 Solve the following system of linear equations in y1 , y2 , . . . , yn . ⎧ b11 y1 + b12 y2 + · · · + b1n yn = s1 ⎪ ⎪ ⎨ b21 y1 + b22 y2 + · · · + b2n yn = s2 ········· ⎪ ⎪ ⎩ bm1 y1 + bm2 b2 + · · · + bmn yn = sm
(7)
Byy = s ,
(8)
or equivalently where B is the coefficient matrix in (7), y = t (y1 , y2 , . . . , yn ) and s = t (s1 , s2 , . . . , sm ). If the system (7) (or (8)) allows a solution y then we have Byy − s = 0 ⇔ Byy − s = 0
(9)
where, for a = Byy − s , aa = a12 + · · · + am2 is the length of the vector a = t (a1 , a2 , . . . , am ) ∈ Rm . If the system (7) (or (8)) allows no solution, then we define the the least square solutions to the system (7) (or (8)) as a minimizer y of the length Byy − s . Definition 1 A least square solution to the system (7) (or (8)) is defined as a minimizer y 0 of the norm Byy 0 − s = minn Byy − s . (10) y ∈R
An overdetermined system of linear equations
115
We usually minimize Byy − s 2 = Byy − s , Byy − s Rm ,
(11)
where ·, ·Rm represents the inner product in Rm , in order to obtain a least square solution to system (7) (or (8)). This is why a minimizer of (10) (equivalently, the minimizer of (11)) is called a least square solution. For the least square solutions to the system (7), the following theorem is known. Theorem 1 Let Mmn (R) be the set of m × n matrices whose components are real numbers. For B ∈ Mmn (R), the following conditions are equivalent. (i) y ∈ Rn is a least square solution to (7). (ii) For any z ∈ Rn , there holds Bzz , Byy − s Rm = 0.
(12)
(iii) There holds the following equation. t
B Byy − t Bss = 0 .
(13)
Proof Let us briefly review the proof of this theorem which is given in Nakamura (2007). We first prove that (i) ⇔ (ii); The condition (i) is equivalent to the following / Ker(B), there holds that the function condition; For any Rn z ∈ ϕ(t) := B(yy + tzz ) − s 2 = B(yy + tzz ) − s , B(yy + tzz ) − s
(14)
takes its minimal value at t = 0. This condition is equivalent to ϕ (0) = 0, since the function ϕ is a quadratic polynomial in t whose coefficients for the quadratic term is positive (see (15) below). The following representation ϕ(t) = t 2 Bzz 2 + 2tBzz , Byy − s Rm + Byy − s 2 , ϕ (t) = 2tBzz 2 + 2Bzz , Byy − s Rm , ϕ (0) = 2Bzz , Byy − s Rm ,
(15)
yields that the condition ϕ (0) = 0 is equivalent to the condition (ii), where we note that the condition (ii) trivially holds for any z ∈ Ker(B). We then prove that (ii) ⇔ (iii); The condition (ii) is equivalent to the condition that for any z ∈ Rn , there holds Bzz , Byy − s Rm = zz , t B(Byy − s )Rn = 0,
which is equivalent to the condition Rn t B(Byy − s ) = 0 .
(16)
116
T. Ochi et al.
This theorem holds for any linear system (7). It does not matter whether the system is overdetermined or not. In order to discuss the uniqueness of the least square solution, we prepare the following proposition. Proposition 1 rank(B) = rank( t B B).
(17)
This proposition is proved by showing Ker(B) = Ker( t B B).
(18)
By Proposition 1 and the condition (iii) in Theorem 1, we obtain the uniqueness of the least square solution to the system (7) (or (8)). Theorem 2 If rank(B) = n, for B ∈ Mmn (R), then the least square solution to the system (7) is uniquely given as the unique solution to the following system; t
B Byy = t Bss .
(19)
Remark 1 In practice, it is very usual that the condition rank(B) = n is satisfied by taking suitable and sufficient measurements. For more details, see the practical applications in Sect. 4.2 below.
4 Importance to Teach How to Analyze an Overdetermined System of Linear Equations with No Solution In this section, we discuss how important to teach how to construct a least square solution to an overdetermined system of linear equations (7), even if it admits no solution. We discuss the importance from two viewpoints. One is the viewpoint of the education of linear algebra (Sect. 4.1), and the other is the viewpoint of practical applications (Sect. 4.2).
4.1 Educational Viewpoint There are a number of advantages to teach how to construct a least square solution to an overdetermined system of linear equations (7) in mathematical education, especially in the education of linear algebra. (a) It contains good teaching materials of linear algebra. (b) In the proof of Theorem 1, we have to learn the integration of linear algebra with other fields, such as minimizing problems, variational principle and so on, which can be good teaching materials.
An overdetermined system of linear equations
117
(c) It is not easy to introduce some practical applications in the class of elementary linear algebra without introducing difficult knowledge of other fields. By teaching how to construct a least square solution to an overdetermined system of linear equations, however, we can introduce some good practical applications in the class of linear algebra, without introducing without introducing difficult knowledge of other fields. In what follows in this subsection, let us explain each advantage.
4.1.1
Teaching Material of Linear Algebra
There are many good teaching materials of linear algebra in the contents of the third section. Let us enumerate them. • In the proof of Theorem 1, we have to review Axx , y Rm = xx , t Ayy Rn for A ∈ Mmn (R), x ∈ Rn , y ∈ Rm ,
(20)
which is a good exercise of the inner product, the product of matrices and their transposes. • In the proof of Theorem 1, the fact aa , b Rn = 0 for ∀aa ∈ Rn ⇔ b = 0
(21)
was made use of. Although it is an easy exercise to show (21), this fact is frequently applied to show that a vector is 0 (vector). The proof of Theorem 1 is one of such typical examples. • In order to introduce the proof of Proposition 1, we have to review the following fact; (22) rank A = n − dim(Ker A) = dim(Ker A)⊥ for A ∈ Mmn (R) It is very important to review this fact. It is a very good teaching material for student to understand the rank of the matrix in connection with the dimension of the subspace (Ker A)⊥ ⊂ Rn , where we also have to review the proper orthogonal decomposition (23) Rn = Ker A ⊕ (Ker A)⊥ Therefore, we claim that there are very important teaching materials of linear algebra in the proof of Proposition 1.
118
4.1.2
T. Ochi et al.
Integration of Linear Algebra with Other Academic Fields
In the contents of Sect. 3, there are good teaching materials for the integration of linear algebra with other academic fields, such as minimizing problems, variational principle, elemental calculus and so on. Let us enumerate them. • In the proof of Theorem 1, we have applied the variational principle. When we give a mathematical representation of a maximal or minimal state, we often give a function ϕ(t) of one variable t ∈ R whose maximal or minimal value is given by t = 0. A necessary condition of the maximal or minimal state is given by ϕ (t) = 0. The variational principle is very important for application in the many fields. It is because we are often required to give a mathematical representation of a maximal or minimal state in many fields, for example, to describe the states of the minimal cost, the maximal energy, the maximal effect and so on. Therefore, it is very important to learn the elemental idea of variational principle, as whose teaching material, the proof of Theorem 1 is good and suitable. • For the variational method mentioned above, we have to calculate ϕ (t) = 0, where the orthogonal relation (21) in terms of inner product played an important role. Therefore, the proof of Theorem 1 serves as a good teaching material to integrate linear algebra with elemental calculus, especially with differential calculus. • In the proof of Theorem 1, the function ϕ in (14) to minimize being quadratic, we can represent its minimal state by completing the square, without applying / Ker(B), there holds differential calculus; for Rn z ∈ ϕ(t) = t 2 Bzz 2 + 2tBzz , Byy − s Rm + Byy − s 2
Bzz , Byy − s Rm 2 Bzz , Byy − s 2Rm 2 = Bzz t + − + Byy − s 2 Bzz 2 Bzz 2
(24)
y −ss Rm By (24), the function ϕ becomes minimal at t = − Bzz ,By . On the other hand, Bzz 2 by definition, ϕ takes its minimal value at t = 0. Therefore,
Bzz , Byy − s Rm = 0
(25)
which proves (i) ⇔ (ii) in Theorem 1. Therefore, the integration with differential calculus mentioned in the above point can be replaced by the integration with the maximal or minimal of the quadratic functions, which is also important for the research of some variational problems. • Let us compare the proofs of (i) ⇔ (ii) in Theorem 1. We have discussed integration of linear algebra and differential calculus in the proof given in the third section, whose ideas are very important to learn the elementary idea of variational method. On the other hand, we have given another proof in the above point, by application of completing the square for the quadratic function, which is much simpler than the original proof given in the third section. Let the students in the class of elementary linear algebra discuss the advantages and the disadvantages of each proof. For
An overdetermined system of linear equations
119
example, if the purpose is only to give a proof of (i) ⇔ (ii) in Theorem 1. It is much better to give a proof with the idea of completing the square, which is the simpler and requires no additional knowledge, namely, the differential calculus in this case. However, in the original proof given in Sect. 3, there are important ideas, the elements of variational method, integration of linear algebra with differential calculus and so on. This kind of discussion helps students discover and understand various thinking ways in mathematics.
4.1.3
Examples of Practical Applications
It is difficult to find interesting and suitable examples of practical applications to be introduced in the class of linear algebra without introducing difficult knowledge of other fields. There are, of course, many practical applications of linear algebra, however, in order to introduce them, we have to introduce some knowledge of other fields, i.e., engineering, economics and so on, in accordance with the examples, which make it difficult to introduce interesting examples of practical applications in the class of linear algebra. By introducing the idea of the least square solution to an overdetermined system of linear equations, however, we can introduce interesting practical applications in the class of linear algebra without mentioning some difficult knowledge of other fields. We shall introduce such examples in the next subsection.
4.2 Viewpoint of Practical Applications In this subsection, we introduce practical applications of the least square solutions to the overdetermined system (7). We introduce three examples, where the hypothesis of Theorem 2, rank A = n, is satisfied by taking suitably sufficient measurements.
4.2.1
Computerized Tomography (CT)
Though the essential context of this subsubsection is the quotation from the paper (Takiguchi 2020), for readers’ convenience, we introduce the application of the least square solutions to CT from the beginning without avoiding redundancy with Takiguchi (2020). We consider a section of the human body by a plane. We define the coordinate in order that this plane is given by {(x, y, z) ∈ R3 | z = 0}. We let f (x, y) = f (x, y, 0) be the density distribution of the human body in the plane {(x, y, z) ∈ R3 | z = 0}. In the mathematical model of CT, the condition f ∈ L 1 ∩ L 2 (R2 ) is usually satisfied. In the class of linear algebra, it is sufficient to assume that the function f (x, y) is bounded and the area of the set of the discontinuous points of f is 0. We shortly review the mathematical model of CT. The X-rays are projected to the section of the human body in Fig. 2. Because the X-ray rectilinearly propagates,
120
T. Ochi et al.
Fig. 2 Image of CT scan
we take out the line where the X-ray propagates and let it be the x-axis. We also let the point where the X-ray enters the human body be x = a and the point where the X-ray goes out of the human body be x = b. Let f (x) be the density distribution of the human body and I (x) be the strength of the X-ray, for a ≤ x ≤ b, The attenuated amount −d I (x) of the X-ray by propagating from the point x ∈ (a, b) by the length d x is given by − d I (x) = I (x)d x = −(I (x + d x) − I (x)) = f (x)I (x)d x, hence we have
log I (a) − log I (b) =
b
f (x)d x.
(26)
(27)
a
By (27), observing I (a) and I (b), we obtain the integration of f (x) in the interval [a, b]. Therefore, the mathematical problem of CT is formulated as follows. Problem 2 Determine the function f (x, y) defined on R2 from the data of its line integrals l f (x, y)dl along all lines l’s in R2 . We refer the readers to Natterer (2001), Takiguchi (2015), for the properties of the X-rays, introduction of the mathematical model for CT and the introduction of Problem 2. The authors claim that it is a good teaching material of an ordinary differential equation (ODE) to introduce the mathematical model for CT, since it is a good exercise of ODE and it serves as an interesting example of a practical application. G. N. Hounsfield first practicalized a medical CT device. In usual numerical analysis, we try to directly discretize a mathematical formula obtained by mathematical analysis in order to implement it for practical application. However, the essence of Hounsfield’s idea is to discretize the mathematical model itself, not a mathematical formula. It is very interesting and worked very well for practicalization of CT. For CT, we can assume that supp f is compact since f is a density distribution of the human body. We cover supp f with n congruent squares c1 , c2 , . . . , cn , whose sides are of the same length and parallel to the x- or y-axis, whose areas are all the same as c = |c j |, and any pair of whose interiors are mutually disjoint. Approximate
An overdetermined system of linear equations
121
the function f by a function gc (x, y) defined as gc (x, y) =
n
x j χc j (x, y),
(28)
j=1
where χc j (x, y) is the characteristic function of the square c j and x j are unknowns. For example, if we take x j as the integral mean of f (x, y) in the square c j then it seems easy to understand Hounsfield’s idea. We call the function gc (x, y) a pixel function. There holds that lim gc (x, y) = f (x, y) in L 1 (R2 ),
c→0
(29)
by the definition of the Lebesgue integral. In the class of elementary linear algebra, it is sufficient to mention that limc→0 gc (x, y) = f (x, y) for any point (x, y) of continuity of f (x, y), in place of (29). It is very important in practice to construct approximate solutions for x j ’s for small c > 0 with measurement data containing errors and noises, which Hounsfield tried. He never tried to reconstruct the original function f (x, y) itself, from the observed data necessarily containing errors in various senses. This idea is very nice. Since the best we can hope is to obtain an approximation of f , not to reconstruct f itself, it is very flexible to approximate f by a suitably simple function gc so that the problem would be simplified. Let m X-rays, I1 , I2 , . . . , Im , be projected to the human body. We let Ii0 be the strength of Ii before entering the human body and Ii1 be the one after the passing through the human body, for i-th Xray. By l1 , l2 , . . . , lm , we denote the lines where the X-ray I1 , I2 , . . . , Im rectilinearly propagates, respectively and by ai j , i = 1, 2, . . . , m, j = 1, 2, . . . , n, we denote the length of li ∩ c j . Letting si := log Ii0 − log li1 , i = 1, 2, . . . , m together with (28), the problem we have to solve turns out to be the following one (cf. Takiguchi (2015)). Problem 3 Give an approximate splution to the following system of linear equations in x1 , x2 , . . . , xn . ⎧ a11 x1 + a12 x2 + · · · + a1n xn = s1 , ⎪ ⎪ ⎨ a21 x1 + a22 x2 + · · · + a2n xn = s2 , (30) ········· ⎪ ⎪ ⎩ am1 x1 + am2 x2 + · · · + amn xn = sm , or equivalently Axx = s .
(31)
As a simple model of (30), let us review the system (5). If we project six X-rays to the object consisted of two square pixels whose sides are 1 (cf. Fig. 3), then we obtain the system (5). In the medical CT, we have to treat a huge system like m > 1 × 109 , n > 1 × 107 . The fact m n and the effect of the errors we mentioned above, the overdetermined system (30) must have no solution. Although the system (30) has no solution, the density distribution f (x, y) of the human body really exists with no doubt, which
122
T. Ochi et al.
Fig. 3 Image of the system (5)
must be approximately reconstructed. We note that in this case the probability for rank(A) < n is 0, therefore, it is able to be assumed that rank(A) = n. Therefore, the idea of the least square solution studied in the previous section plays an important role. For more detail of the Hounsfield’s idea for CT, cf. Takiguchi (2020). This idea has a strong connection with the development of a new ultrasonic imaging technique, for which cf. Takiguchi (2019).
4.2.2
Non-destructive Inspection of Concrete Structures
For the maintenance of RC (reinforced concrete) structures, it is very important to know the accurate position of the rebar (reinforcing steel bar) inbedded in the structure. We note that there being many researches to detect the rebar in RC structures non-destructively, for example, see He et al. (2009, Tanaka and Wakabayashi (2006), there exists no non-destructive inspection technique to accurately probe the rebar. We only obtain some rough sketch of the position of the rebar by the known techniques. There are a number of advantages to probe the accurate position of the rebar. Among them, the best one may be that the knowledge of the accurate position of the rebar. It enables us to give a non-destructive inspection of the concrete cover in RC structures by ultrasonic measurements, for which see Mita and Takiguchi (2018), Takiguchi (2019). In the maintenance of RC structures, it is one of the most important tasks to keep the concrete cover in a sound state in order to prevent the rebar from getting corroded. We discuss how to give a non-destructive technique to accurately probe the endpoints of the rebar, which in one of the most difficult problems in the rebar detection. Assume that a cuboid RC structure contains, in its interior, a rebar which locates parallelly or perpendicularly to each edge surface. See Fig. 4 for its image. In Fig. 4, we set the axes as shown. We reconstruct the endpoints of the rebar. For the reconstruction of the endpoint b in Fig. 4, we take suitable measurement points on the edge surface B and its around (for example, on the surface C). We measure the thickness of concrete cover, equivalently the distance between the measurement
An overdetermined system of linear equations
123
Fig. 4 A cuboid test piece
points and the endpoint. We let the endpoint b = (x, y, z) and take n measurement points (x1 , y1 , z 1 ), . . . , (xn , yn , z n ), whose distances from the endpoint b are measured as r1 , . . . , rn , respectively. By the measurements, we obtain an overdetermined system (1) of quadratic equations, however, it is well known that it is impossible to measure the thickness of concrete cover by the existing devices. Henceforth all measurements necessarily contain errors and noises. Therefore, we have make an approximate solution of the system (1) with errors and noises. Therefore, we derive an overdetermined system (2) of linear equations and apply the idea for the solution to the accurate probe of the rebar in RC structures, where we note that the rank of the coefficient matrix is equal to the number of unknowns by taking suitable and sufficient measurements (in practice, we take 6–8 measurements).
4.2.3
GNSS Positioning Technique
“GNSS” is the abbreviation for “global navigation satellite system”. The mathematical model of GNSS positioning technique is almost the same as the probe of the rebar. In order to determine the position of the object, we usually take 6–10 measurements by satellites, each of which gives the distance between the object and the satellite itself. Therefore, we have the same overdetermined system (1) of quadratic equations, the same one as the rebar probe. It is interesting that the number of measurements (6–10) are almost the same. The measurements in GNSS are much more accurate than the rebar probe, however, they must contain some errors because of the following causes; • The velocity of the signals from GNSS satellites changes in accordance with the condition of the air (pressure, humidity and so on). • The inner clock of the satellite gains or loses a slight little amount of time. These two causes are the main ones for the measurements in GNSS to be inaccurate. For the time being, the more accurate GNSS devices receive more information, the information of the air condition and so on, which makes the devices bigger and more expensive. The authors claim that, instead of receiving much information, the application of the idea of the least square solution to the system (2) is of great help.
124
T. Ochi et al.
We also note that the rank of the coefficient matrix in (2) is equal to the number of unknowns by taking suitable and sufficient measurements. In practice, we usually take 6–8 measurements, almost the same as the above example. We claim that our new idea makes the GNSS device much cheaper and smaller, and the positioning is sufficiently accurate.
5 Conclusion In this section, we first summarize the conclusion in this paper. (i) We studied how to analyze an overdetermined system of linear equations with no solution (Theorem 1). (ii) We studied the advantages of the point (i) from educational viewpoint. The contents in the third section can be; good and suitable teaching materials in linear algebra (Sect. 4.1.1) and good to teach integration of linear algebra with other academic fields containing important ideas for application (Sect. 4.1.2). (iii) We introduced practical applications (Sect. 4.2). The authors claim that it helps students keep their motivation high for learning, by introducing successful practical applications and their future development (Sects. 4.2.1–4.2.3). Let us summarize the educational materials proposed in this paper. We enumerate them in order of priority for teaching. (a) Introduce the problem (5) or some similar problem as an introductory problem to analyze an overdetermined system of linear equations with no solution. (b) Prove Theorem 1, where we review some important points of linear algebra mentioned in the previous section as well we the elements of variational method. (c) Give the least square solution to (5) by application of Theorem 1. Multiplying the transpose of the coefficient matrix to the both hand sides of (5) from the left yields 4x + 2y = 6 (32) 2x + 4y = 6 Therefore, we have x = y = 1 as the least square solution. Give the least square solution to (6) by application of Theorem 1. Multiplying the transpose of the coefficient matrix to the both hand sides of (6) from the left yields ⎛
⎞⎛ ⎞ ⎛ ⎞ 64 0 0 x 0 ⎝ 0 64 0 ⎠ ⎝ y ⎠ = ⎝ 0 ⎠ 0 0 64 z 0
(33)
Therefore, we have x = y = z = 0 as the least square solution. (e) As an example of practical applications, introduce one of the examples mentioned in the Sect. 4.2 in accordance with the learning level of the class. The authors
An overdetermined system of linear equations
125
think that the example of CT is very interesting but it may be easier to introduce the other examples. (f) If the learning level of the class is better, introduce all examples mentioned in the Sect. 4.2. (g) If the learning level of the class is excellent, prove Proposition 1. For this proof, deep understanding of linear algebra is required, which are mentioned in the third point in Sect. 4.1.1. The authors think that the set of materials (a), (b) and (c) is the minimal package, however, if necessary, we can omit to prove Theorem 1. Introducing the statement of Theorem 1 without proof and giving the least square solution to (5) would do. We can add the materials (d), (e) and (f) in accordance with the learning level of the class. Therefore, we can arrange the package of the teaching materials in accordance with the learning level of the class. In this sense, the teaching materials recommended in this paper are good and suitable for the course of elementary linear algebra. For closing this paper, we give the final remark. In some textbooks of advanced course of linear algebra, the idea of the pseudo-inverse matrix or the Moore-Penrose inverse matrix (Moore 1920) is introduced. The Moore-Penrose inverse matrix is very useful to give a least square solution to an overdetermined system of linear equations (cf. Hadrien (2021)) and has a lot of applications. However, students have to learn new ideas, for example, the singular value decomposition of the matrices and so on, for the definition of the Moore-Penrose inverse matrix. The authors claim that our idea is much easier and simpler and students are not required to learn any new additional materials in order to give a least square solution to an overdetermined system of linear equations. Therefore, the teaching materials proposed in this paper are good and suitable for elementary course of linear algebra. They are also good for engineers who would like to apply the idea of the least square solutions to practical problems without learning difficult mathematics.
References Hadrien J (2021) Essential math for data science. O’Reilly Media Inc., USA, Sebastopol He XQ, Zhu ZQ, Liu QY, Lu GY (2009) Review of GPR rebar detection, PIERS proceedings, Beijing, China, March 23–27, 2009, pp 804–813 Moore EH (1920) On the reciprocal of the general algebraic matrix. Bull Am Math Soc 26:394–395. https://doi.org/10.1090/S0002-9904-1920-03322-7 Mita N, Takiguchi T (2018) Principle of ultrasonic tomography for concrete structures and nondestructive inspection of concrete cover for reinforcement. Pac J Math Ind 10:6. https://doi.org/ 10.1186/s40736-018-0040-0 Natterer F (2001) The mathematics of computerized tomography, 2nd ed, SIAM (classics in applied mathematics, vol 32), Philadelphia, PA Nakamura I (2007) Linear algebra. Sugaku Shobou, Tokyo (in Japanese) Takiguchi T (2015) How the computerized tomography was practicalized. Bull Jpn Soc Symb Algebr Comput 21:50–57
126
T. Ochi et al.
Takiguchi T (2019) Ultrasonic tomographic technique and its applications. Appl Sci 9:1005. https:// doi.org/10.3390/app9051005 Takiguchi T (2020) A theoretical study of the algorithm to practicalize CT by G. N. Hounsfield and its applications. J Ind Appl Math 37:115–30. https://doi.org/10.1007/s13160-019-00391-1 Tanaka S, Wakabayashi M (2006) On measurement of the depth and the diameter of steel bars in reinforced concrete using electromagnetic wave (Radar). In: 2006 SICE-ICASE international joint conference, pp 2555–2559
Visual 3D Reconstruction of a Rotating Object in Space Environment with a Least-Squares Framework Makoto Maruya and Takashi Takiguchi
Abstract This study focuses on a least-squares framework suitable for spacecraft onboard computing in the visual 3D reconstruction of a rotating target, such as asteroids or space debris. During a proximity mission, recognizing the motion and shape of a target is crucial to touchdown on it or seize it before starting the actual operation. The 3D reconstruction of objects from 2D images, or bundle adjustment, has been intensively studied in the field of computer vision. Most algorithms involve nonlinear optimization and are computationally expensive; therefore, they require high-performance computers on the ground to obtain a solution. However, a strong demand exists for lightweight algorithms that can be operated on onboard computers to reduce spacecraft–ground communication delays. We analyzed the 3D reconstruction problem and formulated it as an overdetermined system of linear equations. We solved these by applying least-square solutions. Linear approximation is justified under the unique circumstances of the space environment, that is, no friction exists with air or other objects, and hence the motion is considerably simple in the absence of an external force. The least-squares framework requires only matrix calculations, and its computational complexity is considerably less than that of conventional algorithms. The simulation results demonstrate the accuracy of the 3D reconstruction acquired by our method. Keywords Visual 3D reconstruction · Asteroid · Debris · Least-squares solution
M. Maruya (B) Geoinsight LLC, Tokyo, Japan e-mail: [email protected] T. Takiguchi National Defense Academy of Japan, Yokosuka, Japan © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 T. Takiguchi et al. (eds.), Practical Inverse Problems and Their Prospects, Mathematics for Industry 37, https://doi.org/10.1007/978-981-99-2408-0_9
127
128
M. Maruya and T. Takiguchi
1 Introduction 1.1 Visual 3D Reconstruction Certain space activities include proximity missions. For example, asteroid explorers approach celestial bodies for detailed observations, and even land on them. Another example is space debris removal. Sweeper satellites approach the target wreckages to catch them. Information on the object shape and relative position/motion is essential for landing on an asteroid or catching debris. Visual 3D reconstruction based on camera information is crucial for obtaining this information. Although objects, such as asteroids or debris, vary in shape and size, they have one thing in common: they rotate. Figure 1 shows a conceptual sketch of the visual 3D reconstruction of a rotating object in space. A typical spacecraft (SC, see Table 1 for abbreviations) carries various sensors, including cameras and LiDARs (see Table 1). Cameras are the primary sensors used for 3D reconstructions. With a camera, the spacecraft captures images of an object while flying nearby. From these images, the GCPs (see Table 1) are extracted by image processing. At the beginning of an observation, the 3D position of the GCPs and the relative position/motion of the SC from the object are unknown. These should be estimated during the 3D reconstruction process. In this paper, we define visual 3D reconstruction as the estimation of the 3D relative position/motion of an SC and the 3D position of GCPs from 2D camera images. GCPs should be extracted from images using certain image-processing techniques before the 3D reconstruction process begins. The rotation axis and rotation period of an object are generally unknown, but we considered them as known in this study for simplicity. We set an object coordinate system such that the rotation axis was along the Z-axis. Note that the object coordinate system does not rotate, but the object rotates with respect to the coordinate system.
Fig. 1 Conceptual sketch of the visual 3D reconstruction of a rotating object in space
Visual 3D Reconstruction of a Rotating Object …
129
Table 1 Nomenclature O
OV = 0 0 0
T
3 × 3 zero matrix 3 × 1 zero vector
d(n, t)
Distance from the SC to the n-th GCP at time t
d Li D A R (t) ⎤ ⎡ 100 ⎥ ⎢ ⎥ E=⎢ ⎣0 1 0⎦ 001 T EV = 1 0 0
LiDAR output at time t 3 × 3 identity matrix
3 × 1 unit vector
Er r or
Error term
GCP
Ground control point (distinctive place on an object like a corner of a rock)
GC P n = T pn,x pn,y pn,z
Position of the n-th GCP (km)
LiDAR
Light detection and ranging
L O S(n, t)
Unit vector from the SC to the n-th GCP at time t
n
GCP number
P = T 0 p0,x p0,y p0,z
The starting position of a spacecraft (km)
R z (ωt) = ⎤ ⎡ cosωt −sinωt 0 ⎥ ⎢ ⎢ sinωt cosωt 0 ⎥ ⎦ ⎣ 0 0 1
3 × 3 rotation matrix (rotation along the Z-axis)
SC
Spacecraft
Superscript Est
Final provisional value when an estimation process has been completed
Superscript Prov
Provisional value or initial value for simulation. It will be gradually updated by estimation iteration
Superscript T
Transpose of a matrix
Superscript True
True value for simulation
Superscript -
Pseudoinverse
t
Time from the beginning of the observation (continued)
130
M. Maruya and T. Takiguchi
Table 1 (continued) tm
Time at m-th image (t1 = 0)
TNG
Total number of GCPs
TNI
Total number of images
TNGO T v = v x v y vz
Total number of GCP observations
ω
Rotation speed of an object (radian/min)
Relative motion vector of a spacecraft observed from the object center (km/min)
1.2 Conventional Methods and Their Problems The 3D reconstruction of objects from 2D images, or bundle adjustment, has been intensively studied in the field of computer vision (Triggs et al. 2000). Because these algorithms involve nonlinear optimization with dozens to hundreds of unknown parameters, they are computationally expensive. At present, the processing capabilities of standard SC onboard computers are insufficient for these complex computations. Therefore, image data are transmitted from the SC to Earth and processed with high-performance computers on Earth to obtain a solution (Maruya 2020; Maruya et al. 2006). However, a strong demand exists for lightweight algorithms that can be operated on onboard computers to reduce spacecraft–ground communication delays. Depending on several factors, such as the distance between an SC and Earth, communication bandwidth, and visible state of the antennas,1 the delays can range from a few minutes to hours. If communication delays are eliminated by onboard processing, SC operational time can be significantly reduced. Certain onboard-oriented algorithms have been proposed, but they involve nonlinear optimization (Pesce et al. 2018).
1.3 Our Idea of Simplifying the Problem Using Unique Circumstances of Outer Space The conventional methods are complex because they attempt to determine solutions even under general conditions. However, the external forces owing to gravity and air resistance are often small in space environments. In such situations, objects move according to the law of inertia. Based on the aforementioned considerations, we propose a simplified method to obtain a solution at a low computational cost. Our approach assumes the following physical model: – An SC moves at a constant speed along a straight line. 1
If an SC is below the horizon and is not visible from an antenna, we must wait communication restoration until the earth’s rotation makes it visible.
Visual 3D Reconstruction of a Rotating Object …
131
– An object moves at a constant speed along a straight line. – An object rotates at a constant speed along a single axis (the rotation axis agrees with the Z-axis, see Sect. 1.1). Our method is so simple that solutions can be obtained using a least-squares framework. Least-squares frameworks have been shown to yield solutions at very low computational costs (Takiguchi 2020). As this is the first trial of applying a leastsquares framework for visual 3D reconstruction, we focus on the general process and basic properties, not on details such as software implementation. The notations used in this study are summarized in Tables 1 and 2.
2 Proposed Method 2.1 Observation Equation Considering the situation depicted in Fig. 1, we can formulate the following observation equation: R z (ωt)GC P nT r ue = d(n, t)T r ue L O S(n, t) + v T r ue t + P 0T r ue
(1)
This equation is described by the object coordinate system. The left side of the equation indicates the position of the n-th GCP at time t. The right side expresses the sum of the vector from the camera to the n-th GCP and the camera position at t. Variables with “True” superscript are unknown (see Table 2 for details). This equation holds for all GCP numbers and time. The advantage of the proposed method is that the relationship between the GCP and the SC position can be expressed using a simple formula. Nevertheless, directly solving this equation requires nonlinear optimization and is computationally expensive. Therefore, the following approximate observation equation that uses a provisional value instead of the true value in Eq. (2) is used to determine the unknown variables step by step. R z (ωt)GC P nPr ov = d(n, t) P r ov L O S(n, t) + v P r ov t + P 0P r ov + Er r or
(2)
We obtain one approximate observation equation from one GCP observation at each time. In the following, we assume that we have the TNGO equations.
2.2 Estimation Process At the beginning of this process, the only known information is the distance from the SC to a certain point on the object obtained using LiDAR and a rough estimate of P 0P r ov that is manually calculated from camera images and navigation data by
132
M. Maruya and T. Takiguchi
Table 2 Simulation conditions Variables SC
Object
Known/ Unknown
Value (initial)
Unit
P 0T r ue
Unknown
p0,x = 100, p0,y = 0, p0,z = 0.5
km
P 0P r ov
Known (initial value)
km
v T r ue
Unknown
v P r ov
Known (provisional value)
Four cases p0,x = 80,90,110,120 p0,z = p0,z = 0 T 0 0 −0.02 T 000
Attitude (orientation) of a spacecraft
Known
Aligned to the object coordinates
Non-dimensional
Camera field of view
Known
1.9
degree
Camera resolution
Known
500 × 500
pixel
GC P nT r ue
Unknown
GC P nPr ov
Known
GCP mean position
Known
GCP #1 position
Known
d(n, t)T r ue
Unknown
d(n, t)
P r ov
Known
Total number of GCPs Known (TNG)
Observed data
km/min km/min
km Calculated from d(n, t) P r ov and L O S(n, t) T 000 T 100
km
km km km
Calculated from the provisional SC position and LiDAR
km
81
–
Rotation axis
Known
Z-axis
–
Rotation speed ω
Known
7.2 (7.2° × 50 steps = 360° for total)
°/min
L O S(n, t)
Known
Calculated from the non-dimensional GCP position in images
Total number of GCP observations (TNGO)
Known
1067
–
Total number of images (TNI)
Known
51
–
d Lidar (t)
Known
LiDAR output
km
Visual 3D Reconstruction of a Rotating Object …
133
the SC control team on Earth. This information is used as the initial value for the estimation. The proposed estimation process consists of four main parts (Fig. 2). Part 1 estimates the GCP position and Part 2 calculates the SC initial position and motion vector. Parts 3 and 4 update the distance data to each GCP and update the GCP position, respectively. Parts 3 and 4 are repeated at predetermined times to accelerate the GCP adjustment. We empirically set five for this parameter in the simulation described in Sect. 3. Part 2 through Part 4 are repeated until the iteration count exceeds max. Details of each part are described in the following subsections.
Fig. 2 Estimation process
134
M. Maruya and T. Takiguchi
2.3 GCP Position Estimation (Part 1) In Part 1, we estimate the GCP position. By aggregating the observation Eq. (2) of visible2 GCPs, we obtain the following linear equations: ⎡
⎤ GC P 1Pr ov ⎢ GC P Pr ov ⎥
T 2 ⎢ ⎥ M⎢ ⎥ = b(t1 , 1) · · · b(t1 , T N G) b(t2 , 1) · · · b(tTNI , T N G) O V E V .. ⎣ ⎦ . GC P TPrNovG (3) (omit b(tm , n) from the right-hand side of this equation if the n – th GCP is not visible in the m – th image), where ⎡ ⎤ R z (ωt1 ) 0 0 0 ··· 0 ⎢ ⎥ 0 ··· 0 0 R z (ωt1 ) 0 ⎢ ⎥ ⎢ ⎥ .. ⎢ ⎥ 0 R z (ωt2 ) 0 0 ··· . ⎢ ⎥ ⎢ ⎥ 0 0 0 0 R z (ωt2 ) · · · ⎢ ⎥ (4) M=⎢ ⎥ .. .. .. .. .. .. ⎢ ⎥ ⎢ ⎥ . . . . . . ⎢ ⎥ ⎢ 0 0 · · · · · · · · · R z (ωtT N I ) ⎥ ⎢ ⎥ ⎣ E ⎦ E E E ··· E E O O O ··· O (this is an example for the case where GCP1 and GCP2 are visible at t1 , and GCP2 and GCP4 are visible at t2 ), the size of M is (3TNGO + 6)rows, 3TNG columns, b(t, n) = d(n, t) P r ov L O S(n, t) + v P r ov t + P 0P r ov ,
(5)
d(n, t) P r ov = d Lidar (t).
(6)
The velocity vector of SC, v P r ov , is generally small and is assumed to be zero at the beginning of the estimation. In Eq. (4), the second line from the bottom implies T NG n=1
and the bottom line implies 2
Refer 3.1 for visibility of GCPs.
T GC P nPr ov = 0 0 0 ,
(7)
Visual 3D Reconstruction of a Rotating Object …
T GC P 0Pr ov = 1 0 0 .
135
(8)
Equations (7) and (8) are introduced to prevent the solution from becoming unstable owing to the arbitrariness of the object coordinate definition (refer to Sect. 3.1). In actual situations, cases may exist where these conditions do not agree. However, in certain cases, the proposed method operates with appropriate modifications. For example, if GCP#1 position is [1, 0, gz]T (gz = 0), E V of Eq. (8) should
T be modified to 1, 0, gz . The position of the GCPs is approximately obtained by the following equation: ⎡
⎤ GC P 1Pr ov ⎢ GC P Pr ov ⎥ 2 ⎢ ⎥ ⎢ ⎥ = .. ⎣ ⎦ . GC P TPrNovG
T M − b(t1 , 1) · · · b(t1 , T N G) b(t2 , 1) · · · b(tTNI , T N G) O V E V
(9)
(omit b(tm , n) from the right-hand side of this equation if the n − th GCP is not visible in the m – th image), where M − is the pseudoinverse of M. In the simulation, we used the Python NumPy library to compute the pseudoinverse (https://numpy.org/doc/stable/refere nce/generated/numpy.linalg.svd.html).
2.4 SC Initial Position and Motion Vector Estimation (Part 2) Part 2 updates the initial position and velocity of the SC using the estimated GCP position. Transforming Eq. (2), we obtain the following equation for the initial position and velocity vector of the SC: P 0P r ov + v P r ov t = −d(n, t)L O S(n, t) + R z (ωt)GC P n + Er r or.
(10)
We formulate a simultaneous equation with P 0P r ov and v P r ov as unknowns,
P 0P r ov L v P r ov
T
= c(t1 , 1) · · · c(t1 , T N G) c(t2 , 1) · · · c(tTNI , T N G)
(11)
(omit c(tm , n) from the right-hand side of this equation if the n − th GCP is not visible in the m – th image), where
136
M. Maruya and T. Takiguchi
⎤ E t1 E ⎢ .. ⎥, the size of L is 3 TNGO rows, six columns, L = ⎣ ... . ⎦ E tT N I E
(12)
c(t, n) = −d(n, t) P r ov L O S(n, t) + R z (ωt)GC P n .
(13)
⎡
P stP rarovt and v P r ov are approximately obtained by the following equation:
P 0P r ov v P r ov
T
= L − c(t1 , 1) · · · c(t1 , T N G) c(t2 , 1) · · · c(tTNI , T N G)
(14)
(omit c(tm , n) from the right-hand side of this equation if the n − th GCP is not visible in the m – th image), where L − is the pseudoinverse of L.
2.5 Update Distance Data to Each GCP (Part 3) Part 3 updates the distance from the SC to the GCP using the latest initial position and velocity of the SC. By transforming Eq. (2), we obtain the following equation: d(n, t) P r ov = R z (ωt)GC P nPr ov − P 0P r ov − v P r ov t.
(15)
We update d(n, t) P r ov according to the aforementioned formula.
3 Computational Experiments and Discussion 3.1 Simulation Setup We generated 81 GCPs using the asteroid Itokawa 3D model published by NASA (https://solarsystem.nasa.gov/resources/2377/asteroid-itokawa-3d-model/). We then defined the object coordinates X-Y-Z satisfying the following conditions: 1. The origin of the coordinates equals GCP mean position.
T 2. The position of GCP #1 is 1 0 0 . 3. Z-axis is the rotation axis. The first and second terms were introduced to determine the orientation and scale of the X-axis, respectively. In the simulation, this coordinate system can be defined according to conditions 1, 2, and 3 because the true values of the GCPs are known in advance. However, in a real situation, such a coordinate system cannot be defined
Visual 3D Reconstruction of a Rotating Object …
137
Fig. 3 Observation simulation
before the positions of GCPs are estimated. In actual cases, a provisional object coordinate system must be used during the estimation phase. The coordinate system is then modified when the GCP values are obtained. The GCPs rotate at a constant speed ω and a camera mounted on an SC shoots
T them once every minute (Fig. 3). The initial position of the SC was 100 0 0.5 ,
T and the SC constantly moved with a motion vector 0 0 −0.02 . We set four cases for the initial values of p0,x to evaluate the effect of errors in the initial values. See Table 2 for simulation details. The simulation was performed in 51 time steps from 0 to 50 min. Examples of simulated camera images (500 × 500 pixels) are shown in the upper right corner of the figure. In these images, dots represent the GCPs. Totally, 81 GCPs exist; however, only some GCPs on the camera side appear in each image, and the other GCPs are occluded by the object body. The visibility of each GCP depends on the viewing direction. Because of the irregular shape of this object, the number of visible GCPs varied greatly depending on the viewing direction (Fig. 4). In this simulation, GCPs were observed a total of 1067 times in 51 images; therefore, 1067 approximate observation equations were obtained. A total of 81 GCPs and six GCP constraints (condition number 1 and 2 of Sect. 3.1) existed; therefore, the degree of freedom of the GCP position was 81× 3−6 = 237. Because the number of equations is larger than the number of unknown variables, Eq. (3) is an overdetermined linear system of equations. Equation (15) is also an overdetermined linear system of equations because the number of freedom of P 0P r ov , v P r ov is only six in total.
138
M. Maruya and T. Takiguchi
Fig. 4 Number of visible GCPs in each image
3.2 Results 1. GCP Position Estimation Figure 5 shows the estimated and true GCP positions. They agreed to some extent; however, some displacements could also be observed. To analyze this error, we defined the GCP position estimation error as follows: GC P_Er r or n = GC P nEst − GC P nT r ue ,
(16)
where GC P nEst is the final value of GC P nPr ov when the estimation process is complete. We then defined the GCP position mean error as follows: T NG GCP position mean error =
n=1
GC P Er r or n N
.
(17)
Figure 6 shows the manner in which the GCP position mean error decreases as the number of iterations increase. The initial value of the error was approximately 0.5, and it dropped to 0.08 or 0.16 at iteration = 50. The decrease speed slowed as the iterations progressed, and almost no change was observed after 40. This also showed that the smaller the error in the initial value, the smaller was the final GCP estimation error. Specifically, the magnitude of the residuals of the final estimates depended on the absolute value of the error in the initial values and not on the direction (positive or negative) of the error. Hence, although four initial values existed, only two final estimates were obtained. To analyze the GCP estimation error, we defined the radius error and ratio of radius error as follows (Fig. 7): Radius err or (n) = rnEst − rnT r ue ,
(18)
Visual 3D Reconstruction of a Rotating Object …
139
Fig. 5 Estimated and true GCP positions ( p0,x = 90)
Fig. 6 Decrease in the GCP position mean error by iteration
Ratio o f radius err or (n) =
Est r − r T r ue n n GC P nEst − GC P nT r ue
,
(19)
where rnEst = rnT r ue
2
2
Est Est ) + (GC P n,y ) , (GC P n,x
T r ue 2 T r ue 2 = (GC P n,x ) + (GC P n,y ) .
(20) (21)
140
M. Maruya and T. Takiguchi
Fig. 7 Radius component (rnEst , rnT r ue ) of the GCP position
The ratio of the radius error is the proportion of the difference between the true value and the estimated value of the GCP position in the radial direction to the absolute GCP estimation error. Figure 8 shows a histogram of the ratio of the radius error. Forty seven GCPs were in the range of 0.95–1.00; this was more than half the total (81 GCPs) and most were above 0.75. This implied that most GCP estimation errors were in the radial direction. The reason for this is discussed in Sect. 3.5. Next, we examined the relationship between the number of GCP observations and the GCP estimation error (Fig. 9). For example, GCP#0 was observed 17 times, whereas GCP#55 and #56 were observed only three times. Apparently, no correlation existed between the number of GCP observations and the GCP estimation error.
Fig. 8 Histogram: ratio of the radius error ( p0,x = 90)
Visual 3D Reconstruction of a Rotating Object …
141
Fig. 9 Relation between the total number of GCP observations and GCP position error ( p0,x = 90)
3.3 Results 2. SC Initial Position Estimation Figure 10 shows the manner in which the estimated value of the SC initial position, P 0P r ov , changes as the number of iterations increases. The Z-component converged to near the true value after 50 iterations, whereas the Y-component started from the true value (zero), shifted slightly, and finally returned to a value close to the true value. For the X-component, no improvement was observed over the iterations. The reason for this is discussed in Sect. 3.5. For all components, the smaller the error in the initial value, the smaller was the error in the final value.
3.4 Results 3. SC Motion Vector Estimation Figure 11 shows the manner in which the estimated value of the SC motion vector, v P r ov , changes as the iteration proceeds. The Z-component converged near the true value after 50 iterations, whereas the Y-component started from the true value (zero), shifted slightly, and finally returned to a value close to the true value. For the X-component, starting from the true value, the error increased slightly after 50 iterations. The reason for this is discussed in Sect. 3.5. For the Y- and Z-components, the smaller the initial value error, the smaller was the final error.
3.5 Discussion The results discussed in Sect. 3.2 through 3.4 indicate that the estimated variables (GCP position and initial position of SC) contain errors mainly along the
142
M. Maruya and T. Takiguchi
Fig. 10 Results of SC initial position estimation
Fig. 11 Results of SC motion vector estimation
X-coordinates (the optical axis of the camera). This is probably owing to the characteristics of the narrow field of view (FOV) (1.9°) camera. Figure 12 shows the anisotropy of the 3D observation with a narrow field of view camera. The circles, triangles, and squares on the left represent GCP#1, #2, and #3, respectively, and the distance between GCP#1 and #2 is the same as that between GCP#1 and #3. However, in the camera image, the distance between GCP#1 and #2 is considerably shorter than the distance between GCP#1 and #3. Because the image distance is pixel quantized in the imaging process, the distance between
Visual 3D Reconstruction of a Rotating Object …
143
Fig. 12 Anisotropy of 3D estimation with a narrow FOV camera
GCP#1 and #2 is measured coarsely compared to the distance between GCP#1 and #3. Therefore, the depth estimation accuracy tends to deteriorate more than that of horizontal position estimation.
4 Summary We proposed a least-squares framework for the visual 3D reconstruction of a rotating target. The proposed method is lightweight; thus, it is advantageous for future onboard processing. As demonstrated in the simulation, the proposed method reduced the error to a fraction after 50 iterations. The observations of the experiment are as follows: • The accuracy of initial values affected the final error. • GCP estimation errors occurred primarily in the radial direction. • Errors in the initial SC position and velocity vectors were mainly in the X-axis direction (the optical axis of the camera). The second and third results could be attributed to the low depth sensitivity of the narrow FOV camera. The method proposed in this study is the first trial to apply a least-squares framework for visual 3D reconstruction, and several issues remain to be resolved. Further studies to be conducted in the future are as follows: • Accuracy improvement considering the anisotropy of the 3D estimation. Additionally, the new algorithm should be less sensitive to the initial value. • Estimation of the rotation axis and rotation period of an object. • Evaluation of computational complexity in the onboard computing environment.
References Asteroid Itokawa 3D Model | NASA Solar System Exploration. https://solarsystem.nasa.gov/resour ces/2377/asteroid-itokawa-3d-model/. Accessed 27 Jun 2022 Maruya M (2020) 3D reconstruction of asteroid Ryugu as an inverse problem. In: New technologies for non-destructive and non-invasive inspections and their applications (ISSN 2188-286X)
144
M. Maruya and T. Takiguchi
Maruya M et al (2006) Navigation shape and surface topography model of Itokawa. In: Collection of technical papers—AIAA/AAS astrodynamics specialist conference, 2006, vol 3, pp 1522–1540. https://doi.org/10.2514/6.2006-6659 numpy.linalg.svd—NumPy v1.23 Manual. https://numpy.org/doc/stable/reference/generated/ numpy.linalg.svd.html. Accessed 26 Jun 2022 Pesce V, Agha-Mohammadi AA, Lavagna M (2018) Autonomous navigation & mapping of small bodies. In: IEEE Aerospace conference proceed-ings, vol 2018-March, pp 1–10, Jun 2018. https://doi.org/10.1109/AERO.2018.8396797 Takiguchi T (2020) A theoretical study of the algorithm to practicalize CT by G. N. Hounsfield and its applications. Jpn J Ind Appl Math 37(1):115–130. https://doi.org/10.1007/s13160-019-003 91-1 Triggs B, McLauchlan PF, Hartley RI, Fitzgibbon AW (2000) Bundle adjustment—A modern synthesis. Lecture Notes in Computer Science (including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 1883:298–372. https://doi.org/10.1007/3540-44480-7_21
Uniqueness of Inverse Source Problems for Time-Fractional Diffusion Equations with Singular Functions in Time Yikan Liu and Masahiro Yamamoto
Abstract We consider a fractional diffusion equation of order α ∈ (0, 1) whose source term is singular in time: (∂tα + A)u(x, t) = μ(t) f (x), (x, t) ∈ × (0, T ), where μ belongs to a Sobolev space of negative order. In inverse source problems of determining f | by the data u|ω×(0,T ) with a given subdomain ω ⊂ and μ|(0,T ) by the data u|{x 0 }×(0,T ) with a given point x 0 ∈ , we prove the uniqueness by reducing to the case μ ∈ L 2 (0, T ). The key is a transformation of a solution to an initialboundary value problem with a regular function in time. Keywords Time-fractional diffusion equation · Inverse source problem · Uniqueness
Y. Liu (B) Research Center of Mathematics for Social Creativity, Research Institute for Electronic Science, Hokkaido University, N12W7, Kita-Ward, Sapporo 060-0812, Japan e-mail: [email protected] M. Yamamoto Graduate School of Mathematical Sciences, The University of Tokyo, 3-8-1 Komaba, Meguro-ku, Tokyo 153-8914, Japan e-mail: [email protected] Honorary Member of Academy of Romanian Scientists, Ilfov, nr. 3, Bucuresti, Romania Correspondence member of Accademia Peloritana dei Pericolanti, Palazzo Università, Piazza S. Pugliatti 1, 98122 Messina, Italy © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 T. Takiguchi et al. (eds.), Practical Inverse Problems and Their Prospects, Mathematics for Industry 37, https://doi.org/10.1007/978-981-99-2408-0_10
145
146
Y. Liu and M. Yamamoto
1 Introduction Let ⊂ Rd (d ∈ N := {1, 2, . . .}) be a bounded domain with a smooth boundary ∂, and let ν = ν(x) be the unit outward normal vector of ∂. We set d
Av(x) = −
∂ j (ai j (x)∂i v(x)) +
i, j=1
d
b j (x)∂ j v(x) + c(x)v(x), x ∈ , (1)
j=1
where ai j = a ji ∈ C 1 (), b j ∈ C() (1 ≤ i, j ≤ d), c ∈ C() and we assume that there exists a constant κ > 0 such that d
ai j (x)ξi ξ j ≥ κ
i, j=1
d
|ξ j |2 , ∀ x ∈ , ∀ (ξ1 , . . . , ξd ) ∈ Rd .
j=1
We consider an initial-boundary value problem for a time-fractional diffusion equation whose source term is described by μ(t) f (x), where f is a spatial distribution of the source and μ is a temporal change factor. We can describe a governing initialboundary value problem as follows: ⎧ α ⎪ ⎨(dt + A)u(x, t) = μ(t) f (x), (x, t) ∈ × (0, T ), u(x, 0) = 0, x ∈ , ⎪ ⎩ u(x, t) = 0, (x, t) ∈ ∂ × (0, T ).
(2)
Here, for 0 < α < 1, we can formally define the pointwise Caputo derivative as dtα v(t)
1 := (1 − α)
t
(t − s)−α v (s) ds, v ∈ C 1 [0, T ].
0
Owing to their capability of describing memory effects, time-fractional partial differential equations such as (2) have gathered consistent popularity among multidisciplinary researchers as models for anomalous diffusion and viscoelasticity (e.g. Barlow and Perkins 1988; Brown et al. 2018; Hatano and Hatano 1998). Though the history of fractional calculus can be traced back to Leibniz, the main focus on fractional equations was biased to the construction of explicit and approximate solutions via special functions and transforms until the last decades due to the needs from applied science. It has only been started recently that problems like (2) are formulated in appropriate function spaces using modern mathematical tools, followed by rapidly increasing literature on their fundamental theories, numerical analysis, and inverse problems. Here we do not intend to give a complete bibliography, but only refer to several milestone works (Eidelman and Kochubei 2004; Gorenflo et al. 2015; Kubica et al. 2020; Sakamoto and Yamamoto 2011) and the references therein.
Uniqueness of Inverse Source Problems for Time-Fractional Diffusion …
147
Especially, the source term in (2) takes the form of separated variables, where μ(t) and f (x) describe the time evolution and the spatial distribution of some contaminant source, respectively. Therefore, the determination of μ(t) or f (x) turns out to be important in the context of environmental issues, which motivates us to propose the following problem. Problem 1 (inverse source problems) Let u satisfy (2), ∅ = ω ⊂ be an arbitrary subdomain and x 0 ∈ be an arbitrary point. Determine μ|(0,T ) by u|{x 0 }×(0,T ) with given f and f | by u|ω×(0,T ) with given μ, respectively. Indeed, Problem 1 includes two inverse problems, namely, the determination of μ(t) by the single point observation and that of f (x) by the partial interior observation of u. Both problems have been studied intensively in the last decade, especially among which the uniqueness was already known in literature. We can refer to many works, but here only to Liu et al. (2016), Liu (2017), Liu and Zhang (2017) for the inverse t-source problem and Jiang et al. (2017), Kian et al. (2022a, b), Li et al. (2023) for the inverse x-source problem. For a comprehensive survey on Problem 1 especially before 2019, we refer to Liu et al. (2019). However, it reveals that the existing papers mainly discuss regular temporal components μ, e.g. in C 1 [0, T ] or L 2 (0, T ). Such restrictions exclude a wide class of singular functions represented by the Dirac delta function, which corresponds with point sources in practice. This encourages us to reconsider Problem 1 in function spaces with lower regularity. More precisely, in this article we are mainly concerned with the case of μ = μ(t) in a Sobolev space of negative order, in particular μ ∈ / L 2 (0, T ). For such less regular μ, we cannot expect the differentiability of u(x, · ) in time and we must redefine the Caputo derivative dtα in (2). In this case, the unique existence of a solution to the initial-boundary value problem is more delicate, and an adequate formulation is needed. This article is composed of 6 sections. Redefining dtα and thus problem (2) in negative Sobolev spaces, in Sect. 2 we state the main result in this paper. Then Sects. 3–4 are devoted to the two key ingredients for treating the singular system, i.e., the transfer to another regular system in L 2 and Duhamel’s principle in α H (0, T )space. Next, the proof of Theorem 1 is completed in Sect. 5. Finally, Sect. 6 provides concluding remarks, and the proof of a technical detail is postponed to the end.
2 Preliminary and Statement of the Main Result To start with, we first define a fractional derivative for v ∈ L 2 (0, T ) which extends the domain of dtα and formulate the initial-boundary value problem. To this end, we introduce function spaces and operators. Set the forward and backward RiemannLiouville integral operators as
148
Y. Liu and M. Yamamoto
1 (J v)(t) := (α) α
(Jα v)(t) :=
1 (α)
t
(t − s)α−1 v(s) ds, 0 < t < T, D(J α ) = L 2 (0, T ),
0
T
(s − t)α−1 v(s) ds, 0 < t < T, D(Jα ) = L 2 (0, T ).
t
We define an operator τ : L 2 (0, T ) −→ L 2 (0, T ) by (τ v)(t) := v(T − t) and obviously τ is an isomorphism. By H α (0, T ) we denote the Sobolev-Slobodecki space equipped with the norm · H α (0,T ) defined by ⎛ v H α (0,T ) := ⎝v2L 2 (0,T ) +
TT 0 0
⎞1/2 |v(t) − v(s)|2 dtds ⎠ |t − s|1+2α
(e.g., Adams 1975). Moreover, we define Sobolev spaces ⎧ α H (0, T ), ⎪ ⎪
0 < α < 1/2, ⎪ ⎨
T 2
Hα (0, T ) := dt < ∞ , α = 1/2, v ∈ H 1/2 (0, T ) |v(t)| t
⎪ 0 ⎪ ⎪ ⎩ {v ∈ H α (0, T ) | v(0) = 0}, 1/2 < α ≤ 1 with the norm
v Hα (0,T ) :=
⎧ α , ⎪ ⎨ v H (0,T ) 2 ⎪ ⎩ v H 1/2 (0,T ) +
T 0
1/2 |v(t)|2 t
α = 1/2, , α = 1/2
dt
and ⎧ α H (0, T ), ⎪ ⎪
0 < α < 1/2, ⎪
T ⎨ 2
α H (0, T ) := dt < ∞ , α = 1/2, v ∈ H 1/2 (0, T ) |v(t)|
0 T −t ⎪ ⎪ ⎪ ⎩ {v ∈ H α (0, T ) | v(T ) = 0}, 1/2 < α ≤ 1 with the norm
vα H (0,T ) :=
⎧ α , ⎪ ⎨ v H (0,T ) 2 ⎪ ⎩ v H 1/2 (0,T ) +
T 0
1/2 |v(t)|2 T −t
dt
α = 1/2, , α = 1/2.
Uniqueness of Inverse Source Problems for Time-Fractional Diffusion …
149
Then we can characterize the ranges of J α and Jα as follows. Lemma 1 Let 0 < α < 1. (i) J α : L 2 (0, T ) −→ Hα (0, T ) is bijective and isomorphism. (ii) Jα : L 2 (0, T ) −→ α H (0, T ) is bijective and isomorphism. As for the proof of Lemma 1(i), see Gorenflo et al. (2015), Kubica et al. (2020). Lemma 1(ii) can be readily derived from Lemma 1(i), the identity J α v = τ (Jα (τ v)) and the fact that τ is an isomorphism. Moreover, setting 0C 0
1
[0, T ] := {v ∈ C 1 [0, T ] | v(0) = 0},
C 1 [0, T ] := {v ∈ C 1 [0, T ] | v(T ) = 0} = τ (0 C 1 [0, T ]),
we can prove Hα (0, T ) := 0 C 1 [0, T ]
Hα (0,T )
,
α
α
H (0, T ) := 0 C 1 [0, T ]
H (0,T )
(e.g., Kubica et al. 2020). Based on the spaces Hα (0, T ) and α H (0, T ), let Hα (0, T ) ⊂ L 2 (0, T ) ⊂ (Hα (0, T ))∗ =: H−α (0, T ), α
H (0, T ) ⊂ L 2 (0, T ) ⊂ (α H (0, T ))∗ =: −α H (0, T )
be the Gel’fand triples, where ( · )∗ denotes the dual space. Let X, Y be Banach spaces and K : X −→ Y be a bounded linear operator with its domain D(K ) = X . By K ∗ we denote the adjoint operator of K , that is, K ∗ : Y ∗ −→ X ∗ is a linear operator such that X ∗ K
∗
y, x X = Y ∗ y, K xY , ∀ x ∈ X, ∀ y ∈ D(K ∗ ) ⊂ Y ∗ .
We understand K ∗ as the unique maximal operator adjoint to K , i.e., any other operator adjoint to K is a restriction of K ∗ . Now we can prove the following lemma. Lemma 2 (Yamamoto 2022, Proposition 9) Let 0 < β < 1. (i) (J β )∗ : H−β (0, T ) −→ L 2 (0, T ) is bijective and isomorphism. (ii) (Jβ )∗ : −β H (0, T ) −→ L 2 (0, T ) is bijective and isomorphism. (iii) There holds J β ⊂ (Jβ )∗ . In particular, J β v = (Jβ )∗ v for v ∈ L 2 (0, T ). Now we can redefine the Caputo derivative for functions in L 2 (0, T ). Definition 1 We define ∂tα := ((Jα )∗ )−1 , D(∂tα ) = L 2 (0, T ).
150
Y. Liu and M. Yamamoto
Remark 1 Kubica et al. (2020) defined ∂tα by ∂tα ) = Hα (0, T ). ∂tα := (J α )−1 , D( ∂tα and By Lemma 2(iii), we see ∂tα ⊃ ∂tα v = ∂tα v = lim dtα vn in L 2 (0, T ), ∀ v ∈ Hα (0, T ), n→∞
(3)
where vn ∈ 0 C 1 [0, T ] and vn −→ v in Hα (0, T ). Thus ∂tα is an extension of ∂tα . Now we are ready to propose the initial-boundary value problem: (∂tα + A)u = μ(t) f (x) in −β H (0, T ; L 2 ()), u ∈ L (0, T ; L ()) ∩ 2
where
μ ∈ −β H (0, T ),
2
−β
H (0, T ; H () ∩ 2
H01 ()),
f ∈ L 2 (), α ≤ β < 1, β > 1/2.
(4) (5)
(6)
Henceforth we assume the existence of such a solution u, although it can be verified during the succeeding arguments. Now we are well prepared to state the main result on the uniqueness for Problem 1 with singularity in time. Theorem 1 Let u satisfy (4)–(5) and assume (6). Let ω ⊂ and x 0 ∈ be the same as that in Problem 1. (i) Let μ ≡ 0 in (0, T ). If u = 0 in ω × (0, T ), then f = 0 in L 2 (). (ii) In (1) we assume that b j = 0 ( j = 1, . . . , d) and c ≥ 0 in . Let f satisfy f ≡ 0 in , and f ≥ 0 or f ≤ 0 in , L 2 () if d = 1, 2, 3, f ∈ D(Aθ ) if d ≥ 4, where θ > d/4 − 1.
(7) (8)
Then u = 0 at {x 0 } × (0, T ) implies μ = 0 in −β H (0, T ). In Theorem 1(ii), in terms of (8) we can verify u ∈ −β H (0, T ; C()).
(9)
Therefore, the data u(x 0 , · ) ∈ −β H (0, T ) makes sense. The proof of (9) is given at the end of the article. In Theorem 1(ii), we can further study and prove the uniqueness in the case where we replace assumption (7) by more general conditions, e.g., b j = 0 and c is not necessarily non-negative, but we omit details. Moreover, here we omit the treatments for the case 1 < α < 2.
Uniqueness of Inverse Source Problems for Time-Fractional Diffusion …
151
Since β H (0, T ) ⊂ C[0, T ] by β > 1/2, we see that the following μ are in N H (0, T ): μ ∈ L 1 (0, T ) and μ(t) = k=1 rk δak (t), where rk ∈ R, 0 < ak < T , and δak is the Dirac delta function: δak , ϕ = ϕ(ak ) for ϕ ∈ C[0, T ]. In particular, from Theorem 1(ii) we can directly derive
−β
Corollary 1 Under the same conditions as that in Theorem 1(ii), we set
μ (t) =
N
rk δak (t), = 1, 2, ak ∈ (0, T ), rk ∈ R \ {0}, k = 1, . . . , N ,
k=1
where ak (k = 1, 2, . . . , N ) are mutually distinct for = 1, 2. Let u be the solution to (4)–(5) with μ = μ ( = 1, 2). Then u 1 (x 0 , t) = u 2 (x 0 , t) for 0 < t < T implies N 1 = N 2 , rk1 = rk2 and ak1 = ak2 (k = 1, . . . , N 1 ). For the case of μ ∈ L 2 (0, T ) or μ ∈ L 1 (0, T ), there are rich references but we are here restricted to Jiang et al. (2017), Kian et al. (2022a, b), Liu et al. (2016). In particular, Kian et al. (2022a) considers also the case 0 < α ≤ 2 and variable α(x) for μ ∈ L 1 (0, T ), and we can apply our method to their formulated inverse problems, as mentioned in Sect. 6.
3 Transfer to a Regular System The main purpose of this section is the proof of Proposition 1 Let u satisfy (4)–(5) with (6). Then v := (Jβ )∗ u ∈ Hβ (0, T ; L 2 ()) ∩ L 2 (0, T ; H 2 () ∩ H01 ())
(10)
(∂tα + A)v = (Jβ )∗ μ f in L 2 (0, T ; L 2 ()).
(11)
satisfies
Proof We show Lemma 3 Let α, γ > 0 and α + γ < 1. Then (Jα+γ )∗ v = (Jα )∗ (Jγ )∗ v = (Jγ )∗ (Jα )∗ v for v ∈ −α−γ H (0, T ). Proof We can directly verify Jα+γ = Jα Jγ = Jγ Jα in L 2 (0, T ). Therefore, by e.g. Kato (1976, Problem 5.26 (p.168)), we conclude the lemma. We complete the proof of Proposition 1. Since u ∈ L 2 (0, T ; L 2 ()), by Lemma 2 and α ≤ β, we see
152
Y. Liu and M. Yamamoto
(Jβ )∗ u = J β u ∈ Hβ (0, T ; L 2 ()) ⊂ Hα (0, T ; L 2 ()). Since u ∈ −α H (0, T ; H 2 () ∩ H01 ()), Lemma 2 yields (Jα )∗ u ∈ L 2 (0, T ; H 2 () ∩ H01 ()). Then (Jβ )∗ u = (Jβ−α )∗ (Jα )∗ u = J β−α (Jα )∗ u ∈ Hβ−α (0, T ; H 2 () ∩ H01 ()) ⊂ L 2 (0, T ; H 2 () ∩ H01 ()) by Lemma 1(i), Lemma 2(iii) and (Jα )∗ u ∈ L 2 (0, T ; H 2 () ∩ H01 ()). Therefore, we see v ∈ L 2 (0, T ; H 2 () ∩ H01 ()). Next we operate (Jβ )∗ on both sides of (4) to deduce (Jβ )∗ ((Jα )∗ )−1 u + A(Jβ )∗ u = (Jβ )∗ μ f. We will prove
(Jβ )∗ ((Jα )∗ )−1 u = ((Jα )∗ )−1 (Jβ )∗ u,
that is,
(Jα )∗ (Jβ )∗ ((Jα )∗ )−1 u = (Jβ )∗ u.
In fact, by Lemma 3, we see (Jβ )∗ = (Jβ−α )∗ (Jα )∗ = (Jα )∗ (Jβ−α )∗ and then (Jα )∗ (Jβ )∗ ((Jα )∗ )−1 u = (Jα )∗ (Jβ−α )∗ (Jα )∗ ((Jα )∗ )−1 u = (Jα )∗ (Jβ−α )∗ u = (Jβ )∗ u. Therefore, it follows that (Jβ )∗ ((Jα )∗ )−1 u = ((Jα )∗ )−1 (Jβ )∗ u = ((Jα )∗ )−1 v = ∂tα v. Hence, we conclude ∂tα v + Av = (Jβ )∗ μ f .
In terms of Proposition 1, the unique existence of the solution u to (4)–(5) can be clarified via v := (Jβ )∗ u in (10).
4 Duhamel’s Principle In this section, we establish Duhamel’s principle which transforms a solution to an initial-boundary value problem without the inhomogeneous term to a solution to (11). Such a principle is known and see e.g. Liu et al. (2016) and we can refer also to the surveys Liu et al. (2019) and Umarov (2019). Here we reformulate the formula in the space Hα (0, T ) in order to apply within our framework.
Uniqueness of Inverse Source Problems for Time-Fractional Diffusion …
153
Lemma 4 (Duhamel’s principle in Hα (0, T )) Let g ∈ L 2 (0, T ) and f ∈ L 2 (). Let z satisfy ∂tα (z − f ) + Az = 0, z − f ∈ Hα (0, T ; L 2 ()), z ∈ L 2 (0, T ; H 2 () ∩ H01 ()). Then
(12)
t g(s)z( · , t − s) ds, 0 < t < T
w( · , t) =
(13)
0
satisfies
(∂tα + A)w = (J 1−α g) f in L 2 (0, T ; L 2 ()), w ∈ Hα (0, T ; L 2 ()) ∩ L 2 (0, T ; H 2 () ∩ H01 ()).
(14)
Proof Step 1. We prove Lemma 5 Let g ∈ L 2 (0, T ) and v ∈ Hα (0, T ). Then ⎞ ⎛ t t ∂tα ⎝ g(s)v(t − s) ds ⎠ = g(s)∂tα v(t − s) ds in L 2 (0, T ). 0
(15)
0
Proof First we assume g ∈ C01 [0, T ] := {h ∈ C 1 [0, T ] | h(0) = h(T ) = 0} and v ∈ 1 0 C [0, T ]. Then by v(0) = 0, we see ⎛ t ⎞ t ⎝ g(s)v(t − s) ds ⎠ = g(s)v (t − s) ds ∈ L 1 (0, T ) 0
0
and thus ⎞ ⎞ ⎛ t ⎛ t ∂tα ⎝ g(s)v(t − s) ds ⎠ = dtα ⎝ g(s)v(t − s) ds ⎠ 0
0
1 = (1 − α) t = 0
t = 0
⎛
t
⎞ ⎛ s (t − s)−α ⎝ g(r )v (s − r ) dr ⎠ ds
0
1 g(r ) ⎝ (1 − α)
0
t
⎞
(t − s)−α v (s − r ) ds ⎠ dr
r
g(r ) dtα v(t − r ) dr =
t 0
g(r )∂tα v(t − r ) dr,
154
Y. Liu and M. Yamamoto
where we exchanged the orders of the integrals with respect to s and r . Therefore, (15) holds for each g ∈ C01 [0, T ] and v ∈ 0 C 1 [0, T ]. Next, let g ∈ L 2 (0, T ) and v ∈ Hα (0, T ). Since Hα (0, T ) = 0 C 1 [0, T ]
Hα (0,T )
L 2 (0,T )
(e.g., Kubica et al. 2020) and L 2 (0, T ) = C01 [0, T ] , we can choose sequences {gn } ⊂ C01 [0, T ] and {vn } ⊂ 0 C 1 [0, T ] such that gn −→ g in L2 (0, T ) and vn −→ v t in Hα (0, T ) as n → ∞. Henceforth we write (g ∗ v)(t) = 0 g(s)v(t − s) ds for 0 < t < T , and we regard ∂tα as (J α )−1 , that is, D(∂tα ) = Hα (0, T ) (see also (3)). As is directly proved, we have gn ∗ vn ∈ 0 C 1 [0, T ] and so gn ∗ vn ∈ Hα (0, T ) = D(∂tα ). Therefore, we see ∂tα (gn ∗ vn )(t) = (gn ∗ ∂tα vn )(t), 0 < t < T. Since vn −→ v in Hα (0, T ), it follows that ∂tα vn −→ ∂tα v in L 2 (0, T ). Then Young’s convolution inequality yields gn ∗ ∂tα vn −→ g ∗ ∂tα v in L 2 (0, T ). Therefore, ∂tα (gn ∗ vn ) converges in L 2 (0, T ) and ∂tα (g ∗ v) = g ∗ ∂tα v in L 2 (0, T ). The proof of Lemma 5 is complete. Step 2. First we have t
t g(s)(z( · , t − s) − f ) ds + f
w( · , t) = 0
Since
t
g(s) ds, 0 < t < T. 0
g(s) ds = (J 1 g)(t) for 0 < t < T and (∂tα J 1 g)(t) = ((J α )−1 J 1 g)(t) =
0
(J 1−α g)(t), by Lemma 5 and z − f ∈ Hα (0, T ; L 2 ()) we see ∂tα w( · , t)
t =
g(s)∂tα (z − f )( · , t − s) ds + (J 1−α g)(t) f, 0 < t < T.
0
Moreover, we have (Aw)( · , t) =
t
g(s)Az( · , t − s) ds. Hence, we arrive at
0
(∂tα
t + A)w( · , t) =
g(s)(∂tα (z − f ) + Az)( · , t − s) ds + (J 1−α g)(t) f
0
= (J 1−α g)(t) f, 0 < t < T. The regularity of w described in (14) can follow directly from (13). Thus, by the uniqueness of solution to (14), the proof of Lemma 4 is complete.
Uniqueness of Inverse Source Problems for Time-Fractional Diffusion …
155
5 Completion of the Proof of Theorem 1 We first show the following key lemma. Lemma 6 We assume (∂tα + A)v = g f in L 2 (0, T ; L 2 ()), v ∈ Hα (0, T ; L 2 ()) ∩ L 2 (0, T ; H 2 () ∩ H01 ()), where g ∈ L 2 (0, T ) and f ∈ L 2 (). Let ω ⊂ and x 0 ∈ be the same as that in Problem 1. (i) Let g ≡ 0 in (0, T ). If v = 0 in ω × (0, T ), then f ≡ 0 in . (ii) Let f satisfy (7)–(8). If v = 0 at {x 0 } × (0, T ), then g ≡ 0 in (0, T ). Let Lemma 6 be proved. Then we can complete the proof of Theorem 1 as follows. Let u be the solution to (4)–(5) and satisfy u = 0 in ω × (0, T )
or
u = 0 at {x 0 } × (0, T ).
(16)
Setting v := (Jβ )∗ u and g := (Jβ )∗ μ ∈ L 2 (0, T ), we have (10)–(11) and v ∈ Hα (0, T ; L 2 ()) according to Proposition 1. Moreover, (16) yields v = 0 in ω × (0, T ) or v = 0 at {x 0 } × (0, T ). Thus, since (Jβ )∗ μ = 0 in L 2 (0, T ) implies μ = 0 in −β H (0, T ), the application of Lemma 6 completes the proof of Theorem 1. Thus it suffices to prove Lemma 6. Proof (Proof of Lemma 6) We set w := J 1−α v. We note that by the injectivity of J 1−α , it follows that v = 0 in ω × (0, T ) and v = 0 at {x 0 } × (0, T ) are equivalent to w = 0 in ω × (0, T ) and w = 0 at {x 0 } × (0, T ), respectively. Hence, we can reduce the proof to the following: Let w satisfy (14). Then (i) Let g ≡ 0 in (0, T ). If w = 0 in ω × (0, T ), then f ≡ 0 in . (ii) Let f ∈ L 2 () satisfy (7)–(8). If w = 0 at {x 0 } × (0, T ), then g ≡ 0 in (0, T ). Proof of (i). The proof is similar to that of Jiang et al. (2017, Theorem 2.6) and we describe the essence. In terms of Lemma 4, we have t g(s)z( · , t − s) ds = 0 in ω, 0 < t < T.
w( · , t) = 0
The Titchmarsh convolution theorem (e.g., Titchmarsh 1926) yields that there exists t∗ ∈ [0, T ] such that g = 0 in (0, T − t∗ ) and z = 0 in ω × (0, t∗ ). Since g ≡ 0, we see that t∗ > 0, indicating that z( · , t) = 0 in ω holds for t in some open interval in R. We apply a uniqueness result (e.g., Jiang et al. 2017 for non-symmetric A) to obtain z ≡ 0 in × (0, ∞). Consequently (12) implies ∂tα (z − f ) ≡ 0 and thus
156
Y. Liu and M. Yamamoto
z − f = J α ∂tα (z − f ) ≡ 0 in × (0, T ). By z ≡ 0 in × (0, T ), we reach f ≡ 0 in . This completes the proof of Lemma 6(i). Proof of (ii). Henceforth, we set ( f, g) := f (x)g(x) dx. We define A by (1) with b j = 0 ( j = 1, . . . , d), c ≥ 0 in and D(A) = H 2 () ∩ 1 H0 (). Then we number all of its eigenvalues with their multiplicities as 0 < λ1 ≤ λ2 ≤ · · · −→ ∞. Let ϕn be an eigenfunction for λn : Aϕn = λn ϕn such that {ϕn }n∈N forms an orthonormal basis in L 2 (). Then, the fractional power Aγ is defined with γ > 0, and γ
D(A ) ⊂ H (), f H 2γ () ≤ C 2γ
∞
1/2 λ2γ n |(
f, ϕn )|
2
n=1
for γ ≥ 0 (e.g., Fujiwara 1967; Pazy 1983). Moreover, we define the Mittag-Leffler functions E α,β (z) with α, β > 0 by E α,β (z) =
∞ k=0
zk , z ∈ C, (αk + β)
where the power series is uniformly and absolutely convergent in any compact set in C (e.g., Gorenflo et al. 2014; Podlubny 1999). Then we can represent z( · , t) =
∞
E α,1 (−λn t α )( f, ϕn )ϕn in L 2 (0, T ; L 2 ()).
(17)
n=1
Then by (8) we can prove that for any fixed δ > 0 and T1 > 0, the series (17) is absolutely convergent in L ∞ (δ, T1 ; C())
(18)
z(x 0 , · ) ∈ L 1 (0, T1 ).
(19)
and Verification of (18) and (19). First let d = 1, 2, 3. Since |E α,1 (−λn t α )| ≤
C , n ∈ N, t > 0 1 + λn t α
(e.g., Podlubny 1999), using (17), we have
Uniqueness of Inverse Source Problems for Time-Fractional Diffusion …
Az( · , t)2L 2 ()
=
∞
λ2n |E α,1 (−λn t α )|2 |(
n=1
≤ C t −2α =Ct
∞
λ n 2 2
f, ϕn )| ≤ C
1 + λ t α |( f, ϕn )| n n=1 2
∞ ∞
λ n t α 2
|( f, ϕn )|2 ≤ C t −2α |( f, ϕn )|2
1 + λ tα n
n=1
−2α
157
n=1
f 2L 2 () ,
that is, z( · , t)C() ≤ C t −α f L 2 () because z( · , t)C() ≤ Cz( · , t) H 2 () ≤ C Az( · , t) L 2 () which is seen by d ≤ 3 and the Sobolev embedding. Therefore, (18) and (19) are seen for d = 1, 2, 3. Next let d ≥ 4. Then, since θ > d/4 − 1 by (8), the Sobolev embedding yields D(A1+θ ) ⊂ H 2+2θ () ⊂ C(). Therefore, |z(x 0 , t)| ≤ Cz( · , t)C() ≤ Cz( · , t) H 2+2θ () ≤ CA1+θ z( · , t) L 2 () ∞ α θ =C λn E α,1 (−λn t )(A f, ϕn )ϕn 2 n=1 L () ∞ 1/2 =C λ2n |E α,1 (−λn t α )|2 |(Aθ f, ϕn )|2 n=1
∞ 1/2
λ n t α 2 θ 2
≤ C t −α ≤ C t −α Aθ f L 2 () ,
1 + λ t α |(A f, ϕn )| n=1
n
so that the verification of (18) and (19) is complete. Similarly to the proof of (i), if g ≡ 0 in (0, T ), then z(x 0 , · ) ≡ 0 in (0, t∗ ) with some constant t∗ > 0. In terms of (18), we apply the t-analyticity of z(x 0 , t) (e.g., Sakamoto and Yamamoto 2011) to reach z(x 0 , t) = 0 for all t > 0. Therefore, we obtain ∞ E α,1 (−λn t α )( f, ϕn )ϕn (x 0 ) = 0, t > δ, n=1
where the series is absolutely convergent in L ∞ (δ, T1 ) with arbitrary T1 > 0. Not counting the multiplicities, we rearrange all the eigenvalues of A as 0 < ρ1 < ρ2 < · · · −→ ∞ and by {ϕn j }1≤ j≤dn we denote an orthonormal basis of ker(ρn − A). In other words, n {ρn }n∈N is the set of all distinct eigenvalues of A. We set Pn f := dj=1 ( f, ϕn j )ϕn j . Hence we can write
158
Y. Liu and M. Yamamoto ∞
E α,1 (−ρn t α )(Pn f )(x 0 ) = 0, t > δ.
(20)
n=1
On the other hand, we know E α,1 (−ρn t α ) =
1 +O (1 − α)ρn t α
1 ρn2 t 2α
as t → ∞
(21)
(e.g., Podlubny (1999, Theorem 1.4 (pp. 33–34))). Since the series in (20) converges in L ∞ (δ, T1 ), extracting a subsequence of partial sums for the limit, we see that the subsequence of the partial sums is convergent for almost all t ∈ (δ, T1 ). Hence, by (21) we obtain ∞ ∞ 1 (Pn f )(x 0 ) 1 1 + 2α cn (Pn f )(x 0 ) = 0 (1 − α) t α n=1 ρn t n=1
for almost all t ∈ (δ, ∞), where |cn | ≤ O(ρn−2 ). Multiplying by t α and choosing a sequence {tm } tending to ∞, we reach ∞ (Pn f )(x 0 ) = 0. ρn n=1
Since it is assumed in Theorem 1(ii) that c ≥ 0 in , we see that A−1 exists and is a bounded operator from L 2 () to itself and A−1 f =
∞ Pn f ρn n=1
in L 2 ().
Therefore, we conclude (A−1 f )(x 0 ) = 0. Set ψ := A−1 f ∈ H 2 () ∩ H01 (). Then Aψ = f in and ψ(x 0 ) = 0.
(22)
Since c ≥ 0 in (1) and f ≥ 0 or f ≤ 0 in , using (22) and applying the strong maximum principle for A (e.g., Gilbarg and Trudinger 2001), we conclude ψ ≡ 0 in , that is, f ≡ 0 in . This contradicts the assumption f ≡ 0 in . Therefore, t∗ > 0 is impossible. Hence t∗ = 0 and so Titchmarsh convolution theorem yields g ≡ 0 in (0, T ). This completes the proof of Lemma 6(ii).
Uniqueness of Inverse Source Problems for Time-Fractional Diffusion …
159
6 Concluding Remarks In this article, we consider an initial-boundary value problem for (∂tα + A)u = μ(t) f (x),
(23)
where μ = μ(t) is in a Sobolev space of negative order. The main machinery is to operate the extended Riemann-Liouville fractional integral operator (Jβ )∗ (see Lemma 2 in Sect. 2) to reduce (23) to (∂tα + A)v = ((Jβ )∗ μ)(t) f (x),
(24)
where v := (Jβ )∗ u and (Jβ )∗ μ ∈ L 2 (0, T ). Thus for the inverse source problems for (23), we can assume that μ ∈ L 2 (0, T ) by replacing (23) by (24). In this article, we limit the range of β to (0, 1) for simplicity, but we can choose arbitrary β > 0. Therefore, for inverse source problems for (23), it is even sufficient to assume that μ is smooth or μ ∈ L ∞ (0, T ). The same transformation for inverse source problems is valid for general timefractional differential equations including fractional derivatives of variable orders αk (x): N pk (x)∂tαk (x) u + Au = μ(t) f (x) (25) k=1
with suitable conditions on pk , αk . In particular, also for (25), we can similarly discuss the determination of μ ∈ −β H (0, T ) with β > 0 by transforming μ to a smooth function. Proof of (9) Since g := (Jβ )∗ μ ∈ L 2 (0, T ) by μ ∈ −β H (0, T ), by Lemma 2(ii), it suffices to prove the solution v := (Jβ )∗ u ∈ L 2 (0, T ; C()) to (11) if f satisfies (8). By Sakamoto and Yamamoto (2011) for example, we have ⎞ ⎛ t ∞ ⎝ s α−1 E α,α (−λn s α )g(t − s) ds ⎠ ( f, ϕn )ϕn , 0 < t < T. v( · , t) = n=1
0
Moreover, we know λn t α−1 E α,α (−λn t α ) = − and
d E α,1 (−λn t α ), t > 0 dt
t α−1 E α,α (−λn t α ) ≥ 0, t > 0.
(26)
(27)
160
Y. Liu and M. Yamamoto
We can directly verify (26) by the termwise differentiation because E α,1 (−λn t α ) is an entire function, while (27) follows from the complete monotonicity of E α,1 (−λn t α ) (e.g., Gorenflo et al. 2014). Let d = 1, 2, 3. Then Av( · , t)2L 2 ()
2
∞ t
λn s α−1 E α,α (−λn s α )g(t − s) ds |( f, ϕn )|2 . =
n=1 0
Hence, Young’s convolution inequality implies t λn s α−1 E α,α (−λn s α )g(t − s) ds 2 0 L (0,T ) α−1 α ≤ λn t E α,α (−λn t ) L 1 (0,T ) g L 2 (0,T ) ,
(28)
and (26) and (27) yield α−1 λn t E α,α (−λn t α )
T L 1 (0,T )
=
λn t α−1 E α,α (−λn t α ) dt
0
T =−
d E α,1 (−λn t α ) dt dt
0
= 1 − E α,1 (−λn T α ) ≤ 1. Therefore, by (28) we see T Av( · , t)2L 2 () dt 0
2 ∞ t λn s α−1 E α,α (−λn s α )g(t − s) ds = 2 n=1
|( f, ϕn )|2
L (0,T )
0
≤ g2L 2 (0,T )
∞
|( f, ϕn )|2 ≤ g2L 2 (0,T ) f 2L 2 () .
n=1
Therefore, the Sobolev embedding H 2 () ⊂ C() by d = 1, 2, 3, yields v2L 2 (0,T ;C()) ≤ Cg2L 2 (0,T ) f 2L 2 () ,
(29)
Uniqueness of Inverse Source Problems for Time-Fractional Diffusion …
161
which means (9) for d = 1, 2, 3. Let d ≥ 4. Then we assume f ∈ D(Aθ ) with θ > d/4 − 1. Since λθn ϕn = Aθ ϕn , applying (28) and (29), we have
A1+θ v( · , t)2L 2 ()
2
∞ t
α−1 α
= E α,α (−λn s )g(t − s) ds
|(Aθ f, ϕn )|2 ,
λn s
n=1 0
and so v2L 2 (0,T ;D(A1+θ )) ≤ C
∞
g2L 2 (0,T ) |(Aθ f, ϕn )|2 ≤ Cg2L 2 (0,T ) Aθ f 2L 2 () .
n=1
By the Sobolev embedding D(A1+θ ) ⊂ H 2+2θ () ⊂ C(), we complete the proof of (9). Acknowledgements Y.Liu is supported by Grant-in-Aid for Early Career Scientists 20K14355 and 22K13954, JSPS. M.Yamamoto is supported by Grant-in-Aid for Scientific Research (A) 20H00117 and Grant-in-Aid for Challenging Research (Pioneering) 21K18142, JSPS.
References Adams RA (1975) Sobolev spaces. Academic, New York Barlow MT, Perkins EA (1988) Brownian motion on the Sierpi´nski gasket. Probab Theory Related Fields 79:543–623. https://doi.org/10.1007/BF00318785 Brown TS, Du S, Eruslu H, Sayas FJ (2018) Analysis of models for viscoelastic wave propagation. Appl Math Nonlinear Sci 3:55–96. https://doi.org/10.21042/AMNS.2018.1.00006 Eidelman SD, Kochubei AN (2004) Cauchy problem for fractional diffusion equations. J Differ Eqs 199:211–255. https://doi.org/10.1016/j.jde.2003.12.002 Fujiwara D (1967) Concrete characterization of the domains of fractional powers of some elliptic differential operators of the second order. Proc Jpn Acad 43:82–86. https://doi.org/10.3792/pja/ 1195521686 Gilbarg D, Trudinger NS (2001) Elliptic partial differential equations of second order. Springer, Berlin Gorenflo R, Kilbas AA, Mainardi F, Rogosin SV (2014) Mittag-Leffler functions. Related topics and applications. Springer, Berlin Gorenflo R, Luchko Y, Yamamoto M (2015) Time-fractional diffusion equation in the fractional Sobolev spaces. Fract Calc Appl Anal 18:799–820. https://doi.org/10.1515/fca-2015-0048 Hatano Y, Hatano N (1998) Dispersive transport of ions in column experiments: an explanation of long-tailed profiles. Water Resour Res 34:1027–1033. https://doi.org/10.1029/98WR00214 Jiang D, Li Z, Liu Y, Yamamoto M (2017) Weak unique continuation property and a related inverse source problem for time-fractional diffusion-advection equations. Inverse Probl 33:055013. https://doi.org/10.1088/1361-6420/aa58d1 Kato T (1976) Perturbation theory for linear operators. Springer, Berlin Kian Y, Liu Y, Yamamoto M (2022a) Uniqueness of inverse source problems for general evolution equations. Commun Contemporary Math. https://doi.org/10.1142/S0219199722500092
162
Y. Liu and M. Yamamoto
Kian Y, Soccorsi É, Xue Q, Yamamoto M (2022b) Identification of time-varying source term in time-fractional diffusion equations. Commun Math Sci 20:53–84. https://doi.org/10.4310/CMS. 2022.v20.n1.a2 Kubica A, Ryszewska K, Yamamoto M (2020) Theory of time-fractional differential equations: an introduction. Springer, Tokyo Li Z, Liu Y, Yamamoto M (2023) Inverse source problem for a one-dimensional time-fractional diffusion equation and unique continuation for weak solutions. Inverse Probl Imaging 17:1–22. https://doi.org/10.3934/ipi.2022027 Liu Y (2017) Strong maximum principle for multi-term time-fractional diffusion equations and its application to an inverse source problem. Comput Math Appl 73:96–108. https://doi.org/10. 1016/j.camwa.2016.10.021 Liu Y, Li Z, Yamamoto M (2019) Inverse problems of determining sources of the fractional partial differential equations. In: Kochubei A, Luchko Y (eds) Handbook of fractional calculus with applications volume 2: fractional differential equations. De Gruyter, Berlin, pp 411–430. https:// doi.org/10.1515/9783110571660-018 Liu Y, Rundell W, Yamamoto M (2016) Strong maximum principle for fractional diffusion equations and an application to an inverse source problem. Fract Calc Appl Anal 19:888–906. https://doi. org/10.1515/fca-2016-0048 Liu Y, Zhang Z (2017) Reconstruction of the temporal component in the source term of a (timefractional) diffusion equation. J Phys A 50:305203. https://doi.org/10.1088/1751-8121/aa763a Pazy A (1983) Semigroups of linear operators and applications to partial differential equations. Springer, Berlin Podlubny I (1999) Fractional differential equations. Academic Press, San Diego Sakamoto K, Yamamoto M (2011) Initial value/boundary value problems for fractional diffusionwave equations and applications to some inverse problems. J Math Anal Appl 382:426–447. https://doi.org/10.1016/j.jmaa.2011.04.058 Titchmarsh EC (1926) The zeros of certain integral functions. Proc Lond Math Soc 25:283–302. https://doi.org/10.1112/plms/s2-25.1.283 Umarov S (2019) Fractional Duhamel principle. In: Kochubei A, Luchko Y (eds) Handbook of fractional calculus with applications volume 2: fractional differential equations. De Gruyter, Berlin, pp 383–410. https://doi.org/10.1515/9783110571660-017 Yamamoto M (2022) Fractional calculus and time-fractional differential equations: revisit and construction of a theory. Mathematics 10:698. https://www.mdpi.com/2227-7390/10/5/698
Long-time Asymptotic Estimate and a Related Inverse Source Problem for Time-Fractional Wave Equations Xinchi Huang and Yikan Liu
Abstract Lying between traditional parabolic and hyperbolic equations, timefractional wave equations of order α ∈ (1, 2) in time inherit both decaying and oscillating properties. In this article, we establish a long-time asymptotic estimate for homogeneous time-fractional wave equations, which readily implies the strict positivity/negativity of the solution for t 1 under some sign conditions on initial values. As a direct application, we prove the uniqueness for a related inverse source problem on determining the temporal component. Keywords Time-fractional wave equation · Asymptotic estimate · Inverse source problem · Uniqueness
1 Introduction Recently several decades have witnessed the explosive development of nonlocal models based on fractional calculus from various backgrounds. Remarkably, between the fundamental equations of elliptic, parabolic, and hyperbolic types, partial differential equations (PDEs) like (1) (∂tα − )u = F with fractional orders α ∈ (0, 1) ∪ (1, 2) of time derivatives have attracted interests from both theoretical and applied sides (the meaning of ∂tα will be specified later). Such time-fractional PDEs have been reported to be capable of describing such phenomena as anomalous diffusion in heterogenous medium and viscoelastic materials X. Huang Institute for Innovative Research, Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro-ku, Tokyo 152-8550, Japan e-mail: [email protected] Y. Liu (B) Research Center of Mathematics for Social Creativity, Research Institute for Electronic Science, Hokkaido University, N12W7, Kita-Ward, Sapporo 060-0812, Japan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 T. Takiguchi et al. (eds.), Practical Inverse Problems and Their Prospects, Mathematics for Industry 37, https://doi.org/10.1007/978-981-99-2408-0_11
163
164
X. Huang and Y. Liu
that usual PDEs fail to describe (e.g., Barlow and Perkins 1988; Brown et al. 2018; Hatano and Hatano 1998). Due to the similarity with their integer counterparts, equations like (1) are called time-fractional diffusion equations for α ∈ (0, 1), while are called time-fractional wave ones for α ∈ (1, 2). In the last decade, modern mathematical theories have been introduced in the study of time-fractional PDEs, and fruitful results on the wellposedness and important properties of solutions have been established especially for α ∈ (0, 1) (e.g., Eidelman and Kochubei 2004; Gorenflo et al. 2015; Kubica et al. 2020; Sakamoto and Yamamoto 2011, and the references therein). On the contrary, time-fractional wave equations (i.e., (1) for α ∈ (1, 2)) seem not well investigated especially from the viewpoint of their relation with the cases of α = 1 and α = 2. Meanwhile, many related inverse problems remain open. In the sequel, let α ∈ (1, 2), T > 0 be constants and ⊂ Rd (d = 1, 2, . . .) be a bounded domain whose boundary ∂ is sufficiently smooth. The main focuses of this paper are the following two initial-boundary value problems for homogeneous and inhomogeneous time-fractional wave equations:
and
⎧ α ⎪ in × (0, T ), ⎨(∂t + A)u = 0 u = u 0 , ∂t u = u 1 in × {0}, ⎪ ⎩ u=0 on ∂ × (0, T )
(2)
⎧ α ⎪ ⎨(∂t + A)u(x, t) = ρ(t) f (x), (x, t) ∈ × (0, T ), in × {0}, u = ∂t u = 0 ⎪ ⎩ u=0 on ∂ × (0, T ).
(3)
Here, ∂tα denotes the Caputo derivative in the time variable t > 0 and A is a symmetric elliptic operator in the space variable x ∈ , whose definitions will be provided in Sect. 2 in detail. In the homogeneous problem (2), there is no external force and u 0 , u 1 stand for the initial displacement and velocity, respectively. In the inhomogeneous problem (3), initial displacement and velocity vanish and the source term takes the form of separated variables, where ρ(t) and f (x) stand for the temporal and spatial components, respectively. In both problems (2)–(3), we impose the homogeneous Dirichlet boundary condition, which can be replaced by homogeneous Neumann or Robin ones. There are some results on the well-posedness and the vanishing property of (2)–(3) in Huang and Yamamoto (2022), Liu et al. (2021), Sakamoto and Yamamoto (2011), which basically inherit those for time-fractional diffusion equations. However, the strong positivity property for 0 < α < 1 (see Liu 2017; Liu et al. 2016) no longer holds for 1 < α < 2, whose solutions oscillate and change signs in general even with strictly positive initial values. Therefore, we are interested in the sign change of the solution to (2). Indeed, we acquire some hints from the graphs of Mittag-Leffler functions t j−1 E α, j (−t α ) with j = 1, 2 (see (4) for a definition) in Fig. 1, which are closely
Long-time Asymptotic Estimate and a Related Inverse Source Problem …
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0
-0.2
-0.2
-0.4
-0.4
-0.6
-0.6
-0.8
-0.8
-1
0
2
4
6
8
10
12
14
16
18
20
-1
0
2
4
6
8
165
10
12
14
16
18
20
Fig. 1 Plots of Mittag-Leffler functions E α,1 (−t α ) (a) and t E α,2 (−t α ) (b) with several choices of α ∈ (1, 2]
related to the solution to (2). First, the numbers of sign changes for both functions seem to be finite and monotonely increasing with respect to α < 2, and as α reaches 2 we know E 2,1 (−t 2 ) = cos t and t E 2,2 (−t 2 ) = sin t. Next, though both functions tend to 0 as t → ∞, we see that E α,1 (−t α ) < 0 and t E α,2 (−t α ) > 0 after sufficiently large t. Since E α,1 (−t α ) and t E α,2 (−t α ) coincide with the solutions to (2) at x = π/2 with = (0, π ), A = −∂x2 and the special choice of u 0 (x) = sin x, u 1 (x) = 0 and u 0 (x) = 0, u 1 (x) = sin x, respectively, we are concerned with such long-time strict positivity/negativity for (2) with more general initial values. As a related topic, we are also interested in the following inverse problem. Problem 1 (inverse source problem) Fix x 0 ∈ and let u be the solution to (3). Provided that the spatial component f of the source term is suitably given, determine the temporal component ρ by the single point observation of u at {x 0 } × (0, T ). As before, there is abundant literature on inverse source problems for timefractional diffusion equations (see Liu et al. 2019 for a survey), but much less on those for time-fractional wave ones. Moreover, the majority of the latter were treated by uniform methodologies for 0 < α ≤ 2. We refer to Hu et al. (2020), Liu et al. (2021) for inverse moving source problems, and Kian et al. (2022) for the inverse source problem on determining f (x) in (3). For Problem 1, there are results on uniqueness and stability for 0 < α < 1 (see Liu 2017; Liu et al. 2016; Liu and Zhang 2017), which heavily rely on the strong positivity property of the homogeneous problem (2). Thus, we shall consider Problem 1 at most with the long-time positivity suggested above for 1 < α < 2.
166
X. Huang and Y. Liu
The remainder of this article is organized as follows. Preparing necessary notations and definitions, in Sect. 2, we state the main results on the long-time asymptotic estimate, strict positivity/negativity, and the uniqueness for Problem 1. In Sect. 3, we show the well-posedness for (2)–(3) and establish a fractional Duhamel’s principle between them. Then Sect. 4 is devoted to the proof of main results, followed by a brief conclusion in Sect. 5.
2 Preliminary and Statement of Main Results We start with the definition of the Caputo derivative ∂tα in (2)–(3). Recall the ReimannLiouville integral operator of order β > 0: 1 J g(t) := (β) β
t
(t − s)β−1 g(s) ds, g ∈ C[0, ∞),
0
where ( · ) is the Gamma function. Then for 1 < α < 2, the pointwise Caputo derivative ∂tα is defined as (e.g., Podlubny 1999) ∂tα g(t)
:=
J
2−α
d2 ◦ 2 dt
1 g(t) = (2 − α)
t 0
g (s) ds, g ∈ C 2 [0, ∞), (t − s)α−1
where ◦ is the composite. Notice that the above definition is naive in the sense that it is only valid for smooth functions. For non-smooth functions, in recent years, there β is a modern definition of ∂t as the inverse of J β in the fractional Sobolev space Hβ (0, T ) (see, e.g., Gorenflo et al. 2015 for the case of 0 < β < 1 and Huang and Yamamoto 2022 for that of 1 < β < 2). For instance, the problem (2) should be formulated as ⎧ α ⎪ in L 2 (0, T ; H −1 ()), ⎨∂t (u − u 0 − t u 1 ) + Au = 0 1 0 < t < T, u( · , t) ∈ H0 (), ⎪ ⎩ −1 u − u 0 − t u 1 ∈ Hα (0, T ; H ()) in that context. Nevertheless, since the definition of ∂tα is not the main concern of this article, we prefer the traditional formulations (2)–(3) for better readability. Next, we invoke the familiar Mittag-Leffler functions for later use: E α,β (z) :=
∞ k=0
zk , z ∈ C, β > 0. (αk + β)
We collect several frequently used estimates for E α,β (z).
(4)
Long-time Asymptotic Estimate and a Related Inverse Source Problem …
167
Lemma 1 Let α ∈ (1, 2). Then there exist constants C0 > 0 and C0 > 0 depending only on α such that
|E α,β (−η)| ≤
C0 , 1+η C0 , 1+η2
∀ β ∈ [1, 2], β=α
for η ≥ 0,
(5)
C0 1 ≤ E α, j (−η) − for η 1, j = 1, 2. ( j − α) η η2
(6)
For β = α, the estimate (5) follows immediately from Podlubny (1999, Theorem 1.6). Only in the special case of β = α, one can apply Podlubny (1999, Theorem 1.4) with p = 1 to improve the estimate. Similarly, the estimate (6) also follows from the asymptotic estimate of E α, j (z) with j = 1, 2 in Podlubny (1999, Theorem 1.4). Now we proceed to the space direction. By ( · , · ) we denote the usual inner product in L 2 (), and let H γ () (γ > 0) denote Sobolev spaces (e.g., Adams 1975). The elliptic operator A in (2)–(3) is defined by A : H 2 () ∩ H01 () −→ L 2 (), g −→ −∇ · (a∇g) + c g, where · and ∇ refer to the inner product in Rd and the gradient in x, respectively. Here c ∈ L ∞ () is non-negative and a = (ai j )1≤i, j≤d ∈ C 1 (; Rd×d sym ) is a symmetric and strictly positive-definite matrix-valued function on . More precisely, we assume that c ≥ 0 in , ai j ∈ C 1 (), ai j = a ji on (1 ≤ i, j ≤ d) and there exists a constant κ > 0 such that a(x)ξ · ξ ≥ κ(ξ · ξ ), ∀ x ∈ , ∀ ξ ∈ Rd . Next, we introduce the eigensystem {(λn , ϕn )}∞ n=1 of A satisfying Aϕn = λn ϕn , 0 < λ1 < λ2 ≤ · · · , λn −→ ∞ (n → ∞) and {ϕn } ⊂ D(A) forms a complete orthonormal system of L 2 (). As usual, we can further introduce the Hilbert space D(Aβ ) for β ≥ 0 as ⎧ ⎨
⎫ ∞
1/2 ⎬ λβ (g, ϕn )2 D(Aβ ) := g ∈ L 2 () gD(Aβ ) := 0. For −1 ≤ β < 0, let D(A−β ) ⊂ L 2 () ⊂ (D(A−β ))∗ =: D(Aβ )
168
X. Huang and Y. Liu
be the Gel’fand triple, where ( · )∗ denotes the dual space. Then the norm of D(Aβ ) for −1 ≤ β < 0 is similarly defined by gD(Aβ )
∞
1/2 2 β λ D(Aβ ) g, ϕn D(A−β ) := , n
n=1
where D(Aβ ) · , · D(A−β ) denotes the pairing between D(Aβ ) and D(A−β ). Then the space D(Aβ ) is well defined for all β ≥ −1. Now we are well prepared to state the main results of this article. First we investigate the asymptotic behavior of the solution to the homogeneous problem (2) as t → ∞. Theorem 1 (Long-time asymptotic estimate) Let u j ∈ D(Aβ ) ( j = 0, 1) with β ≥ 0 and u be the solution to (2). Then we have u( · , t) ∈ D(Aβ+1 ) for t > 0 and there exists a constant Cα > 0 depending only on α such that 1 −1 A u j j−α u( · , t) − t ( j + 1 − α) j=0
≤ Cα
D(Aβ+1 )
1
u j D(Aβ−1 ) t j−2α
(7)
j=0
for t 1. The above theorem generalizes a similar result for multi-term time-fractional diffusion equations in Li et al. (2015, Theorem 2.4) and the estimate (7) keeps the same structure describing the asymptotic behavior. First, (7) points out that the solution u( · , t) converges to 0 in D(Aβ+1 ) with the pattern 1 j=0
A−1 u j A−1 u 0 −α A−1 u 1 1−α t j−α = t + t as t → ∞, ( j + 1 − α) (1 − α) (2 − α)
which immediately implies
u( · , t)D(Aβ+1 ) =
O(t 1−α ), u 1 ≡ 0 in , O(t −α ), u 1 ≡ 0, u 0 ≡ 0 in
as t → ∞.
Therefore, the initial velocity u 1 impacts the long-time asymptotic behavior more than the initial displacement u 0 . Second, (7) further gives the convergence rate 1 −1 A uj j−α u( · , t) − t ( j + 1 − α) j=0
D(Aβ+1 )
=
O(t 1−2α ), u 1 ≡ 0 in , O(t −2α ), u 1 ≡ 0, u 0 ≡ 0 in
for t 1. In this sense, the estimate (7) provides rich information on the long-time asymptotic behavior of the solution.
Long-time Asymptotic Estimate and a Related Inverse Source Problem …
169
As a direct consequence of Theorem 1, one can immediately show the following result on the sign of the solution for t 1. Corollary 1 (Long-time strict positivity/negativity) Let u j ∈ D(Aβ ) ( j = 0, 1) with
= 0, d = 1, 2, 3, β (8) > d/4 − 1, d ≥ 4 and u be the solution to (2). Then for any x ∈ , u(x, t) makes pointwise sense for t > 0 and the followings hold true. (a) If A−1 u 1 (x) = 0 and A−1 u 0 (x) = 0, then there exists a constant T0 1 depending on α, , A, u 0 , u 1 , x such that the sign of u(x, · ) is opposite to that of A−1 u 0 (x) in (T0 , ∞). Especially, if u 1 ≡ 0, u 0 ≡ 0, and u 0 ≤ 0 (u 0 ≥ 0) in , then u(x, · ) > 0 (u(x, · ) < 0) in (T0 , ∞). (b) If A−1 u 1 (x) = 0, then there exists a constant T1 1 depending on α, , A, u 0 , u 1 , x such that the sign of u(x, · ) is the same as that of A−1 u 1 (x) in (T1 , ∞). Especially, if u 1 ≡ 0 and u 1 ≥ 0 (u 1 ≤ 0) in , then u(x, · ) > 0 (u(x, · ) < 0) in (T1 , ∞). Similarly to Theorem 1, the above corollary also generalizes a similar result for multi-term time-fractional diffusion equations (see Liu (2017, Lemma 3.1)). However, since solutions to time-fractional wave equations change sign in general, there seems no literature discussing the sign of solutions for 1 < α < 2 to our best knowledge. On the other hand, notice that in Corollary 1 we directly make assumptions on A−1 u j (x) ( j = 0, 1) instead of u j as that in Liu (2017). Since the non-vanishing of A−1 u j (x) is a necessary condition of u j ≡ 0 and (u j ≥ 0 or u j ≤ 0) in
(9)
according to the strong maximum principle, the assumptions in Corollary 1 are definitely weaker than the previous one. Under the special situation (9), let us comment Corollary 1 in further detail. If the initial velocity u 1 vanishes and the sign of the initial displacement u 0 (≡ 0) keeps unchanged, then Corollary 1(a) asserts that the solution u to the homogeneous problem (2) must take the opposite sign against that of u 0 for t 1. This turns out to be the remarkable difference from the case of 0 < α ≤ 1 in view of the strong positivity property for the latter. In contrast, if the sign of u 1 (≡ 0) keeps unchanged, then Corollary 1(b) claims that u must takes the same sign as that of u 1 for t 1. Notice that in Corollary 1(b), there is no assumption on u 0 because u 1 plays a more dominating role in the asymptotic estimate (7) than u 0 does. Corollary 1 is not only novel and interesting by itself, but also closely related to the uniqueness issue of Problem 1. Indeed, as a direct application of Corollary 1, one can prove the following theorem.
170
X. Huang and Y. Liu
Theorem 2 (Uniqueness for Problem 1) Let x 0 ∈ and u be the solution to (3), where ρ ∈ L p (0, T ) (1 ≤ p ≤ ∞) and f ∈ D(Aβ ) with β satisfying (8). If A−1 f (x 0 ) = 0, then u = 0 at {x 0 } × (0, T ) implies ρ ≡ 0 in (0, T ). Especially, if f ≡ 0 and ( f ≥ 0 or f ≤ 0) in , then the same result holds with arbitrary x 0 ∈ . As expected, again the above theorem generalizes and improves corresponding results for (multi-term) time-fractional diffusion equations (see Liu et al. (2016, Theorem 1.2) and Liu (2017, Theorem 1.3)). Moreover, in spite of the difference between time-fractional diffusion and wave equations, the assumption and result keep essentially the same. Therefore, the key ingredient for proving such uniqueness turns out to be the long-time strict positivity/negativity like Corollary 1 instead of the strong positivity property. Further, thanks to the weakened assumption in Corollary 1(b), here in Theorem 2 we can also weaken the sign assumption on f simply to A−1 f (x 0 ) = 0.
3 Well-Posedness and Fractional Duhamel’s Principle This section is devoted to the preparation of basic facts concerning problems (2)–(3) before proceeding to the proofs of main results. We start with discussing the unique existence and regularity of solutions to (2)– (3). Regarding the well-posedness of time-fractional wave equations, there are partial results, e.g., in Liu et al. (2021), Sakamoto and Yamamoto (2011) and here we generalize their results to fit into the framework of this paper. First we consider the homogeneous problem (2). Lemma 2 Fix constants β ≥ 0, γ ∈ [0, 1] arbitrarily and assume u j ∈ D(Aβ ) ( j = 0, 1). Then there exists a unique solution u ∈ L ∞ (0, T ; D(Aβ )) to the initialboundary value problem (2). Moreover, there exists a constant C1 > 0 depending only on α such that u( · , t)D(Aβ+γ ) ≤ C1
1
u j D(Aβ ) t j−αγ , t > 0.
(10)
j=0
Further, the map u : (0, T ) −→ D(Aβ+1 ) can be analytically extended to a sector {z ∈ C \ {0} | |arg z| < π/2}. The above lemma collects the basic unique existence and time-analyticity of the solution to (2). Although the estimate (10) holds for all t > 0, it is mainly effective for finite t and especially in describing the short-time asymptotic behavior as t → 0. On the other hand, the solution seems to blow up as t → ∞ since the right-hand side of (10) goes to infinity if γ < 1/α. However, actually the solution does not blow up because for 0 ≤ γ < 1, it follows from (10) that
Long-time Asymptotic Estimate and a Related Inverse Source Problem …
171
u( · , t)D(Aβ+γ ) ≤ u( · , t)D(Aβ+1 ) ≤ C1 u 0 D(Aβ ) t −α + u 1 D(Aβ ) t 1−α −→ 0 as t → ∞. Proof It follows from Sakamoto and Yamamoto (2011) that the solution to (2) takes the form 1 ∞ u( · , t) = tj E α, j+1 (−λn t α )(u j , ϕn )ϕn . (11) n=1
j=0
Employing the estimate (5) in Lemma 1, we estimate
u( · , t)2D(Aβ+γ )
2 1 2(β+γ ) j α = λn t E α, j+1 (−λn t )(u j , ϕn ) j=0 n=1 ∞
≤2
1
t2 j
≤2
2 |λγn E α, j+1 (−λn t α )|2 λβn (u j , ϕn )
n=1
j=0 1
∞
t
j=0
2j
∞ C0 (λn t α )γ n=1
1 + λn t α
t
−αγ
2
β λ (u j , ϕn )2 n
1 2 2 ≤ 2 C0 t −αγ u j D(Aβ ) t j
⎛ ≤ 2 ⎝C0
j=0 1
⎞2
u j D(Aβ ) t j−αγ ⎠ .
j=0
√ Then we arrive at (10) by simply putting C1 = 2 C0 . In particular, taking γ = 0 in (10) immediately yields u ∈ L ∞ (0, T ; D(Aβ )). Finally, the analyticity of u( · , t) with respect to t in D(Aβ+1 ) can be proved by the same argument as that of Sakamoto and Yamamoto (2011, Theorem 2.1). Next, we consider the inhomogeneous problem with a general source term: ⎧ α ⎪ ⎨(∂t + A)w = F in × (0, T ), in × {0}, w = ∂t w = 0 ⎪ ⎩ w=0 on ∂ × (0, T ).
(12)
Lemma 3 Fix constants β ≥ 0, p ∈ [1, ∞] arbitrarily and assume F ∈ L p (0, T ; D(Aβ )). (a) If p = 2, then there exist a unique solution w ∈ L p (0, T ; D(Aβ+1 )) to (12) and a constant C2 > 0 depending only on α, T such that w L p (0,T ;D(Aβ+1 )) ≤ C2 F L p (0,T ;D(Aβ )) .
172
X. Huang and Y. Liu
(b) If p = 2, then there exist a unique solution w ∈ L p (0, T ; D(Aβ+γ )) to (12) for any γ ∈ [0, 1) and a constant C2 > 0 depending only on α, T such that w L p (0,T ;D(Aβ+γ )) ≤
C2 F L p (0,T ;D(Aβ )) . 1−γ
The results of the above lemma inherits those of Li et al. (2015, Theorem 2.2(b)), Liu et al. (2021, Lemma 2.3(a)) and Li et al. (2023, Lemma 5(ii)). More precisely, the improvement of the spatial regularity of the solution can reach 2 only for p = 2, which is strictly smaller than 2 if p = 2. The proof of Lemma 3(b) resembles those in the above references and we omit the proof. However, recall that for p = 2, the proof in Li et al. (2015) relies on the positivity of E α,α (−η) for α ∈ (0, 1) and η ≥ 0. Since such positivity no longer holds for α ∈ (1, 2), we shall give an alternative proof for Lemma 3(a). Proof (Proof of Lemma 3(a)) Based on the solution formula (see Sakamoto and Yamamoto (2011, Theorem 2.2)) ⎛ t ⎞ ∞ ⎝ (t − s)α−1 E α,α (−λn (t − s)α )(F( · , s), ϕn ) ds ⎠ ϕn , w( · , t) = n=1
(13)
0
we employ Young’s convolution inequality to estimate T w2L 2 (0,T ;D(Aβ+1 ))
=
w( · , t)2D(Aβ+1 ) dt 0
2 T t ∞ β+1 α−1 α = (t − s) E α,α (−λn (t − s) )(F( · , s), ϕn ) ds dt λn 0 n=1 0 2 ∞ T t λβ+1 (t − s)α−1 E α,α (−λn (t − s)α )(F( · , s), ϕn ) ds dt = n n=1 0 0 ⎞2 T ⎛ T ∞ β λ (F( · , t), ϕn )2 dt. ⎝ λn t α−1 |E α,α (−λn t α )| dt ⎠ ≤ n n=1
0
0
T It suffices to show the uniform boundedness of 0 λn t α−1 |E α,α (−λn t α )| dt for all n = 1, 2, . . .. Indeed, performing integration by substitution η = λn t α and utilizing the estimate (5) with β = α in Lemma 1, we calculate
Long-time Asymptotic Estimate and a Related Inverse Source Problem …
T λn t
α−1
0
1 |E α,α (−λn t )| dt = α α
≤
C0 α
λn T α
0
∞ 0
1 |E α,α (−η)| dη ≤ α
173
∞ |E α,α (−η)| dη 0
dη πC0 =: C2 . = 2 1+η 2α
Then we immediately obtain ∞ β λ (F( · , t), ϕn )2 dt = C2 F L 2 (0,T ;D(Aβ )) 2 , n T
u2L 2 (0,T ;D(Aβ+1 ))
≤
C22
n=1 0
which completes the proof of Lemma 3(a).
Now we discuss the fractional Duhamel’s principle which connects the inhomogeneous problem (3) and the homogeneous one (2). Concerning Duhamel’s principle for time-fractional partial differential equations, there already exist plentiful results especially for time-fractional diffusion equations, and we refer to Liu et al. (2016), Liu (2017), Umarov (2019). For general α > 0, Hu et al. (2020, Lemma 5.2) established a fractional Duhamel’s principle for (12) with a smooth source term F. Here we provide a result for (3) with a non-smooth source term. Lemma 4 (Fractional Duhamel’s principle) Fix constants β ≥ 0, p ∈ [1, ∞] arbitrarily and assume ρ ∈ L p (0, T ), f ∈ D(Aβ ). Let γ = 1 for p = 2 and γ ∈ [0, 1) be arbitrary for p = 2. Then for the solution u to (3), there holds t J
2−α
u( · , t) =
ρ(s)v( · , t − s) ds in L p (0, T ; D(Aβ+γ )),
(14)
0
where v solves the homogeneous problem ⎧ α ⎪ ⎨(∂t + A)v = 0 in × (0, T ), v = 0, ∂t v = f in × {0}, ⎪ ⎩ v=0 on ∂ × (0, T ).
(15)
Proof First we confirm that the both sides of (14) lie in L p (0, T ; D(Aβ+γ )) for γ assumed in Lemma 4. In fact, since ρ f ∈ L p (0, T ; D(Aβ )), it follows from Lemma 3 that u ∈ L p (0, T ; D(Aβ+γ )). Next, it is readily seen from Young’s convolution inequality that J 2−α : L p (0, T ) −→ L p (0, T ) is a bounded linear operator, which implies J 2−α u ∈ L p (0, T ; D(Aβ+γ )). On the other hand, applying Lemma 2 to (15) yields v( · , t)D(Aβ+γ ) ≤ C1 f D(Aβ ) t 1−αγ , t > 0, ∀ γ ∈ [0, 1].
174
X. Huang and Y. Liu
Since 1 − αγ > −1, we have v ∈ L 1 (0, T ; D(Aβ+γ )) for any γ ∈ [0, 1]. Therefore, we see that the right-hand side of (14) also makes sense in L p (0, T ; D(Aβ+γ )) for any γ ∈ [0, 1] by ρ ∈ L p (0, T ) and again Young’s convolution inequality. Now we can proceed to verify the identity (14) by brute-force calculation based on the solution formulae, because the possibility of exchanging the involved summations and integrals is guaranteed by the above argument. According to (13), we know ⎞ ⎛ t ∞ ⎝ (t − s)α−1 E α,α (−λn (t − s)α )(ρ(s) f, ϕn ) ds ⎠ ϕn u( · , t) = n=1
=
∞
0
μn (t)( f, ϕn )ϕn ,
(16)
n=1
where
t μn (t) :=
(t − s)α−1 E α,α (−λn (t − s)α )ρ(s) ds.
0
Then by the definitions of J 2−α and E α,α ( · ), we calculate
J
=
2−α
1 μn (t) = (2 − α)
1 (2 − α)
1 = (2 − α) 1 = (2 − α)
t 0
t 0
t
t (t − s)1−α μn (s) ds 0
⎞ ⎛ s (t − s)1−α ⎝ (s − τ )α−1 E α,α (−λn (s − τ )α )ρ(τ ) dτ ⎠ ds (17) ⎛
0
t
ρ(τ ) ⎝ (t − s)1−α (s − τ )α−1 ⎛ ρ(τ ) ⎝
k=0
0
(αk + α)
k=0
τ ∞
∞ (−λn (s − τ )α )k
(−λn )k (α(k + 1))
t
⎞ ds ⎠ dτ ⎞
(t − s)1−α (s − τ )α(k+1)−1 ds ⎠ dτ,
τ
where we exchanged the order of integration in (17). For the inner integral above, we perform integration by substitution s = θ (t − τ ) + τ (0 < θ < 1) to calculate t (t − s)
1−α
(s − τ )
α(k+1)−1
ds = (t − τ )
αk+1
τ
1
(1 − θ )1−α θ α(k+1)−1 dθ
0 αk+1 (2
= (t − τ )
− α)(α(k + 1)) . (αk + 2)
Long-time Asymptotic Estimate and a Related Inverse Source Problem …
175
Hence, by the definition of E α,2 ( · ), we obtain t J
2−α
μn (t) =
ρ(τ ) 0
t =
∞ (−λn )k (t − τ )αk+1 dτ (αk + 2) k=0
ρ(s)(t − s)E α,2 (−λn (t − s)α ) ds.
0
Therefore, we perform J 2−α on both sides of (16) and substitute the above equality to obtain J 2−α u( · , t) =
∞
J 2−α μn (t)( f, ϕn )ϕn
n=1
⎞ ⎛ t ∞ ⎝ ρ(s)(t − s)E α,2 (−λn (t − s)α ) ds ⎠ ( f, ϕn )ϕn = n=1
0
t =
ρ(s) (t − s)
∞
α
E α,2 (−λn (t − s) )( f, ϕn )ϕn ds.
n=1
0
Then we reach the conclusion by applying the solution formula (11) to v.
4 Proofs of Main Results Proof (Proof of Theorem 1) Since u( · , t) ∈ D(Aβ+1 ) by Lemma 2 with γ = 1, we know that the left-hand side of (7) makes sense for t > 0. By the solution formula (11) and ∞ (u j , ϕn ) −1 ϕn , j = 0, 1, A uj = λn n=1 we obtain u( · , t) −
1 j=0
=
1 j=0
t
j
∞ n=1
t j−α
A−1 u j ( j + 1 − α)
1 E α, j+1 (−λn t ) − ( j + 1 − α) λn t α α
Then we employ the estimate (6) in Lemma 1 to estimate
(u j , ϕn )ϕn .
176
X. Huang and Y. Liu
2 1 −1 A uj j−α u( · , t) − t ( j + 1 − α) j=0 D(Aβ+1 ) 2 ∞ 1 1 j α E (u = λ2(β+1) t (−λ t ) − , ϕ ) α, j+1 n j n n α ( j + 1 − α) λ t n n=1 j=0 2 1 ∞ β 1 2j 2 α λ (u j , ϕn )2 ≤2 t λn E α, j+1 (−λn t ) − n ( j + 1 − α) λn t α n=1 j=0 ⎞2 ⎛ ∞ 1 1 −2α 2 2 λβ−1 (u j , ϕn ) ≤ 2 ⎝C ≤ 2 C0 t t2 j u j D(Aβ−1 ) t j−2α ⎠ n 0 j=0
n=1
j=0
for t 1. Then we can conclude (7) by simply putting Cα =
√ 2 C0 .
Proof (Proof of Corollary 1) We have from (8) that 2(β + 1) > d/2 and the Sobolev embedding yields D(Aβ+1 ) ⊂ H 2(β+1) ⊂ C() (e.g., Adams 1975), indicating that A−1 u j ( j = 0, 1) and u( · , t) (t > 0) make pointwise sense. Then Theorem 1 yields for any x ∈ that 1 1 −1 −1 A u (x) A u j j j−α j−α u( · , t) − u(x, t) − ≤ t t ( j + 1 − α) ( j + 1 − α) j=0 j=0 C() 1 1 A−1 u j t j−α ≤ C ≤C u j D(Aβ−1 ) t j−2α u( · , t) − ( j + 1 − α) j=0 j=0 D(Aβ+1 )
for t 1, where C > 0 is the Sobolev embedding constant depending only on and C := C Cα depends only on and α. This implies u(x, t) ≥
1 j=0
u(x, t) ≤
1 j=0
A−1 u j (x) j−α t −C u j D(Aβ−1 ) t j−2α , ( j + 1 − α) j=0 1
(18)
A−1 u j (x) j−α t +C u j D(Aβ−1 ) t j−2α ( j + 1 − α) j=0 1
for t 1. Recall the relation 0 > 1 − α > −α > 1 − 2α > −2α by 1 < α < 2. (a) Without loss of generality, we only consider the case of A−1 u 0 (x) < 0 because the opposite case can be studied in identically the same manner. Substituting A−1 u 1 (x) = 0 into (18) yields
Long-time Asymptotic Estimate and a Related Inverse Source Problem …
A−1 u 0 (x) −α u j D(Aβ−1 ) t j−2α for t 1. t −C (1 − α) j=0
177
1
u(x, t) ≥
(19)
−1
u 0 (x) Together with the fact that (1 − α) < 0, we see A(1−α) > 0 and thus there exists T0 1 depending on α, , A, u 0 , u 1 , x such that the right-hand side of (19) keeps strictly positive for all t > T0 . As for the special situation of u 0 ≡ 0, ≤ 0 in , the strong maximum principle for the elliptic operator A (see Gilbarg and Trudinger (2001, Chapter 3)) guarantees A−1 u 0 < 0 in . This completes the proof of (a).
(b) Likewise, it suffices to consider the case of A−1 u 1 (x) > 0 without loss of generality. Now it follows from (18) that u(x, t) ≥
1 A−1 u 1 (x) 1−α A−1 u 0 (x) −α t t − − C u j D(Aβ−1 ) t j−2α (2 − α) (1 − α) j=0
(20)
for t 1. Then by (2 − α) > 0, similarly we conclude the existence of T1 1 depending on α, , A, u 0 , u 1 , x such that the right-hand side of (20) keeps strictly positive for all t > T1 . This completes the proof of (b). Proof (Proof of Theorem 2) We turn to the fractional Duhamel’s principle (14) and the corresponding homogeneous problem (15) in Lemma 4. Since the identity (14) holds in L p (0, T ; D(Aβ+γ )) for any γ ∈ [0, 1) and β satisfies (8), one can choose γ sufficiently close to 1 such that 2(β + γ ) > d/2. Then by the Sobolev embedding D(Aβ+γ ) ⊂ H 2(β+γ ) () ⊂ C(), it reveals that (14) makes sense in L p (0, T ; C()). This allows us to substitute x = x 0 into (14) to obtain t 0= J
2−α
u(x 0 , t) =
ρ(s)v(x 0 , t − s) ds in L p (0, T ). 0
Now we are well prepared to apply the Titchmarsh convolution theorem (see Titchmarsh 1926) to conclude the existence of a constant t∗ ∈ [0, T ] such that v(x 0 , · ) ≡ 0 in (0, t∗ ) and ρ ≡ 0 in (0, T − t∗ ). It remains to verify t∗ = 0 by the argument of contradiction. If t∗ > 0 instead, then the analyticity of v( · , t) ∈ D(Aβ+1 ) ⊂ C() with respect to t indicates v(x 0 , · ) ≡
178
X. Huang and Y. Liu
0 in (0, ∞). However, owing to the assumption A−1 f (x 0 ) = 0, it follows from Corollary 1(b) that there exists T1 1 such that v(x 0 , · ) > 0 or v(x 0 , · ) < 0 in (T1 , ∞), which is a contradiction. Consequently, there should hold t∗ = 0, i.e., ρ ≡ 0 in (0, T ).
5 Conclusion The results obtained in this article reflect both similarity and difference between time-fractional wave and diffusion equations. Owing to the existence of two initial values, the evolution of solutions u to homogeneous time-fractional wave equations becomes more complicated and interesting than that for 0 < α < 1. Motivated by the asymptotic property of Mittag-Leffler functions, we capture the behavior of u( · , t) for t 1 in view of Theorem 1 and Corollary 1. Meanwhile, it reveals that the uniqueness of inverse t-source problems like Problem 1 only requires the long-time non-vanishing and the time-analyticity of u, which was neglected in the proof for 0 < α < 1. We close this article with some conjectures on the number of sign changes of the solution u(x, t) to (2) for any x ∈ . Fixing initial values u 0 , u 1 , from Fig. 1 one can see the monotone increasing of such a number with respect to α, but there seems no rigorous proof yet. Further, if either of u 0 , u 1 vanishes and the other keeps sign, it seems possible to prove the parity of the number of sign changes, i.e., odd if u 1 ≡ 0 and even if u 0 ≡ 0. This also relates with the distribution of zeros of u, which can be another future topic. Acknowledgements X. Huang is supported by Grant-in-Aid for JSPS Fellows 20F20319, JSPS. Y. Liu is supported by Grant-in-Aid for Early Career Scientists 20K14355 and 22K13954, JSPS.
References Adams RA (1975) Sobolev spaces. Academic, New York Barlow MT, Perkins EA (1988) Brownian motion on the Sierpi´nski gasket. Probab Theory Related Fields 79:543–623. https://doi.org/10.1007/BF00318785 Brown TS, Du S, Eruslu H, Sayas FJ (2018) Analysis of models for viscoelastic wave propagation. Appl Math Nonlinear Sci 3, 55–96 (2018). https://doi.org/10.21042/AMNS.2018.1.00006 Eidelman SD, Kochubei AN (2004) Cauchy problem for fractional diffusion equations. J Differ Eqs 199:211–255. https://doi.org/10.1016/j.jde.2003.12.002 Gilbarg D, Trudinger NS (2001) Elliptic partial differential equations of second order. Springer, Berlin Gorenflo R, Luchko Y, Yamamoto M (2015) Time-fractional diffusion equation in the fractional Sobolev spaces. Fract Calc Appl Anal 18:799–820. https://doi.org/10.1515/fca-2015-0048
Long-time Asymptotic Estimate and a Related Inverse Source Problem …
179
Hatano Y, Hatano N (1998) Dispersive transport of ions in column experiments: an explanation of long-tailed profiles. Water Resour Res 34:1027–1033. https://doi.org/10.1029/98WR00214 Hu G, Liu Y, Yamamoto M (2020) Inverse moving source problems for fractional diffusion(-wave) equations: determination of orbits. In: Cheng J et al (eds) Inverse Problems and Related Topics, Springer Proceedings in Mathematics & Statistics, vol 310. Springer, Singapore, pp 81–100. https://doi.org/10.1007/978-981-15-1592-7_5 Huang X, Yamamoto M (2022) Well-posedness of initial-boundary value problem for timefractional diffusion-wave equation with time-dependent coefficients. arXiv: 2203.10448, https:// doi.org/10.48550/arXiv.2203.10448 Kian Y, Liu Y, Yamamoto M (2022) Uniqueness of inverse source problems for general evolution equations. Commun Contemporary Math. https://doi.org/10.1142/S0219199722500092 Kubica A, Ryszewska K, Yamamoto M (2020) Theory of time-fractional differential equations: an introduction. Springer, Tokyo Li Z, Huang X, Liu Y (2023) Well-posedness for coupled systems of time-fractional diffusion equations. Fract Calc Appl Anal 26:533–566. https://doi.org/10.1007/s13540-023-00149-0 Li Z, Liu Y, Yamamoto M (2015) Initial-boundary value problems for multi-term time-fractional diffusion equations with positive constant coefficients. Appl Math Comput 257:381–397. https:// doi.org/10.1016/j.amc.2014.11.073 Liu Y (2017) Strong maximum principle for multi-term time-fractional diffusion equations and its application to an inverse source problem. Comput Math Appl 73:96–108. https://doi.org/10. 1016/j.camwa.2016.10.021 Liu Y, Hu G, Yamamoto M (2021) Inverse moving source problem for time-fractional evolution equations: determination of profiles. Inverse Probl 37:084001. https://doi.org/10.1088/13616420/ac0c20 Liu Y, Li Z, Yamamoto M (2019) Inverse problems of determining sources of the fractional partial differential equations. In: Kochubei A, Luchko Y (eds) Handbook of fractional calculus with applications volume 2: fractional differential equations. De Gruyter, Berlin, pp 411–430. https:// doi.org/10.1515/9783110571660-018 Liu Y, Rundell W, Yamamoto M (2016) Strong maximum principle for fractional diffusion equations and an application to an inverse source problem. Fract Calc Appl Anal 19:888–906. https://doi. org/10.1515/fca-2016-0048 Liu Y, Zhang Z (2017) Reconstruction of the temporal component in the source term of a (timefractional) diffusion equation. J Phys A 50:305203. https://doi.org/10.1088/1751-8121/aa763a Podlubny I (1999) Fractional differential equations. Academic Press, San Diego Sakamoto K, Yamamoto M (2011) Initial value/boundary value problems for fractional diffusionwave equations and applications to some inverse problems. J Math Anal Appl 382:426–447. https://doi.org/10.1016/j.jmaa.2011.04.058 Titchmarsh EC (1926) The zeros of certain integral functions. Proc Lond Math Soc 25:283–302. https://doi.org/10.1112/plms/s2-25.1.283 Umarov S (2019) Fractional Duhamel principle. In: Kochubei A, Luchko Y (eds) Handbook of fractional calculus with applications volume 2: fractional differential equations. De Gruyter, Berlin, pp 383–410. https://doi.org/10.1515/9783110571660-017
A Big Data Processing Technique Based on Tikhonov Regularization Yu Chen, Jin Cheng, Jiantang Zhang, and Min Zhong
Abstract We propose a big data processing technique that can reconstruct a function and its derivative from scattered data with random noise. The main idea is to make use of the data amount and its statistical property to reduce the random error, while using a relatively small and proper number of nodes for interpolation. Tikhonov regularization is adopted to obtain stable and reliable reconstructions of the function and its derivative. While the noise level remains large, reconstruction error becomes small as sample size increases. When a dataset is given, rigorous error bounds of function and derivative reconstruction in continuous L 2 norm are obtained using a histogram density estimator as indicator function. When observation points are sampled from a distribution, asymptotic convergence rate in probability is obtained as sample size tends to infinity. Keywords Big data · Numerical differentiation · Regularization
1 Introduction With the rapid growth of data, how to extract effective information from data is one of the most fundamental problems. Numerical differentiation is a classical problem aiming at recovering a function and its derivative from random noisy samples of function values. In applied sciences, such problems often arise. For example, a Y. Chen School of Mathematics, Shanghai University of Finance and Economics, Shanghai 200433, China J. Cheng (B) School of Mathematical Sciences, Fudan University, Shanghai 200433, China e-mail: [email protected] J. Zhang HiSilicon (Shanghai) Technologies CO.,LIMITED, Shanghai, China M. Zhong School of Mathematics, Southeast University, Nanjing 210096, China Nanjing Center for Applied Mathematics, Nanjing 211135, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 T. Takiguchi et al. (eds.), Practical Inverse Problems and Their Prospects, Mathematics for Industry 37, https://doi.org/10.1007/978-981-99-2408-0_12
181
182
Y. Chen et al.
numerical reconstruction of derivative can be used to determine discontinuities of a function (Deans 2007), and find the solution of an Abel integral equation (Gorenflo and Vessella 2006). However, numerical differentiation is an ill-posed problem in the sense of unstable dependence of solutions on data perturbations. That is, small errors in observations may be severely amplified during the reconstruction process. Thus, many techniques are proposed to smooth noisy data. First, a moving average of data can reduce high-frequency noise provided that observation points are sufficiently dense. Second, fitting a polynomial function to the observation data can result in a smooth curve, provided that the degree of fitted polynomial is proper. In fact, using basis functions other than polynomials may give better results for certain classes of data, such as Fourier basis functions and wavelet basis functions (Addison 2002; Hu and Lu 2012). Third, fitting over short intervals rather than entire data may also improve accuracy (Eilers and Marx 1996; Klasson 1997; Klaus and van Ness 1967; Savitzky and Golay 1964; Scott and Scott 1989). All of the methods above behave well in certain applications, while having their limitations (Eilers 2003; Scott and Scott 1989). Nowadays, scientists may have an explosively growing amount of noisy data, though observation noise and errors are still inevitable and even large. It is then desirable to use the amount of data in exchange for reconstruction precision in a numerical differentiation problem. In this paper, we develop such a technique based on Tikhonov regularization. In inverse problems, regularization is the process of adding an additional term in an optimization problem to improve smoothness of the solution and prevent over-fitting. For a numerical differentiation problem, the solution given by Tikhonov regularization is the minimizer of Tikhonov functional, which is a function that balances the fit to the original data with smoothness. This type of method is investigated by several authors and gives satisfactory results of function and derivative reconstructions (Cheng et al. 2007; Hanke and Scherzer 2001; Wang et al. 2002; Wang and Wei 2005; Wei and Hon 2007; Wei et al. 2005). In particular, Cheng et al. (2007) gives a prior choice of regularization parameter, which alleviates computational burden while still producing satisfactory reconstructions. However, several difficulties prevent direct applications of those existing methods with big data. First, in those previous works, the observation noise is required to be bounded in magnitude, and convergence is obtained when noise level approaches zero and the number of observations approaches infinity. While in a big data setting, the noise level is typically large. Thus, reconstruction accuracy mainly depends on noise level and remains large regardless of the number of observations. We need a method that uses the redundant observations to improve accuracy. Second, the solution given by Tikhonov regularization is actually a spline function whose knots are equal to observation points, which means in the big data setting where observation points are huge in number, the computational cost of such a spline function can be unaffordable. We need to reduce computational burden while maintaining reconstruction accuracy. Third, the error estimates in previous works hold only if the observation points are equidistant or quasi-equidistant. However, the observation points are more likely to be randomly distributed in a big data setting. Instead of imposing additional constraints
A Big Data Processing Technique Based on Tikhonov Regularization
183
in prior, we desire rigorous and data-driven error bounds of function and derivative reconstructions. In this paper, we make several modifications to the existing numerical differentiation technique based on Tikhonov regularization, so that the modified technique is more suitable with big data. Instead of finding the minimizer of Tikhonov functional in infinite-dimensional function spaces, we fix a finite number of knots and find a solution in the finite-dimensional vector space of spline functions with those fixed knots. This brings two advantages. First, provided that observation noise is randomly distributed with a finite variance, the reconstruction error induced by noise tends to zero in probability as the number of observations increases. In this way, our modified technique makes use of the redundant observations to improve accuracy. Second, since the knots can be far fewer than observation points, computational burden is alleviated. We design an online algorithm that quickly computes the reconstructions when needed, while processing and saving sequential observation data in temporary storage of a fixed size. On the other hand, we use histogram density estimation of given observation points as an indicator function, and give rigorous error bounds of function and derivative reconstructions in confidence intervals. Our error bounds do not include any additional assumptions on observation points, thus avoid the quasi-equidistant condition and are more suitable with big data. Further, provided that observation points are sampled from a probability distribution, we can also give an asymptotic convergence rate in probability without imposing the quasi-equidistant condition. In fact, to the authors’ knowledge, most of the previous works on asymptotic convergence rates need the quasi-equidistant condition (Claeskens et al. 2009; Ragozin 1983; Wahba 1975). Our work achieves the same convergence rate of Claeskens et al. (2009) without imposing that condition. The remaining of this paper is organized as follows. In Sect. 2, we introduce the problem settings, prove the existence and uniqueness of reconstruction, and give an online algorithm. In Sect. 3.1, we estimate the averaged mean squared error at observation points. In Sect. 3.2, we use histogram density estimator as indicator function and give error bounds in continuous L 2 norm. In Sect. 3.3, we give asymptotic convergence rates in probability provided that observation points are sampled from a continuous probability distribution. In Sect. 4, numerical test cases are provided to illustrate the effectiveness of our method.
2 Formulation and Algorithm In this section, we will construct the regularized solution and propose the reconstruction algorithm. We will characterize the finite dimensional linear space in which the regularized solution is established. The expression for the approximant will be derived and its existence and uniqueness be proved. An algorithm for the construction of the approximant will be given in pseudo code.
184
Y. Chen et al.
2.1 Formulation of the Problem On an interval I = [0, 1], suppose that f (x) is a function that belongs to the Sobolev space W 2,2 (I ). We have a set of observation points N ⊂ I, X N = {xi }i=1
and corresponding observations with random noises yi = f (xi ) + ηi , i = 1, 2, . . . N , where the noise terms ηi , 1 ≤ i ≤ N are uncorrelated random variables defined in probability space (R, B(R), P) that have zero means and identical variances of σ 2 . Given those observational data, we are interested in the following problems 1. Constructing a function f N (x), such that f N (x) approximates f (x) and f N (x) approximates f (x); 2. Estimating approximation errors in L 2 -norm of f N − f and f N − f on I or certain subintervals of I . For the construction of f N (x), it is worthwhile to note that max1≤i≤N |ηi | may tend to infinity as N → ∞, which makes it improper to directly apply those regularization techniques where the regularization parameters are determined by upper bound of noise level, such as in Cheng et al. (2007). In this paper, we use penalized regression cubic B-splines for function reconstruction, which is widely applied in the field of statistics (Claeskens et al. 2009). Let be a uniform subdivision of I with grids p j , 0 ≤ j ≤ M, where p j are defined by p j = jd,
j ∈ Z,
(1)
d = 1/M denotes the mesh size. Let VM be a finite-dimensional vector space that contains all cubic splines with knots of . We define the following Tikhonov regularization functional J (g; α N , y N ) =
N 2 1 g(xi ) − yi + α N g 2L 2 (I ) , N i=1
(2)
where α N > 0 is the regularization parameter, and column vector y N ∈ R N is composed of observation data yi , 1 ≤ i ≤ N . The approximant f N (x) is defined by the following minimization problem f N (x) = arg min J (g; α N , y N ). g∈VM
(3)
In Sect. 3.1 we will show that, when α N is chosen properly, f N (x) approximates f (x) in the sense of mean-squared error at observation points X N , i.e.,
A Big Data Processing Technique Based on Tikhonov Regularization N
f N (xi ) − f (xi )
185
2
i=1
approaches zero when N , M → ∞ and M ∼ o(N ). We also give a prior choice of regularization parameter α N in Sect. 2, which gives the best asymptotic convergence rate under our settings. While it is always feasible to construct the function f N (x) from observational data, f N (x) may fail to approximate f (x) on subintervals of I that lack observations. A common condition that is used to tackle with this problem is the quasi-uniform condition on X N , i.e., − xi | max |xi+1
1≤i≤N −1
min |xi+1 − xi |
1≤i≤N −1
(4)
is bounded, where xi , 1 ≤ i ≤ N are observation points that are sorted in ascending order. However, observation data from real world may fail to satisfy the quasi-uniform condition, which prevents direct applications of those error bounds based on the quasi-uniform condition, such as Ragozin (1983). For example, even when {xi } are i.i.d. to uniform distribution on I , the quotient (4) tends to infinity as N → ∞. Also, in real applications engineers may place several adjacent observation points to reduce local uncertainty, which may enlarge the quotient in (4) and result in worse error bounds. In this paper, instead of imposing conditions on X N in prior, we use a histogram density estimator ρ N (x) of X N as an indicator function for upper bounds of approximation errors on different parts of I . Denoting M bins I j , 1 ≤ j ≤ M as I1 = [0, d], I j = ( jd − d, jd],
(5)
the estimate ρ N is given by ρ N (x) =
ρ N , j = N j /(N d), 0,
if x ∈ I j , if x ∈ / I,
(6)
where N j is the number of elements in set X N ∩ I j . Here the frequency N j /N is divided by mesh size d, so ρ N (x) is normalized as a PDF, i.e., R ρ N (x)dx = 1. In Sect. 3.2 we will show that, on any subinterval of I whose length is not smaller than mesh size d multiplied by an explicit constant, if ρ N (x) has a positive lower bound, then error bounds of f N − f and f N − f can be given in L 2 norms.
186
Y. Chen et al.
2.2 Cardinal Cubic B-Splines In order to construct the solution space, we first give a definition of cardinal cubic B-splines. Cubic spline function φ3 (x) is explicitly defined by ⎧ ⎪ (x + 2)3 , ⎪ ⎪ ⎪ ⎪ ⎪(x + 2)3 − 4(x + 1)3 , 1 ⎨ φ3 (x) = × (2 − x)3 − 4(1 − x)3 , 6 ⎪ ⎪ ⎪(2 − x)3 , ⎪ ⎪ ⎪ ⎩0,
if − 2 ≤ x ≤ −1, if − 1 < x ≤ 0, if 0 < x ≤ 1, if 1 < x ≤ 2, if |x| > 2.
φ3 (x) is a piecewise cubic polynomial with compact support on [−2, 2], and regularity of C 2 (R). Cardinal cubic B-splines can be obtained by scaling and shifting ˜ that is extended from be φ3 (x) to equidistant knots. Let a partition ˜ = { p j } M+1 j=−1 = ∪ { p−1 , p M+1 }, ˜ are defined where p j are defined in (1). The cardinal cubic B-splines with knots of by x − p j , −1 ≤ j ≤ M + 1. ψ j = φ3 d Next, we give two properties of functions {ψ j } M+1 j=−1 . Lemma 1 Let VM be the vector space of all cubic spline functions on I with knots of . Then, the functions {ψ j } M+1 j=−1 forms a basis of VM . That is, VM is isomorphic M+3 . The corresponding isomorphic mapping is given to the real vector space R by : R M+3 → VM , λ → [λ] =
M+1
λjψj,
(7)
j=−1
where λ j is the j-th component of the (M + 3)-dimensional real column vector λ. M+1 Lemma 2 The functions {ψ j } M+1 j=−1 are linearly independent on I , i.e., ∀{a j } j=−1 ⊂ R, we have M+1 a j ψ j ≡ 0 on I j=−1
if and only if a j = 0, −1 ≤ j ≤ M + 1. The proof can be found in Micula (1999).
A Big Data Processing Technique Based on Tikhonov Regularization
187
Denote the natural cubic spline interpolant of f as s f,M ∈ VM , which has zero second derivative at both ends and is twice continuously differentiable. We have the following lemma that describes the interpolation error. Lemma 3 Let s f,M ∈ VM be the natural cubic spline interpolant of f ∈ W 2,2 (I ) with knots . Then, the L 2 norm of f and s f,M satisfy the following relationship s f,M 2L 2 (I ) + s f,M − f 2L 2 (I ) = f 2L 2 (I ) .
(8)
The interpolation error of s f,M − f and s f,M − f satisfy d 4 d 4 2 s f,M − f 2L 2 (I ) ≤ f L 2 (I ) , 16 16 d2 d2 ≤ s f,M − f 2L 2 (I ) ≤ f 2L 2 (I ) . 2 2
s f,M − f 2L 2 (I ) ≤ s f,M
−
f 2L 2 (I )
(9)
The proof can be found in Micula (1999).
2.3 Solution of Regularization Problem as Spline Functions By Lemma 1, we may rewrite the Tikhonov functional (2) in a vector form for functions that belong to VM in the following lemma. Lemma 4 ∀λ ∈ R M+3 , the Tikhonov functional J ( [λ]) in (2), where [λ] is defined in (7), can be written as J ( [λ]; α N , y N ) = where HN is defined by
1 (HN λ − y N ) (HN λ − y N ) + α N λ Pλ, N
(10)
⎞ Hx1 ⎟ ⎜ HN = ⎝ ... ⎠ ∈ R N ×(M+3) , ⎛
Hx N row vectors Hxi , 1 ≤ i ≤ N are defined by Hx = ψ−1 (x), ψ0 (x), . . . , ψ M (x), ψ M+1 (x) .
(11)
The matrix P ∈ R(M+3)×(M+3) is defined by P=
( pi j )i,M+1 j=−1 ,
pi j = I
ψi (x)ψ j (x)dx.
(12)
188
Y. Chen et al.
Proof It suffices to show that ∀λ ∈ R M+3 , we have
and
Hx λ = [λ](x)
(13)
λ Pλ = [λ] (x)2L 2 (I ) .
(14)
The first equality is directly from (7), since [λ](x) =
M+1
λ j ψ j (x) = Hx λ.
j=−1
Due to the bilinearity of inner product, to prove the second equality, it suffices to prove that ei P e j = ψi (x)ψ j (x)dx, ∀ − 1 ≤ i, j ≤ M + 1, I M+3 where {e j } M+1 . This is the definition of pi j in (12). j=−1 is the natural basis of R
Finally, in the following theorem, we give an explicit formula for approximant f N and prove its uniqueness. N are not identical, and Theorem 1 When N ≥ 2, observation points X N = {xi }i=1 regularization parameter α N > 0, Tikhonov functional (2) has a unique minimizer in VM . The unique minimizer is given by
f N = φ[λ N ], where
1 1 α N P + HN HN λ N = HN y N . N N
(15)
Proof Since the Tikhonov functional (10) is a quadratic form with respect to λ, it must have a form of J ( [λ]; α N , y N ) =
1 (λ − λ N ) A(λ − λ N ) + q, 2
where the constant q is independent of λ. When A is positive definite, λ = λ N is the unique minimizer of J ( [λ]; α N , y N ). First, we show that A is positive definite under the assumptions of this theorem. We have A=
2 ∂2 J ( [λ]; α N , y N ) = HN HN + 2α N P. N ∂λ2
Assume that λ Aλ = 0. Due to (14), P is a semi-definite matrix, hence we have
A Big Data Processing Technique Based on Tikhonov Regularization
189
λ Pλ = 0 and HN λ = 0. From the former equality and (14), we have [λ] (x) L 2 (I ) = 0, hence [λ](x) is a linear function on I . From the latter equality and (13), we have [λ](xi ) = 0, 1 ≤ i ≤ N . N are not all identical, the linear function [λ](x) must be equivalent to Since {xi }i=1 zero on I , hence λ=0
by Lemma 2. This proves that A is a positive definite matrix. Next we derive the expression of λ N by letting ∂ J ( [λ]; α N , y N ) = 0, ∂λ which gives that
1 1 α N P + HN HN λ N = HN y N . N N
2.4 Algorithm Description We now summarize the algorithm in pseudo code that constructs the approximant f N . In the following algorithm, we choose the regularization parameter α N in prior as σ2M + d 4. (16) αN = N In this algorithm, sequential observation data are processed and stored in matrix A N and vector b N . On every evaluation of f N , we do not need to reprocess previous observation data, and only need to solve one linear system that includes M + 3 degrees of freedom. Note that matrix P has a bandwidth of 3, and row vectors Hxi , i ∈ N+ have continuous non-zero elements with a maximum quantity of 4. Thus, the matrix α N P + A N in (17) also has a bandwidth of 3. Therefore, data process of one datum costs O(1) flops, and one-time evaluation of f N costs O(M) flops thanks to sparsity.
190
Y. Chen et al.
Algorithm 1 Reconstruction algorithm based on prior regularization parameter Require: The number of cubic spline knots M; observational noise level σ 2 . Ensure: Reconstructed function f N (x) ∈ VM . 1: Initialize A0 = 0 ∈ R(M+3)×(M+3) , which stores N1 H N H N ; 2: Initialize b0 = 0 ∈ R(M+3) , which stores N1 H N y N ; 3: Generate quadratic form matrix P according to (12); 4: for N ← 1, 2, . . . do 5: Wait for the input of next datum point (x N , y N ); 6: Generate row vector Hx N according to (11); 7: Update A N as N −1 1 AN ← A N −1 + HxN Hx N ; N N 8: Update b N as N −1 1 bN ← b N −1 + HxN y N ; N N 9: When user requires an evaluation of current reconstruction result f N , solve the following linear equations for λ N (α N P + A N )λ N = b N , (17) where α N follows the prior rule (16). Then, give f N as f N = [λ N ]. 10: end for
3 Theoretical Analysis 3.1 Averaged Mean Squared Error at Observation Points We first estimate the averaged mean squared error of f N − f at observation points X N . That is, denoting error function e N by e N = f N − f,
(18)
N 2 e N (xi ) by splitting it into a deterministic part and a random part. we estimate i=1 The observations yi , 1 ≤ i ≤ N contain two parts. The first part is information from the truth function f (xi ), 1 ≤ i ≤ N , while the second part is random noise ηi , 1 ≤ i ≤ N . These two parts have very different properties and should be treated differently. We denote the truth part f N ,1 and the noise part f N ,2 of the approximant f N by f N ,1 = arg min J (g; α N , y N − η N ) = [λ N ,1 ], g∈VM
f N ,2 = arg min J (g; α N , η N ) = [λ N ,2 ], g∈VM
(19)
A Big Data Processing Technique Based on Tikhonov Regularization
191
where η N ∈ R N is the column vector that is composed of observation noise ηi , 1 ≤ i ≤ N . Due to the linear dependence between the minimizer of Tikhonov regularization functional and the observation values, we have f N = f N ,1 + f N ,2 . Therefore, in the following text the averaged mean squared error is estimated separately in deterministic part and random part. For the deterministic part, we denote truth error function e N ,1 by e N ,1 = f N ,1 − f, then we estimate
N i=1
(20)
e2N ,1 (xi ) by the minimizing property of f N ,1 , i.e.,
J ( f N ,1 ; α N , y N − η N ) ≤ J (g; α N , y N − η N ), ∀g ∈ VM . For the noise part, we give a confidence interval estimation of statistical properties of uncorrelated random variables.
N i=1
f N2 ,2 (xi ) by
Deterministic Part Recall that s f,M ∈ VM is the natural cubic spline interpolation of f with knots of . By the following inequality J ( f N ,1 ; α N , y N − η N ) ≤ J (s f,M ; α N , y N − η N ), we have the following lemma. Lemma 5 Denote the mean-squared interpolation error of s f,M − f at X N by E N , that is, N 2 1 EN = s f,M (xi ) − f (xi ) . (21) N i=1 Then, we have N 1 2 e (xi ) ≤ E N + α N f 2L 2 (I ) , N i=1 N ,1
eN ,1 2L 2 (I ) ≤
2E N + 4 f 2L 2 (I ) . αN
Proof By J ( f N ,1 ; α N , y N − η N ) ≤ J (s f,M ; α N , y N − η N ), we have
N 1 2 e (xi ) + α N f N ,1 2L 2 (I ) ≤ E N + α N s f,M 2L 2 (I ) . N i=1 N ,1
(22)
192
Y. Chen et al.
From (8), we have s f,M L 2 (I ) ≤ f L 2 (I ) , then N 1 2 e (xi ) ≤ E N + α N f 2L 2 (I ) . N i=1 N ,1
On the other hand, dividing both sides of (22) by α N gives that f N ,1 2L 2 (I ) ≤
EN + s f,M 2L 2 (I ) . αN
Substitute it into (20), then we have eN ,1 2L 2 (I ) ≤ 2 f N ,1 2L 2 (I ) + 2 f 2L 2 (I ) EN ≤2 + 2s f,M 2L 2 (I ) + 2 f 2L 2 (I ) αN EN ≤2 + 4 f 2L 2 (I ) . αN Random Part For the random part, we give estimates of the following two terms N 1 2 1 f N ,2 (xi ) = λN ,2 HN HN λ N ,2 , N i=1 N
(23)
f N ,2 2L 2 (I ) = λN ,2 Pλ N ,2 , where λ N ,2 ∈ R(M+3) is defined in (19). By Theorem 1, λ N ,2 is given by (N α N P + HN HN )λ N ,2 = HN η N .
(24)
Here, if P is invertible, then the inverse of N α N P + HN HN can be given by Woodbury matrix identity, then the desired estimates of (23) can be obtained in expectation forms. However, P is actually a semi-definite matrix thus not invertible. As a workaround, we replace P by P = P + I, > 0, and then pass the limit of → 0+ to reach the final result. First, we replace P by P and obtain the desired estimates. We first introduce the Woodbury matrix identity: Lemma 6 (Woodbury matrix identity) Let A ∈ Rn×n be an invertible matrix, U ∈ Rn×k , C ∈ Rk×k , V ∈ Rk×n . Then, we have (A + U C V )−1 = A−1 − A−1 U (C −1 + V A−1 U )−1 V A−1 .
(25)
A Big Data Processing Technique Based on Tikhonov Regularization
193
We replace P by P in (24) and denote the perturbed solution by λ N ,2 , i.e., λ N ,2 = (N α N P + HN HN )−1 HN η N . By Woodbury matrix identity (25), substituting A = N α N P , U = HN , C = I N , V = HN , the inverse of N α N P + HN HN is given by −1 (N α N P + H N H N )−1 = (N α N )−1 P −1 − (N α N )−1 P −1 H N (N α N )I + H N P −1 H N H N P −1 .
Thus, λ N ,2 is given by −1 λ N ,2 = P −1 HN (N α N )I + HN P −1 HN η N .
(26)
In the following theorem, we estimate the desired terms (23) by replacing λ N ,2 by λ N ,2 and P by P : Lemma 7 If > 0, then EHN λ N ,2 22 ≤ σ 2 (M + 3), σ 2 (M + 3) . E (λ N ,2 ) P λ N ,2 ≤ 4 N αN Proof Let
(27)
S = HN P −1 HN ,
then S is a positive semi-definite matrix. Let its eigendecomposition be S = U T U , where U ∈ R N ×N is an orthogonal matrix, T ∈ R N ×N is a diagonal matrix that is composed of all eigenvalues of S in a non-increasing order, that is, T = diag{t1 , t2 , . . . , t M , t M+1 , t M+2 , t M+3 , . . . , t N }, where ti ≥ ti+1 , 1 ≤ i ≤ N − 1. Notice that although S is a martix of order N , it has a maximum rank of M + 3, since rank(S) ≤ rank(P −1 ) = M + 3. Since S is semi-definite, the number of its positive eigenvalues equals rank(S), hence we have ti = 0, M + 4 ≤ i ≤ N .
194
Y. Chen et al.
Also, since ηi , 1 ≤ i ≤ N are uncorrelated random variables, we have E[η N ηN ] = σ 2 I N . Therefore, substituting (26), we have EHN λ N ,2 22 = E (λ N ,2 ) HN HN λ N ,2 −1 −1 = E ηN (N α N )I + S S 2 (N α N )I + S η N
−2 = E tr S 2 (N α N )I + S η N ηN
−2 = tr S 2 (N α N )I + S E[η N ηN ] −2 = σ 2 tr S 2 (N α N )I + S = σ2
N i=1
ti2 ≤ σ 2 (M + 3). (ti + N α N )2
For the other term, we have −1 −1 E (λ N ,2 ) P λ N ,2 = E η H N P −1 P P −1 H N (N α N )I + S ηN N (N α N )I + S
−2 η N η = E tr S (N α N )I + S N −2 2 = σ tr S (N α N )I + S = σ2
N i=1
≤ σ2
N i=1
ti (ti + N α N )2 ti σ 2 (M + 3) ≤ . 4ti N α N 4 N αN
Next, recalling Fatou’s lemma in probability theory we pass the limit → 0+ to (27) and have the following lemma: Lemma 8 For the noise part f N ,2 , when N ≥ 2 and X N contains at least two different observation points, we have the following estimates N σ 2 (M + 3) 1 , f N2 ,2 (xi ) ≤ E N i=1 N E f N ,2 2L 2 (I ) ≤ Proof We first prove that
σ 2 (M + 3) . 4 N αN
A Big Data Processing Technique Based on Tikhonov Regularization
195
λ N ,2 = lim+ λ N ,2 →0
holds almost surely. Denote A = N α N P + HN HN in this proof, by Theorem 1, A is positive definite provided that X N contains at least two different observation points. Thus, we have λmax (A) ≥ λmin (A) > 0, where λmax (A) and λmin (A) are the largest and the smallest eigenvalue of A, respectively. Since λ N ,2 = A−1 HN η N , λ N ,2 = (N α N P + HN HN )−1 HN η N = (A + N α N I)−1 HN η N , we have
(A + N α N I)(λ N ,2 − λ N ,2 ) = − N α N λ N ,2 .
This gives the following estimate λ N ,2 − λ N ,2 22 ≤ 2 (N α N )2 (A + N α N I)−1 22 λ N ,2 22 ≤ 2 Hence, we have
(N α N )2 λ N ,2 22 . μ2min
lim λ N ,2 = λ N ,2 , a.s.
→0+
Next, since the following two equalities N 1 2 1 1 lim HN λ N ,2 22 , f N ,2 (xi ) = HN λ N ,2 22 = N i=1 N N →0+
lim (λ N ,2 ) P λ N ,2 = lim+ (λ N ,2 ) Pλ N ,2 + lim+ λ N ,2 22 = λN ,2 Pλ N ,2
→0+
→0
→0
almost surely hold, by Fatou’s lemma, we have N 1 1 2 E f N ,2 (xi ) = E lim+ HN λ N ,2 22 N i=1 N →0 ≤
1 σ 2 (M + 3) 2 lim inf , EH λ ≤ N N ,2 2 N →0+ N
196
Y. Chen et al.
and E f N ,2 2L 2 (I ) = E λN ,2 Pλ N ,2 = E lim+ (λ N ,2 ) P λ N ,2 →0 E (λ ) P λ ≤ lim inf N ,2 N ,2 + ≤
→0 2
σ (M + 3) . 4 N αN
Remark 1 By Markov’s inequality, ∀δ1 , δ2 > 0 such that δ1 + δ2 < 1, the following two estimates N 1 2 σ 2 (M + 3) , f N ,2 (xi ) ≤ N i=1 δ1 N
f N ,2 2L 2 (I ) ≤
σ 2 (M + 3) . 4δ2 N α N
simultaneously hold with the probability of at least 1 − δ1 − δ2 .
3.2 Estimating Continuous L 2 Reconstruction Error In this section, based on the averaged mean square error at discrete points, we derive upper bounds of continuous L 2 reconstruction error on certain subintervals of I . We will see in the final conclusion that for a certain subinterval I ⊆ I , the lower bound of ρ N (x) on I and the upper bound of ρ N (x) on I both affect the reconstruction error on I , which illustrates how a non-uniform distribution of observation points will affect reconstruction accuracy. Converting Discrete Norm to Continuous Norm First, we show that for a function u(x) and a point x0 ∈ R, the function value u 2 (x0 ) and the L 2 norm of u(x) on a neighborhood of x0 are related. Lemma 9 ∀u ∈ C 1 [a, b], ∀x0 ∈ [a, b], we have
u2L 2 [a,b] ≤ 2 (b − a)u 2 (x0 ) + (b − a)2 u 2L 2 [a,b] , (b − a)u 2 (x0 ) ≤ 2u2L 2 [a,b] + 2(b − a)2 u 2L 2 [a,b] . Proof ∀x ∈ [a, b], we have
A Big Data Processing Technique Based on Tikhonov Regularization
x u(x) = u(x0 ) +
197
u (s)ds.
x0
Taking squares on both sides gives that b u (x) ≤ 2u (x0 ) + 2(b − a) 2
2
|u (s)|2 ds.
(28)
a
Then we integrate over [a, b] with respect to x and have b
b u (x)dx ≤ 2(b − a)u (x0 ) + 2(b − a) 2
2
2
a
|u (s)|2 ds.
a
Exchange x and x0 in (28) then integrate over [a, b] with respect to x, we have b (b − a)u (x0 ) ≤ 2
b u (x)dx + 2(b − a)
2
2
a
2
|u (x)|2 dx.
a
Next, we substitute x0 by observation points xi , 1 ≤ i ≤ N and let the neighborhood of xi be the histogram bin I j (defined in (5)) that contains xi , then sum over certain indices of i. Let I p,q = I p ∪ · · · ∪ Iq , 1 ≤ p ≤ q ≤ M.
(29)
We have the following two corollaries that connect the discrete averaged mean squared error to the continuous L 2 norm via the histogram density estimator ρ N (x) defined in (6): Corollary 1 ∀ p, q(1 ≤ p ≤ q ≤ M), if inf ρ N (x) ≥ γ > 0,
x∈I p,q
then ∀u ∈ C 1 (I ), we have u2L 2 (I p,q ) ≤ 2
N
1 u 2 (xi ) + d 2 u 2L 2 (I p,q ) . γ N i=1
Proof ∀ j ( p ≤ j ≤ q), ∀x ∈ I j , by Lemma 9 we have
u2L 2 (I j ) ≤ 2 du 2 (x) + d 2 u 2L 2 (I j ) .
198
Y. Chen et al.
Sum over X N ∩ I j with respect to x, then divide by N j = #{X N ∩ I j }, we have N
u2L 2 (I j ) ≤ 2 N −1 1xi ∈I j · u 2 (xi ) + d 2 u 2L 2 (I j ) . j d
(30)
i=1
Since inf ρ N (x) ≥ γ > 0,
x∈I p,q
we have N −1 j d ≤
1 . γN
Finally, we sum (30) over p ≤ j ≤ q with respect to j. Since I j , p ≤ j ≤ q do not intersect with each other, each xi will appear at most once. Hence, u2L 2 (I p,q ) ≤ 2
N
1 u 2 (xi ) + d 2 u 2L 2 (I p,q ) . γ N i=1
Corollary 2 If sup ρ N (x) = β, x∈I
then ∀u ∈ C 1 (I ), we have N
1 2 u (xi ) ≤ 2β u2L 2 (I ) + d 2 u 2L 2 (I ) . N i=1
Proof ∀ j (1 ≤ j ≤ M), ∀x ∈ I j , by Lemma 9 we have u 2 (x) ≤
2 u2L 2 (I j ) + 2du 2L 2 (I j ) . d
Sum over X N ∩ I j with respect to x, we have N i=1
1xi ∈I j · u 2 (xi ) ≤
2N j u2L 2 (I j ) + 2N j du 2L 2 (I j ) . d
Since sup ρ N (x) = β, x∈I
we have
Nj ≤ Nβ, d
A Big Data Processing Technique Based on Tikhonov Regularization
then
N
199
1xi ∈I j · u 2 (xi ) ≤ 2 Nβu2L 2 (I j ) + 2 Nβd 2 u 2L 2 (I j ) .
i=1
Finally, sum over 1 ≤ j ≤ M and divide by N , since I = I1 ∪ · · · ∪ I M , we have N 1 2 u (xi ) ≤ 2βu2L 2 (I ) + 2βd 2 u 2L 2 (I ) . N i=1
Error Bound of Function Reconstruction Corollary 1 involves the L 2 norm of firstorder derivative of a function, while in our settings, only the L 2 norm of second-order derivative is bounded. To obtain an error bound of function reconstruction, we need to give an estimate of the first-order derivative using the second-order derivative. We first introduce a Sobolev inequality which interpolates on order of smoothness: Lemma 10 (Sobolev interpolation inequality) ∀u ∈ W 2,2 [a, b], ∀ 0 > 0, there exists a Sobolev constant K ( 0 , b − a), such that ∀ ∈ (0, 0 ], there holds
u 2L 2 [a,b] ≤ K −2 u2L 2 [a,b] + 2 u 2L 2 [a,b] .
(31)
See Adams and Fournier (2003) for its proof. Remark 2 If 0 is proportional to the interval length b − a, then the Sobolev constant K only relies on the ratio. When 0 = b − a, the corresponding Sobolev constant can be given by (32) K (b − a, b − a) = K ∗ = 32. The following lemma generalizes Corollary 1 by cancelling the dependence on u (x). Instead, the L 2 norm of u (x) needs to be bounded, and the length of interval I p,q should not be smaller than an explicit constant. Lemma 11 ∀1 ≤ p ≤ q ≤ M, suppose that inf ρ N (x) ≥ γ > 0,
x∈I p,q
and the length of I p,q satisfies |I p,q | ≥ 2K ∗1/2 d.
(33)
Then, ∀u ∈ W 2,2 (I ), we have u2L 2 (I p,q ) ≤
N 4 2 u (xi ) + 16K ∗2 d 4 u 2L 2 (I p,q ) , γ N i=1
(34)
200
Y. Chen et al.
where K ∗ is the constant in (32). Proof By Corollary 1, we have u2L 2 (I p,q ) ≤ 2
N
1 u 2 (xi ) + d 2 u 2L 2 (I p,q ) . γ N i=1
Then we substitute the Sobolev inequality (31). ∀ ∈ (0, |I p,q |], we have u2L 2 (I p,q )
N
1 ≤2 u 2 (xi ) + d 2 u 2L 2 (I p,q ) γ N i=1
≤2 =
N 1 u 2 (xi ) + d 2 K ∗ −2 u2L 2 (I p,q ) + 2 u 2L 2 (I p,q ) γ N i=1
N d 2 2 2 u (xi ) + 2K ∗ u2L 2 (I p,q ) + 2K ∗ d 2 2 u 2L 2 (I p,q ) . γ N i=1
Next, we choose a proper value of and move the term 2K ∗ (d/ )2 u2L 2 (I p,q ) from right-hand side to left-hand side. We choose = 2K ∗1/2 d ≤ |I p,q |, thus 2 = 4K ∗ d 2 and the coefficient 2K ∗ (d/ )2 satisfies 2K ∗
d 2
≤
1 . 2
After transposition of the term, we have u2L 2 (I p,q ) ≤ Remark 3 Since
N 4 2 u (xi ) + 16K ∗2 d 4 u 2L 2 (I p,q ) . γ N i=1
√ 2K ∗1/2 = 8 2 < 12,
condition (33) holds if q − p + 1 ≥ 12. Next, we apply the above lemma to e N ,1 and f N ,2 , then finally obtain the L 2 estimates of error function e N . Lemma 12 When M ≥ 3, N ≥ M, let β = sup ρ N (x), x∈I
A Big Data Processing Technique Based on Tikhonov Regularization
201
and the regularization parameter α N be chosen as (16). Suppose that I p,q is an interval defined by (29), such that |I p,q | ≥ 2K ∗1/2 d and inf ρ N (x) = γ > 0.
x∈I p,q
Then, for e N ,1 there hold
9β + 8 Mσ 2 4 2 · f L 2 (I ) + d 4 · + 36K ∗2 β + 64K ∗2 f 2L 2 (I ) , N γ 2γ (35)
9 β + 4 f 2L 2 (I ) . ≤ eN ,1 2L 2 (I ) ≤ 4
e N ,1 2L 2 (I p,q ) ≤ eN ,1 2L 2 (I p,q )
Also, ∀δ(0 < δ < 1), the following two bounds for f N ,2 Mσ 2 16 16K ∗2 · + d4 · , N δγ δ 1 ≤ f N ,2 2L 2 (I ) ≤ δ
f N ,2 2L 2 (I p,q ) ≤ f N ,2 2L 2 (I p,q )
(36)
hold simultaneously with the probability of at least 1 − δ. Proof This proof contains three parts. First, we estimate the interpolation error E N , defined in (21). By Corollary 2 and (9), we have
E N ≤ 2β f − s f,M 2L 2 (I ) + d 2 f − s f,M 2L 2 (I ) d4
d4 ≤ 2β f 2L 2 (I ) + f 2L 2 (I ) 16 2 9 4 2 ≤ βd f L 2 (I ) . 8 Second, by Lemma 5, if α N satisfies (16), we have N
Mσ 2 1 2 + d 4 f 2L 2 (I ) e N ,1 (xi ) ≤ E N + N i=1 N
9 2 Mσ f 2L 2 (I ) + d 4 β + 1 f 2L 2 (I ) , ≤ N 8
and eN ,1 2L 2 (I ) ≤
9
9 β f 2L 2 (I ) + 4 f 2L 2 (I ) ≤ β + 4 f 2L 2 (I ) . 4 4
202
Y. Chen et al.
Substituting the above two formulas into (34) to estimate the L 2 norm of e N ,1 , we have e N ,1 2L 2 (I
N 4 2 ≤ e N ,1 (xi ) + 16K ∗2 d 4 eN ,1 2L 2 (I ) p,q ) p,q γN i=1
9 9 4 Mσ 2 + d 4 β + 1 f 2L 2 (I ) + 16K ∗2 d 4 · β + 4 f 2L 2 (I ) ≤ · γ N 8 4
9β + 8 Mσ 2 4 2 · f L 2 (I ) + d 4 · + 36K ∗2 β + 64K ∗2 f 2L 2 (I ) . ≤ N γ 2γ
Third, in Remark 1 we take δ1 = δ2 = δ/2, and substitute (16) and let M ≥ 3. Then, the following two estimates N 1 2 2σ 2 (M + 3) Mσ 2 4 f N ,2 (xi ) ≤ ≤ · , N i=1 δN N δ
f N ,2 2L 2 (I ) ≤
1 . δ
hold with the probability of at least 1 − δ. Substituting into (34) to estimate the L 2 norm of f N ,2 , we have f N ,2 2L 2 (I p,q ) ≤ ≤
N 4 2 f (xi ) + 16K ∗2 d 4 f N ,2 2L 2 (I p,q ) γ N i=1 N ,2
Mσ 2 16 16K ∗2 · + d4 · . N δγ δ
Remark 4 Since e N = e N ,1 + f N ,2 , ∀δ(0 < δ < 1), under the conditions of Lemma 12, the following estimate of e N e N L 2 (I p,q ) ≤ C1
Mσ 2 21 N
+ C2 M −2
holds with the probability of at least 1 − δ, where C1 and C2 are given by 4 1 C1 = 2γ − 2 f L 2 (I ) + √ , δγ 9β + 8 − 21 C2 = 4K ∗ δ + f L 2 (I ) + 36K ∗2 β + 64K ∗2 . 2γ
A Big Data Processing Technique Based on Tikhonov Regularization
203
Error bound of derivative reconstruction It lefts us to show the error bound of derivative reconstruction. Use the Sobolev inequality (31) again, we have the following lemma: Lemma 13 Under the conditions of Lemma 12, let K σ = σ |I p,q |−2 + 1 K ∗ ,
(37)
then for e N ,1 there holds Mσ 2 21 9 · 4K σ γ −1 + K ∗ β + 4 f 2L 2 (I ) eN ,1 2L 2 (I p,q ) ≤ N 4 9β + 8
+ 18K ∗2 β + 32K ∗2 f 2L 2 (I ) . + d2 8γ For f N ,2 , provided that (36) holds, we have f N ,2 2L 2 (I p,q ) ≤
Mσ 2 21 16 1 8K ∗2 + K∗ + d2 · . · Kσ N δγ δ δ
Proof To complete this proof, we consider the following two cases 1. (Mσ 2 /N )1/4 ≤ |I p,q |, 2. (Mσ 2 /N )1/4 > |I p,q |. When (Mσ 2 /N )1/4 ≤ |I p,q |, we take = max
Mσ 2 41 N
, 2K ∗1/2 d ,
in the Sobolev inequality (31). Then, about we have ≤ |I p,q |, 2 ≤
Mσ 2 21 N
Thus, by (35), we have
+ 4K ∗ d 2 , −2 ≤
Mσ 2 − 21 N
, −2 ≤ (4K ∗ d 2 )−1 .
204
Y. Chen et al.
eN ,1 2L 2 (I p,q ) ≤ K ∗ −2 e N ,1 2L 2 (I p,q ) + 2 eN ,1 2L 2 (I p,q ) Mσ 2 21 · 4γ −1 f 2L 2 (I ) ≤ K∗ N 9β + 8
+ 9K ∗ β + 16K ∗ f 2L 2 (I ) + d2 8K ∗ γ
Mσ 2 21
9 β + 4 f 2L 2 (I ) + + 4K ∗ d 2 N 4 Mσ 2 21
9 ≤ · K ∗ 4γ −1 + β + 4 f 2L 2 (I ) N 4
2 9β + 8 2 + 18K ∗ β + 32K ∗2 f 2L 2 (I ) . +d 8γ
(38)
Also, provided that (36) holds, we have
f N ,2 2L 2 (I p,q ) ≤ K ∗ −2 f N ,2 2L 2 (I p,q ) + 2 f N ,2 2L 2 (I p,q ) Mσ 2 21 16
4K ∗ Mσ 2 21 + d2 · + · + 4K ∗ d 2 · ≤ K∗ N δγ δ N
16 Mσ 2 21 2 8K ∗ 1 + d2 · · K∗ ≤ + . N δγ δ δ
1 δ (39)
In the second case where (Mσ 2 /N )1/4 > |I p,q |, we take = |I p,q | in the Sobolev inequality (31) and have 2
K ∗ . From (38) and (40), we have Mσ 2 21 9 · 4K σ γ −1 + K ∗ β + 4 f 2L 2 (I ) eN ,1 2L 2 (I p,q ) ≤ N 4
2 9β + 8 2 2 2 +d + 18K ∗ β + 32K ∗ f L 2 (I ) . 8γ From (39) and (41), provided that (36) holds, we have f N ,2 2L 2 (I p,q ) ≤
Mσ 2 21 16 1 8K ∗2 + K∗ + d2 · . · Kσ N δγ δ δ
Finally, by Lemmas 12 and 13, we obtain the error bounds of both function and derivative reconstructions in the following theorem: Theorem 2 When M ≥ 3, N ≥ M, let β = sup ρ N (x), x∈I
and the regularization parameter α N be chosen as (16). Suppose that I p,q is an interval defined by (29), such that |I p,q | ≥ 2K ∗1/2 d and inf ρ N (x) = γ > 0.
x∈I p,q
Then, ∀δ ∈ (0, 1), the following error estimates hold
206
Y. Chen et al.
eN L 2 (I p,q )
Mσ 2 21
+ C2 M −2 , N Mσ 2 41 ≤ C3 + C4 M −1 N
e N L 2 (I p,q ) ≤ C1
with the probability of at least 1 − δ, where the constants are given by 4 1 C1 = 2γ − 2 f L 2 (I ) + √ , δγ 9β + 8 − 21 C2 = 4K ∗ δ + f L 2 (I ) + 36K ∗2 β + 64K ∗2 , 2γ
1 1 4 9 − 21 − 21 2 2 C3 = K σ √ + 2γ f L 2 (I ) + K ∗ δ + f L 2 (I ) β +4 , 4 δγ √ K∗ 9β + 8 + 18K ∗2 β + 32K ∗2 , C4 = 2 2 √ + f L 2 (I ) 8γ δ and K σ is given by (37).
3.3 Convergence of Reconstruction with Randomly Distributed Observation Points In Theorem 2, we obtained the reconstruction error estimates when observation points are given. However, convergence is not guaranteed, since the number of observations is fixed, and no assumption is imposed on the distribution of observation points. In this section, we make an assumption that the observation points are i.i.d. random variables with a density function that is given in prior. Then, we will show the convergence in probability of function and derivative reconstructions on certain subintervals of I when N → ∞. The assumptions on observation points and empirical distribution are summarized as follows. Assumption 1 We assume that the observation points xi , i ∈ N+ are independent and identically distributed continuous random variables defined in probability space (I, B(I ), P), with a sample space of I equipped with Borel σ -algebra, a distribution function F(x), and a density function ρ(x) that is continuous on I . Also, assume that noise and observation points are independent, i.e., ∀N > 0, the σ -algebra generated by ηi , 1 ≤ i ≤ N is independent of that generated by xi , 1 ≤ i ≤ N . Under these settings, the histogram density estimator ρ N (x) will converge to the integral mean of ρ(x) on each bin as N → ∞. In the following text, we introduce the Dvoretzky-Kiefer-Wolfowitz inequality, then give a rough estimate on the convergence speed of ρ N (x).
A Big Data Processing Technique Based on Tikhonov Regularization
207
Lemma 14 (Dvoretzky-Kiefer-Wolfowitz inequality) Let FN (x) be the empirical N , which is given by distribution function of {xi }i=1 FN (x) =
N 1 1x ≤x , x ∈ I. N i=1 i
(42)
Then, ∀ > 0, we have P sup FN (x) − F(x) >
≤ 2e−2N . 2
(43)
x∈I
By the Dvoretzky-Kiefer-Wolfowitz inequality, we can derive a confidence interval estimation of the histogram density estimator ρ N (x). Lemma 15 Let N j be defined by (6). Then, ∀δ ∈ (0, 1), the estimate jd 2 ln(2/δ) N 1 j − . ρ(x)dx ≤ sup N d d d2 N 1≤ j≤M ( j−1)d
holds with the probability of at least 1 − δ. Proof By (43), the estimate FN (x) − ≤ F(x) ≤ FN (x) + , =
ln(2/δ) 2N
holds with the probability of at least 1 − δ. By the definition of empirical distribution function (42), we have Nj 1 = FN ( jd) − FN ( j − 1)d) , Nd d
j = 1, 2, . . . , M.
Combining the above two formulas, we have Nj F( jd) − F(( j − 1)d) 2 F( jd) − F(( j − 1)d) 2 − ≤ ≤ + , d d Nd d d
Then, noting that ρ(x) is the derivative of F(x), we have 1 F( jd) − F(( j − 1)d) = d d Thus
jd ρ(x)dx. ( j−1)d
j = 1, 2, . . . , M.
208
Y. Chen et al.
jd 2 N 1 j − , sup ρ(x)dx ≤ d d 1≤ j≤M N d ( j−1)d
where 2 = d
2 ln(2/δ) . d2 N
Corollary 3 Let the upper bound of ρ(x) be sup ρ(x) = β. x∈I
Suppose that I p,q is an interval defined by (29), such that inf ρ(x) = γ .
x∈I p,q
Then, ∀δ ∈ (0, 1), provided that N≥
8 ln(2/δ) , γ 2d 2
the estimates sup ρ N (x) ≤ 2β, x∈I
inf ρ N (x) ≥
x∈I p,q
γ 2
hold simultaneously with the probability of at least 1 − δ. Combining Corollary 3 and Theorem 2, we have the following theorem: Theorem 3 Denote the upper bound of ρ(x) by sup ρ(x) = β. x∈I
Suppose that I p,q is an interval defined by (29), such that |I p,q | ≥ 2K ∗1/2 d, and inf ρ(x) = γ > 0.
x∈I p,q
Then, ∀δ ∈ (0, 1), when the regularization parameter α satisfies (16), and the number of observations N satisfies
A Big Data Processing Technique Based on Tikhonov Regularization
N≥
209
8 ln(4/δ) , γ 2d 2
the estimates e N L 2 (I p,q ) ≤
C1
eN L 2 (I p,q )
C3
≤
Mσ 2 N Mσ 2 N
1 2
1 4
+ C2 M −2 , + C4 M −1
hold simultaneously with the probability of at least 1 − δ. Here, the constants are given by √ 8 1 C1 = 2 2γ − 2 f L 2 (I ) + √ , δγ √ 18β + 8 − 21 + 72K ∗2 β + 64K ∗2 , C2 = 4 2K ∗ δ + f L 2 (I ) γ
1 1 √ √ − 1 8 9 1 − 2 2 C3 = K σ √ + 2 2γ 2 f L 2 (I ) + K ∗ β +4 , 2δ 2 + f L 2 (I ) 2 δγ 9β + 4 1 + 36K ∗2 β + 32K ∗2 , C4 = 4K ∗ δ − 2 + f L 2 (I ) 2γ and K σ is given by (37). Remark 5 By Theorem 3, we can give the optimal convergence orders in probability of e N L 2 (I ) and eN L 2 (I ) , provided that ρ(x) has a positive lower bound on I . Let M ∼ N 1/5 , then the convergence orders are e N L 2 (I ) ∼ O p (N −2/5 ), eN L 2 (I ) ∼ O p (N −1/5 ). The notation O p denotes order in probability.
4 Numerical Tests The truth function f (x) and its derivative are given by f (x) =
1 , 1 + 100(x − 0.5)2
f (x) =
−200(x − 0.5) . (1 + 100(x − 0.5)2 )2
(44)
On the interval I = [0, 1], we place observation points that are i.i.d. distributed with different densities. We add random noise at each point with normal distribution
210
Y. Chen et al.
Fig. 1 Function and derivative reconstruction of f (x) that is given in (44). a the blue dots are 5000 noisy observations with σ = 0.1, while the red line is the truth function f (x); b the histogram density of observation points; c the reconstructed function; d the reconstructed derivative
N (0, σ ) with σ = 0.1 and generate corresponding observations. In the algorithm, we take M = 40, and regularization parameter α N is chosen by the prior rule (16). In the first case, the observation points are uniformly distributed on I . When N = 5000, the reconstructed results are shown in Fig. 1. The reconstructed function coincides with the exact one well on the whole, with some smoothing effect around the peak of the curve. The derivative reconstruction is relatively less accurate than the function itself, with some fluctuation in the flat region and weakened derivative peaks. We point out that with such level of observation noise, reconstruction with small amount of data is usually not reliable. In Fig. 2a–c, only 200 observations are made, which is evenly distributed in I . M = 40, the same as the previous case. The reconstruction deviates from the ground truth significantly, and the result is worse for the derivative reconstruction. When batches of observations are added with the online update algorithm, the relative error is then almost monotonously reduced as the data amount are increased, as illustrated in Fig. 2d.
A Big Data Processing Technique Based on Tikhonov Regularization
211
Fig. 2 Function and derivative reconstruction of f (x) that is given in (44) when 200 observation points are evenly distributed in panel (a). b and c give the reconstructed function and its derivative. d gives the relative error versus data amount (20 observations in each batch)
Then we let the observation points be non-uniformly distributed and see the difference. Splitting I into I L = [0, 1/2] and I R = [1/2, 1], and 90% observation locations are distributed in I L and 10% in I R . We expect that the reconstruction is less accurate in I R than in I L . When N = 5000, the reconstruction results are shown in Fig. 3. Since the observation points are denser in I L , both reconstructed function and derivative are more accurate in I L than in I R , which is in accordance with the error analysis.
5 Conclusion In this paper, we proposed a numerical differentiation method based on Tikhonov regularization for scattered data with large random noise. The prior choice rule for regularization and the choice of interpolation dimensions are designed to balance the reconstruction resolution and reliability. The algorithm can process a large amount of
212
Y. Chen et al.
Fig. 3 Function and derivative reconstruction of f (x) that is given in (44) when observation points are non-uniformly distributed. a the blue line is 5000 noisy observations with σ = 0.1, the red line is the truth function f (x); b the histogram density of observation points; c the reconstructed function; d the reconstructed derivative
online data with low computational cost. Error analysis is carried out and an indicator function is provided to show reliable reconstruction regions with a given dataset. The method can be used as a preprocess method in numerical computations for various inverse problems. Acknowledgements This work was supported by the National Science Foundation of China (No. 11971121, No. 12201386).
A Big Data Processing Technique Based on Tikhonov Regularization
213
References Adams, RA, Fournier JJF (2003) Sobolev spaces, pure and applied mathematics, vol 140, 2nd edn. Academic Press, Amsterdam and Boston Addison PS (2002) The illustrated wavelet transform handbook: introductory theory and applications in science, engineering, medicine and finance/Paul S. Addison. Institute of Physics Publishing, Bristol Cheng J, Jia XZ, Wang YB (2007) Numerical differentiation and its applications. Inverse Probl Sci Eng 15(4):339–357 Claeskens G, Krivobokova T, Opsomer JD (2009) Asymptotic properties of penalized spline estimators. Biometrika 96(3):529–544 Deans SR (2007) The radon transform and some of its applications. Dover Publications, Mineola N.Y Eilers PHC (2003) A perfect smoother. Anal Chem 75(14):3631–3636 Eilers PHC, Marx BD (1996) Flexible smoothing with b -splines and penalties. Stat Sci 11(2):89– 121 Gorenflo R, Vessella S (2006) Abel integral equations: analysis and applications, vol 1461. Lecture notes in mathematics series. Springer, Berlin Hanke M, Scherzer O (2001) Inverse problems light: numerical differentiation. Am Math Mon 108(6):512–521 Hu B, Lu S (2012) Numerical differentiation by a Tikhonov regularization method based on the discrete cosine transform. Appl Anal 91(4):719–736 Klasson KT (1997) Experimental data analysis: an algorithm for determining rates and smoothing data. Appl Biochem Biotechnol 63–65:339–348 Klaus RL, van Ness HC (1967) An extension of the spline fit technique and applications to thermodynamic data. AIChE J 13(6):1132–1136 Micula G (1999) Handbook of splines, mathematics and its applications, vol 462. Springer, Netherlands, Dordrecht Ragozin DL (1983) Error bounds for derivative estimates based on spline smoothing of exact or noisy data. J Approx Theory 37(4):335–355 Savitzky A, Golay MJE (1964) Smoothing and differentiation of data by simplified least squares procedures. Anal Chem 36(8):1627–1639 Scott LB, Scott LR (1989) Efficient methods for data smoothing. SIAM J Numer Anal 26(3):681– 692 Wahba G (1975) Smoothing noisy data with spline functions. Numer Math 24(5):383–393 Wang YB, Jia XZ, Cheng J (2002) A numerical differentiation method and its application to reconstruction of discontinuity. Inverse Probl 18(6):1461 Wang YB, Wei T (2005) Numerical differentiation for two-dimensional scattered data. J Math Anal Appl 312(1):121–137 Wei T, Hon YC (2007) Numerical differentiation by radial basis functions approximation. Adv Comput Math 27(3):247–272 Wei T, Hon YC, Wang YB (2005) Reconstruction of numerical derivatives from scattered noisy data. Inverse Probl 21(2):657–672
Quantitative Estimation of Crack on or Near Surface Using Laser-Ultrasonic Surface Wave: Numerical Simulation Cheng Hua and Takashi Takiguchi
Abstract Cracks are the most common defects in various kinds of materials. Laser ultrasonic technology has an important application in the field of nondestructive inspection of solid surface or near-surface cracks. Based on the thermoelastic coupling theory and finite element method, a numerical method to quantitative estimation of crack in the near surface area with laser-excited Rayleigh wave is established in this study. The observation position for receiving signal is set between the laser excited point and the crack. The interaction between laser-excited Rayleigh wave and surface crack is analyzed, and the time difference between two reflected wave signals is used to quantitatively estimate the surface or near-surface crack. The arrival time and amplitude of reflected Rayleigh (RR) wave are extracted to compare the effect of crack depth. The results show that the arrival time of RR wave is approximately linear with the top depth of crack, but independent with the bottom depth of crack. The amplitude of RR wave increases first and then decreases with the increase of the top depth of crack, and gradually increases with the increase of the bottom depth of crack, which provides a possible basis for practical applications and quantitative estimation of laser-excited Rayleigh wave to detect surface or near-surface cracks. Keywords Laser ultrasonic · Nondestructive inspection · Rayleigh wave · Finite element method
C. Hua (B) Department of Aeronautics and Astronautics, Fudan University, Shanghai, China e-mail: [email protected] T. Takiguchi Department of Mathematics, National Defense Academy of Japan, Yokosuka, Japan © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 T. Takiguchi et al. (eds.), Practical Inverse Problems and Their Prospects, Mathematics for Industry 37, https://doi.org/10.1007/978-981-99-2408-0_13
215
216
C. Hua and T. Takiguchi
1 Introduction When a pulse laser beam whose intensity is not enough to melt the object to be tested is irradiated onto the surface of the object, the irradiated part will produce sharp thermal expansion deformation. This squeezing deformation of the medium is transmitted to the surrounding and then ultrasonic waves are generated, according to White (1963). Pulsed laser-ultrasonic detection technology is based on the above physical phenomena. Since then, laser-ultrasonic detection, as a rapidly developing non-contact nondestructive inspection technique, has gradually been widely used and become a research focus (Scruby 1989; Scruby and Drain 1990). Fundamentally, ultrasonic waves can be roughly divided into two main categories: body waves and surface waves. Body waves can be further sub-categorized into primary (P) and secondary (S) waves. P-wave arrives first while S-wave arrives second. Various types of surface waves can also be transmitted across the surface of a body, among these are what we call Rayleigh wave. Rayleigh wave is the strongest in this type of ultrasonic waves, which is generated by the surface interference of P-wave and Swave (Viktorov 1967). Because Rayleigh wave propagates over long distances and is not easily attenuated, therefore it can be a good carrier of the surface information of the test object and can be applied in the field of nondestructive inspection (NDI) (Sun et al. 2004). In practice, the surface of industrial materials will inevitably be subjected to certain pressure or friction, which may produce a variety of defects, one of the common defects is surface or near surface crack. If the surface or near-surface crack of the material is not effectively detected, it is very hard to prevent cracks from expansion, which can cause a huge safety hazard to the safety of the whole structure. Laser-excited Rayleigh wave to detect surface or near-surface cracks has attracted the attention of many researchers because of its advantages of non-contact and remote features (Chen et al. 2013). In the 1980s, experiments confirmed that when Rayleigh wave propagates on the solid surface, it will interact with cracks, and the information about this kind of interaction can be captured by the receiving spots in front of or behind the crack, for example Cooper et al. (1986) used a point source pulsed laser on an aluminum plate and placed a signal receiver between the laser spot and the crack. They analyzed the received wave signal and successfully calculated the crack depth. This shows that interaction between Rayleigh wave and crack can be reflected on the waveform, and in turn, the crack information can be detected in some ways according to the waveform. Such a connection was also confirmed in Jeong’s finite element calculation later (Jeong 2005) and in some other studies (Dai et al. 2010; Jian et al. 2007; Matsuda et al. 2006). There are still some follow-up studies and discussions on various solid medium, such as concrete materials (Aggelis et al. 2012; Abraham et al. 2012; Hashimoto et al. 2017) and about more detailed surface cracks (Han et al. 2022). Recently, one of the authors developed a new nondestructive inspection method with high precision (Mita and Takiguchi 2018). This study is mainly motivated by Cooper and Jeong’s research works. We think that the relationship between Rayleigh wave and crack can be further clarified. The task of
Quantitative Estimation of Crack on or Near Surface Using …
217
this study is to make some quantitative discussion on the Rayleigh wave propagation process based on numerical simulation results. We establish a finite element model of laser-excited Rayleigh wave to simulate the interaction between Rayleigh wave and surface cracks. The relationship between the propagating Rayleigh wave field and crack is analyzed qualitatively and quantitatively, providing a possible basis for the laser-ultrasonic NDI of material surface or near-surface cracks. Finally, some suggestions are put forward to laser-ultrasonic surface wave especially on crack inspection in concrete material.
2 Modeling 2.1 The Governing Equations Thermoelastic coupling is the theoretical basis for the generation of Rayleigh wave when a pulse laser hits an object surface as shown in Fig. 1. In real process of physics, temperature change produces thermal stress, which causes the expansion of solid. Conversely, expansion and compression cause temperature to decrease and increase, respectively. Therefore, with the generation and propagation of thermoelastic waves, mechanical motion and thermal motion will interact with each other and have a coupling effect. However, the coupling area containing thermal stress is actually very small. This model ignores other external forces and considers that the strain generated due to the temperature field is also very small, so we assume that the temperature change caused by the deformation can be ignored, and therefore only unidirectional coupling of thermal expansion deformation is considered. The thermoelastic governing equations can be summarized as follows: Equation of motion σi j, j = ρ u¨ i
(1)
Strain–displacement equation
Fig. 1 Illustrations of the ultrasonic wave generated by a pulse laser beam (a) and Rayleigh wave propagation (b)
218
C. Hua and T. Takiguchi
εi j =
1 (u i, j + u j,i ) 2
(2)
Constitutive equation 2 σi j = (λ + μ)δi j εkk + 2μei j − δi j (3λ + 2μ)αT 3
(3)
Heat conduction equation kT,ii + ρ Q = ρc T˙
(4)
In the above governing equations, u i is the displacement component, T is the temperature, σi j is the stress tensor component, εi j is the strain tensor component, εkk is the first invariant of strain, ei j is the component of the strain bias, ρ is the density, λ and μ are the Lame coefficients, α is the coefficient of thermal expansion, k is the coefficient of heat conductivity, c is the constant pressure heat capacity, and Q is the heat source. The term kT ,ii in Eq. (4) denotes the Laplacian of T , which can be also expressed as div[k(grad)]T in the form of components.
2.2 Laser and Material For the heat source Q in Eq. (4), a heat source with a radius of 0.1 mm was selected on the upper surface of the model to simulate the irradiation effect of sinusoidal pulse laser with the same radius spot, and it was assumed that there was no other heat exchange between the model and the outside world. The heat source excitation function in the form of sinusoidal law varying with time is set in this study: Q = I sin(2π f t),
(5)
where the frequency f is 2 MHz, the irradiation time is 0.125 us, and the amplitude I is 25000 W. Aluminum is selected as the model material which is a common industrial material with good thermoelastic properties. The parameters and values of the material used in the above governing equations are listed in Table 1.
2.3 The Geometric Model We assume that cracks on or near the surface of the material are mostly deep and narrow and can be simplified as very thin rectangular shape. Therefore, the geometric shape of our computational object is a two-dimensional model with a very thin rectangular crack near the upper surface as shown in Fig. 2. The length of the model
Quantitative Estimation of Crack on or Near Surface Using … Table 1 Material properties
219
Material property parameter
Parameter value
Density (ρ)
2700 [kg/m3 ]
Young’s modulus (E)
70 [GPa]
Poisson’s ratio (ν=)
0.33
Constant pressure heat capacity (c)
900 [J/(kg*K)]
Thermal Conductivity (k)
238 [W/(m*K)]
Coefficient of thermal expansion (α)
23e−6 [1/K]
is 20 cm in the x-direction, and the height is 10 cm in the z-direction as shown in Fig. 1. The upper left corner of the model is the position where the laser heat source acts. In order not to make the echo at the bottom and both sides of the model affect the calculation results, symmetric boundary condition is used for the left boundary of the model, and absorption boundary conditions are used for the right boundary and bottom boundary of the model. The size of the crack is mainly divided into three parts: in the first part, the depth D between the top of the crack and the top of the upper surface of the model; in the second part, the depth L of the bottom of the crack; in the third part, the width of the crack, it is fixed at 0.1 mm. In this study, the parameters L and D are different, L is set as 1 mm, 2 mm, 3 mm, 4 mm; and D is set as 0 mm, 0.01 mm, 0.05 mm, 0.1 mm, 0.2 mm, 0.5 mm, 0.7 mm, 1 mm, so actually we have 5♦8 = 40 computational models. The observation point A for receiving signal is 40 mm to the right of the laser spot. It should be noted that stressfree boundary condition is a conventional method when dealing with crack faces, so natural crack-face boundary condition is considered in this study, including the case of D = 0 mm.
Fig. 2 Geometric sketch of model
220
C. Hua and T. Takiguchi
2.4 Finite Element Modeling This study applies Finite Element Method (FEM) for numerical simulation. The governing equations of the physical model can be written in the FEM form: C T˙ + K T = P
(6)
M U¨ + EU = F
(7)
In the above formulas, C is the heat capacity matrix, K is the heat conduction matrix, P is the temperature load matrix, T is the Nodal temperature matrix, M is the mass matrix, E is the stiffness matrix, F is the nodal load vector, and U is the nodal displacement vector. In the numerical simulation, a FEM model is designed and applied with the finite element analysis software Comsol Multiphysics 5.2, on Intel Xeon 16-core CPU with main memory of 16 GB. The maximum size of the FEM mesh is 0.02 mm, and the minimum size is 0.001 mm, the mesh is refined near the crack and also near the laser spot in order to improve the accuracy of the FEM solution as shown in Fig. 3. The finite element calculation time step is set to 2.5 ns, and the ratio of mesh size to time step is much larger than the velocity of P-wave in the model material, so in this way the stability condition of numerical calculation (Chung and Hulbert 1993).
3 Results and Discussion When the laser is applied onto solid surface, the calculation results show that the rapid and steep deformation occurs just near the spot of laser action. This steep deformation is transmitted internally to generate P- and S-waves, and these two waves interfere
Mesh type: triangular Mesh size: 0.001~0.02mm absorption boundary condition
symmetric boundary condition
Fig. 3 Diagram of finite element modeling and mesh
absorption boundary condition
Quantitative Estimation of Crack on or Near Surface Using …
221
to form Rayleigh waves on the surface. Among them, P-wave is the fastest, while Rayleigh wave is slightly slower than S-wave. According to the calculation, the velocity of Rayleigh wave in aluminum is 2823 m/s, which is 45% and 90% of the velocity of P-wave and S-wave, respectively. This calculation is consistent with the theoretical relation among the velocities of elastic waves in aluminum. This gives a foundation to further research on the interaction between Rayleigh wave and surface crack.
3.1 Numerical Results of Laser-Excited Rayleigh Wave for Detection Surface Crack The surface crack is actually a special case when the top depth parameter D is 0. In this study, the generation and propagation process of laser-excited Rayleigh wave are calculated when the bottom depth parameter L of the crack is 1 mm, 2 mm, 3 mm, and 4 mm respectively. Figure 4a–f is a snapshot of the wave field of laserexcited Rayleigh wave when parameter L is equal to 2 mm, showing the generation, propagation, and interaction of Rayleigh wave with the surface crack. In order to clearly show the wave field, different color columns are used at different times. Initially, the incident Rayleigh wave (IR wave) excited by the laser propagates along the surface as illustrated in Fig. 4a. Figure 4b shows the IR wave encounter the crack, and part of the Rayleigh wave is reflected (RR wave) in Fig. 4c, while the other part propagates vertically down the crack in Fig. 4d. When the downward Rayleigh wave meets the corner at the bottom of the crack, it forms a scattering S wave (TS wave) in Fig. 4e, which again interferes on the surface to form Rayleigh waves (RS wave) in Fig. 4f. Therefore, the observation point A in front of the crack can receive three Rayleigh wave signals, the first is the incoming Rayleigh wave (IR wave), the second is the reflected Rayleigh wave (RR wave) from the top of the crack, the third is Rayleigh wave (RS) from the S wave reflected from the bottom of the crack to the interference on the surface. The above simulation results show that the interaction between the laser-induced Rayleigh wave and the surface crack occurs at the top and bottom of the crack, and the interaction is transmitted to the model surface. Therefore, by taking the observation point A on the model surface, the surface crack depth can be quantitatively estimated according to its waveform, specifically speaking, the bottom depth of crack parameter L can be obtained by using the time difference between RR and RS waves. For the observation point A, the calculation results show the y-direction displacement diagram of the observation point when the values of parameter L are taken in different cases (see Fig. 5). It can be seen from the waveform that P-wave should have the highest speed. The first wave to reach point A is the longitudinal wave (i.e. P-wave) moves along the surface wave (SL wave), followed mainly by IR wave, RR wave, and RS wave. Therefore, the influence of parameter L on waveform is only reflected in the later period after RR wave arrives, not in the previous period
222
C. Hua and T. Takiguchi
Fig. 4 Laser-induced Rayleigh waves (The full-field displacement diagram of the model): a the IR wave is generated and propagates along the surface; b the IR wave meets the crack; c the IR wave splits into RR wave and the Rayleigh wave propagates downward along the crack; d TS wave is generated; e TS wave propagates around the model; f RS wave is generated
before as shown in Fig. 5. As the parameter L increases, the time interval Δt of RR and RS wave also increases, which refers to the time delay between the two peaks of the reflected Rayleigh waves. Actually according to the empirical formula based on experiment from Cooper et al. (1986), the time interval Δt can be obtained as follows: t =
L vr
+
L vs cosθ
−
Ltanθ , vr
(8)
where vr is the velocity of Rayleigh wave, vs is the velocity of S-wave. Although θ is dependent on material properties, θ can be approximately regarded as 30° in most materials (Cooper et al. 1986). We think that this formula may have theoretical
Fig. 5 Waveform of point A at different values of crack parameter L
Displacement in Y direction at different crack heights
Quantitative Estimation of Crack on or Near Surface Using …
223
SL wave
IR wave RSL wave
RS
RR
Represents RS wave. ˂t represents the time difference between RR and RR waves.
Time (µs)
research value for inverse problems. It is suggested that values of the bottom depth parameter L can be quantitatively evaluated by inversion based on Eq. (8) as well as based on numerical approaches.
3.2 Numerical Results of Laser-Excited Rayleigh Wave for Detection Crack Near Surface The propagation of laser-excited Rayleigh wave is numerically studied when it encounters near-surface cracks (the top depth parameter D is not 0). We consider the case that the bottom depth parameter L of the crack is unchanged, which means the bottom of the crack is fixed. When the value of parameter D is larger, the amplitude of TS wave becomes smaller, that is, TS wave is less obvious, while the amplitude of both the RR and RS waves change (see Fig. 6). The numerical results indicate that the arrival time of RR wave at the observation point A is advancing with the increase of parameter D, while the arrival time of RS wave is not affected. An approximate linear relationship between the arrival time of RR wave and parameter D can be obtained by extracting waveform information and fitting as shown in Fig. 7. But, the amplitude of RR wave increases first and then decreases with the parameter D as shown in Fig. 8. We try to illustrate why the amplitude of RR wave has such a phenomenon with Fig. 9. Figure 9 tries to
224
C. Hua and T. Takiguchi
TS wave
RS wave RR
IR wave
Time (µs) Fig. 6 Waveform diagram of the point A at different top depth parameter D (when L = 2 mm fixed)
Fig. 7 The arrival time of RR wave versus parameter D
Time of arrival (µs)
explain their interactions between Rayleigh wave and near-surface crack with the FEM numerical results and the reason can be qualitatively conjectured: when D increases, the portion of IR wave transmitted at the crack increases, while the other portion down along the crack decreases. The changes of these two Rayleigh waves jointly affect the amplitude and peak value of RR wave; therefore, it shows the characteristics of first increasing and then decreasing. When the top depth parameter D is fixed at 0.1 mm, the waveforms of observation point A under different parameter L are shown in Fig. 10. Figure 10 shows that the arrival time of RR wave is almost the same for different L, but the amplitude increases with the increase of L. This also confirms the validity of Eq. (8) in the numerical sense. Thus, it is suggested that the parameter L of near-surface crack can be quantitatively evaluated by inversion based on numerical approaches. The above numerical results indicate that the arrival time of RR wave is only related to the top depth D, and the amplitude of RR wave is related to both D and L.
The top depth D
The RR wave amplitude
Quantitative Estimation of Crack on or Near Surface Using …
The top depth D Fig. 8 The amplitude of RR wave versus parameter D
Y displacement (nm)
Fig. 9 Interaction diagram of Rayleigh waves with near-surface crack
Time (µs) Fig. 10 The waveform plot of observation point A with different parameter L
225
226
C. Hua and T. Takiguchi
The top depth D can be estimated according to the arrival time of RR wave on the waveform, and the bottom depth L can be estimated according to the time difference between RS wave and RR wave.
4 Conclusions In this study a FEM model of laser-excited Rayleigh wave is established, and the relationship between mainly Rayleigh wave and surface cracks is numerically analyzed. With this FEM model, a possible method for quantitative estimation of the depth of crack on or near-surface is proposed. The depth of crack can be obtained based on the waveform of the observation point, which is consistent with the results of Cooper et al. According to the waveform signal, the depth of surface or near-surface crack can be quantitatively estimated, and the conclusions are listed as follows: (1) Arrival time of the RR wave is approximately linear with the top depth D, and independent of the bottom depth L. (2) The amplitude of RR wave is affected by both top depth D and bottom depth L: if the bottom depth remains unchanged, when the top depth increases, the amplitude of RR wave first increases and then decreases. Finally, from the viewpoint of Appendix, it is also important to determine the surface position P (see Fig. 2), under which the crack locates. This problem will be taken as our next work together with the theoretical research on Eq. (8). Acknowledgements The authors would like to thank the anonymous reviewers for their constructive comments and suggestions which have helped to improve this paper significantly. The authors would also like to express their special thanks to Professor Jun Huang for his enthusiastic help in graphic drawing, and former graduate student Haoyu Wang for his supplementary calculation work on FEM. The work has been partially supported by Shanghai Key Laboratory for Acupuncture Mechanism and Acupoint Function (No. 21DZ2271800) and Natural Science Foundation of Shanghai (No. 17ZR1402800).
Appendix: Application to Prevention of Concrete Exfoliation Our theory can be applied to prevent the exfoliation of the concrete structures. When a concrete structure becomes old, a drop of an exfoliated piece of concrete can happen and it may cause a serious accident. In fact, it did in Japan a few years ago. In order to prevent such accidents, we have to a priori know where the concrete exfoliation likely happens. The main (and almost all) cause of concrete exfoliation is the existence of cavities in the concrete structure close to its surface. Not so long ago, it was very usual to apply hammering test by human to detect such cavities at huge labor cost. In order to reduce such labor cost, West Nippon Expressway Shikoku Company Limited, a Japanese expressway maintenance company, developed an excellent NDI device called J-SYSTEM (https://www.w-e-shikoku.co.jp/wp-content/uploads/2021/07/4ec
Quantitative Estimation of Crack on or Near Surface Using …
227
dce3ad2493561923042109a9fe2ed.pdf). J-SYSTEM consists of a thermal camera and a computer. The computer analyzes the thermal pictures to detect where critical cavities close to the surface locate, in whose algorithm, deep learning by artificial intelligence is applied. By NDI with J-SYSTEM, all critical cavities near the surface are completely detected and its running cost is much cheaper than the hammering test. Therefore, we can conclude that NDI with J-SYSTEM is very superior. Let us shortly review the outline of this NDI. The concrete structure gets warm in daytime by sunshine and its surface radiates heat in the evening when the temperature becomes low, which makes differences in the surface thermal pictures if there are cavities near the surface, which is photographed and analyzed by J-SYSTEM. By this character, NDI by J-SYSTEM is passive and it is possible for only limited period of time in a day and the length of possible period of time depends on the season and the weather. J-SYSTEM includes a subsystem (with a specimen) to judge when its NDI is possible, which is another superior point of J-SYSTEM. We, however, cannot inspect all concrete structures by this device. NDI by J-SYSTEM is impossible in the area where the sunshine is not sufficient in daytime. For its complement, some active NDI technique is necessary. We claim that our theory can be applied for this purpose. In this inspection, we do not have to detect the depth of the cavity (the parameter L). What we try to do is to give a rough sketch of the cavities within the depth (the parameter D) less than 1 cm. In this detection, we need not precisely detect the location of the cavity. Its rough sketch is sufficient. Since, in the practical prevention of concrete exfoliation, when the critical cavities near the surface are found, they get rid of the surface part near the cavity by hammering before it exfoliates, where the rough sketch of the cavity is sufficient. For our theory to be applied to prevent the exfoliation of the concrete structures, it is favorable to give a rough sketch of the cavities with D less than 1 cm near the whole one surface of the structure with as less laser ultrasonic waves as possible, for which our theory should be modified and generalized. It is our next problem from the viewpoint of practical applications.
References Abraham O, Piwakowski B, Villain G et al (2012) Non-contact, automated surface wave measurements for the mechanical characterisation of concrete. Constr Build Mater 37:904–915 Aggelis DG, Leonidou E, Matikas TE (2012) Subsurface crack determination by one-sided ultrasonic measurements. Cement Concr Compos 34(2):140–146 Chen K, Fu X, Dorantes-Gonzalez DJ et al (2013) Laser-generated surface acoustic wave technique for crack monitoring—A review. Int J Autom Technol 7(2):211–220 Chung J, Hulbert GM (1993) A time integration algorithm for structural dynamics with improved numerical dissipation: the generalized-α method. J Appl Mech 60(2):371–375 Cooper JA, Dewhurst RJ, Palmer SB et al (1986) Characterization of surface-breaking defects in metals with the use of laser-generated ultrasound [and discussion]. Philos Trans R Soc Lond 320:319–328 Dai Y, Qiang B, Xu et al (2010) Finite element modeling of the interaction of laser-generated ultrasound with a surface-breaking notch in an elastic plate. Optics Laser Technol 42(4):693–697
228
C. Hua and T. Takiguchi
Han S, Lian Y, Xie L, Hu Q, Ding J, Wang Y, Lu Z (2022) Numerical simulation of angled surface crack detection based on laser ultrasound. Front Phys 10:982232 Hashimoto K, Shiotani T, Nishida T et al (2017) Application of elastic-wave tomography to repair inspection in deteriorated concrete structures. J Disaster Res 12(3):496–505 Jeong H (2005) Finite element analysis of laser-generated ultrasound for characterizing surfacebreaking cracks. J Mech Sci Technol 19(5):1116–1122 Jian X, Dixon S, Guo N, Edwards R (2007) Rayleigh wave interaction with surface-breaking cracks. J Appl Phys 101(6):064907 J-SYSTEM. https://www.w-e-shikoku.co.jp/wp-content/uploads/2021/07/4ecdce3ad249356192 3042109a9fe2ed.pdf, https://www.w-e-shikoku.co.jp/wp-content/uploads/2021/07/71fae6912 be864fbcc7618aa91fe7fe0.pdf Matsuda Y, Nakano H, Nagai S et al (2006) Surface breaking crack evaluation with photorefractive quantum wells and laser-generated Rayleigh waves. Appl Phys Lett 89(17):324–356 Mita N, Takiguchi T (2018) Principle of ultrasonic tomography for concrete structures and nondestructive inspection of concrete cover for reinforcement. Pacific J Math Ind 10(1):1–10 Scruby CB (1989) Some applications of laser ultrasound. Ultrasonics 27(4):195–209 Scruby CB, Drain LE (1990) Laser ultrasonics: techniques and applications. Adam Hilger Sun J, Shengwen QI, Zhang H (2004) Application of Rayleigh wave detection in nondestructive testing for engineering. J Eng Geol 12(S1):427–432 Viktorov IA (1967) Rayleigh and lamb waves. Plenum Press, New York, pp 1–122 White RM (1963) Elastic wave generation by electron bombardment or electromagnetic wave absorption. J Appl Phys 34(7):2123–2124
Enclosure Method for Inverse Problems with the Dirichlet and Neumann Combined Case Mishio Kawashita and Wakako Kawashita
Abstract In this article, a brief survey of the enclosure method for detecting obstacles is given from the viewpoint of shortest length. After that, we consider the problems for “combined case,” for example, on some boundaries of cavities the Dirichlet boundary condition is satisfied and on other boundaries of cavities the Neumann boundary condition is satisfied. The difficulties with these cases come from the fact that the sign of the indicator function with a parameter is not definite when the parameter is large enough. The combined cases are classified into two types: separated case and non-separated case. The method of elliptic estimates developed by Ikehata works for separated case but does not work well for non-separated case. Hence, we refer to the way for obtaining the shortest length from the indicator function for two types of the combined cases. Keywords Enclosure method · Shortest length · Combined case · Indicator function · Cavities
1 Introduction In our daily life, many people would like to know the inside of a human body or the position of subterranean cave by observation in the outside without invasion or destruction. Inverse problems come from these fundamental issues in the modern science. Calderón (1980) formulated one of such problem estimating conductivity Partly supported by JSPS KAKENHI Grant Number JP19K03565. Partly supported by JSPS KAKENHI Grant Number JP20K03684. M. Kawashita (B) Mathematics Program, Graduate School of Advanced Science and Engineering, Hiroshima University, Higashihiroshima 739-8526, Japan e-mail: [email protected] W. Kawashita Electrical, Systems, and Control Engineering Program, Graduate School of Advanced Science and Engineering, Hiroshima University, Higashihiroshima 739-8527, Japan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 T. Takiguchi et al. (eds.), Practical Inverse Problems and Their Prospects, Mathematics for Industry 37, https://doi.org/10.1007/978-981-99-2408-0_14
229
230
M. Kawashita and W. Kawashita
from observation data on the outside boundaries as the problem identifying the coefficients of elliptic boundary value problems. This is called inverse boundary value problems (IBP for short). After this pioneering work, many researchers investigated inverse problems for the differential equations like IBP. Many types of inverse problems are formulated and studied, and time-dependent problems are also investigated widely. The main subjects on inverse problems are divided into three types: uniqueness, stability, and reconstruction. Above all, reconstruction is the important problems, and many methods are developed, for example, probe method, no-response test, linear sampling method, and singular source method. Those methods are different in the treatments and arguments, but it is showed that they are equivalent to each other in mathematics essentially (cf. Honda et al. 2008 and Nakamura, Potthast and Sini 2006, for the various methods see, e.g., Nakamura and Potthast 2015). As concerns reconstruction, the enclosure method is different from above, which is investigated by Ikehata (1999, 2000) as a new method for the estimation of cavities in some elliptic boundary value problems. Nowadays many researchers employ the enclosure method for many problems, for example, time dependent problems, nonlinear problems, and so on. In the enclosure method, we introduce indicator functions with large parameter. The indicator function is defined from weak formulations for the boundary value problem, and we use a special solution with a parameter for the dual problem. From analysis of the indicator function for large parameter, in some case we can obtain all the tangential planes for the cavities and a convex hull. This is an origin of the name “enclosure method.” The enclosure method is also useful for time dependent problems. In Sect. 1.1, we begin a survey of the studies on the enclosure method for the heat equation. We explain how the problems are handled and what is obtained by the enclosure method for the heat equation. Ikehata also developed the enclosure method for various problems formulated by hyperbolic equations. Since his serial works are too huge to introduce all by the present authors, we just look back on the enclosure method for the wave equation concerning obstacle scattering in Sect. 1.2. After that, we explain the authors’ present interests in the enclosure method for obstacle scattering problems.
1.1 A Brief Review on the Enclosure Method for Heat Equations The study of enclosure method for time-dependent problems is started from consideration for IBP of one-dimensional heat equation (Ikehata 2007), and extended to the three-dimensional case by Ikehata and Kawashita (2009, 2010, 2014b). Here we restrict the problems to detection of cavities and let us introduce considerations on Ikehata and Kawashita (2010, 2014b) (in Ikehata and Kawashita 2010 inclusion case is treated, but cavities case can be treated in the same way).
Inverse Problems for the Combined Case
231
Let ⊂ IR3 be a set of background media and D be the union of all cavities with D ⊂ . We assume that the temperature distribution u satisfies ⎧ ⎨ (∂t − )u(t, x) = 0 in (0, T ) × ( \ D), νx · ∇x u(t, x) = 0 on (0, T ) × ∂ D, ⎩ u(0, x) = 0 on \ D, where νx = t ((νx )1 , (νx )2 , (νx )3 ) is the unit outer normal of ∂ D from the D-side. Let f (t, x) be a heat flow on the boundary ∂ as an input for this IBP. This is expressed by νx · ∇x u(t, x) = f (t, x) on (0, T ) × ∂. Here we use the same notation νx to express the unit outer normal of x ∈ ∂ from the -side. For this heat flow, we measure the temperature u(t, x) in ∂ from observation time t = 0 to t = T . On the IBP for the heat equation, our problem is to estimate D from the measurement data ( f, u). The indicator function Iτ (τ >> 1) is defined by using a solution v for ( − τ 2 )v(x, τ ) = 0 in as follows: T
νx · ∇x ψτ (t, x)u(t, x) − ψτ (t, x) f (t, x) d Sx dt,
Iτ = 0 ∂
where ψτ (t, x) = e−τ t v(x, τ ) is a solution for the dual problem (∂t + )ψτ = 0. From the form of the indicator function we know that there exist constants l > 0, α1 ≤ α2 and C, C > 0 such that 2
Iτ = p(τ )e−τl , Cτ α1 ≤ p(τ ) ≤ C τ α2 (τ 1),
(1)
namely the indicator function decays exponentially. We can say that the enclosure method for time-dependent problems is an effective use of this property. If we obtain (1), it follows that lim e
τ →∞
τT
Iτ =
0 (l > T ), ∞ (l < T ),
lim
τ →∞
1 log |Iτ | = −l. τ
Here, it is important to know what kind of information on D is meant by l. From Ikehata and Kawashita (2009, 2010, 2014a), we clarify that l express the shortest length of the considering problems. Note that the shortest length should be changed depending on the function v chosen in the definition of Iτ . The way to obtain the shortest length l is roughly classified into two categories: one is by the estimate of Iτ obtained by using estimates of the solutions for the elliptic boundary value problem, which is called “the method of elliptic estimates” in this article, (cf. Ikehata and Kawashita 2009, 2010) and the other is by an approximate solution for
232
M. Kawashita and W. Kawashita
boundary value problem, which is called “the method of approximation” (cf. Ikehata and Kawashita 2014b). Here we set T w(x, τ ) =
e
−τ 2 t
T u(t, x)dt, g(x, τ ) =
0
where w satisfies
e−τ t f (t, x)dt, 2
0
⎧ in \ D, ⎨ ( − τ 2 )w(x, τ ) = 0 νx · ∇x w(x, τ ) = g(x, τ ) on ∂, ⎩ on ∂ D. νx · ∇x w(x, τ ) = 0
(2)
In Ikehata and Kawashita (2010), v is chosen as a solution of
( − τ 2 )v(x, τ ) = 0 in , νx · ∇x v(x, τ ) = g(x, τ ) on ∂.
Here we note that νx · ∇x (w − v) = 0 on ∂. In this case, from the elliptic estimates,1 we can show that there exists a constant C > 0 such that 2 C −1 ∇x v2L 2 (D) + O(τ −1 e−T τ ) ≤ Iτ (τ 1), (3) 2 Iτ ≤ C(∇x v2L 2 (D) + τ 2 ∇x v2L 2 (D) ) + O(τ −1 e−T τ ) (τ 1). To obtain (1) from (3), we need asymptotic behavior of v as τ → ∞. Here, using potential theory, from the expression of v by integral on ∂ we obtain v(x, τ ) =
1 2π
∂
e−τ |x−y| g(y, τ ) + O(τ −1 ) d S y (τ → ∞). |x − y|
(4)
From (4) and (3), it follows that l of (1) means l=2
inf
x∈D,y∈∂
|x − y| = 2dist(D, ∂)
(cf., e.g., Theorem 1.1 of Ikehata and Kawashita (2010) for inclusions). In Ikehata and Kawashita (2014b), choosing a point p ∈ IR3 \ , we define Iτ 1 e−τ |x− p| by another v(x, τ ) = 2π . In this case, since νx · ∇x (w − v) = 0 on ∂, the |x− p| method of elliptic estimates does not work well. Thus, by integration by parts, we have the expression of Iτ : Iτ =
1
1 2π
νx · ∇x ∂D
e−τ |x− p| w(x, τ )d Sx . |x − p|
To the best of the authors’ knowledge, this method is introduced by Ikehata (1998).
Inverse Problems for the Combined Case
233
Here we construct an approximation for solution w of (2). If the heat flow is supplied whole part of the boundaries ∂ and ∂ D, we can make an approximation of w by usual approach. However, as in (2), w satisfies νx · ∇x w(x, τ ) = 0 on ∂ D. This means that the heat flow is provided only on the outside boundary ∂, and there is no heat flow on the inside boundary ∂ D. Therefore, we cannot pick up the exponential decay term in (1) by the usual approximation method. In order to overcome this difficulty, we need the precise evaluation to the integral kernel obtained by Ikehata and Kawashita (2014a) when D is strictly convex. Then, in Ikehata and Kawashita (2014b) they obtain another expression
e−τ (|x− p|+|x−y|) G(x, y, τ )d Sx d S y
Iτ =
(5)
∂ D×∂
with an integral kernel G by using the potential theory. From the form of the exponent in (5), we can expect the shortest length l in this case is given by l=
inf
x∈D,y∈∂
|x − p| + |x − y| .
(6)
To check this fact, we must meet the problem on the sign of Iτ . Since there are two cases that G(x0 , y0 , τ ) > 0 and G(x0 , y0 , τ ) < 0 for points (x0 , y0 ) attaining l, the situation is not so simple. But under some condition, the estimate of the indicator function as (1) is obtained (cf. section 6 of Ikehata and Kawashita 2014b). Eventually, we know that (6) gives the shortest length l (cf. Theorems 1.4 and 2.1 of Ikehata and Kawashita 2014b, for the case that D consists of several convex cavities, see Kawashita 2017). As stated above, we know that there is a close relation between the fact that l of (1) is the shortest length and the asymptotic behavior on the parameter of the solution for the reduced boundary value problem obtained by Laplace transform. In 1960s, Varadhan (1967) already used the asymptotic behavior of the solution for reduced problem to obtain the short time asymptotics of the heat kernel. It is beyond our expectations that this idea reappears on the stage of inverse problems in the enclosure method. In Varadhan (1967), heat flow is supplied whole part of the boundary. However, in the above case, the heat flow is supplied only on the outside boundary, and the heat flow vanishes on the inside boundary. Even in such a case, we can obtain asymptotic behavior of the solution for the reduced problem with large parameter by the kernel estimate given in Ikehata and Kawashita (2014a) (cf. Ikehata and Kawashita 2018).
1.2 The Enclosure Method for Obstacle Scattering Next, we proceed to discuss the enclosure method for the wave equation. Here, we consider exterior scattering problems by obstacles, which is more intuitive setting for
234
M. Kawashita and W. Kawashita
wave phenomena. In the exterior scattering problem for wave equations, we estimate the forms of obstacles by emitting waves from an outside domain of obstacles and observing the reflected wave in somewhere. The “shortest length" of this problem (it may be shortest time as “optical length”) is given by the shortest path connecting three points: emitting point of incident wave, hitting point of the objective boundary, and observing point of the reflected wave. Thus, the appearance of “the shortest length” is not peculiar to the IBP of heat equations. Even if we restrict subjects to the exterior scattering problem of a wave equation, Ikehata gave many researches and the shortest length is obtained from the indicator function (for the case of the Neumann boundary condition, see Ikehata 2010, for the case of the Robin type boundary condition with dissipation terms, see Ikehata 2012, and for the case of the Dirichlet boundary condition and bistatic data (i.e., the place where incident waves are emanated is different from the place where reflected waves are measured), see Ikehata 2013). Thus, each case of Dirichlet boundary condition, Neumann type boundary condition, and dissipative boundary condition is studied well, respectively. However, all of them satisfy “monotonicity condition", namely, when there are two or more cavities, each cave gives the same sign for the indicator function. Moreover, the method of elliptic estimates plays an important role in those studies. When we consider the case that the Dirichlet boundary condition is satisfied on some boundaries of cavities and on other boundaries of cavities Neumann type boundary condition is satisfied, the sign of the indicator function is not definite. Since this situation may cause cancelation, we need an additional consideration for this case. We call this situation “the combined case,” hereafter. This is the main theme in this manuscript.
1.3 Problems for the Combined Case Let us find the shortest length (or time) for time-dependent problems in the combined case. If we employ the method of elliptic estimates developed by Ikehata, it is effective in a certain combined case where the shortest length is attained only by cavities giving the same sign for the indicator function. We call the case stated above “the combined but separated case” (or for short “the separated case”). By the method of elliptic estimates we can obtain the same result as the former works of Ikehata. In Sect. 2, setting and results for the combined case will be showed. On the other hand, if it is not separated case, we call the case “the combined and non-separated case”(or for short “the non-separated case”). Since non-separated situation can cause cancelation of indicator, it seems that it is difficult to manage this case only by the method of elliptic estimates. Therefore, we construct an approximate solution to obtain the behavior of the indicator function. In Sect. 2.2, we will just describe a result for the non-separated case without proof.
Inverse Problems for the Combined Case
235
2 The Main Results for the Combined Case Let D + and D − be disjoint unions of bounded open sets with C 0,1 (i.e., Lipschitz) boundaries ∂ D + and ∂ D −2 . We put D = D + ∪D − and = IR3 \ D. We denote νx = t ((νx )1 , (νx )2 , (νx )3 ) by the unit outer normal of ∂ D from the D-side. Hereafter, we always write as the exterior domain = IRn \ D. This usage of the notations is different from those in Sect. 1.1. For the exterior domain = IR3 \ D, the unit outer normal vector of ∂ at x ∈ ∂ from -side is given by −νx since ∂ = ∂ D and is the exterior domain of D. Take T > 0, and consider the following “combined case” for the wave equation: ⎧ 2 (∂ − γ0 )u(t, x) = 0 ⎪ ⎪ ⎨ t (∂νx ,γ0 − λ1 (x)∂t − λ0 (x))u(t, x) = 0 u(t, x) = 0 ⎪ ⎪ ⎩ u(0, x) = 0, ∂t u(0, x) = f (x)
in (0, T ) × , on (0, T ) × ∂ D + , on (0, T ) × ∂ D − , on ,
(7)
where γ0 > 0 is a constant, ∂νx ,γ0 = γ0 νx · ∇x is the conormal derivative of γ0 , and λ j ∈ L ∞ (∂ D + ) ( j = 0, 1) are real-valued functions such that λ1 (x) ≥ 0 a.e. on ∂ D + . Thus, on ∂ D + (resp. ∂ D − ), the Neumann type (resp. Dirichlet) boundary conditions are imposed. We try to clarify the relation between the dissipative coefficient √ λ1 (x) and the speed γ0 > 0 of waves. Therefore, we do not set γ0 = 1 through this paper. Take an open set B with B ⊂ , and put f ∈ L 2 (IR3 ) ∩ L ∞ (IR3 ) satisfying
f ∈ L 2 () with supp f = B and there exists a constant c1 > 0 such that f (x) ≥ c1 (x ∈ B) or − f (x) ≥ c1 (x ∈ B).
(8)
This condition (8) ensures that waves are emanated from B exactly. Hence, we call (8) the emission condition from B. For this initial data f , we measure waves on B from the initial time 0 to T . Thus, the measurement is given by u(t, x) for 0 ≤ t ≤ T and x ∈ B. Hence, in this setting, the inverse problem to be considered is to find information of the cavities D ± from this measurement. 1 1 − − − We put H0,∂ D − () = {u ∈ H () | u|∂ D = 0 on ∂ D }, where u|∂ D is the trace on ∂ D − . In what follows, the usual trace operator is omitted if it does not make confusion. We set ψ, ϕ = ψ, ϕ(H 1 ()) ×H 1 () . It is well known that for any 1 f ∈ L 2 (), there exists a unique weak solution u ∈ C 0 ([0, T ]; H0,∂ D − ()) of (7) 2 2 2 1 with ∂t u ∈ C([0, T ]; L ())
and ∂t u ∈ L (0, T ; (H ()) ) satisfying u(0, x) = 0, ∂t u(0, x) = f (x), and dtd ∂ D+ λ1 (x)u(t, x)ϕ(x)d Sx is well-defined and
The cavities D ± may not be connected. For example, they consist of union of finitely many disjoint bounded connected sets.
2
236
M. Kawashita and W. Kawashita
∂t2 u(t, ·), ϕ +
γ(x)∇x u(t, x) · ∇x ϕ(x)d x +
+
d dt
λ1 (x)u(t, x)ϕ(x)d Sx ∂ D+
λ0 (x)u(t, x)ϕ(x)d Sx = 0 a.e. t ∈ (0, T )
∂ D+ 1 for all ϕ ∈ H0,∂ D − () (cf. for example Dautray and Lions 1992 Chap. 18, Sects. 5 and 6). We consider weak solutions in this class. As in the usual approach of the enclosure method developed by Ikehata (2010), Ikehata (2012) and Ikehata (2013), we introduce the indicator function Iτ defined by
Iτ =
f (x)(w(x; τ ) − v(x; τ ))d x,
(9)
where
T w(x; τ ) =
e−τ t u(t, x)dt
(x ∈ )
0
and v(·; τ ) ∈ H 1 (IR3 ) is the weak solution of (γ0 − τ 2 )v(x; τ ) + f (x) = 0
in IR3
(10)
having the kernel representation: v(x; τ ) =
√
1 e−τ |x−y|/ γ0 . τ (x, y) f (y)dy with τ (x, y) = 4πγ0 |x − y|
(11)
Notice that Iτ is obtained from the measurement u(t, x) for 0 ≤ t ≤ T and x ∈ B. Our purpose in this article is to obtain “the shortest length” in the combined case from the asymptotic behavior of the indicator function Iτ as τ → ∞.
2.1 A Result for the Combined but Separated Case Assume that D + is divided into two parts D +,+ and D +,− having the following properties:
D + = D +,+ ∪D +,− , D +,+ ∩D +,− = ∅, there exists a constant μ1 > 0 √ such that ∓ (λ1 (x) − γ0 ) > μ1 a.e. on ∂ D +,±.
We put
(12)
Inverse Problems for the Combined Case
l0− = min
237
dist(D +,− , B) dist(D − , B) dist(D +,+ , B) , l0+ = , . √ √ √ γ0 γ0 γ0
(13)
In the above definitions, we use a convention dist(∅, B) = ∞, which means that we do not consider the distances if there is no corresponding set. For example, if D does not have D +,− (i.e., D +,− = ∅), we exclude dist(D +,− , B) in (13). We call the sets D − and D +,− “minus group”, and the set D +,+ “plus group”, respectively. Heterogeneous parts D − and D +,− belonging to the minus group bring Iτ to negative if τ > 1 goes to infinity. Oppositely, the sets in the plus group bring Iτ to positive. For the solution v of (10), as in Kawashita and Kawashita (2021) for example, we have the following. Proposition 1 Let U and B be bounded sets in IR3 with U ∩B = ∅. Suppose that U and B have C 0,1 boundary ∂U and ∂ B. Then, there exists a constant C > 0 such that for any τ ≥ 1, C −1 τ −7 e C
−2 √τγ dist(U,B) 0
τ −1 −7 −2 √γ0 dist(U,B)
τ
e
≤ ∇x v(·; τ )2L 2 (U ) ≤ Cτ 2 e
−2 √τγ dist(U,B) 0
τ 2 −2 √γ0 dist(U,B)
≤ τ 2 v(·; τ )2L 2 (U ) ≤ Cτ e
(τ ≥ 1), (τ ≥ 1),
where v is the solution of (10) with f satisfying (8). We set I + (v) = ∇x v2L 2 (D+,+ ) + τ 2 v2L 2 (D+,+ ) , I − (v) = ∇x v2L 2 (D− ) + τ 2 v2L 2 (D− ) + ∇x v2L 2 (D+,− ) + τ 2 v2L 2 (D+,− ) . If all of D +,± , D − and B are C 0,1 boundaries, from Proposition 1 we have ±
C −1 τ −7 ≤ e2τl0 I ± (v) ≤ Cτ 2 (τ ≥ 1). Based on these estimates, we obtain the following information on cavities: Theorem 1 Suppose that ∂ D ± are C 0,1 class, D + satisfies (12). Then, the indicator function (9) satisfies the following properties: (1) For T < 2 min{l0+ , l0− }, limτ →∞ eτ T Iτ = 0. (2) For T > 2 min{l0+ , l0− }, if B is a convex set and (8) holds, then we have (ii)+ Under the condition l0+ < l0− , if λ1 (x) = 0 a.e. on ∂ D +,+ ,or ∂ D +,+ is C 1 class, then we have limτ →∞ eτ T Iτ = +∞. (ii)− Under the condition l0+ > l0− , if ∂ D +,− is C 1 class, then we have limτ →∞ eτ T Iτ = −∞. Further, in each of the cases (ii)+ and (ii)− , we also have
238
M. Kawashita and W. Kawashita
1 −l0+ for the case of (ii)+ , log |Iτ | = lim τ →∞ 2τ −l0− for the case of (ii)− . Remark 1 In Theorem 1, there is no restriction on B except B ⊂ . Since incident signals are emitted from B and reflected ones are caught on B, we can take B which is the place emitting and receiving signals as small as we want. To show Theorem 1, we can use the method of elliptic estimates. In the separated case, the indicator function Iτ has the following properties: Proposition 2 There exist constants C ±j > 0, C j > 0, δ j > 0 and τ j ≥ 1 ( j = 1, 2), and such that if l0+ < l0− , we have +
Iτ ≥ C1+ I + (v) − C1− I − (v) − C1 (τ e−2τ (l0 +δ1 ) + τ −1 e−τ T ) (τ ≥ τ1 ),
(14)
and if l0+ > l0− , we have
− −Iτ ≥ C2− I − (v) − C2+ I + (v) − C2 τ e−2τ (l0 +δ2 ) + τ −1 e−τ T (τ ≥ τ2 ). We can see that Theorem 1 is obtained by Propositions 1 and 2 immediately. Thus, the estimates in Proposition 2 are the key to show Theorem 1. These estimates are basically showed by using the argument of Ikehata (2010), Ikehata (2012) and Ikehata (2013). However, since we have some terms of boundary integrals that may cancel out in this case, we need to check that boundary integrals concerning with longer length are negligible.
2.2 A Result for the Combined and Non-separated Case If l0+ = l0− , it seems to be difficult to handle the problems by the method of elliptic estimates. In this case, we use the solution of ⎧ in , ⎨ (γ0 − τ 2 )w0 (x; τ ) + f (x) = 0 Bγ0 w0 (x; τ ) = 0 on ∂ D + , (15) ⎩ on ∂ D − , w0 (x; τ ) = 0 where Bγ0 w0 = ∂νx ,γ0 w0 − λ(x; τ )w0 and λ(x; τ ) = λ1 (x)τ + λ0 (x). We change the form of Iτ as follows: −1 −τ T Iτ = Iτ ,0 + O(τ e ) (τ → ∞), Iτ ,0 = f (w0 − v) d x. (16)
Inverse Problems for the Combined Case
239
Here, we handle Iτ ,0 by constructing an approximation of w0 . √Roughly speaking, w0 (x; τ ) has an integral kernel of the forms: A(x, y; τ )e−τ |x−y|/ γ0 (x ∈ , y ∈ B) with a function A(x, y; τ ) of polynomial order with respect to τ , and we can express Iτ ,0 as τ
Iτ ,0 =
dydξ
− √ (|x−y|+|x−ξ|) ˜ d Sx , A(x, y, ξ; τ )e γ0
(17)
∂D
B×B
˜ where A(x, y, ξ; τ ) is a function of polynomial order with respect to τ . In this form, we use the usual Laplace method (for example Ikehata and Kawashita 2014b, Lemma 4.3). We set l0I = dist(∂ D +,+ , B), l0I I = dist(∂ D +,− , B) and l0I I I = dist(∂ D − , B), and l0 = min α=I, I I, I I I l0α . The minimizing points attaining inf (x,y,ξ)∈B×B×∂ D (|x − y| + |x − ξ|) contribute to the leading part of Iτ ,0 . We introduce the sets of the minimizing points: I = {(x, y) ∈ ∂ D +,+ × ∂ B| |x − y| = l0I }, I I = {(x, y) ∈ ∂ D +,− × ∂ B| |x − y| = l0I I } and I I I = {(x, y) ∈ ∂ D − × ∂ B| |x − y| = l0I I I }. We denote by κ1,I (x) and κ2,I (x) the principal curvatures of ∂ D +,+ at x ∈ ∂ D +,+ with κ1,I (x) ≤ κ2,I (x). We also write κ j,I I (x) and κ j,I I I (x) ( j = 1, 2) with κ1,α (x) ≤ +,− curvatures and ∂ D − , respectively. κ2,α (x) (α = I I, I I I ) as the principal of ∂ D 2
We also set Aα (x) = j=1 κ j,α (x) + l α 1+a (α = I, I I, I I I ). For simplicity, 0 we take B as a ball with radius a > 0 for the non-separated case. Theorem 2 Assume that ∂ D + (resp. ∂ D − ) is C 3 (resp. C 4 boundary), and satisfies κ1,α (x) +
l0α
1 > 0 ((x, y) ∈ α for some y ∈ ∂ B, α = I, I I, I I I ). +a
(18)
Then all I , I I and I I I are finite sets, and λ1 ∈ C(∂{D}+ ) and f ∈ C 1 (B), there exists δ > 0 such that α 2 a2 πγ0 − √τγ 2l0α α ( f (y0 )) − 21 0 b Iτ = 4 + O(τ e (x ) ) α 0 τ α=I,I I,I I I (x α ,y α )∈ 2(l0α + a)2 Aα (x0α ) 0
0
α
+ O(e where bα (x0α ) =
√ γ0 −λ1 (x0α ) √ γ0 +λ1 (x0α )
√τ γ0
(2l0 +δ)
) + O(τ −1 e−τ T ) (τ → ∞),
for α = I, I I and b I I I (x0I I I ) = −1.
For α = I, I I, I I I , we set the ball B˜ α with radius l0α + a and the same center z 0 as of B. Notice that the amounts appeared in (18) stand for the smallest relative curvature of ∂ D and B˜ α at the tangent point x0 ∈ ∂ D ∩ B˜ α . Hence, Aα (x) are the relative Gauss curvatures. For example, in the case of α = I , the condition (18) says that D +,+ is in the outside of the ball B˜ I and ∂ D +,+ is tangent to B˜ I (see, e.g., Fig. 1). It seems to be hard to say which parts contribute mainly to the asymptotic behavior of Iτ given in Theorem 2. Consider the simplest case that λ1 (x) = 0 (x ∈ ∂ D + ) and
240
M. Kawashita and W. Kawashita
D+,+
the case ˜I B
κ1,I > 0
l0I a
D+,+
the case κ1,I < 0
˜I B
l0I a B
B Fig. 1 The conditions for the curvatures
l0I (resp. l0I I I ) attains only one pair of the point (x0I , y0I ) (resp. (x0I I I , y0I I I )). In this case, we have D +,− = ∅, and I and I I I are expressed by I = {(x0I , y0I )} and I I I = {(x0I I I , y0I I I )}. If l0I = l0I I I (= l0 ) and f (y) = 1 in B, Theorem 2 implies that the leading term of Iτ is given by
1 a2 1 πγ0 − √τγ 2l0 0 e − . τ 4 2(l0 + a)2 A I (x0I ) A I I I (x0I I I ) Thus, the boundary which the relative Gauss curvature is smaller gives the dominant part for the asymptotic behavior of Iτ as τ → ∞. Notes and Comments. In this article, we only make an announcement for Theorems 1 and 2. The details of the proofs will be given in forthcoming articles which is in preparation. For Theorem 1, we can handle the combined cases of both cavities and inclusions.
3 Outline of the Proofs In the rest of this article, we introduce an outline of the proof for (14), which is used to obtain (ii)+ of Theorem 1. In what follows, for simplicity, we assume that ∂ D is C 2 class, and λ1 , λ0 ∈ C 1 (∂ D + ), which gives w(·; τ ) ∈ H 2 () and ∂νx ,γ0 w ∈ H 1/2 (∂ D) as usual trace sense. For an open set U ⊂ IR3 , we put ψ Hτ1 (U ) = {∇x ψ2L 2 (U ) + τ 2 ψ2L 2 (U ) }1/2 (ψ ∈ H 1 (U ), τ ≥ 1) and g Hτ1/2 (∂U ) = {g2H 1/2 (∂U ) + τ g2L 2 (∂U ) } 2 (g ∈ H 1/2 (∂U ), τ ≥ 1). We define Hτ−1/2 (∂U ) as the dual space of Hτ1/2 (∂U ) with the norm 1
g Hτ−1/2 (∂U ) = sup{g, ϕ∂ D | ϕ ∈ H 1/2 (∂U ), ϕ Hτ1/2 (∂U ) = 1}. Recall the usual trace estimates with large parameter. Proposition 3 Assume that U ⊂ IR3 is a domain with compact and C 2 boundary ∂U . Then, there exists a constant C > 0 such that √ τ ϕ L 2 (∂U ) ≤ Cϕ Hτ1 (U )
(τ ≥ 1, ϕ ∈ H 1 (U )).
Inverse Problems for the Combined Case
241
For a domain K and ψ, ϕ ∈ H 1 (K ), we introduce A K ,τ (ψ, ϕ) = τ 2 ψ, ϕ K + a K (ψ, ϕ), a K (ψ, ϕ) = ψ, ϕ K =
ψ(x)ϕ(x)d x,
bλ+ (ψ, ϕ) =
γ0 ∇x ψ(x) · ∇x ϕ(x)d x, K
λ(x; τ )ψ(x)ϕ(x)d Sx , ∂ D+
K
and ψ, ϕ∂ K = ψ, ϕ H −1/2 (∂ K )×H 1/2 (∂ K ) . Notice that ψ, ϕ K = ψ, ϕ(H 1 (K )) ×H 1 (K ) for ψ, ϕ ∈ L 2 (K ). We give a basic and well-known estimate for the weak solution w1 (x; τ ) of ⎧ ⎨ (γ0 − τ 2 )w1 (x; τ ) = F1 (x; τ ) Bγ w1 (x; τ ) = G 1 (x; τ ) ⎩ 0 w1 (x; τ ) = 0
in , on ∂ D + , on ∂ D − .
(19)
Proposition 4 For F1 ∈ L 2 () and G 1 ∈ L 2 (∂ D + ), the weak solution 1 w1 ∈ H0,∂ D − () of (19) satisfies w1 2Hτ1 () ≤ Cτ −2 {F1 2L 2 () + τ G 1 2L 2 (∂ D+ ) } (τ ≥ τ1 ), where the constants C > 0 and τ1 ≥ 1 depend only on ∂ and λ0 L ∞ (∂ D+ ) . Proof From the definition of the weak solution of (19), it follows that min{γ0 , 1}w1 2Hτ1 () ≤ A,τ (w1 , w1 ) = −F1 , w1 − G 1 w1 d Sx − τ bλ+1 (w1 , w1 ) − bλ+0 (w1 , w1 ). ∂ D+
Since λ1 (x) ≥ 0 a.e. on ∂ D + , we have bλ+1 (w1 , w1 ) ≥ 0. Proposition 3 yields that −
G 1 w1 d Sx − bλ+0 (w1 , w1 ) ≤
∂ D+
1 G 1 2L 2 (∂ D+ ) 4δτ
λ0 L ∞ (∂ D+ ) w1 2Hτ1 () . +C δ+ τ
Hence, if we take τ1 = max{1, δ −1 λ0 L ∞ (∂ D+ ) }, there exists a constant C˜ > 0 such that min{γ0 , 1}w1 2Hτ1 () ≤
1 F1 2L 2 () + τ G 1 2L 2 (∂ D+ ) 2 4δτ 2 ˜ + Cδw 1 Hτ1 () for τ ≥ τ1 .
242
M. Kawashita and W. Kawashita
˜ −1 min{γ0 , 1} > 0, we obtain Proposition 4. Taking δ = (2C)
Set F1 (x; τ ) = e−τ T (∂t u(T, x) + τ u(T, x)) and G 1 (x; τ ) = λ1 (x)e−τ T u(T, x). Then, it follows that w − w0 satisfies (19) for these F1 and G 1 from (10) and (15). Hence, Proposition 4 implies (16), and it suffices to show (14) for Iτ ,0 . We also need some basic and well-known elliptic estimates for the solutions of the inhomogeneous Dirichlet boundary value problem in D −
(γ0 − τ 2 )v − (x; τ ) = 0 v − (x; τ ) = g − (x)
in D − , on ∂ D −
(20)
and the inhomogeneous Neumann boundary value problem in D +
(γ0 − τ 2 )v + (x; τ ) = 0 Bγ0 v + (x; τ ) = g + (x)
in D + , on ∂ D + .
(21)
Lemma 1 For any τ ≥ 1 and g − ∈ H 2 (∂ D − ) (resp. g + ∈ H − 2 (∂ D + )), there exists a unique weak solution v − (·; τ ) ∈ H 1 (D − ) of (20) (resp. v + (·; τ ) ∈ H 1 (D + ) of (21)) satisfying 1
1
A D± ,τ (v ± , v ± ) = ∂νx ,γ0 v ± (·; τ ), v ± (·; τ )∂ D± and there exists a constant C > 0 independent of τ such that for any τ ≥ 1, C −1 g ± Hτ∓1/2 (∂ D± ) ≤ v ± Hτ1 (D± ) ≤ Cg ± Hτ∓1/2 (∂ D± ) , Bγ0 v + Hτ−1/2 (∂ D+ ) ≤ Cv + Hτ1 (D+ ) . Since f (w0 − v) = w0 (τ 2 − γ0 )v − v(τ 2 − γ0 )w0 = div(γ0 v∇x w0 − γ0 w0 ∇x v) from (10) and (15), we have Iτ ,0 = ∂νx ,γ0 v w0 − ∂νx ,γ0 w0 v d Sx ∂D
= ∂νx ,γ0 v, w0 ∂ D+ − bλ+ (w0 , v) − ∂νx ,γ0 w0 , v∂ D− .
(22)
To obtain estimates of Iτ ,0 by the method of elliptic estimates, we use another identity for some elliptic problem. We set w1 = w0 − v. Since w1 satisfies ⎧ ⎨ (γ0 − τ 2 )w1 (x; τ ) = 0 Bγ w1 (x; τ ) = −Bγ0 v(x; τ ) ⎩ 0 w1 (x; τ ) = −v(x; τ ) integration by parts implies that
in , on ∂ D + , on ∂ D − ,
(23)
Inverse Problems for the Combined Case
243
A,τ (w1 , w1 ) + bλ+ (w1 , w1 ) − Bγ0 v, w1 ∂ D+ + ∂νx ,γ0 w1 , w1 ∂ D− = 0.
(24)
Noting the boundary conditions in (15), we have ∂νx ,γ0 w0 , v∂ D− = −∂νx ,γ0 w1 , w1 ∂ D− + ∂νx ,γ0 v, v∂ D− . From this identity, (22) and (24), we obtain Iτ ,0 = A,τ (w1 , w1 ) + bλ+ (w1 , w1 ) + Bγ0 v, v∂ D+ + 2∂νx ,γ0 w1 , w1 ∂ D− − ∂νx ,γ0 v, v∂ D− .
(25)
From Lemma 1 and the definitions of A,τ and bλ+ , first three terms of right side in (25) are positive for large τ ≥ 1. If ∂ D − = ∅, we obtain (14) immediately. This is an original version of the method of elliptic estimates. In our case, the fifth term −∂νx ,γ0 v, v∂ D− is negative, which causes the sign of the indicator function to be indefinite. Moreover, we also need to handle the term 2∂νx ,γ0 w1 , w1 ∂ D− . For this term, we give the following estimate: Lemma 2 There exists a constant C > 0 such that for any ε > 0, ∂ν
x ,γ0
C w1 , w1 ∂ D− ≤ I − (v) + ε∂νx ,γ0 v, v∂ D+ . ε
Proof Consider the weak solution ψ of the boundary value problem given by replacing D + with D − and taking λ = 0 and g + = ∂νx ,γ0 w1 in (21). Since w1 = −v on ∂ D − , for any ε > 0 we have ∂ν
x ,γ0
C − I (v) + εA D− ,τ (ψ, ψ), w1 , w1 ∂ D− = A D− ,τ (ψ, −v) ≤ 4ε
where we used the estimate A D− ,τ (v, v) ≤ C I − (v). Apply the argument developed in McLean (2000), Lemma 4.3 to (23), we can see that ∂νx ,γ0 w1 2H −1/2 (∂ D− ) ≤ C{v2H 1/2 (∂ D− ) + v2Hτ1 (D+ ) }. τ
τ
(26)
Thus, (26), the estimates and the equality given in Lemma 1 imply that A D− ,τ (ψ, ψ) ≤ C∂νx ,γ0 w1 2H −1/2 (∂ D− ) ≤ C {A D− ,τ (v, v) + A D+ ,τ (v, v)} τ
= C {A D− ,τ (v, v) + ∂νx ,γ0 v, v∂ D+ }, where we used min{γ0 , 1}ϕ2H 1 (D± ) ≤ A D± ,τ (ϕ, ϕ) ≤ max{γ0 , 1}ϕ2H 1 (D± ) (ϕ ∈ τ τ H 1 (D ± ), τ ≥ 1). From these estimates, we get Lemma 2. Since λ(x, τ ) = λ1 (x)τ + λ0 (x) ≥ λ0 (x), Proposition 3 implies that bλ+ (w1 , w1 ) ˜ −1 A,τ (w1 , w1 ) for some fixed constant C˜ > 0 independent of τ ≥ 1. From ≥ −Cτ
244
M. Kawashita and W. Kawashita
this estimate, (25) and Lemma 2, it follows that there exists a constant C1 > 0 and τ1 ≥ 1 such that C1 − Iτ ,0 ≥ I (v) (τ ≥ τ1 , ε > 0). (27) (1 − ε)∂νx ,γ0 v − λv vd Sx − ε ∂ D+
We divide the boundary integral in (27) into two parts ∂ D +,+ and ∂ D +,− . From √ μ1 < 1/4 and c0 = (12), we have γ0 − μ1 > λ1 (x) ≥ 0 on ∂ D +,+ . If we set ε0 = 4√ γ0 √ λ(x;τ ) 1−3ε0 < 1, then it follows that 1−2ε0 ≤ γ0 c0 τ for τ ≥ τ1 = max{1, 4λ0 L ∞ (∂ D+,+ ) 1−2ε0 /μ1 } and x ∈ ∂ D +,+ . Then, taking ε = ε0 in (27), we have (1 − ε0 )∂νx ,γ0 v − λv, v∂ D+,+ ≥ ε0 ∂νx ,γ0 v, v∂ D+,+ √ + (1 − 2ε0 )(∂νx ,γ0 v − γ0 c0 τ v), v∂ D+,+ . (28) By (11), the second integral of right side of (28) is expressed by √ τ γ0 dydξ f (y) f (ξ) (4πγ0 )2
∂ D +,+
B×B
e
− √τγ (|x−y|+|x−ξ|) 0
|x − y||x − ξ| √
+,+ (x, y, τ )d Sx ,
(29)
γ
·(x−y) x ·(x−y) where +,+ (x, y, τ ) = − νx|x−y| − c0 − τ 0 ν|x−y| 2 . The leading part of (29) is given by the points attaining inf (x,y,ξ)∈B×B×∂ D (|x − y| + |x − ξ|), which is the same ·(x−y) = 1 on such points and ∂ D +,+ is C 1 , we have +,+ > 0 as for (17). Since − νx|x−y| in some neighborhood U of such points. Noting that there exists a constant δ0 > 0 such that |x − y| + |x − ξ| ≥ l0+ + δ0 ((x, y, ξ) ∈ B×B×∂ D \ U), we obtain
∂νx ,γ0 v −
√ − √2 τ (l + +δ ) γ0 c0 τ v, v∂ D+,+ ≥ −Cτ e γ0 0 0 .
For the integral on ∂ D +,− , from Lemma 1 and Proposition 3, we also have ∂ν
x ,γ0
v−
√
γ0 c0 τ v, v∂ D+,− ≤ Cv2Hτ1 (D+,− ) .
From these estimates, (27), (28) and Lemma 1, we can conclude that (14) holds.
References Calderón AP (1980) On an inverse boundary value problem seminar on numerical analysis and its applications to continuum physics (Rio de Janeiro, 1980) pp 65–73 (Soc. Brasil. Mat.) Dautray R, Lions J-L (1992) Mathematical analysis and numerical methods for sciences and technology. In: Evolution problems I, vol 5. Springer, Berlin Honda N, Potthast R, Nakamura G, Sini M (2008) The no-response approach and its relation to non-iterative methods for the inverse scattering. Ann Mat Pura Appl 187(1):7–37
Inverse Problems for the Combined Case
245
Ikehata M (1998) Size estimation of inclusion. J Inv Ill-Posed Probl 6:127–140 Ikehata M (1999) Enclosing a polygonal cavity in a two-dimensional bounded domain from Cauchy data. Inverse Probl 15:1231–1241 Ikehata M (2000) Reconstruction of the support function for inclusion from boundary measurements. J Inv Ill-Posed Probl 8:367–378 Ikehata M (2007) Extracting discontinuity in a heat conductive body. One-space dimensional case. Appl Anal 86(8):963–1005 Ikehata M, Kawashita M (2009) The enclosure method for the heat equation. Inverse Probl 25:075005; 10 pp Ikehata M (2010) The enclosure method for inverse obstacle scattering problems with dynamical data over a finite time interval. Inverse Probl 26:055010; 20 pp Ikehata M, Kawashita M (2010) On the reconstruction of inclusions in a heat conductive body from dynamical boundary data over a finite time interval. Inverse Probl 26(9):095004; 15 pp Ikehata M (2012) The enclosure method for inverse obstacle scattering problems with dynamical data over a finite time interval: II. Obstacles with a dissipative boundary or finite refractive index and back-scattering data. Inverse Probl 28:045010; 29 pp Ikehata M (2013) The enclosure method for inverse obstacle scattering problems with dynamical data over a finite time interval: III. Sound-soft obstacle and bistatic data. Inverse Probl 29(8):085013; 35 pp Ikehata M, Kawashita M (2014) Estimates of the integral kernels arising from inverse problems for a three-dimensional heat equation in thermal imaging. Kyoto J Math 54(1):1–50 Ikehata M, Kawashita M (2014) An inverse problem for a three-dimensional heat equation in thermal imaging and the enclosure method. Inverse Probl Imaging 8:1073–1116 Ikehata M, Kawashita M (2018) Asymptotic behavior of the solutions for the Laplace equation with a large spectral parameter and the inhomogeneous Robin type conditions. Osaka J Math 55:117–163 Kawashita M (2017) An inverse problem for a three-dimensional heat equation in bounded regions with several convex cavities. https://arxiv.org/abs/1709.00165 Kawashita M, Kawashita W (2021) In: Watanabe T (ed) Spectral and scattering theory and related topics, RIMS Kôkyûroku No. 2195, pp 42–63 McLean W (2000) Strongly elliptic systems and boundary integral equations. Cambridge University Press, Cambridge Nakamura G, Potthast R (2015) Inverse modeling. An introduction to the theory and methods of inverse problems and data assimilation. In: IOP expanding physics. IOP Publishing, Bristol Nakamura G, Potthast R, Sini M (2006) Unification of the probe and singular sources methods for the inverse boundary value problem by the no-response test. Commun Partial Differ Equs 31(10–12):1505–1528 Varadhan SRS (1967) On the behavior of the fundamental solution of the heat equation with variable coefficients. Commun Pure Appl Math 20:431–455
Algebraic Reconstruction of a Dipolar Wave Source from Observations on Several Points Takashi Ohe and Misa Yokoyama
Abstract A source reconstruction problem in wave propagation is considered. We assume that the wave source is modeled by a single dipole source whose position is fixed and dipole moment varies in time. We suppose that both position and dipole moment are unknown, and consider the problem to identify the unknown position and moment of the dipole source from the observation of the wave field on several points. From the explicit expression of the wave field, we derive some algebraic relations between the observation and unknown parameters. Based on these relations, we propose a procedure for the reconstruction of unknowns from observations on four well-arranged points. The proposed procedure is examined by some numerical experiments. Keywords Inverse source problem · Wave equation · Dipole source model · Four well-arranged observation points · Reconstruction procedure
1 Introduction Inverse source problems appear in many science and engineering fields, for example, detection of pollution sources in environment sciences, estimation of the epicenter in seismic sciences, medical imaging, and so on (Ammari et al. 2022; Andrle and El Badia 2012; Isakov 2017). Our interest in this paper is a reconstruction of the unknown source in wave propagation. Such a problem is usually formulated as an inverse source problem for the wave equations and discussed in many papers (Ando et al. 2013; El Badia and Ha-Duong 2001; Komornik and Yamamoto 2005).
T. Ohe (B) Okayama University of Science, Okayama 700-0005, Japan e-mail: [email protected] M. Yokoyama Graduate School of Science, Okayama University of Science, Okayama 700-0005, Japan SUS Co.Ltd, Kyoto 600-8008, Japan © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 T. Takiguchi et al. (eds.), Practical Inverse Problems and Their Prospects, Mathematics for Industry 37, https://doi.org/10.1007/978-981-99-2408-0_15
247
248
T. Ohe and M. Yokoyama
In most of the research on inverse source problems for partial differential equations, the observation is assumed to be given on the whole or some part of the boundary of the interest domain, that is, the observation data is given infinitely many points. However, in many practical problems, observation is allowed only on a small number of points. Several researchers consider the identification problem of sources under such restricted situations (Al Jebawy et al. 2022; Bruckner and Yamamoto 2000; Inui and Ohnaka 2008; Nakaguchi et al. 2012; Rashedi and Sini 2015). In Inui and Ohnaka (2008), Inui and Ohnaka consider the case that the unknown source is expressed by a single point source whose position is fixed in space but the intensity varies in time, and give a reconstruction procedure for the position and intensity of the unknown point source from observations of the wave field only on one point. This result is extended by Nakaguchi et al. for a single moving point source (Nakaguchi et al. 2012). Al Jebawy et al. take another approach for the case that a single moving point source whose intensity is a given constant and give an identification procedure and Lipschitz stability of the identification results (Al Jebawy et al. 2022). This paper concerns the result in Inui and Ohnaka (2008), but considers the case that the source term is expressed by the dipole model. The dipole model is one of the important source models in electromagnetism and acoustics, for example, the dipole antenna and loudspeaker. A most important difference from the point source is that the wave field generated by the dipole source does not propagate spherically but propagates bi-directionally. This directionality depends on the dipole moment of the source model. In this paper, we suppose that a single dipole source is placed in R3 , and the position and moment of the dipole are unknown. Then, we consider the reconstruction of these unknowns from observation of the wave field on several points. The contents of this paper are as follows. Section 2 gives a formulation of our inverse problem, and shows some related results for our problem. In Sect. 3, we describe the process of our reconstruction procedure. We give some numerical experiments to show the effectiveness of our procedure in Sect. 4.
2 Problem Formulation Let u be the solution to the initial value problem for the three-dimensional wave equation ⎧ 1 ⎪ ⎨ 2 ∂t2 u(t, r) − u(t, r) = −m(t) · ∇r δ(r − p), in (0, T ) × R3 , c u(0, r) = 0, in R3 , ⎪ ⎩ in R3 , ∂t u(0, r) = 0,
(1)
where T > 0 is a fixed time, c > 0 is the speed of the wave propagation, m ∈ C 3 ((0, T ); R3 ) denotes the moment of the dipole source, and p ∈ R3 denotes the position of the dipole source, and δ(·) denotes the three-dimensional Dirac’s delta function. The solution u of (1) has an explicit expression given by
Reconstruction of Dipolar Wave Source
249
μ(td (t; r, p); r, p)Y (t − |r − p|/c) 4π|r − p|2 μ(t ˙ d (t; r, p); r, p)Y (t − |r − p|/c) + , 4πc|r − p|
u(t, r) =
(2)
where μ(t; r, p) = m(t) · er,p , r−p , er,p = |r − p| |r − p| td (t; r, p) = t − , c μ(t; ˙ r, p) = ∂t μ(t; r, p) = dt m(t) · er,p , and Y (·) is the Heaviside function (Jackson 1999). This expression is easily obtained by computing the directional derivative m(td (t; r, p)) · ∇ of the fundamental solu1 (R3 )) ∩ C 2 ([0, T ] × tion of the wave equation. We note that u ∈ C([0, T ]; L loc 3 R \Bε (p)) for any ε > 0, where Bε (p) denote the open ball whose center is p and radius is ε. Now, we formulate our identification problem. Suppose that the position p and the moment m(t) of the dipole source are unknown. We consider the problem to reconstruct unknown parameters p and m(t) from information on u and its derivative ∂t u, ∂t2 u, ∇u, ∂t ∇u, H (u) on four well-arranged points r j ∈ R3 , j = 1, 2, 3, 4, where H (u) denotes the Hessian of u. Especially, for the space derivatives of u, we assume that only one component of the specific direction is available on each observation point, that is, we can observe d · ∇u, d · ∂t ∇u, and dT H (u)d for the given direction d ∈ S 2 . Here, we define that four points r j ∈ R3 , j = 1, 2, 3, 4 are well arranged if three vectors r1 − r j , j = 2, 3, 4 are linearly independent. Also, we assume that the time T satisfies the condition T > max
j=1,2,3,4
|r j − p| , c
since the wave field propagates with speed c. For the case where the wave source is described by the point source model, Inui and Ohnaka give that the wave source can be uniquely identified from the measurement of u, ∂t u, and ∇u on a single point for the case where the position of the source is fixed in time (Inui and Ohnaka 2008). Nakaguchi et al. extend this result for moving point source cases, and propose an identification procedure based on the measurement of u, ∂t u, ∇u, ∂t ∇u, and H (u) (Nakaguchi et al. 2012). Al Jebawy et al. consider the identification problem of moving point source with constant magnitude and give a reconstruction procedure from the measurement of u on the six different points. They also show a Lipschitz stability estimates for the reconstruction result (Al Jebawy et al. 2022). In this paper, we take a similar approach as Inui and Ohnaka (2008),
250
T. Ohe and M. Yokoyama
and consider the problem for the case that the wave source is described by the dipole source model.
3 Reconstruction of Unknown Dipole Source 3.1 Identification of |r j − p| and μ(td (t; rj , p); r j , p) In our reconstruction procedure, we do not reconstruct the unknown parameters directly from observation, but identify the distance |r j − p| between the dipole source and each observation point and er j ,p -component μ(td (t; r j , p); r j , p) of the unknown moment m(t) on each observation point r j . Using these parameters for four wellarranged points r j , j = 1, 2, 3, 4, we can uniquely reconstruct unknown parameters p and m(t) as we will show in Sect. 3.2. As observations of u(t, r) on the points r j , j = 1, 2, 3, 4, we assume that • u(t, r j ), ∂t u(t, r j ) and ∂t2 u(t, r j ) • d j · ∇u(t, r j ) and d j · ∂t (∇u(t, r j )), • dTj H (u)(t, r j )d j , are available, where d j ∈ S 2 denotes the given specific direction of the observation device on the observation point r j . Using the explicit expression (2) of u(t, r), we derive the following expressions for observations for t ≥ |r j − p|/c: u(t, r j ) =
μ(td (t; r j , p); r j , p) μ(t ˙ d (t; r j , p); r j , p) , + 2 4πc|r j − p| 4π|r j − p|
μ(t ˙ d (t; r j , p); r j , p) μ(t ¨ d (t; r j , p); r j , p) , + 2 4πc|r j − p| 4π|r j − p| ... μ (td (t; r j , p); r j , p) μ(t ¨ d (t; r j , p); r j , p) , ∂t2 u(t, r j ) = + 2 4πc|r j − p| 4π|r j − p| μ(td (t; r j , p); r j , p) μ(t ˙ d (t; r j , p); r j , p) d j · ∇u(t, r j ) = − + 2π|r j − p|3 2πc|r j − p|2 μ(t ¨ d (t; r j , p); r j , p) + er j ,p · d j , 4πc2 |r j − p| μ(t ˙ d (t; r j , p); r j , p) μ(t ¨ d (t; r j , p); r j , p) d j · ∂t ∇u(t, r j ) = − + 2π|r j − p|3 2πc|r j − p|2 ... μ (td (t; r j , p); r j , p) + er j ,p · d j , 4πc2 |r j − p| 2μ(td (t; r j , p); r j , p) 2μ(t ˙ d (t; r j , p); r j , p) T d j H (u)(t, r j )d j = + 4 π|r j − p| πc|r j − p|3 ∂t u(t, r j ) =
(3) (4) (5)
(6)
(7)
Reconstruction of Dipolar Wave Source
+ −
251
μ(t ¨ d (t; r j , p); r j , p) πc2 |r j − p|2
μ(td (t; r j , p); r j , p) 2π|r j − p|4
+
+
... μ (td (t; r j , p); r j , p) 4πc3 |r j − p|
(er j ,p · d j )2
μ(t ˙ d (t; r j , p); r j , p) 2πc|r j − p|3
μ(t ¨ d (t; r j , p); r j , p) + . 4πc2 |r j − p|2
(8)
To simplify equations (3)–(8), we introduce the following new variables: ∂tk−1 μ(td (t; r j , p); r j , p) , k = 1, 2, 3, 4, |r j − p| c , ρj ≡ |r j − p|
yk, j ≡
ξ j ≡ er j ,p · d j , z 1, j ≡ 4πcu(t, r j ), z 2, j ≡ 4πc∂t u(t, r j ), z 3, j ≡ 4πc∂t2 u(t, r j ), z 4, j ≡ −4πc2 d j · ∇u(t, r j ) z 5, j ≡ −4πc2 d j · ∇∂t u(t, r j ), z 6, j ≡ 4πc3 dTj H (u)(t, r j )d j .
Note that yk, j , k = 1, 2, 3, 4, ρ j and ξ j are unknown, and z k, j , k = 1, 2, . . . , 6 are known. Using these new variables, we derive the following simplified equations from (3)–(8): z 1, j = ρ j y1, j + y2, j , z 2, j = ρ j y2, j + y3, j ,
(9) (10)
z 3, j = ρ j y3, j + y4, j ,
(11)
(2ρ2j y1, j (2ρ2j y2, j (8ρ3j y1, j
(12)
z 4, j = z 5, j = z 6, j =
+ 2ρ j y2, j + y3, j )ξ j , + 2ρ j y3, j + y4, j )ξ j , +
8ρ2j y2, j
+ 4ρ j y3, j +
(13) y4, j )ξ 2j
−
(2ρ3j y1, j
+
2ρ2j y2, j
+ ρ j y3, j ). (14)
Substituting (9)–(12) to (13) and (14), we can eliminate yk, j , k = 1, 2, 3, 4, and derive the system of two algebraic equations for unknowns ρ j and ξ j :
252
T. Ohe and M. Yokoyama
2z 1, j ρ2j + (2z 2, j − z 4, j /ξ j )ρ j + (z 3, j − z 5, j /ξ j ) =0, 2z 1, j ξ 2j ρ2j
+ z 4, j (3ξ j − 1/ξ j )ρ j +
(ξ 2j z 3, j
− z 6, j ) =0,
(15) (16)
The system of equations (15) and (16) is complicated and difficult to solve directly. However, if we can give the exact value of ξ j in some method, then the equations (15) and (16) have a common solution. Hence, we may determine ξ j such that the equations (15) and (16) have a common solution. We describe a method to determine ξ j based on this idea in Sect. 3.3. Once ξ j is identified, then we can identify ρ j as the common solution of the equations (15) and (16). Substituting the identified value of ρ j and ξ j to (9), (10), and (12), we derive the system of linear equations for yk, j , k = 1, 2, 3. This linear system is non-singular, and we can easily solve y1, j . Finally, we identify |r j − p| = c/ρ j and μ(td (t; r j , p); r j , p) = cy1, j /ρ j . Note. To the best of the authors’ knowledge, the uniqueness of the solution of the system of equations (9)–(14) has not been established yet while the numerical results in Sect. 4 suggest the uniqueness.
3.2 Reconstruction of p and m(t) from |r j − p| and μ(td (t; rj , p); r j , p) In the previous subsection, we identify the distance |r j − p| between the position p of the dipole source and observation point r j , and er j ,p -component μ(td (t; r j , p); r j , p) of the moment m(td (t; r j , p)). Let us choose four well-arranged observation points r j ∈ R3 , j = 1, 2, 3, 4. Then, one can uniquely determine p as the cross point of the four balls whose centers and radii are r j and c/ρ j , j = 1, 2, 3, 4, respectively, if the identified ρ j , j = 1, 2, 3, 4 are exact. Once the position of the dipole is identified, we can estimate the time delay between the position of the dipole source and each observation point by |r j − p|/c. Taking into consideration of these time delays, we identify μ(t, r j ) = m(t) · er j ,p . From the assumption for r j , there exist three linearly independent vectors in four vectors r j − p, j = 1, 2, 3, 4. Hence, we can uniquely identify the vector m(t) from μ(t, r j ). Note that the time interval of the reconstructed moment is 0 ≤ t ≤ T − τd , where τd = max j=1,2,3,4 |r j − p|/c due to the time delay between the position of the source and the observation points. Finally, we summarize the procedure for our reconstruction method. Reconstruction procedure Step 1. Choose four well-arranged observation points {r j } j=1,2,3,4 ⊂ R3 , and fix d j ∈ S1. Step 2. Give observations for the wave field u(t, r j ), ∂t u(t, r j ), ∂t2 u(t, r j ), d j · ∇u(t, r j ), d j · ∂t (∇u(t, r j )), and dTj H (u)(t, r j )d j for 0 ≤ t ≤ T on each observation point r j .
Reconstruction of Dipolar Wave Source
253
Step 3. On each observation point r j , estimate ξ j such that two equations (15) and (16) have a common solution, and estimate ρ j as this common solution of (15) and (16). Step 4. Reconstruct the position p which satisfies the system of equations |r j − p| = c/ρ j , j = 1, 2, 3, 4. Step 5. On each observation point r j , solve the system of linear equations (9), (10), and (12), and obtain y1, j . Identify μ(td (t; r j , p); r j , p) = y1, j |r j − p|. Step 6. Taking into consideration of the time delay, estimate μ(t; r j , p) for 0 ≤ t ≤ T − τd . Step 7. Solve the system of linear equations m(t) · er j ,p = μ(t; r j , p), j = 1, 2, 3, 4 and reconstruct m(t), 0 ≤ t ≤ T − τd .
3.3 Some Remarks on the Implementation of the Reconstruction Procedure In the previous subsection, we propose a reconstruction procedure under the assumption that observation is exact and the computation in each step can be executed exactly. However, in practice, observations have some errors and noises, and we can not evaluate the unknown parameters exactly in each step. For this reason, we discuss ideas for the practical implementation of some steps in the reconstruction procedure. (a) Step 3: Evaluation of ξ j : In Step 3, we estimate ξ j such that the second-order equations (15) and (16) have a common solution. For the implementation of this idea, we introduce an evaluation function D j (ξ) which measures the minimum difference between the solutions of equations (15) and (16) as follows: ρ(15) ρ(16) D j (ξ) = min {| j,l (ξ) − j,m (ξ)|}, l,m=1,2
(17)
where ρ(15) ρ(16) j,l (ξ) and j,l (ξ) (l = 1, 2) are solutions of the second-order equations (15) and (16) for the observation point r j and by substituting ξ j = ξ, respectively. Then, we estimate ξ j which minimizes the evaluation function D j (ξ). We confirm this idea numerically in the next section. (b) Step 4: Reconstruction of p: In Step 4, the position p of the unknown dipole is reconstructed from the identified value of |r j − p| = c/ρ j on four well-arranged observation points r j , j = 1, 2, 3, 4. However, due to the numerical errors and the observation noises, we can not obtain the exact value of these parameters. Hence, we have to consider estimating p which satisfies the condition approximately. For the approximation of the position p of the dipole, we introduce the evaluation function J (p) which gives the square sum of the difference between |r j − p| and its identified value, and approximate p by pˆ which minimizes this evaluation function:
254
T. Ohe and M. Yokoyama
pˆ = argmin J (p) ≡ argmin
4
c 2 |r j − p| − , ρˆ j j=1
(18)
where ρˆ j is the identified value of ρ j in Step 3. Note. Step 4 is a kind of multilateration which can be found in GPS (Global Positioning System) and so on (c.f. Hou 2022 and its references). The uniqueness of the solution of the minimization problem (18) is not guaranteed if the errors of the estimated distances |r j − p| ≡ c/ρˆ j , j = 1, 2, 3, 4 are not small. In such cases, we need another observation data for the verification of the identified result. (c) Steps 6 and 7: Reconstruction of m(t): In Step 6, we evaluate μ(t; r j , p) from μ(td (t; r j , p); r j , p) taking into consideration of the time delay. However, in practice, we can identify μ(td (t; r j , p); r j , p) on the discrete time points, and then we can not evaluate μ(t; r j , p) directly for arbitrary time instance t. Hence, we approximate μ(t; r j , p) from μ(td (t; r j , p); r j , p) at the discrete time instances. In this paper, we apply the idea of linear interpolation for the approximation of μ(t; r j , p). In Step 7, we need to solve the system of four linear equations ˆ r j , p), j = 1, 2, 3, 4, m(t) · er j ,p = μ(t;
(19)
where μ(t; ˆ r j , p) is the identified value of μ(t; r j , p). Since the identified values of μ(t; ˆ r j , p) are not exact, the system of equation (19) may have no solution. Hence, we choose the least square solution of (19) using the Moore–Penrose generalized inverse matrix, and approximate m(t) by ˆ m(t) = (Ar,T p Ar, p )−1 Ar,T p μˆ r, p (t), where the matrix Ar, p and the vector μˆ r, p (t) are defined by
T Ar, p = er1 ,pˆ er2 ,pˆ er3 ,pˆ er4 ,pˆ ,
T ˆ r2 , p) μ(t; ˆ r3 , p) μ(t; ˆ r4 , p) . ˆ r1 , p) μ(t; μˆ r, p (t) = μ(t;
4 Numerical Experiments Now, we show some numerical experiments for our reconstruction procedure. As a test case, we consider the case that the position of the unknown dipole is fixed on p = (0.6, 0.4, 1.5) and the moment changes in time as shown in Fig. 1. The wave field generated by an unknown dipole is observed on four points {r j , j = 1, 2, 3, 4} = {(0, 0, 0), (1, 0, 0), (0, 1, 0), (0, 0, −1)},
Reconstruction of Dipolar Wave Source
255
Fig. 1 Time profile of the dipole moment of an unknown source
Fig. 2 Behaviors of evaluation function D3 (ξ) defined by (17) on r3 = (0, 1, 0) for t = 10, 15, 20. The exact value of ξ = ξ3 = −0.8704
and the specific direction of the observation devices is d j = d = (0, 0, 1) for all j = 1, 2, 3, 4. We execute the reconstruction procedure for 0 ≤ t ≤ 35 with the time step t = 0.1, and we give the observation data for the wave field on 0 ≤ t ≤ 37.5 with the same time step at tl = l · t, l = 0, 1, 2, . . . , 375. Before showing the reconstruction results, we show how the evaluation function D j (ξ) defined by (17) works for the estimation of ξ j . Figure 2 gives the behaviors of the evaluation function D3 (ξ) for observation point r3 at t = 10, 15, 20. The result of Fig. 2 shows that we can give a good approximation of ξ j by ξˆ which gives the minimum value of D j (ξ).
256
T. Ohe and M. Yokoyama
(a) position of dipole
(b) moment of dipole
Fig. 3 Reconstruction results of the position p and the moment m(t) of unknown dipole from noise-free observation. Here, px denotes the exact value of the x-component of p, pˆ x denotes its reconstructed value, and so on
4.1 Reconstruction Results for Noise-Free Observations First, we examine our reconstruction procedure for the noise-free case. Figure 3 gives the reconstruction results of the position p and the moment m(t) for each time instance tl , l = 0, 1, 2, . . . , 350, respectively. The results of Fig. 3 show that our procedure works well for the reconstruction of the dipole source under the noise-free condition. In Fig. 3, our procedure gives bad reconstruction results around t = 7, 15, 22, 30. To consider the reason for these bad estimates, we check the inner product of two unit vectors erj ,p , and m(t)/|m(t)|. Figure 4 shows the results, and we can see that two vectors m(t) and erj ,p become almost orthogonal around the time at which our procedure gives bad estimates. The orthogonality of m(t) and erj ,p leads μ(t; r j , p) = 0, and hence the estimation of ρ j becomes difficult. We consider that this phenomenon is one of the reasons for these bad estimates.
4.2 Reconstruction Results for Noisy Observations Next, we consider the case where the observation includes some noises. For the evaluation of the robustness of the reconstruction procedure against the observation noise, we add Gaussian noise to the observation data according to the uniform distribution such that max
max
l=0,1,··· ,375 j=1,2,3,4
max
|u obs (tl , r j ) − u ex (tl , r j )| max
l=0,1,··· ,375 j=1,2,3,4
|u ex (tl , r j )|
= 0.0, 0.01, 0.05, 0.1, 0.5, 1.0, 5.0%,
Reconstruction of Dipolar Wave Source
257
Fig. 4 Behaviors of the inner product erj ,p · m(t)/|m(t)|
where u ex (tl , r j ) denotes the exact value and u obs (tl , r j ) the observation data disturbed by the noise, respectively. From the assumptions, the exact values of the solution u(t, r j ), j = 1, 2, 3, 4 on the observation points r j are continuous with respect to t; hence, we evaluate the error with the supremum norm. We add similar noises for other observations ∂t u(tl , r j ), d · ∇u(tl , r j ), and so on. Figures 5 and 6 give the reconstruction results of the position p and the moment m(t) for various noise levels. The results of Figs. 5 and 6 suggest that our procedure can give a good reconstruction of the parameters of unknown dipole even if observation data includes 1% noise. However, our procedure does not work well for the case where the noise level is as large as 5.0%. To consider the reason for these bad reconstruction results for the case where noise is as large as 5.0%, we check the estimations of ξ j . We give the estimation results of ξ j for various noise levels in Fig. 7. The results of Fig. 7 show that the error of the estimation of ξ j becomes unignorable for the case where the noise level is larger than 1%, and the errors of ξ j , j = 1, 2, 3, 4 finally affect the reconstruction result of the unknown dipole. We evaluate the dependence of the errors of reconstruction results on the observation noise. Table 1 shows the average errors of the reconstruction results for various noise levels. Here, to avoid the effect of the bad arrangement of er j ,p , we use the reconstruction results in the interval I = [9.0, 10.5] ∪ [17.0, 19.0] ∪ [24.0, 25.5] ∪ [30.0, 33.0]. In Table 1, the averages of errors are evaluated by
258
T. Ohe and M. Yokoyama
(a) 0.01% noise
(b) 0.1% noise
(c) 1.0% noise
(d) 5.0% noise
Fig. 5 Reconstruction results of the position p of unknown dipole from noisy observations
Average error of the position p =
1 ˆ − p|, |p(t) N I t ∈I l
1 ˆ Average error of the moment m(t) = |m(t) − m(t)|, N I t ∈I l
Average error of ξ = max
j=1,2,3,4
1 ˆ |ξ j (t) − ξ j |, N I t ∈I l
where N I = the number of tl ∈ I . From Table 1, we can find that the error grows slowly when the noise level is smaller than 0.5% but grows rapidly when the noise level becomes larger than 1%. The analysis of the reason for this rapid growth of errors is left as an open problem. Taking into account that the range of ξ j is [−1, 1], errors of the reconstruction results grow rapidly when the error of ξ j becomes large as 10%.
Reconstruction of Dipolar Wave Source
259
(a) 0.01% noise
(b) 0.1% noise
(c) 1.0% noise
(d) 5.0% noise
Fig. 6 Reconstruction results of the moment m(t) of unknown dipole from noisy observations Table 1 Average errors of the reconstruction results for various noise levels Noise level (%) Average error of p Average error of m(t) Average error of ξ 0.00 0.01 0.05 0.10 0.50 1.00 5.00
2.31E-2 2.38E-2 2.36E-2 3.15E-2 1.10E-1 2.89E-1 9.39E+0
7.65E-2 8.08E-2 7.52E-2 7.82E-2 1.90E-1 3.35E-1 1.39E+0
2.12E-2 2.24E-2 2.64E-2 3.03E-2 6.30E-2 8.79E-2 1.27E-1
260
T. Ohe and M. Yokoyama
(a) 0.01% noise
(b) 0.1% noise
(c) 1.0% noise
(d) 5.0% noise
Fig. 7 Estimation results of the parameters ξ j from noisy observations
5 Conclusions In this paper, we consider a reconstruction problem of a single dipole wave source from the information of the wave field on four well-arranged points. From the explicit expression of the wave field, we derive some algebraic relations between the observation and unknown parameters. We propose a reconstruction procedure of the dipole source based on these algebraic relations of unknown parameters. We examine our procedure by some numerical examples and find that the procedure gives good reconstruction results even if the observation includes 1% noise. However, our procedure does not work well for the case where noise is as large as 5.0%. We have some future works for the reconstruction of dipole sources, for example, the uniqueness of the solution and stability analysis of the reconstruction procedure, an extension of the procedure for a moving source and multiple sources cases. Acknowledgements The authors would like to anonymous reviewers for their helpful comments and advice. This work is supported by part by JSPS KAKENHI Grant Number 18K03438.
Reconstruction of Dipolar Wave Source
261
References Al Jebawy H, El Badia A, Triki F (2022) Inverse moving point source problem for the wave equation. Inverse Probl 38(12):125003(11pp) Ammari H, Bao G, Fleming JL (2022) An inverse source problem for Maxwell’s equations in magnetoencephalography. SIAM J Appl Math 62(4):1369–1382 Ando S, Nara T, Levy T (2013) Partial differential equation-based localization of a monopole source from a circular array. J Acoust Soc Am 134(4):2799–2813 Andrle M, El Badia A (2012) Identification of multiple moving pollution sources in surface waters or atmospheric media with boundary observations. Inverse Probl 28(7):075009(22 pp) Bruckner G, Yamamoto M (2000) Determination of point wave sources by pointwise observations: stability and reconstruction. Inverse Probl 16(3):723–748 El Badia A, Ha-Duong T (2001) Determination of point wave sources by boundary measurements. Inverse Probl 17(4):1127–1139 Hou M (2022) Uniqueness and hyperconic geometry of positioning with biased distance measurements. GPS Solut 26(3), Article No. 79 Inui H, Ohnaka K (2008) A direct identification method of a point source for 3-dimensional scalar wave equation. Information 11(2):171–178 Isakov V (2017) Inverse problems for partial differential equations, 3rd edn. Springer, Cham Jackson JD (1999) Classical electrodynamics, 3rd edn. Wiley, New York Komornik V, Yamamoto M (2005) Estimation of point source and applications to inverse problems. Inverse Probl 21(6):2051–2070 Nakaguchi E, Inui H, Ohnaka K (2012) An algebraic reconstruction of a moving point source for a scalar wave equation. Inverse Probl 28(6):065018(21 pp) Rashedi K, Sini M (2015) Stable recovery of the time-dependent source term from one measurement for the wave equation. Inverse Probl 31(10):105011(17 pp)