180 60 26MB
English Pages 387 [388] Year 2014
Yong-Gang Li (Ed.) Seismic Imaging, Fault Damage and Heal
Also of Interest Imaging, Modeling and Assimilation in Seismology Yong-Gang Li (Ed.), 2012 ISBN 978-3-11-025902-5, e-ISBN 978-3-11-025903-2, Set-ISBN 978-3-11-220440-5
Computational Methods for Applied Inverse Problems Yanfei Wang, Anatoly G. Yagola, Changchun Yang (Eds.) ISBN 978-3-11-025904-9, e-ISBN 978-3-11-025905-6, Set-ISBN 978-3-11-220441-2
Direct and Inverse Problems in Wave Propagation and Applications Ivan Graham, Ulrich Langer, Jens Melenk, Mourad Sini (Eds.), 2013 ISBN 978-3-11-028223-8, e-ISBN 978-3-11-028228-3, Set-ISBN 978-3-11-028229-0
Contributions to Geophysics and Geodesy Online The Journal of Geophysical Institute of Slovak Academy of Sciences, 4 issues/year ISSN 1338-0540
Seismic Imaging, Fault Damage and Heal Edited by Yong-Gang Li
Physics and Astronomy Classification 2010 91.30.Ab, 91.30.Bi, 91.30.Jk, 91.30.pd, 93.85.Rt
ISBN 978-3-11-032991-9 e-ISBN 978-3-11-032995-7 Set-ISBN 978-3-11-032996-4 Library of Congress Cataloging-in-Publication Data A CIP catalog record for this book has been applied for at the Library of Congress. Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at http://dnb.dnb.de. © 2014 Higher Education Press and Walter de Gruyter GmbH, Berlin/Boston Cover image: SteffenHuebner/iStock/Thinkstock Printing and binding: CPI buch bücher.de GmbH, Birkach ♾Printed on acid-free paper Printed in Germany www.degruyter.com
Preface This book is the second monograph of the earth science specializing in computational, observational and interpretational seismology and geophysics, containing the full-3D waveform tomography method and its application; beamlets and curvelets method for wavefield representation, propagation and imaging; twoway coupling of solid-fluid with discrete element model and lattice Boltzmann model; fault-zone trapped wave observations and 3-D finite-difference synthetics for high-resolution imaging subsurface rupture zone segmentation and bifurcation; fault rock damage and heal associated with earthquakes in California and New Zealand; characterization of pre-shock accelerating moment release with careful considerations in processing and analysis of seismicity using earthquake catalogues; and statistical modeling of earthquake occurrences based on the ultra-low frequency ground electric signals. Each chapter in this book includes the detailed discussion of the state-of-the-art method and technique with their applications in case study. The editor approaches this as a broad interdisciplinary effort, with well-balanced observational, metrological and numerical modeling aspects. Linked with these topics, the book highlights the importance for imaging the crustal complex structures and internal fault-zone rock damage at seismic depths that are closely related to earthquake occurrence and physics. Researchers and graduate students in geosciences will broaden their horizons about advanced methodology and technique applied in seismology, geophysics and earthquake science. This book can be taken as an expand of the first book in the series, and covers multi-disciplinary topics to allow readers to grasp the new methods and skills used in data processing and analysis as well as numerical modeling for structural, physical and mechanical interpretation of earthquake phenomena, and to strengthen their understanding of earthquake occurrence and hazards, thus helping readers to evaluate potential earthquake risk in seismogenic regions globally. Readers of this book can make full use of the present knowledge and techniques to serve the reduction of earthquake disasters.
Contents
Seismic Imaging, Fault Damage and Heal: An Overview References
1 1.1 1.1.1 1.1.2 1.1.3 1.1.4 1.2 1.2.1 1.2.2 1.2.3 1.2.4 1.3 1.3.1 1.3.2 1.3.3 1.4
2 2.1 2.2 2.2.1 2.2.2 2.2.3 2.3 2.3.1 2.3.2
1
10
Applications of Full-Wave Seismic Data 15 Assimilation (FWSDA) Numerical Solutions of Seismic Wave Equations 16 Stable Finite-Difference Solutions on Non-Uniform, Discontinuous Meshes 18 Accelerating Finite-Difference Methods Using GPUs 22 The ADER-DG Method 26 29 Accelerating the ADER-DG Method Using GPUs Automating the Waveform Selection Process for FWSDA 41 42 Seismogram Segmentation Waveform Selection 49 50 Misfit Measurement Selection Fr´echet Kernels for Waveforms Selected in the Wavelet Domain 55 Application of FWSDA in Southern California Waveform Selection on Ambient-Noise Green’s Functions 57 59 Waveform Selection on Earthquake Recordings Inversion Results after 18 times Adjoint Iteration 60 63 Summary and Discussion References 65
Wavefield Representation, Propagation and Imaging Using 73 Localized Waves: Beamlet, Curvelet and Dreamlet Introduction 74 77 Phase-Space Localization and Wavelet Transform Time-Frequency Localization 78 81 Time-Scale Localization Extension and Generalization of Time-Frequency, Time-Scale Localizations 82 Localized Wave Propagators: From Beam to Beamlet 85 Frame Beamlets and Orthonormal Beamlets 87 Beamlet Spreading, Scattering and Wave Propagation in the Beamlet Domain 90
51
viii 2.3.3 2.3.4 2.4 2.4.1 2.4.2 2.4.3 2.5 2.5.1 2.5.2 2.5.3 2.5.4 2.6
3 3.1 3.2 3.2.1 3.2.2 3.2.3 3.2.4 3.3 3.3.1 3.3.2 3.3.3 3.4 3.4.1 3.4.2 3.4.3 3.5 3.5.1 3.5.2 3.5.3 3.6
Contents
Beam Propagation in Smooth Media with High-Frequency 96 Asymptotic Solutions Beamlet Propagation in Heterogeneous Media by the Local Perturbation Approach 101 Curvelet and Wave Propagation 106 106 Curvelet and Its Generalization Fast Digital Transforms for Curvelets and Wave Atoms 110 Wave Propagation in Curvelet Domain and the Application to Seismic Imaging 110 112 Wave Packet: Dreamlets and Gaussian Packets Physical Wavelet and Wave-Packets 112 116 Dreamlet as a Type of Physical Wavelet Seismic Data Decomposition and Imaging/Migration Using Dreamlets 119 Gaussian Packet Migration and Paraxial Approximation of Dreamlet 123 130 Conclusions Acknowledgement 131 132 References
Two-way Coupling of Solid-fluid with Discrete Element 143 Model and Lattice Boltzmann Model Introduction 143 Discrete Element Method and the ESyS-Particle Code 146 A Brief Introduction to the Open Source DEM Code: The ESyS-Particle 147 147 The Basic Equations Contact Laws and Particle Interaction 148 150 Fracture Criterion Lattice Boltzmann Method 151 151 The Basic Principle of LBM Boundary Conditions of LBM 152 A Brief Introduction to the Open Source LBM Code: OpenLB Two-way Coupling of DEM and LBM 156 157 Moving Boundary Conditions Curved Boundary Conditions 157 Implementation of Darcy Flow in LBM 160 161 Preliminary Results Bonded Particles Flow in Fluid 161 162 Fluid Flow in the Fractures Hydraulic Fracture Simulation 164 166 Discussion and Conclusions Acknowledgement 167 167 References
156
Contents
4
4.1 4.2 4.2.1 4.2.2 4.2.3 4.3 4.3.1 4.3.2 4.4 4.4.1 4.4.2 4.4.3 4.5
5
5.1 5.2 5.2.1 5.2.2 5.3 5.4 5.5
6 6.1 6.2 6.3
ix
Co-seismic Damage and Post-Mainshock Healing of Fault Rocks at Landers, Hector Mine and Parkfield, 173 California Viewed by Fault-Zone Trapped Waves Introduction 173 Rock Damage and Healing on the Rupture Zone of the 1992 M 7.4 Landers Earthquake 176 176 Landers Rupture Zone Viewed with Fault-Zone Trapped Waves Fault Healing at Landers Rupture Zone 183 Additional Damage on the Landers Rupture Zone by the Nearby Hector Mine Earthquake 192 Rock Damage and Healing on the Rupture Zone of the 1999 M 7.1 Hector Mine Earthquake 194 Hector Mine Rupture Zone Viewed with FZTWs 194 204 Fault Healing at Hector Mine Rupture Zone Rock Damage and Healing on the San Andreas Fault Associated with the 2004 M 6 Parkfield Earthquake 208 Low-Velocity Damaged Structure of the San Andreas Fault at Parkfield from Fault Zone Trapped Waves 209 Seismic Velocity Variations on the San Andreas Fault Caused by the 2004 M 6 Parkfield Earthquake 218 237 Discussion Conclusion 239 242 Acknowledgment References 242
Subsurface Rupture Structure of the M 7.1 Darfield and M 6.3 Christchurch Earthquake Sequence Viewed with 249 Fault-Zone Trapped Waves Introduction 250 256 The Data and Waveform Analyses The FZTWs Recorded for Aftershocks along Darfield/Greendale Rupture Zone 264 The FZTWs Recorded for Aftershocks along Christchurch/Port Hills Rupture Zone 277 288 Subsurface Damage Structure Viewed with FZTWs 3-D Finite-Difference Simulations of Observed FZTWs 294 306 Conclusion and Discussion Acknowledgment 314 314 References
Characterizing Pre-shock (Accelerating) Moment Release: 323 A Few Notes on the Analysis of Seismicity Introduction 323 The ‘Interfering Events’ and the ‘Eclipse Method’ Comparing with Linear Increase: The BIC Criterion
325 327
x 6.4 6.5 6.6 6.7 6.8 6.9 6.10 6.10.1 6.10.2 6.10.3
7
7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9
Contents
The Time-Space-MC Mapping of the Scaling Coefficient, 328 m(T, R, MC ) Removal of Aftershocks and the ‘De-clustered Benioff Strain’ 331 ‘Crack-like’ Spatial Window for Great Earthquakes: The 2008 Wenchuan Earthquake 335 Looking into a Finite Earthquake Rupture: The 2004 Sumatra-Andaman Earthquake 338 Using Seismic Moment Tensors to Investigate the Moment Release: AMij R before the 2011 Tohoku Earthquake? 340 344 Concluding Remarks and Discussion Appendix: The Magnitude Conversion Problem, and the Completeness of an Earthquake Catalogue 345 Magnitudes 345 Conversion of Magnitudes 346 347 Completeness of an Earthquake Catalogue References 347
Statistical Modeling of Earthquake Occurrences Based on External Geophysical Observations: With an Illustrative Application to the Ultra-low Frequency Ground Electric 351 Signals Observed in the Beijing Region Introduction 352 The Data 354 357 Model Description Results for Circles around the Individual Stations 359 364 Results for the 300 km Circle around Beijing Results from the Tangshan Region 369 Probability Gains from Forecasts Based on Electrical Signals Effect of Changes in the Background Seismicity 373 Conclusions 374 375 References
371
Seismic Imaging, Fault Damage and Heal: An Overview Yong-Gang Li
This book presents state-of-the-art methods and technique in observational, computational and analytical seismology for earthquake science. Authors from global institutions present multi-disciplinary topics with case studies to illuminate high-resolution imaging of complex crustal structures and earthquakeborne fault zones by the full-3D waveform tomography, beamlets and curvles of localized waves, discrete element model for fully-coupled solid-fluid, and 3-D finite-difference simulation of fault zone trapped (guided) waves observed at recent rupture zones in California and New Zealand. In addition, authors discuss the significance in characterization of the pre-shock moment release using cataloged seismicity, and statistical modeling of earthquake occurrence based on the ultra-low frequency ground electric signals. All topics in this book help further understanding earthquake physics and hazard assessment in global seismogenic regions. The detailed crustal structure and physical properties of fault network are of great interest because of the factors that control the occurrence and dynamic rupture in earthquake. Observations suggest that the crustal complexity may segment fault zones (Aki, 1984; Malin et al., 1989; Ellsworth, 1990; Beck and Christensen, 1991) or control the timing of moment release in earthquakes (Harris and Day, 1993; Wald and Heaton, 1994). Rupture models have been proposed that involved variations in fluid pressure over the earthquake cycle (Hickman et al., 1995; Blanpied et al., 1992). Geometrical, structural, and rheological fault discontinuities, caused by the spatial variations in strength and stress, will affect the earthquake rupture (e.g., Wesson and Ellsworth, 1973; Das and Aki, 1977; Rice, 1980; Day, 1984; Duan, 2012). Rupture segmentation is often related to fault bends, step-overs, branches, and terminations that have been recognized by surface mapping (e.g., Sieh et al., 1993; Johnson et al., 1994), exhumation (e.g., Chester et al., 1993), and seismic profiling and tomography
2
Yong-Gang Li
(e.g., Lees and Malin, 1990; Thurber et al., 2004). In order to relate present-day crustal stresses and fault motions to the geological structures formed by previous ruptures, we must understand the evolution of fault systems on many spatial and temporal scales in the complex earth crust. Because the fault plane is thought to be a weakness plane in the earth crust, it facilitates slip to occur under the prevailing stress orientation. As suggested by laboratory experiments, shear faulting is highly resisted in brittle material and proceeds as re-activated faults along surfaces which have already encountered considerable damage (e.g., Dieterich, 1997; Marone, 1998). Field evidence shows that the rupture plane of slip on a mature fault occurs at a more restricted position, the edge of damage zone at the plane of contact with the intact wall rock (Chester et al., 1993; Chester and Chester, 1998). Assuming that this is an actual picture of rupture preparation on the major faults, high-resolution defining the crustal complex and internal damage structure of faults as well as their temporal variations in physical property are challenging work in earthquake science. Monitoring seismic events and other physical field related to the principal rupture plane would be crucial for earthquake prediction. The slip of these events in series with the main fault is most likely to load the principal slip plane to a point of a major through-going rupture. In these circumstances, it is important to image where the principal fault plane is accompanied with damage zone at depth. Detailing the crustal structure and local variations in seismic velocities has implications for near-fault hazards and expected ground shaking. Greater amplitude shaking is expected near faults due to both proximity to the fault and localized amplification in damaged material. Examining the geometry and physical properties of fault zones as well as the crustal complex structure will help us understand the origin of spatial and temporal variations in rock damage and the evolution of heterogeneities in stress and strain in a seismogenic region. Other geophysical parameters, such as signals from the ultra-low frequency ground electric field, can be applied for modeling earthquake occurrence. For instance, the version of Ogata’s Lin-Lin algorithm (Ogata, 1988) presented in this book is useful for examining the influence of an explanatory signal on the occurrence of earthquakes in a stochastic point process. The statistical models based on observations of these signals allow to forecast earthquakes in its associated circle. In this book, we introduce the new methodology and technology used in data assimilation for defining subsurface complexity, seismically imaging the multiscale crustal heterogeneity and fault zone geometry, characterizing fault damage magnitude and heal progression, and its physical properties with high-resolution. We also introduce a sophisticated discrete element model with solid-fluid coupling mechanics for earthquake fracture zone rheological simulation, and the pre-
Seismic Imaging, Fault Damage and Heal: An Overview
3
shock accelerating moment release (AMR) model related to the critical-point-like behavior of earthquake preparation. This book includes seven chapters. Chapter 1: “Applications of Full-Wave Seismic Data Assimilation (FWSDA)” by Dawei Mu, En-Jui Lee and Po Chen. In the first volume of this book series, Po Chen (2012) introduced theoretical background and recent advances of full-waveform seismic data assimilation (FWSDA) as well as its mathematical formulations in the framework of the various data assimilation theories. In this chapter, Mu et al. further discuss the full-wave seismological inverse, as a weakly constrained generalized inverse problem, in which the seismic wave equation with its initial and boundary conditions, the structural and source parameters and the waveform misfit measurements are all allowed to contain errors. The issues related to the applications of FWSDA in realistic seismological inverse problems are also discussed in detail. Authors present the recent development of FWSDA that can potentially improve the efficiency of some numerical algorithms used for solving acoustic and visco-elastic seismic wave equations. To fully take advantage of the newly emerging computing hardware, algorithmic changes are needed. For the earth structure models in 3-D with highly irregular surface topography and fault structures, the efficiency and the accuracy of the wave equation solver are highly important in solving the problem in a realistic amount of time. In some of the recent successful full-3D waveform tomography applications, the waveform misfit measurements were made on selected wave packets on the seismograms. In order to achieve successful full-3D waveform tomography applications with a large amount of seismic data, the waveform selection process needs to be automated to a certain extent. Authors provide some of the latest developments in numerical solutions of the forward problem and their implementation and optimization on modern CPU-GPU hybrid parallel computing platforms. A realistic full-3D, full-wave tomography for the crustal structure in Southern California is used to illustrate the various components of FWSDA. Chapter 2: “Wavefield Representation, Propagation and Imaging Using Localized Waves: Beamlet, Curvelet and Dreamlet” by Ru-Shan Wu and Jinghuai Gao. In this chapter, authors review phase-space localization, mainly along the line of time-frequency localization, and then phase-space localization using generalized wavelet transform applied to wave field and one-way propagator decompositions. Physically the phase-space localized propagators are beamlet or wavepacket propagators which are propagator matrices for short-range iterative propagation. When asymptotic solutions are applied to the beamlet for long-range propagation, beamlets evolve into global beams. Various asymptotic beam propagation methods have been developed in the past, such as the Gaussian beam, complex ray, coherent state, and more recently the curvelet methods. Local perturbation method for propagation in strongly heterogeneous media is
4
Yong-Gang Li
also briefly described in this chapter. Finally, authors review the development of curvelet transform and its application to propagation and imaging in comparison with the beamlet approach. For wavefield decomposition, both beamlet and curvelet transforms have elementary functions of directional wavelets. Beamlet is a type of physical wavelet, representing an elementary wave in various wavefield decomposition schemes using localized building elements, such as coherent state, Gabor atom, GaborDaubechies frame vector, local trigonometric basis function. Curvelet transform is a specifically defined mathematical transform, characterized by the parabolic scaling. Its generalization width is similar to the beam-aperture requirement for asymptotic beam solution: the beamwidth must be smaller than the scale of heterogeneity and much greater than the wavelength. Optimal beamwidth is reached by balancing the beam geometric spreading and the beam-front distortion. Using optimal beamwidth, beamlet or curvelet propagator will be sparse in smooth media for short-range propagation. For strong and rough heterogeneities, beamlet or curvelet scattering will occur and asymptotic propagator may not work well. In this case, the local perturbation method can be applied, in which the propagator is decomposed into a background propagator and a perturbation operator for each forward marching step. Numerical examples demonstrate the validity of the approach in this chapter. Chapter 3: “Two-way Coupling of Solid-fluid with Discrete Element Model and Lattice Boltzmann Model” by Yucang Wang, Sheng Xue and Jun Xie. This chapter presents a fully coupled solid-fluid code using Discrete Element Method (DEM) and Lattice Boltzmann Method (LBM). The new and distinctive features of this coupled approach compared with the existing coupled DEMLBM models include the permission of bonded DEM particles, the capability to simulate explicitly fracturing events by the breakage of bonds, simulation of Darcy flow, free flow, and turbulent flow with the same integrated code, adoption of a more stable and efficient moving boundary condition, and a unified parallel algorithm for both codes based on MPI libraries, which allows larger scale parallel computing using super computers in the future. Two widely used open source codes, the Esys-Particle and OpenLB, are integrated as both of the codes are written using C++ and paralleled with MPI library. Recently, LBM has made a significant progress as a new method into numerical modeling of fluid dynamics. In contrast to the conventional computational fluid dynamics (CFD) techniques that solve macroscopic Navier-Stokes equations, LBM is built on a mesoscopic scale in which fluid is described by a group of discrete particles that propagate along a regular lattice and collide with each other. The use of LBM instead of CFD also eliminates severe mesh distortion due to frequent mesh geometry adaptation required in CFD. Because of its Eulerian grids, LBM is particularly suitable for modeling fluid-solid interaction problems, and a large number of solid particles can easily be accommodated.
Seismic Imaging, Fault Damage and Heal: An Overview
5
Authors present three simple preliminary numerical results to assess the performance of the coupled DEM-LBM approach. The small scaled models are used as a qualitative display to demonstrate the capability and potential of the coupled approach. Some preliminary 2-D simulations, such as particles moving in the fluid, fluid flow in a narrow tunnel or crack and hydraulic fracture induced by the injection of fluid into a borehole, are carried out to validate the integrated code. These results show that the new method is capable of simulating solid particle flow in fluid, fluid flow inside narrow fracture, and hydraulic fracture by injection of fluid. The validation of large-scale simulations in 3-D and detailed comparisons with physical experiments are under development. Chapter 4: “Co-seismic Damage and Post-Mainshock Healing of Fault Rocks at Landers, Hector Mine and Parkfield, California Viewed by Fault-Zone Trapped Waves” by Yong-Gang Li. This chapter reviews fault rock co-seismic damage and post-mainshock healing progressions associated with the 1992 M 7.4 Landers, the 1999 M 7.1 Hector Mine, and the 2004 M 6.0 Parkfield earthquakes in California through observations and 3-D finite-difference modeling of fault-zone trapped waves (FZTWs) generated by explosions and aftershocks, and recorded at linear seismic arrays deployed across and along the rupture zones (Li et al., 1990, and further references). Because FZTWs arise from coherent multiple reflections at the boundaries between the low-velocity fault zone and the high-velocity surrounding rock, their amplitudes, frequencies and dispersive waveforms strongly depend on the fault geometry and physical properties, these waves enable to insight the internal structure and physical properties of fault zones at seismogenic depths with a higher resolution than ever before. The author with his colleagues from multiple institutions (see acknowledgement and references of Chapter 4) have used FZTWs to delineate the studied rupture zones being a low velocity waveguide about 100 to 250 m wide, in which S velocities are reduced by 40%–50% from wall-rock velocities and Q values are 10–50, which is interpreted as a remnant of process zone where inelastic deformation occurs around the propagating crack tip during dynamic rupture in the mainshocks. The width of the fault zone waveguide scales to the rupture length as predicted in published dynamic rupture models (e.g., Scholz, 1990). FZTWs also show the rupture segmentation and bifurcation associated with these earthquakes. The strength of the low-velocity anomalies along the fault might vary over the earthquake cycle (e.g., Vidale et al., 1994; Marone, 1998). Repeated seismic experiments conducted at the Landers rupture zone showed fault healing with recovery of seismic velocity by approximate 2% between 1994 and 1998. The survey in 1998 showed a reduction of the healing rate by a factor of two between 1994–1996 and 1996–1998. The ratio of the rates of P-wave and S-wave speed recovery is consistent with healing caused by closure of cracks that are partially fluid-filled. A similar experiment at Hector Mine has confirmed that healing is
6
Yong-Gang Li
not unique to Landers and shows that there is variability in healing rates among the fault segments that we have measured. However, the healing at the Landers rupture was interrupted in 1999 by the M 7.1 Hector Mine earthquake rupture, which occurred 20–30 km away. The Hector Mine earthquake both strongly shook and permanently strained the Landers fault, adding damage discernible as a temporary reversal of the healing process. The fault has since resumed the trend of strength recovery that it showed after the Landers earthquake. These observations suggest that fault damage caused by strong seismic waves may help to explain earthquake clustering and seismicity triggering by shaking, and may be involved in friction reduction during faulting. At Parkfield, repeated surveys reveal an approximately 2.5% co-seismic decrease in seismic velocity within the San Andreas fault (SAF), due to the co-seismic damage of fault-zone rocks at seismogenic depths during dynamic rupture in the 2004 M 6 Parkfield earthquake. Seismic velocities then increased by an approximate 1.2% in the following ∼4 months, indicating that the rock damaged in the M 6 mainshock recovers rigidity through time. These observations lead us to speculate that fault damage caused by strong seismic waves may help to explain earthquake clustering and seismicity triggering by shaking, and may be involved in friction reduction during faulting. Chapter 5: “Subsurface Rupture Structure of the M 7.1 Darfield and M 6.3 Christchurch Earthquake Sequence Viewed with Fault-Zone Trapped Waves” by Yong-Gang Li, Gregory De Pascale, Mark Quigley and Darren Gravely. In this chapter, Li et al. present the subsurface fault rock damage structure along the Greendale fault (GF) and Port Hills fault (PHF) that ruptured in the 2010 M 7.1 Darfield and 2011 M 6.3 Christchurch earthquake sequence using fault-zone trapped waves (FZTWs) generated by aftershocks recorded at a linear seismic array installed across the surface rupture along the GF. FZTWs were identified for aftershocks occurring on both the GF and the PHF. The postS duration of these FZTWs increases as focal depths and epicentral distances from the array increase, showing an effective low-velocity waveguide formed by severely damaged rocks existing along the GF and PHF at seismogenic depths. Locations of aftershocks generating prominent FZTWs delineate the subsurface GF rupture extending eastward as bifurcating blind fault segments an additional ∼5–8 km beyond the mapped ∼30 km surface rupture into a zone with comparably low seismic moment release west of the PHF rupture. The propagation of FZTW through the intervening ‘gap’ indicates moderate GF-PHF structural connectivity. This zone is interpreted as a fracture mesh reflecting the interplay between basement faults and stress-aligned microcracks that enable the propagation of PHF-sourced FZTWs into the GF damage zone. Combined with previous rupture models for slip distributions in the Canterbury earthquake sequence (Quigley et al., 2012; Barnhart et al., 2011; Beavan et al., 2012; Elliott et al., 2012), authors construct a plausible model of subsurface
Seismic Imaging, Fault Damage and Heal: An Overview
7
rupture zones associated with the Darfield-Christchurch earthquakes. Velocities of basement rocks in this model are constrained by the existing regional velocity models in Canterbury Plains (e.g., Smith et al., 1995; Eberhart-Phillips and Bannister, 2002; Kaiser et al., 2012). The 3-D finite-difference simulations of observed FZTWs suggest that the GF rupture zone is ∼200–250-m wide, consistent with the surface deformation widths, in which velocities are reduced by 35%–55% with the maximum reduction in the ∼100-m wide damage core zone corresponding to surface and shallow subsurface evidence for discrete fracturing. The damage zone delineated by FZTWs indicates an effective low-velocity waveguide extending ∼65 km along the GF and PHF under the Canterbury Plains while the waveguide varies in its velocity and geometry along multiple rupture segments viewed by FZTWs, and penetrates down to the depth of ∼8 km or deeper, consistent with hypocentral locations and geodetically-derived fault models. Their experiment also illuminates a potential approach to image the buried part of a rupture zone using FZTWs recorded at seismic array deployed at the surface-exposed part of the rupture zone. Authors have examined the possible temporal change in wave velocity for repeated aftershock occurring just before and after the large aftershocks to find the additional co-seismic damage in rocks associated with these large aftershocks. We measured ∼2% decrease of seismic velocity with fault rocks due to co-seismic damage by an M 5.3 aftershock. This value is in general consistent with observations of fault rock damage and healing at the San Andreas fault associated with the 2004 M 6 Parkfield earthquake (Li et al., 2007, 2006). Chapter 6: “Characterizing Pre-shock (Accelerating) Moment Release: A Few Notes on the Analysis of Seismicity” by Changsheng Jiang and Zhongliang Wu. Understanding of seismicity is one of the frontiers in the modern seismology. Careful considerations in processing and analysis of seismicity using earthquake catalogues are necessary. in this chapter, Jiang and Wu demonstrate some useful tactics in analysis of earthquake catalog data and make notes on the existing methods used for careful analysis of seismicity in terms of (1) interfering events and the eclipse method, (2) the Bayesian information criterion, (3) the spatiotemporal scales for the sampling of seismic events, and (4) removal of aftershocks and the de-clustered Benioff strain method. Authors use the pre-shock accelerating moment release (AMR) model (Bufe et al., 1994; Brehm and Braile, 1998; Bowman and King, 2001) related to the critical-point-like behavior of earthquake preparation (Sornette and Sammis, 1995; Bowman et al., 1998; Jaum´e and Sykes, 1999; Rundle et al., 2000). They explore whether the claimed and controversial pre-shock acceleration have a firm statistical (and seismological) basis by retrospective investigation in which they focus on the scaling exponent with the failure time fixed to the origin time of the ‘target’ earthquake so that the fitting can be stabilized by reducing one free
8
Yong-Gang Li
parameter (origin time). Borrowing from the concept of modern astronomy for analyzing remote planets, they use an ‘eclipse method’ for screening out the seismicity in the neighboring active fault zones as shown in analysis of seismicity for the 2008 M 8 Wenchuan earthquake catalog data. The Bayesian Information Criterion (BIC) consideration provides a useful aid to judge whether the apparent ‘accelerating’ trend is statistically significant. The BIC criterion may be able to reveal more clues regarding the accelerating/quiescence behavior in the seismic moment release. To de-cluster an earthquake catalogue, previous works on AMR tended to use simple schemes (e.g., Robinson, 2005; Jiang and Wu, 2010), an alternative approach is to use the ‘Epidemic-Type Aftershock Sequences’ (ETAS) model (Ogata, 1988; Zhuang et al., 2002; Zhuang and Ogata, 2006), in which a stochastic de-clustering scheme is proposed no longer determine whether an earthquake is a ‘background event’ or if it is triggered by another. To check the accelerating behavior objectively, authors also try to map the scaling coefficient calculated for different spatio-temporal windows, with different cutoff magnitude of the catalog (Jiang and Wu, 2005, 2010). The method extends a manifestation of the Gutenberg-Richter’s law. Deviation from the G-R power-law relation can be used for judging the completeness of an earthquake catalogue. Quantitatively, the goodness of fit between a power law fit to the data and the observed frequency-magnitude distribution as a function of a lower cutoff of the magnitude can be used (Wiemer and Wyss, 2000). Finally, they provide the case study in seismicity analysis using real catalog data: (1) ‘crack-like’ spatial window for the 2008 M 8.0 Wenchuan earthquake, (2) a finite earthquake rupture of the 2004 M 9.1 Sumatra-Andaman earthquake, and (3) seismic moment tensors to investigate the moment release before the 2011 M 9.0 Tohoku earthquake. Chapter 7: “Statistical Modeling of Earthquake Occurrences Based on External Geophysical Observations: With an Illustrative Application to the Ultralow Frequency Ground Electric Signals Observed in the Beijing Region” by Jiancang Zhuang, Yosihiko Ogata, David Vere-Jones, Li Ma and Huaping Guan. In this chapter, authors present the idea on developing models for earthquake probability forecasts based on the precursor data from observations of the ultra-low frequency components of the underground electric signals used as an example to illustrate the modeling strategies. In the study case, signals from 4 stations in the vicinity of Beijing are used to monitor the variations in ultralow frequency components electric field for forecasting the occurrence of M 4 earthquakes within a 300-km circle centered in Beijing. The model used is a version of Ogata’s Lin-Lin algorithm for examining the influence of an explanatory signal on the occurrence of events in a stochastic point process, which is highly significant, and greatly superior to the explanatory effect of the same signals applied to a randomized version of the earthquake data. The results from all four stations show significant explanatory power although in combination the two most effective tend to dominate the forecasts. The predictions appear to
Seismic Imaging, Fault Damage and Heal: An Overview
9
be most effective for events with M 5, for which probability gains are up to 3–4 over the simple Poisson process, and for the events closer to the observing stations. Some smaller events appear to produce detectable signals at distances of over 100 km from the source. The probability modeling framework adapted in this chapter is extended to the development of probability forecasts, which can be assessed directly, and in their turn can form the basis for a variety of decision procedures (e.g., VereJones, 1995, and further references). Authors present a brief discussion of the performance of probability forecasts based on the best Lin-Lin model, which provides a strong confirmation of the reality of the explanatory power of the electric signals. They also carefully examine the effect of changes in background seismicity. Results show that the Lin-Lin model based on the electrical signals still out-performs the two-stage Poisson model. The purpose of this book is to introduce the new approaches in solid-earth geophysics research with case studies. The following new methods and results presented in this book will be of particular interest to the readers: – The full-3D waveform tomography method, and beamlets and curvelets methods for imaging complex subsurface structure. – Observations and 3-D finite-difference simulations of fault-zone trapped wave for high-resolution delineation of fault internal structure and physical properties. – Co-seismic rock damage and post-mainshock heal in major earthquakes. – Discrete element method for solid-fluid coupling mechanics in earthquake fracture modeling. – Pre-shock accelerating moment release with analysis of seismicity for earthquake risk assessment. – Ultra-low frequency ground electric signals for statistical modeling of earthquake occurrences. This book is a self-contained volume starting with an overview of the subject then explores each topic with in depth detail. Extensive reference lists and cross references with other volumes to facilitate further research. Full-color figures and tables support the text and aid the readers in understanding. Content is suited for both the senior researchers and graduate students in geosciences who will broaden their horizons about observational, computational and applied seismology and earthquake sciences. This book covers multi-disciplinary topics to allow readers to gasp the new methods and techniques used in data analysis and numerical modeling for structural, physical and mechanical interpretation of earthquake phenomena, to aid the understanding of earthquake processes and hazards, and thus helps readers to evaluate potential earthquake risk in seismogenic regions globally.
10
Yong-Gang Li
Part of articles in the preceded book (Book 1) edited by Li (2012) and this book (Book 2) came out of International Symposium on Earthquake Seismology and Earthquake Predictability (ISESEP) held in Beijing, China, 2009, sponsored by Institute of Geophysics in China Earthquake Administration (CEA), co-sponsored by the Asian Seismological Commission (ASC) of the International Association of Seismology and Physics of the Earth’s Interior (IASPEI) and supported by the International Union for Geodesy and Geophysics (IUGG). The meeting included two special sessions: I. “Wenchuan Earthquake: One Year After” and II. “Keiiti Aki Workshop on Earthquake Physics and Earthquake Predictability”. The meeting highlights the importance for an international discussion on the seismology, geology, and geodynamics of strong to great earthquakes, their predictability, and how to make full use of the present knowledge and techniques to reduce earthquake disasters. Chapter 3 by Yong-Gang Li, Peter E. Malin, and Elizabeth S. Cochran; Chapter 6 by Xiang-Chu Yin, Yue Liu, LangPing Zhang, and Shuai Yuan in Book 1, and Chapter 6 by Changsheng Jiang and Zhongliang Wu; Chapter 7 by Jiancang Zhuang, Yosihiko Ogata, David Vere-Jones, Li Ma and Huaping Guan in Book 2 came from representations in the 2009 ISESEP meeting. The editor of this book series wishes to thank reviewers who contributed to referee articles in Volume 1 (Chen, 2012; Wu et al., 2012; Li et al., 2012a,b; Duan, 2012; Yin et al., 2012; Wang et al., 2012) and the present Volume. In addition to many chapter authors, reviewers include Zhengxi Ge (PKU), Elizabeth Cochran (UCR), En-Jui Lee (UOW), David Oglesby (UCR), Martha Savage (VUOW), Yushen Sun (MIT), Xiao-Bi Xie (UCSC), Xiangzu Yin, and Yingcai Zheng (MIT). We are grateful to many organizations and individuals, including HEP Director Bingxiang Li and Editors Zhengxiong Chen and Yan Guan , who help to make both books possible. This article was completed partly during the Author’s (YGL) visit as Honorary Professor in Chinese Academy of Geological Science, Beijing, China. Key Words: Data assimilation, Full-3D waveform tomography, Beamlets and curvelets methods, Fault-zone trapped waves, Rock damage and heal, Two-way coupling of solid-fluid, Discrete element model and lattice Boltzmann model, Pre-shock moment release, Earthquake catalogues, Relocation of the Wenchuan earthquake, Statistical modeling of earthquake occurrences, Ultra-low frequency ground electric signals.
References Aki, K., 1984. Asperities, barriers, characteristic earthquakes, and strong motion prediction. J. Geophys. Res., 89, 5867–5872.
Seismic Imaging, Fault Damage and Heal: An Overview
11
Barnhart, W. D., M. J. Willis, R. B. Lohman, and A. K. Melkonian, 2011. InSAR and optical constraints on fault slip during the 2010–2011 New Zealand earthquake sequence. Seismological Research Letters, 82 (6), 815–823. Beck, S. L. and D. H. Christensen, 1991. Rupture process of the February 4, 1965, Rat Islands earthquake. J. Geophys. Res., 96, 2205–2221. Beavan J., M. Motagh, E. Fielding, N. Donnelly, and D. Collett, 2012. Fault slip models of the 2010–2011 Canterbury, New Zealand, earthquakes from geodetic data, and observations of post-setismic ground deformation. New Zealand Journal of Geology and Geophysics, 55, doi: 10.1080/00288306.2012.697472. Blanpied, M. L., D. A. Lockner, and J. D. Byerlee, 1992. An earthquake mechanism based on rapid sealing of faults. Nature, 359, 574–576. Bowman, D. D. and G. C. P. King, 2001. Accelerating seismicity and stress accumulation before large earthquakes. Geophys. Res. Lett., 28: 4039–4042. Bowman, D. D., G. Ouillon, C. G. Sammis, A. Sornette, and D. Sornette, 1998. An observational test of the critical earthquake concept. J. Geophys. Res., 103, 24359– 24372. Brehm, D. J. and L. W. Braile, 1998. Intermediate-term earthquake prediction using precursory events in the New Madrid seismic zone. Bull. Seism. Soc. Am., 88, 564–580. Bufe, C. G., S. P. Nishenko, and D. J. Varnes, 1994. Seismicity trends and potential for large earthquake in the Alaska-Aleutian region. PAGEOPH, 142, 83–99. Chen, P., 2012. Full-wave seismic data assimilation: A unified methodology for seismic waveform inversion. In: Li, Y. G. (Ed.). Imaging, Modeling and Assimilation in Seismology. Higher Education Press, Beijing, China, De Gruyter, Boston, USA, 19–63. Chester, F. M., J. P. Evans, and R. L. Biegel, 1993. Internal structure and weakening mechanisms of the San Andreas fault. J. Geophys. Res., 98, 771–786. Chester, F. M. and J. S. Chester, 1998. Ultracataclasite structure and friction processes of the San Andreas fault. Tectonophysics, 295, 199–221. Das, S. and K. Aki, 1997. Fault plane with barriers: A versatile earthquake model. J. Geophys. Res., 82, 5658–5670. Day, S. M., 1984. Three-dimensional simulation of spontaneous rupture: The effect of nonuniform prestress. Bull. Seismol. Soc. Am., 72, 1881–1902. Dieterich, J. H., 1997. Modeling of rock friction. 1. Experimental results and constitutive equations. J. Geophys. Res., 84, 2169–2175. Duan, B. C., 2012. Ground-motion simulations with dynamic source characterization and parallel computing. In: Li, Y. G. (Ed.). Imaging, Modeling and Assimilation in Seismology. Higher Education Press, Beijing, China, De Gruyter, Boston, USA, 199–218. Eberhart-Phillips, D. and S. Bannister, 2002. Three-dimensional crustal structure in the Southern Alps region of New Zealand from inversion of local earthquake and active source data. J. Geophys. Res., 107, doi:10.1029/2011JB000567. Elliott J. R., E. K. Nissen, P. C. England, J. A. Jackson, S. Lamb, Z. Li, M. Oehlers, and B. Parsons, 2012. Slip in the 2010–2011 Canterbury earthquakes, New Zealand. J. Geophys. Res., 117, B03401, 1–36.
12
Yong-Gang Li
Ellsworth, W. L., 1990. Earthquake history, 1769–1989. In: Wallace, R. E. (Ed.). The San Andreas Fault System, California. U. S. Geol. Surv. Prof. Pap., 1515, 153–187. Harris, R. A. and S. M. Day, 1993. Dynamics of fault interaction: Parallel strike-slip faults. J. Geophys. Res., 98, 4461–4472. Hickman, S., R., Sibson, and R. Bruhn, 1995. Introduction to special section: Mechanical involvement of fluids in faulting. J. Geophys. Res., 100, 12831–12840. Jaum´e, S. C. and I. R. Sykes, 1999. Evolving towards a critical point: A review of accelerating seismic moment/energy release prior to large and great earthquake. PAGEOPH, 155, 279–306. Jiang, C. S. and Z. L. Wu, 2005. Test of the preshock accelerating moment release (AMR) in the case of the 26 December 2004 MW 9.0 Indonesia earthquake. Bull. Seism. Soc. Am., 95, 2016–2025. Jiang, C. S. and Z. L. Wu, 2010. Seismic moment release before the May 12, 2008, Wenchuan earthquake in Sichuan of Southwest China. Concurrency Computat.: Pract. Exper., 22, 1784–1795. Johnson, A. M., R. W. Fleming, and K. M. Cruikshank, 1994. Shear zones formed along long straight traces of fault zones during the 28 June 1992 Landers, California, earthquake. Bull. Seism. Soc. Am., 84, 499–510. Kaiser, A., C. Holden, J. Beavan, D. Beetham, R. Benites, A. Celentano, D. Collett, J. Cousins, M. Cubrinovski, G. Dellow, P. Denys, E. Fielding, B.Fry, M. Gerstenberger, R.Langridge, C. Massey, M. Motagh, N. Pondard, G. McVerry, J. Ristau, M. Stirling, J. Thomas, S. R. Uma, and J. Zhao, 2012. The MW 6.2 Christchurch earthquake of February 2011: Preliminary report. New Zealand Journal of Geology and Geophysics, 55 (1), 67–90. Lees, J. M. and P. E. Malin, 1990. Tomographic images of P wave velocity variation at Parkfield, California. J. Geophys. Res., 95, 21793–21804. Li, Y. G. and P. C. Leary, 1990. Fault zone trapped seismic waves. Bull. Seism. Soc. Am., 80, 1245–1271. Li, Y. G., P. C. Leary, K. Aki, and P. E. Malin, 1990. Seismic trapped modes in the Oroville and San Andreas fault zones. Science, 249, 763–766. Li, Y. G., K. Aki, D. Adams, A. Hasemi, W. H. K. Lee, 1994. Seismic guided waves trapped in the fault zone of the Landers, California, earthquake of 1992. J. Geophys. Res., 99, 11705–11722. Li, Y. G., J. E. Vidale, K. Aki, F. Xu, T. Burdette, 1998. Evidence of shallow fault zone strengthening after the 1992 M 7.5 Landers, California, earthquake. Science, 279, 217–219. Li, Y. G., P. Chen, E. S. Cochran, J. E. Vidale, and T. Burdette, 2006. Seismic evidence for rock damage and healing on the San Andreas fault associated with the 2004 M 6 Parkfield earthquake. Special issue for Parkfield M 6 earthquake. Bull. Seism. Soc. Am., 96(4), S1-15, doi:10.1785/0120050803. Li, Y. G., J. E. Vidale, K. Aki, and F. Xu, 2000. Depth-dependent structure of the Landers fault zone from trapped waves generated by aftershocks. J. Geophys. Res., 105, 6237–6254. Li, Y. G., J. E. Vidale, S. M. Day, and D. Oglesby, 2002. Study of the M 7.1 Hector Mine, California, earthquake fault plan by fault-zone trapped waves. Hector Mine Earthquake Special Issue. Bull. Seism. Soc. Am., 92, 1318–1332.
Seismic Imaging, Fault Damage and Heal: An Overview
13
Li, Y. G., J. E., Vidale, and S. E. Cochran, 2004. Low-velocity damaged structure of the San Andreas fault at Parkfield from fault-zone trapped waves. Geophy. Res. Lett., 31, L12S06. Li, Y. G., P. Chen, E. S. Cochran, and J. E. Vidale, 2007. Seismic velocity variations on the San Andreas Fault caused by the 2004 M 6 Parkfield earthquake and their implications. Eearth and Planate Science, 59, 21–31. Li, Y. G. and P. E. Malin, 2008. San Andreas Fault damage at SAFOD viewed with fault-guided waves. Geophys. Res. Lett., 35, L08304, doi:10.1029/2007GL032924. Li, Y. G., 2012. Imaging, Modeling and Assimilation in Seismology. Higher Education Press, Beijing, China, De Gruyter, Boston, USA, 1–262. Li, Y. G., P. Malin, and E. Cochran, 2012a. Fault-zone trapped waves: High-resolution characterization of the damage zone on the Parkfield San Andreas fault at depth. In: Li, Y. G. (Ed.). Imaging, Modeling and Assimilation in Seismology. Higher Education Press, Beijing, China, De Gruyter, Boston, USA, 108–150. Li, Y. G., J. Y. Sue, and T. C. Chen, 2012b. Fault-zone trapped waves at a dip fault: Documentation of rock damage on the thrusting Longmen-Shan fault ruptured in the 2008 M 8 Wenchuan earthquake. In: Li, Y. G. (Ed.). Imaging, Modeling and Assimilation in Seismology. Higher Education Press, Beijing, China, De Gruyter, Boston, USA, 151–198. Malin, P. E., S. N. Blakeslee, M. G. Alvarez, and A. J. Martin, 1989. Microearthquake imaging of the Parkfield asperity. Science, 244, 557–559. Marone, C., 1998. Laboratory-derived friction laws and their application to seismic faulting. Annu. Rev. Earth Planet. Sci., 26, 643–696. Ogata, Y., 1988. Likelihood analysis of point processes and its application to seismological data. Bulletin of the International Statistical Institute, 50, 943–961. Quigley, M., R. Van Dissen, N. Litchfield, P. Villamor, D. Barrell, T. Stahl, E. Bilderback, D. Noble, 2012. Surface rupture during the 2010 MW 7 Darfield (Canterbury) earthquake: Implications for fault rupture dynamics and seismic hazard analysis. Geology, 40 (1), 55–58. Rice, J. R., 1980. The mechanics of earthquake rupture. In: Dziewonski, A. M. and E. Boschi. (Eds.). Physics of the Earth’s Interior. Amsterdam, 555–649. Robinson, R., S. Y. Zhou, S. Johnston, and D. Vere-Jones, 2005. Precursory accelerating seismic moment release (AMR) in a synthetic seismicity catalog: A preliminary study. Geophys. Res. Lett., 32, L07309, doi:10.1029/2005GL022576. Rundle, J. B., W. Klein, D. L. Turcotte, and B. D. Malamud, 2000. Precursory seismic activation and critical-point phenomena. PAGEOPH, 157, 2165–2182. Scholz, C.H., 1990. Wear and gouge formation in brittle faulting. Geology, 15, 493–495. Sieh, K., et al., 1993. Near-field investigations of the Landers earthquake sequence, April to July 1992. Science, 260, 171–176. Smith E. G., T. Stern, and B. O’Brien, 1995. A seismic velocity profile across the central South Island, New Zealand, from explosion data. New Zealand Journal of Geology and Geophysics, 38, 565–570. Sornette, D. and C. G. Sammis, 1995. Critical exponents from renomalization group theory of earthquakes: Implications for earthquake prediction. J. Phys. I., 5: 607– 619.
14
Yong-Gang Li
Thurber, C., S. Roecker, H. Zhang, S. Baher, and W. Ellsworth, 2004. Fine-scale structure of the San Andreas fault zone and location of the SAFOD target earthquakes. Geophys. Res. Letter, 31, L12S02, doi:10.1029/2003GL019398. Vere-Jones, D., 1995. Forecasting earthquakes and earthquake risk. International Journal of Forecasting, 11, 503–538. Vidale, J. E., W. L. Ellsworth, A. Cole, and C. Marone, 1994. Rupture variation with recurrence interval in eighteen cycles of a small earthquake. Nature, 368, 624–626. Vidale, J. E. and Y. G. Li, 2003. Damage to the shallow Landers fault from the nearby Hector Mine earthquake. Nature, 421, 524–526. Wald, D. J. and T. H. Heaton, 1994. Spatial and temporal distribution of slip for the 1992 Landers, California, earthquake. Bull. Seism. Soc. Am., 84, 668–691. Wang, Y. C., S. Xue, and J. Xie, 2012. Discrete element method and its applications in earthquake and rock fracture modeling. In: Li, Y. G. (Ed.). Imaging, Modeling and Assimilation in Seismology. Higher Education Press, Beijing, China, De Gruyter, Boston, USA, 235–262. Wiemer, S. and M. Wyss, 2000. Minimum magnitude of complete reporting in earthquake catalogs: Examples from Alaska, the Western United States, and Japan. Bull. Seism. Soc. Am., 90, 859–869. Wesson, R. L. and W. L. Ellsworth, 1973. Seismicity preceding moderate earthquakes in California. J. Geophys. Res., 78, 8527–8545. Wu, R. S., X. B. Xie, and S. W. Jin, 2012. One-return propagators and the applications in modeling and imaging. In: Li, Y. G. (Ed.). Imaging, Modeling and Assimilation in Seismology. Higher Education Press, Beijing, China, De Gruyter, Boston, USA, 65–105. Yin, X. C., Y. Liu, S. A. Yuan, L. P. Zhang, 2011. LURR and its new progress. In: Li, Y. G. (Ed.). Imaging, Modeling and Assimilation in Seismology. Higher Education Press, Beijing, China, De Gruyter, Boston, USA, 219–234. Zhuang, J. and Y. Ogata, 2006. Properties of the probability distribution associated with the largest event in an earthquake cluster and their implications to foreshocks. Phys. Rev. E., 73, 046134, doi: 10.1103/PhysRevE.73.046134. Zhuang, J., Y. Ogata, and D. Vere-Jones, 2002. Stochastic declustering of space-time earthquake occurrences. J. Amer. Stat. Assoc., 97, 369–380.
Author Information Yong-Gang Li Department of Earth Sciences, University of Southern California, Los Angeles, CA 90089, USA. E-mail: [email protected]
Chapter 1
Applications of Full-Wave Seismic Data Assimilation (FWSDA) Dawei Mu, En-Jui Lee, and Po Chen
In the first volume of this book series, we introduced the concept of full-wave seismic data assimilation (FWSDA) and its mathematical formulations in the framework of the various data assimilation theories (Chen, 2010). The full-wave seismological inverse problem, which aims at estimating earth structure parameters and seismic source parameters using observed waveform data and the seismic wave equation, can be formulated as a weakly constrained generalized inverse, in which the seismic wave equation (with its initial and boundary conditions), the structural and source parameters and the waveform misfit measurements are all allowed to contain errors. FWSDA provides a unified framework for solving seismological inverse problems and for estimating uncertainties associated with the nonlinear inversion process. Both the adjoint-wavefield (AW) method and the scattering-integral (SI) method can be derived from FWSDA as special cases. In this chapter, we will discuss issues related to the applications of FWSDA in realistic seismological inverse problems. In FWSDA, the seismic wave equation and its adjoint system, if the AW method is adopted, or the receiver-side Green’s tensors (RGTs), if the SI method is adopted, need to be solved many times. For three-dimensional earth structure models with highly irregular surface topography or fault structures, the efficiency and the accuracy of the wave equation solver are highly important in solving the problem in a realistic amount of time. In this chapter, we will review and discuss some of the latest developments in numerical solutions of the forward problem and their implementation and optimization on modern CPU-GPU hybrid parallel computing platforms. In some of the recent successful full-3D waveform tomography applications, the waveform misfit measurements were made on selected wave packets on the seismograms. For realistic inversions involving a large amount of seismic data, this waveform selection process needs to be automated to a certain extent. We will discuss
16
Dawei Mu, En-Jui Lee, and Po Chen
some recent developments in automating seismic waveform data processing and selection. A realistic full-3D, full-wave tomography for the crustal structure in Southern California will be used to illustrate the various components of FWSDA. Key Words: Data assimilation, Full-wave tomography, Full-3D inversion, Earthquake source parameters, Discontinuous Galerkin, Adjoint method and scatteringintegral methods, Finite-difference, Discontinuous mesh, GPU, Waveform selection.
1.1
Numerical Solutions of Seismic Wave Equations
Computer simulations of seismic wavefields have played an important role in seismology in the past few decades. However, the accurate and computationally efficient numerical solution of the three-dimensional (visco)elastic seismic wave equation is still a very challenging task, especially when the material properties are complex and the modeling geometry, such as surface topography and subsurface fault structures, is irregular. In the past, several numerical schemes have been developed to solve the elastic seismic wave equation. The finitedifference (FD) method was introduced to simulate SH and P-SV waves on regular, staggered-grid, two-dimensional meshes in Madariaga (1976) and Virieux (1984, 1986). The FD method was later extended to three spatial dimensions and to account for anisotropic, viscoelastic material properties (e.g., Mora 1989; Igel et al., 1995; Tessmer, 1995; Graves, 1996; Moczo et al., 2002). The spatial accuracy of the FD method is mainly controlled by the number of grid points required to accurately sample the wavelength. The pseudo-spectral (PS) method with Chebychev or Legendre polynomials (e.g., Carcione, 1994; Tessmer and Kosloff, 1994; Igel, 1999) partially overcomes some limitations of the FD method and allows for highly accurate computations of spatial derivatives. However, due to the global character of its derivative operators, it is relatively cumbersome to account for irregular modeling geometry and efficient and scalable parallelization on distributed-memory computer clusters is not as straightforward as in the FD method. Another possibility is to consider the weak (i.e., variational) form of the seismic wave equation. The finite-element (FE) method (e.g., Lysmer and Drake, 1972; Bao et al., 1998) and the spectral-element (SE) method (e.g., Komatitsch and Vilotte, 1998; Komatitsch and Tromp, 1999, 2002) are based on the weak form. An important advantage of such methods is that the free-surface boundary condition is naturally accounted for even when the surface topography is highly irregular. And in the SE method, high-order polynomials (e.g.,
1
Applications of Full-Wave Seismic Data Assimilation (FWSDA)
17
Lagrange polynomials defined on Gauss-Lobatto-Legendre points) are used for approximation, which provides a significant improvement in spatial accuracy and computational efficiency. The arbitrary high-order discontinuous Galerkin (ADER-DG) method on unstructured meshes was introduced to solve two-dimensional isotropic elastic seismic wave equation in K¨ aser and Dumbser (2006). It was later extended to three-dimensional isotropic elastic case in Dumbser and K¨aser (2006) and to account for viscoelastic attenuation (K¨ aser et al., 2007), anisotropy (la Puente et al., 2007) and poroelasticity (la Puente et al., 2009). The p-adaptivity (i.e., the polynomial degrees of the spatial basis functions can vary from element to element) and locally varying time steps were addressed in Dumbser et al. (2007). Unlike conventional numerical schemes, which usually adopt a relatively loworder time-stepping method such as the Newmark scheme (Hughes, 1987) and the 4th -order Runge-Kutta scheme (e.g., Igel, 1999), the ADER-DG method achieves high-order accuracy in both space and time by using the arbitrary high-order derivatives (ADER), which was originally introduced in Titarev and Toro (2002) in the finite-volume framework. The ADER scheme performs highorder explicit time integration in a single step without any intermediate stages. In three dimensions, the ADER-DG scheme achieves high-order accuracy on unstructured tetrahedral meshes, which allows for automated mesh generation even when the modeling geometry is highly complex. Furthermore, the majority of the operators in the ADER-DG method are applied in an element-local way, with weak element-to-element coupling based on numerical flux functions, which results in strong locality in memory access patterns. And the high-order nature of this method lets it require fewer data points, therefore fewer memory fetches, in exchange for higher arithmetic intensity. These characteristics of the ADERDG method make it well suited to run on massively parallel graphic processing units (GPUs). In the following sections, we will discuss some recent developments in the finite-difference method, in particular, its extensions to non-uniform and discontinuous meshes, and the ADER-DG method in more detail. It is likely that the literature cited in the following is incomplete. However, some of the key references are included and readers who are interested in studying these topics in depth can use them as a starting point for further investigation. This is a highly active research area with many new ideas and implementations emerging rapidly. The advance in computing architecture certainly plays an important role and many new implementations and optimizations are facilitated by innovations in computer sciences.
18
1.1.1
Dawei Mu, En-Jui Lee, and Po Chen
Stable Finite-Difference Solutions on Non-Uniform, Discontinuous Meshes
The finite-difference method for solving acoustic and (visco)elastic seismic wave equations has been used extensively in seismology because its numerical efficiency is high both on commodity desktops and on modern distributed-memory parallel computing platforms and it is relatively easy to program and use. In conventional uniform-mesh finite-difference method, the grid space and time step length are determined based on the maximum desired frequency of the resulting synthetic seismograms and the CFL (Courant-Friedrichs-Levy) stability condition, i.e., αmax ∆t < 0.5 (1.1) h where αmax is the maximum P-wave speed, ∆t is the time-step length and h is the grid space. Using our tomography in Southern California as an example, the maximum desired frequency of the synthetic seismograms is 0.2 Hz and the minimum S-wave speed in our three-dimensional starting model is 900 m/s, which gives a minimum wavelength of 4,500 m. If we choose a grid space of 500 m, we can guarantee 9 grid points per minimum wavelength in our threedimensional 4th -order staggered-grid finite-difference simulations. In a 4th -order finite-difference scheme, 5.5–6 grid points per minimum wavelength are usually sufficient to ensure accuracy of the synthetic seismograms. We are using 9 grid points per minimum wavelength in the starting model because the minimum S-wave speed in our structure model may reduce when we update our velocity model during the iterative tomographic inversion process. The maximum Pwave speed in the simulation volume is 8,223 m/s, considering Equation (1.1), the time-step length must be smaller than 0.0304 s for the simulation to be stable. For a simulation volume that is 900 km long, 450 km wide and 50 km deep, the total number of grid points is 162 million. If the desired length of the synthetic seismograms is 180 s and the time-step length is around 0.03 s, the total number of time steps is about 6,000. On the latest IBM Blue Gene/Q system, it takes 2,048 cores in about 15 minutes of wall-time to complete one simulation. For many earth structure models, the minimum S-wave speed close to the surface of the earth can be much smaller than that at greater depths. If this is the case, using a discontinuous mesh with finer grid in the upper part of the model and a coarser grid in the lower part of the model may significantly improve computational efficiency without scarifying simulation accuracy. Considering our Southern California example, the minimum S-wave speed increases from around 900 m/s at 250 m depth to around 3,000 m/s at around 5 km depth. If we adopt a finer grid with 500 m grid space for the modeling volume above 5 km depth and a coarser grid with 1,500 m grid space for the volume below 5 km
1
Applications of Full-Wave Seismic Data Assimilation (FWSDA)
19
depth, the total number of grid points is 21.6 million, a reduction of about 87% compared with the uniform mesh configuration, which can be directly translated into a significant amount of savings in either the wall-time or the core count or both. An important challenge in implementing finite-difference methods on discontinuous meshes is how to reduce the instability caused by the numerical noise generated at the interface between the finer and the coarser grids. On this interface, in order to compute the spatial derivatives of the field variables (e.g., velocity and stress) at the finer-grid boundary we need access to the field variables at grid positions that do not exist at the coarser-grid boundary. Some type of interpolation scheme is needed to obtain the field variables at those missing grid positions. The existing finite-difference implementations on discontinuous meshes can be categorized based on their interpolation approaches for reducing the instability. For two-dimensional acoustic wave equations, Jastram and Behle (1992) used trigonometric interpolation in the horizontal direction to obtain the pressure at those missing grid positions at the boundary of the coarser grid. The trigonometric interpolation scheme is closely related to Fourier spectral methods, which have been shown to be highly accurate in computing spatial derivatives of the field variables. This interpolation scheme allows arbitrary integer ratio of the coarser grid space Hand the finer grid space h, although intuitively one can expect that the larger is the grid ratio H/h, the higher is the possibility of generating numerical instability. The same methodology was extended to twodimensional P-SV elastic wave equation using a staggered grid in Jastram and Tessmer (1994). An interpolation scheme that is closely related to trigonometric interpolation is the interpolation in the wavenumber domain, which is adopted in Wang and Schuster (1996) to solve three-dimensional acoustic and elastic wave equations. The same technique was extended to the viscoelastic wave equation in Wang et al. (2001). Simple linear or bilinear interpolation schemes have also been adopted in both two-dimensional (e.g., Hayashi et al., 2001) and threedimensional (e.g., Aoi and Fujiwara, 1999) finite-difference simulations. In Aoi and Fujiwara (1999), numerical evidences have shown that when the grid space ratio H/h = 3 and the number of grid points per wavelength is larger than 10, the error introduced by a linear interpolation scheme is less than 2.2%, which is sufficiently accurate for the 2nd -order staggered-grid finite-difference scheme used in their simulations. A different issue that is also related to the instability problem is how to downsample the field variables from the finer grid to the coarser grid on the interface. Theoretical considerations (e.g., Kristek et al., 2010) and some numerical experiments (e.g., Hayashi et al., 2001; Kristek et al., 2010) have shown that one cannot simply take the field variable values in the finer grid to replace those coarser-grid field variable values at the coarser grid positions that coincide with the grid points in the finer grid when computing spatial derivatives of the field
20
Dawei Mu, En-Jui Lee, and Po Chen
variables in the coarser grid. From a theoretical point of view (e.g., Kristek et al., 2010), the minimum wavelength supported by the finer grid λh is smaller than the minimum wavelength supported by the coarser grid λH for a given frequency. When the wave-field enters the coarser grid from the finer grid at the interface, waves with wavelength larger than λh but smaller than λH will introduce aliasing effect into the coarser grid and a filtering process that removes waves with wavelength smaller than λH is needed at the interface to ensure numerical stability. In Hayashi et al. (2001), a one-dimensional five-point averaging formula was used to improve the stability of their two-dimensional P-SV viscoelastic finitedifference scheme. In Kristek et al. (2010), the Lanczos down-sampling filter was used to improve the stability of their three-dimensional 4th -order staggered-grid finite-difference scheme. The Lanczos filter is a windowed sinc function in space and provides a good approximation to a boxcar in the wavenumber space. It can be implemented efficiently using a weighted averaging formula on the interface (Kristek et al., 2010). If a single time step is used for the discontinuous spatial mesh, this time step may become unnecessarily small for some spatial grid points. To further improve numerical efficiency, a straightforward extension is to use a locally varying time step that is adapted to the stability condition, Equation (1.1), in each submesh. This type of local-time-step, discontinuous-grid finite-difference method was implemented in Kang and Baag (2004). In their implementation, a simple linear interpolation scheme was adopted for both the temporal and the spatial interpolations of the field variables on the mesh interface and the 4th -order staggered-grid finite-difference scheme is used for all interior grid points. The efficient implementation of such local-time-step, discontinuous-grid finite-difference schemes on modern distributed-memory parallel computing platforms is still a very challenging issue. If the spatial mesh is distributed evenly among all processors using a simple domain decomposition approach, the processors that are mainly occupied by the coarser grid will likely be idle for a significant amount of time because the field variables on the coarser grid are updated less frequently than those on the finer grid, which is a serious load-balancing problem. Another possibility is to evenly distribute the finer grid and the coarser grid separately so that each processor owns an equal number of finer grids, as well as an equal number of coarser grids. In such a case, every processor will always have some work to do at every time step, but the spatial decomposition of the finer and the coarser grids may no longer conform to simple boundaries and may introduce additional complexity in exchanging boundary field variables among processors. Instead of using a discontinuous mesh, one can also try to adapt the mesh to the velocity model using a non-uniform but continuous mesh. In a non-uniform mesh, the number of grid points in each direction does not change; therefore one does not need to interpolate field variables. However, the grid space can vary in accordance with the velocity model and avoid oversampling in regions with
1
Applications of Full-Wave Seismic Data Assimilation (FWSDA)
21
high velocity. Following Pitarka (1999), the 4th -order difference operator Dx on a field variable g(x) at location xi can be expressed as Dx g(xi ) = c1 g(xi + ∆1 ) + c2 g(xi − ∆2 ) + c3 g(xi + ∆3 ) + c4 g(xi − ∆4 ) (1.2) where ci are 4 coefficients to be determined and ∆i are spatial increments on both sides of xi and can be expressed in terms of the non-uniform grid spaces. Transforming Equation (1.2) into the Fourier domain, we obtain an equation in terms of the wavenumber k, ik = c1 exp (ik∆1 ) + c2 exp (−ik∆2 ) + c3 exp (ik∆3 ) + c4 exp (−ik∆4 )
(1.3)
The exponentials in Equation (1.3) can be expanded into Taylor series and we can truncate the Taylor expansion to 4th -order. Using the first term on the right-hand-side as an example, we have, exp(ik∆1 ) ≈ 1 + ik∆1 −
k 3 ∆31 k 2 ∆21 −i 2 6
(1.4)
Bringing Equation (1.4) into Equation (1.3) and collecting the terms according to the order of k, we obtain ik = (c1 + c2 + c3 + c4 ) + ik(c1 ∆1 − c2 ∆2 + c3 ∆3 − c4 ∆4 ) k2 (−c1 ∆21 − c2 ∆22 − c3 ∆23 − c4 ∆24 ) 2 k3 +i (−c1 ∆31 + c2 ∆32 − c3 ∆33 − c4 ∆34 ) 6 +
(1.5)
Equation (1.5) can be expressed in a matrix form as
1 ∆1 −∆2 1 −∆31
1 −∆2 −∆22 ∆32
1 ∆3 −∆23 −∆33
1 c1 0 −∆4 c2 1 = −∆24 c3 0 0 ∆34 c4
(1.6)
which can be solved for the coefficients ci . The same analysis can also be performed on the y- and z-axis. Explicit expressions for ci in terms of ∆i can be obtained by solving Equation (1.6) using a computer algebra system such as Maple and Mathematica. Once the non-uniform mesh has been set up, the spatial increments ∆i are known and the coefficients ci only need to be computed once and stored on disk. For a staggered-grid mesh, two sets of ci need to be computed for field variables located on the grid points and those located on positions shifted by half the grid space. Perhaps an even more efficient implementation would be a combination of a discontinuous mesh with a non-uniform mesh. In Liu and Archuleta (2002), the
22
Dawei Mu, En-Jui Lee, and Po Chen
mesh is allowed to be discontinuous in the vertical direction with a grid space ratio H/h = 3 and also non-uniform in all three spatial dimensions. The perfectlymatched-layer (PML) boundary condition is implemented for all boundaries of the modeling volume except for the free-surface and the 4th -order staggered-grid finite-difference scheme is adopted for all interior grid points. This code has been parallelized using the message-passing-interface (MPI). It is used in some of our own modeling and inversion studies in which the effects of surface topography and the curvature of the Earth do not need to be considered. The improvement in computational efficiency is really astonishing compared with a uniform-mesh 4th -order staggered-grid finite-difference code. In cases where irregular surface topography and/or subsurface fault structures need to be accounted for, we use the ADER-DG method for solving the seismic wave equation. More discussions about our ADER-DG implementation are presented in Sections 1.1.3 and 1.1.4.
1.1.2
Accelerating Finite-Difference Methods Using GPUs
In the past four decades, the development in the computing chip industry has roughly followed the Moore’s law. Many of the performance improvements were due to increased clock speeds and sophisticated instruction scheduling in a single core. As the transistor density keeps increasing, the industry is now facing a number of engineering difficulties with using a large number of transistors efficiently in individual cores (e.g., power consumption, power dissipation). The effect is that clock speeds are staying relatively constant and core architecture is expected to become simpler. As a consequence, when we consider future platforms for high-performance scientific computing, there are some inevitable trends, for instance, the increase in the number of cores in general-purpose CPUs, and the adoption of many-core accelerators (e.g., Field Programmable Gate Array, Graphic Processing Unit, Cell Broadband Engine) due to their footprints smaller and power consumptions per flop lower than general-purpose CPUs. The users who want to once again experience substantial performance improvements as before need to learn how to exploit multiple/many cores. The graphic processing unit (GPU) has become an attractive many-core coprocessor for general-purpose scientific computing in the past few years. In the conventional CPU architecture, a large amount of transistors are dedicated for caches, prediction and speculation, which is mainly to battle the memory bottleneck caused by bandwidth limitations and memory-fetch latency. Unlike in a conventional CPU, in a typical GPU, many more transistors are dedicated for arithmetic calculations rather than data caching and flow control. The abundance of cheap computing power on a GPU allows us to effectively hide memory-access latencies with massive parallelism. In particular, on a GPU one
1
Applications of Full-Wave Seismic Data Assimilation (FWSDA)
23
can launch a large number of threads and the thread scheduler can effectively overlap memory transactions for some threads with arithmetic calculations on other threads. Such a massive parallelism offered by GPUs is particularly well suited for addressing data-parallel calculations such as those used in solving seismic wave equations. In fact, most of the numerical algorithms used for solving seismic wave equations can be expressed in terms of simple local-scale operations applied in parallel on many different pieces of distributed data with limited or no interdependence, i.e., single-instruction-multiple-data (SIMD) style. In the past, programming on GPUs was difficult and different from that on CPUs because of the significant barriers to recast scientific algorithms into unfamiliar graphic programming frameworks. Recent efforts by GPU vendors, in particular, NVIDIA’s CUDA (Compute Unified Device Architecture) programming model, the OpenCL (Open Computing Language) framework and the OpenACC compiler directives and APIs, have significantly increased the programmability of commodity GPUs. Using these tools, a programmer can directly issue and manage data-parallel computations on GPUs using high-level instructions without the need to map them into a set of graphic-processing instructions. For readers who are not familiar with CUDA or GPU programming, we give a very brief introduction about the programming model in the following section.
1.1.2.1
CUDA programming model
The CUDA software stack is composed of several layers, including a hardware driver, an application programming interface (API) and its runtime environment. There are also two high-level, extensively optimized CUDA mathematical libraries, the fast Fourier transform library (CUFFT) and the basic linear algebra subprograms (CUBLAS), which are distributed together with the software stack. The CUDA API comprises an extension to the C programming language for a minimum learning curve. The complete CUDA programming toolkit is distributed free of charge and is regularly maintained and updated by NVIDIA. A CUDA program is essentially a C program with multiple subroutines (i.e., functions). Some of the subroutines may run on the “host” (i.e., the CPU) and others may run on the “device” (the GPU). The subroutines that run on the device are called CUDA “kernels”. A CUDA kernel is typically executed on a very large number of threads to exploit data parallelism, which is essentially a type of SIMD operation. Unlike on CPUs where thread generation and scheduling usually takes thousands of clock cycles, GPU threads are extremely “light-weight” and cost very few cycles to generate and manage. The very large amounts of threads are organized into many “thread blocks”. The threads within a block are executed in groups of 16, called a “half-warp”, by the “multiprocessors” (a
24
Dawei Mu, En-Jui Lee, and Po Chen
type of vector processor), each of which executes in parallel with the others. A multiprocessor can have a number of “stream processors”, which are sometimes called “cores”. A high-end Fermi GPU has 16 multiprocessors and each multiprocessor has two groups of 16 stream processors, which amounts to 512 processing cores. The memory on a GPU is organized in a hierarchical structure. Each thread has access to its own register, which is very fast, but the amount is very limited. The threads within the same block have access to a small pool of low-latency “shared memory”. The total amount of registers and shared memory available on a GPU restricts the maximum number of active warps on a multiprocessor (i.e., the “occupancy”), depending upon the amount of registers and shared memory used by each warp. To maximize occupancy, one should minimize the usage of registers and shared memory in the kernel. The most abundant memory type on a GPU is the “global memory”, however, accesses to the global memory have much higher latency. To hide the latency, one needs to launch a large number of thread blocks so that the thread scheduler can effectively overlap the global memory transactions for some blocks with the arithmetic calculations on other blocks. To reduce the total number of global memory transactions, each access needs to be “coalesced” (i.e., consecutive threads accessing consecutive memory addresses), otherwise the access will be “serialized” (i.e., separated into multiple transactions), which may heavily impact the performance of the code. In addition to data-parallelism, GPUs are also capable of task-parallelism, which is implemented as “streams” in CUDA. Different tasks can be placed in different streams and the tasks will proceed in parallel despite the fact that they may have nothing in common. Currently task parallelism on GPUs is not yet as flexible as on CPUs. Current-generation NVIDIA GPUs now support simultaneous kernel executions and memory copies either to or from the device.
1.1.2.2
CUDA implementations of finite-difference methods
With the rapid development of the GPU programming tools, various numerical algorithms have been successfully ported to GPUs and GPU-CPU hybrid computing platforms and substantial speedups, compared with pure-CPU implementations, have been achieved for applications in different disciplines. In the area of acoustic/elastic seismic wave propagation simulations, finite-difference methods (e.g., Abdelkhalek et al., 2009; Mich´ea and Komatitsch, 2010; Okamoto et al., 2010; Wang et al., 2010; Unat et al., 2012; Zhou et al., 2012), the spectralelement method (e.g., Komatitsch et al., 2009; Komatitsch et al., 2010) and the ADER-DG method (Mu et al., 2013) have been successfully ported to GPUs using the CUDA programming model. The speedup obtained varies from around
1
Applications of Full-Wave Seismic Data Assimilation (FWSDA)
25
20-fold to around 60-fold depending on several factors, e.g., whether a particular calculation is amenable to GPU acceleration, how well the reference CPU code is optimized, the particular CPU and GPU architectures used in the comparisons and the specific compilers, as well as the compiler options, used for generating the binary codes. In this section, we discuss CUDA implementations of finitedifference methods. In Section 1.4, we will discuss the CUDA implementation of the ADER-DG method. In most of the finite-difference methods, the majority of the calculations involve a central point and a set of neighboring points in space. The spatial derivatives of the field variable at the central point are approximated using a weighted average of field variables at neighboring points. This neighborhood in space is often referred to as the stencil. The stencil operator applied to every point is the same, except for possible differences in the weights in non-uniform meshes. In a typical C-language implementation, the computation is implemented as nested for-loops, in which the loop indices sweep through every grid point in the mesh and update the field variables in place. A straightforward parallelization scheme is to use one thread to handle one central point. For a 4th -order finite-difference scheme, each stencil is composed of 13 grid points. If all the field variables are stored in the global memory, to compute the spatial derivatives of the field variable at the central point, each thread will need 13 accesses to the global memory, which will result in very poor performance since the global memory has the highest access latency. But in practice, it is not necessary for each thread to carry out all 13 accesses because neighboring stencils share many grid points and the field variables on those shared grid points can be fetched from the high-latency global memory and stored in the low-latency shared memory. If the number of threads in a thread block is large enough, on average each thread will only need one access to the global memory to fetch the field variable located at its own central point and store it into the shared memory and the rest 12 fetches will be from the low-latency shared memory. For the few threads located at the boundary of a thread block, more fetches from the global memory are needed because different thread blocks cannot share data directly on current-generation GPUs. The use of the shared memory improves the performance of the CUDA code significantly by removing most of the redundant accesses to the high-latency global memory. To further improve performance of the code, we need to make sure that the remaining accesses to the global memory are coalesced. In our case, this problem involves understanding two different issues, i.e., how a threedimensional array is laid out in the global memory and what is the indexing scheme for a three-dimensional thread block. In both C and CUDA, a threedimensional field variable, say vz[NY][NX][NZ], is laid out linearly in memory as, vz[0][0][0], vz[0][0][1], vz[0][0][2], . . . , vz[0][0][NZ-1], vz[0][1][0], vz[0][1][1], . . . , which is known as “row-major ordering”. For this particular example, we often
26
Dawei Mu, En-Jui Lee, and Po Chen
say that “the z-axis is the fastest direction and the y-axis is the slowest direction for array vz”. In a three-dimensional thread block, each thread is indexed using three integers, threadIdx.x, threadIdx.y and threadIdx.z and in CUDA thread topology, threadIdx.x is the fastest direction and the threadIdx.z is the slowest direction. To ensure that consecutive threads are accessing consecutive addresses in the global memory, for our example, one would like to match threadIdx.x with the z-axis of the array vz and threadIdx.y with the x-axis of vz and threadIdx.z with the y-axis of vz. In practice, one often uses a two-dimensional thread topology. In such a case, threadIdx.x should be matched with the z-axis of array vz, threadIdx.y should be matched with the x-axis of vz and each thread corresponds to one point in the x-z plane of vz and has to loop through the entire y-axis of vz. For a two-dimensional thread block, to compute spatial derivatives of field variables in the y-axis one can store multiple x-z planes of field variables in shared memory if the amount of shared memory on the GPU is large enough. For a 4th -order finite-difference scheme, one widely used algorithm is to store 4 consecutive x-z planes in shared memory. Then for each iteration in the loop direction (y-axis in our example), 3 out of the 4 x-z planes from the previous iteration can be re-used and we only need to fetch one x-z plane from the global memory. The 4 x-z planes in shared memory are constantly updated during the loop with one old plane being discarded and one new plane being added. This rotation process reduces about 75% of global memory accesses after all the threads in the thread block loop through the entire y-axis. There are also other issues need to be considered to fully take advantage of the computing capability of the GPU. Certain directive-based C-to-CUDA translation software, such as mint (Unat et al., 2012), can be used to facilitate this process for finite-difference calculations. We have successfully ported the discontinuous, non-uniform mesh, finite-difference code of Liu and Archuleta (2002) to GPU. On the latest Kepler K20 GPU, we obtained a speedup of around 15-fold when compared with a single Intel Nehalem 2.4 GHz CPU with 4 cores.
1.1.3
The ADER-DG Method
The ADER-DG method for solving the seismic wave equation is both flexible and robust. It allows unstructured meshes and easy control of accuracy without compromising simulation stability. Like the SE method, the solution inside each element is approximated using a set of orthogonal basis functions, which leads to diagonal mass matrices. These types of basis functions exist for a wide range of element types. Unlike the SE or typical FE schemes, the solution is allowed to be discontinuous across element boundaries. The discontinuity is treated us-
1
Applications of Full-Wave Seismic Data Assimilation (FWSDA)
27
ing well-established ideas of numerical flux functions from the high-order finitevolume framework. The spatial approximation accuracy can be easily adjusted by changing the order of the polynomial basis functions within each element (i.e., p-adaptivity). The ADER time-stepping scheme is composed of three major ingredients, a Taylor expansion of the degree-of-freedoms (DOFs, i.e., the coefficients of the polynomial basis functions in each element) in time, the solution of the Derivative Riemann Problem (DRP) (Toro and Titarev, 2002) that approximates the space derivatives at the element boundaries and the CauchyKovalewski procedure for replacing the temporal derivatives in the Taylor series with spatial derivatives. We summarize major equations of the ADER-DG method for solving the three-dimensional isotropic elastic wave equation on unstructured tetrahedral meshes in the following. Please refer to Dumbser and K¨ aser (2006) for details of the numerical scheme. The three-dimensional elastic wave equation for an isotropic medium can be expressed using a first-order velocity-stress formulation and written in a compact form as (1.7) ∂t Qp + Apq ∂x Qq + Bpq ∂y Qq + Cpq ∂z Qq = 0 where Q is a 9-vector consisting of the 6 independent components of the symmetric stress tensor and the velocity vector Q = (σxx , σyy , σzz , σxy , σyz , σxz , u, v, w)T and Apq , Bpq and Cpq are space-dependent 9×9 sparse matrices with the nonzero elements given by the space-dependent Lam´e parameters and the buoyancy (i.e., the inverse of the density). Summation for all repeated indices is implied in all equations. The seismic source and the free-surface and absorbing boundary conditions can be considered separately as shown in K¨aser and Dumbser (2006) and Dumbser and K¨ aser (2006). Inside each tetrahedral element T (m) , the numerical solution Qh can be expressed as a linear combination of space-dependent and time-independent polynomial basis functions Φl (ξ, η, ζ) of degree N with support on T (m) , (m)
(m)
(t)Φl (ξ, η, ζ) [Qh ]p (ξ, η, ζ, t) = Q pl
(m)
(1.8)
(t) are time-dependent DOFs and ξ, η, ζ are coordinates in the referwhere Q pl ence element TE . Explicit expressions for the orthogonal basis functions Φl (ξ, η, ζ) on a reference tetrahedral element are given in Cockburn et al. (2000) and the appendix A of K¨ aser et al. (2007). Bringing Equation (1.8) into Equation (1.7), multiplying both sides with a test function Φk , integrating over an element T (m) and then applying integration by parts, we obtain dV (Φk ∂t Qp ) + dS(Φk Fph ) (m) T ∂T (m) (1.9) dV (∂x Φk Apq Qq + ∂y Φk Bpq Qq + ∂z Φk Cpq Qq ) = 0 − T (m)
28
Dawei Mu, En-Jui Lee, and Po Chen
The numerical flux Fph between the element T (m) and one of its neighboring elements, T (mj ) , j = 1, 2, 3, 4, can be computed from an exact Riemann solver, 1 j (m) (m) j −1 (m) (m) Fph = Tpq Aqr + Aqr Trs Qsl Φl 2 (1.10) 1 j (m) j −1 (mj ) (mj ) (m) Aqr − Aqr Trs + Tpq Qsl Φl 2
j is the rotation matrix that transforms the vector Q from the global where Tpq Cartesian coordinate to a local normal coordinate that is aligned with the boundary face between the element T (m) and its neighbor element T (mj ) . Bringing Equation (1.10) into Equation (1.9) and converting all the integrals from the global xyz -system to the ξηζ-system in the reference element TE through a coordinate transformation, we obtain the semi-discrete discontinuous Galerkin formulation, ∗ (m) η ∗ (m) ζ (m) Mkl − |J| A∗pq Q (m) K ξ + Bpq K + C K Q Q |J|∂t Q pq ql kl kl kl ql ql pl 4 1 (m) j (m) j −1 (m) −,j + Aqr |Sj |Tpq + Aqr Qsl Fkl Trs 2 j=1 (1.11) 4 1 (m) j −1 (mj ) +,j,i,h j (m) Aqr |Sj |Tpq + − Aqr =0 Qsl Fkl Trs 2 j=1
where |J| is the determinant of the Jacobian matrix of the coordinate transformation being equal to 6 times the volume of the tetrahedron, |Sj | is the area of ∗ and face j between the element T (m) and its neighbor element T (mj ) , A∗pq , Bpq ∗ Cpq are linear combinations of Apq , Bpq and Cpq with the coefficients given by ξ η ζ the Jacobian of the coordinate transformation, Mkl , Kkl , Kkl and Kkl are the mass and stiffness matrices and the flux matrices are given by −,j Φk ξ (j) (χ, τ ) Φl ξ (j) (χ, τ ) dχdτ, ∀1 j 4 = (1.12) Fkl ∂(TE )j
+,j,i,h = Fkl
∂(TE )j
Φk ξ (j) (χ, τ ) Φl ξ (i) χ (h) (χ, τ ), τ(h) (χ, τ ) dχdτ,(1.13)
∀1 i 4, ∀1 h 3
The mass, stiffness and flux matrices are all computed on the reference element, which means that they can be evaluated analytically beforehand using a computer algebra system (e.g., Maple, Mathematica) and stored on disk. If we project Equation (1.7) onto the DG spatial basis functions, the temporal derivative of the DOF can be expressed as −1 ζ ∗ ∗ pn (t) = (−M −1 K ξ A∗pq − M −1 K η Bpq ∂t Q − Mnk Klk Cpq )Qql (t) lk lk nk nk
and the m-th temporal derivative can be determined recursively as
−1 ζ ∗ ∗ pn (t) = (−M −1 K ξ A∗pq − M −1 K η Bpq ql (t) (1.14) ∂tm Q − Mnk Klk Cpq )∂tm−1 Q lk lk nk nk
1
Applications of Full-Wave Seismic Data Assimilation (FWSDA)
29
The Taylor expansion of the DOF at time tn is pn (t) = Q
N (t − tn )m m ∂t Qpn (tn ) m! m=0
which can be integrated from tn to tn+1 , ql (tn ) ≡ Ipnql (∆t)Q
tn+1 tn
pn (t)dt = Q
N ∆tm+1 m ∂t Qpn (tn ) (m + 1)! m=0
(1.15)
pn (tn ) can be computed recursively using Equawhere ∆t = tn+1 − tn , and ∂tm Q tion (1.8). Considering Equation (1.15), the fully discretized system can then be obtained by integrating the semi-discrete system, Equation (1.11), from tn to tn+1 , n+1 n (m) (m) |J| Qpl Mkl − Qpl (m) n ξ η ζ ∗ ∗ mn = |J|(A∗pq Kkl + Bpq Kkl + Cpq Kkl )Iqlmn (∆t)(Q ) 4
−
1 j (m) (m) j −1 −,j (m) n mn |Sj |Tpq (Aqr + |Aqr |)(Trs ) Fkl Islmn (∆t)(Q ) 2 j=1
(1.16)
4
−
1 j (m) (m) j −1 +,j,i,h j) n (m |Sj |Tpq (Aqr − |Aqr |)(Trs ) Fkl Islmn (∆t)(Q mn ) 2 j=1
Equation (1.16), together with Equations (1.14) and (1.15), provides the mathematical foundation for our GPU implementation and optimization.
1.1.4
Accelerating the ADER-DG Method Using GPUs
The implementation and optimization of the ADER-DG method on a single GPU was documented in Mu et al. (2013). Extending the implementation to a cluster of GPUs is relatively straightforward. We give a brief summary in the following and demonstrate the performance of our multi-GPU CUDA-MPI code using specific examples. Prior to running our wave-equation solver, a tetrahedral mesh for the entire modeling domain was generated on a CPU using the commercial mesh generation software “GAMBIT”. The mesh generation process is fully automated and the generated tetrahedral mesh conforms to all discontinuities built into the modeling geometry, including irregular surface topography and subsurface fault structures. The entire mesh was then split into subdomains, one per GPU, using
30
Dawei Mu, En-Jui Lee, and Po Chen
the open-source software “METIS”, which is a serial CPU program for partitioning finite-element meshes in a way that minimizes inter-processor communication cost while maintaining load-balancing.
1.1.4.1
CUDA implementation
In Figure 1.1, we listed the major steps in the reference parallel CPU code, “SeisSol” (la Puente et al., 2009), and those in our parallel CPU-GPU hybrid implementation. In our parallel CPU-GPU hybrid implementation, we assume that each MPI process has access to only one device and each device is controlled by only one MPI process. At the start of the calculation, a sequence of preprocessing steps is executed on the CPUs. The pre-processing sequence includes: – Reading and processing a control file. – Reading and processing geometric information, which includes the tetrahedral mesh, the boundary conditions, the material properties (i.e., density and Lam´e parameters) for each element and the mesh partitioning information generated by METIS. – For the elements in each subdomain, creating a list of all the elements that are in contact with elements in other subdomains, which we call the “outer” elements, and those that are not, which we call the “inner” elements. – Reading and processing the DG matrices, which include the mass, stiffness and flux matrices, which were pre-computed and stored on the disk. – Reading and processing the files describing the seismic source and seismic receivers. Our CUDA program adopts the typical CUDA programming model. After the pre-processing sequence is carried out on the host CPUs, the arrays needed by the CUDA kernels are then copied to the global memory of the devices using “cudaMemcpy”. The hosts then call a sequence of CUDA kernels in every time step. The results of the simulation (e.g., the synthetic seismograms) are stored on the devices during the time loop and copied back to the hosts after all time steps are completed. In our implementation, the calculation for each tetrahedral element is carried out by one thread block. Within each thread block, the number of threads (m) , are depends upon the dimension of the element’s DOFs. The DOFs, Q pl allocated and initialized in the global memory of the device using “cudaMalloc” and “cudaMemset”. For a 5th -order scheme, which is sufficiently accurate for most of our applications, the number of DOFs per component per element is 35. Considering the 9 components of the governing PDE, (i.e., 6 stress components and 3 velocity components), the DOFs of each element consist of a 9 × 35
1
Applications of Full-Wave Seismic Data Assimilation (FWSDA)
31
Fig. 1.1 The flowcharts of the major steps in the reference parallel CPU codes (a) and those in our CPU-GPU hybrid implementation (b). The whole calculation can be separated into 3 sections: pre-processing, time-stepping and post-processing. The preprocessing section reads and calculates all the data that the time-stepping section will use. The time-stepping section updates the DOFs of each tetrahedral element according to equations (1.8)–(1.10) and has been ported to the GPU. The post-processing section is in charge of writing out the DOFs and/or the seismograms at the pre-specified locations.
32
Dawei Mu, En-Jui Lee, and Po Chen
matrix, which is represented in memory as a one-dimensional array of length 315 organized in the column-major ordering. To obtain better memory alignment, we padded 5 zeros behind the DOFs of each element so that the length of the one-dimensional DOF array of each element is increased to 320, which is 10 times the number of threads in a warp. For a subdomain of “nElem” elements, the length of the one-dimensional DOF array for the whole subdomain is therefore “nElem × 320”. The amount of memory that is wasted on purpose is less than 1.6%, however, the better memory alignment improved the performance of some simple operations such as summation operations and scalar-product operations by around 6.3%. Considering Equations (1.14)–(1.16), the calculations on the devices within each time step can be organized into 5 major steps: n (m) mn (1) Calculating the time-integrated DOFs, i.e., the term Iqlmn (∆t) Q n (m) mn using the DOFs Q at the current time step through the CauchyKovalewski procedure, i.e., Equations (1.14) and (1.15). (2) Calculating the volume contributions, i.e., the first term on the right-handside of Equation (1.16), using the time-integrated DOFs obtained in step (1). (3) Calculating the first numerical flux term, i.e., the second term on the righthand-side of Equation (1.16), using the time-integrated DOFs obtained in step (1). (4) Calculating the second numerical flux term, i.e., the third term on the right-hand-side of Equation (1.16), using the time-integrated DOFs of the four neighboring elements obtained in step (1). n+1 (m) using the DOFs at (5) Updating the DOFs to the next time step Q pl n (m) , the volume contributions obtained in step the current time step Q pl (2) and the numerical flux terms obtained in steps (3) and (4), as well as any contributions from the seismic source, by using Equation (1.16), which also involves inverting the mass matrix Mkl , which is diagonal.
All the calculations in steps (1), (2), (3) and (5) can be performed in an element-local way and require no inter-element information exchange, which is ideal for SIMD-type processors such as GPUs. The calculations in step (4) need to use the time-integrated DOFs from all neighboring elements, which in our distributed-memory parallel implementation requires passing time-integrated DOFs of the outer elements of each subdomain across different MPI processes. Most of this communication overhead can be hidden through overlapping computation with communication. In our implementation, we calculate the time-integrated DOFs for all the outer elements of a subdomain first. The calculation of the time-integrated DOF re-
1
Applications of Full-Wave Seismic Data Assimilation (FWSDA)
33
quires access to the DOF array in the global memory. The DOFs of the outer elements are usually scattered throughout the entire DOF array of the subdomain. To avoid non-coalesced memory accesses, which could impact performance by up to 54%, the entire DOF array is split into two sub-arrays, one for DOFs of all the outer elements and the other for the DOFs of all inner elements. Once we complete the calculations of the time-integrated DOFs of the outer elements, the device starts to compute the time-integrated DOFs of the inner elements of the subdomain right away. At the same time the time-integrated DOFs of the outer elements are assembled into a separate array, which is then copied into the host memory asynchronously to fill the MPI buffer using a separate CUDA stream, and then the host initiates a non-blocking MPI data transfer and returns. While the messages are being transferred, the device completes the calculations of the time-integrated DOFs of the inner elements, combines them with the time-integrated DOFs of the outer elements into a single timeintegrated DOF array and proceeds to calculations of the volume contributions in step (2) and the first numerical flux term in step (3). On the host, synchronization over all the MPI processes is performed, once the host receives the array containing the time-integrated DOFs of the outer elements on the neighboring subdomains, it is copied to the device asynchronously using a separate CUDA stream. After completing step (3), the device synchronizes all streams to make sure that the required time-integrated DOFs from all neighboring subdomains have arrived and proceeds to calculate the second numerical flux term in step (4) and then update the DOFs as in step (5). The overhead for splitting the entire DOF array into two sub-arrays for inner and outer elements and for combining the time-integrated DOFs of the outer and inner elements into a single array amounts to less than 0.1% of the total computing time on the device. The entire process is illustrated in Figure 1.1. To ensure effective communication-computation overlap, the ratio of the number of the outer to inner elements must be sufficiently small. An upper bound of this ratio can be estimated based on both the processing capability of the devices and the speed of the host-device and host-host inter-connections. On the NVIDIA Fermi M2070 GPUs that we experimented with, we achieved nearly zero communication overheads when this ratio is below 2%. We note that if the same approach is implemented using a classic CPU cluster, this ratio can be much larger, since the calculations for the inner elements and steps (2) and (3) are over an order of magnitude slower on a CPU core. Step (5), which simply updates the DOFs, is probably the most straightforward to implement and optimize. We assign 320 threads to each thread block and the number of thread blocks is identical to the number of tetrahedral elements “nElem” on each subdomain. The memory access pattern is fully coalesced, which is also a benefit of padding the 5 zeros behind the 315 DOFs of each element. The calculations in this kernel are less arithmetically intense compared
34
Dawei Mu, En-Jui Lee, and Po Chen
with those in other kernels and are mostly memory-bounded. In practice, we can achieve around 84% of the peak memory bandwidth of the GPU. The fraction of total wall-time spent on this step is around 2.1%. Most of the total wall-time is spent on the calculations in steps (1), (2), (3) and (4), which are dominated by matrix-matrix multiplications. We use step (2), which computes the volume contribution, as an example. A flowchart of the major calculations in step (2) is shown in Figure 1.2a. Considering the first term on the right-hand-side of Equation (10), the calculations in step (2) involve mathematical operations in the form of n ξ (m) mn Iqlmn (∆t) Q A∗pq Kkl
n (m) mn where the time-integrated DOF Iqlmn (∆t) Q , denoted as “dgwork” in Figure 1.2a, is computed in step (1) and has the same dimension and memξ , denoted as ory layout as the DOF array. The multiplication between Kkl “Kxi” in Figure 1.2a, and the time-integrated DOF is different from the normal matrix-matrix product in linear algebra. This multiplication involves three steps: first, transpose the time-integrated DOF matrix, second, multiply with the stiffness matrix following the usual matrix-matrix product rule, third, transpose the matrix obtained in the previous step. We call this multiplication the “left-multiplication”. A code segment for the baseline CUDA implementation of the left-multiplication is shown in Figure 1.2b. This left-multiplication operation is used extensively through the calculations in steps (1)–(4) and deserves more optimization effort. First, we can reduce the number of floating-point operations by exploiting the fact that some of the matrices in this operation are sparse; second, the order of the floating-point operations can be rearranged in a way so that the accesses to “dgwork” in the global memory are as coalesced as possible. The DOF, its temporal derivatives and the time-integrated DOF are generally dense. However, the stiffness matrices and the Jacobians have fill-ratios ranging from 8.8% to 29.6%. To take advantage of this sparsity, one possibility is to adopt an existing sparse linear algebra library such as “CUSP” (Bell and Garland, 2008) or cuSPARSE. However, the result we obtained using “CUSP” was not very satisfactory. The best performance gain, which was obtained using the “HYB” matrix format, was about 36.2% compared with the baseline implementation shown in Figure 1.2b. This is largely due to the very irregular distribution of the non-zeros in our matrices, which caused a large number of uncoalesced accesses to the time-integrated DOF arrays, and the amount of arithmetic calculations was not large enough to hide the memory-access latencies due to the low fill-ratio in the stiffness matrix. Considering the fact that the locations of the non-zero elements in the stiffness matrices can be determined beforehand and are fixed throughout the program,
1
Applications of Full-Wave Seismic Data Assimilation (FWSDA)
35
the results of the left-multiplications can be evaluated analytically beforehand and expressed in terms of the non-zero elements in those matrices using a computer algebra system. The expressions of the left-multiplication results, which are linear combinations of the time-integrated DOF with coefficients given by the non-zero elements of the stiffness matrices, can be hardwired into the CUDA kernels. This implementation eliminates all redundant calculations involving zero elements and by carefully arranging the order of the calculations in accordance with the thread layout, we can also minimize the number of uncoalesced memory accesses to the time-integrated DOF array. A code segment of the optimized left-multiplication is shown in Figure 1.2c, which is about 4 times faster than the baseline implementation shown in Figure 1.2b. This approach can also be applied to normal matrix-matrix product and is used throughout steps (1)–(4) in our optimized CUDA codes. A drawback of this approach is that the resulting kernel source code is quite long and some manual editing is required to ensure coalesced memory accesses. However, modern computer algebra systems (e.g.,
36
Dawei Mu, En-Jui Lee, and Po Chen
Fig. 1.2 (a) The flowchart of the calculations in step (2), the volume contributions. “dudt” is the volume contribution, “Kxi”, “Keta”, “Kzeta” correspond to the stiffness ζ ξ η matrices Klk , Klk in the text. “JacobianDet” is the determinant of the Jaco, Klk bian |J|. “nElem” is the number of tetrahedral elements in the subdomain, “nDegFr” is the number of DOFs per component per tetrahedral element, “nVar” is the number of components in the governing equation. “AStar”, “BStar”, “CStar” correspond to A∗ , B ∗ , C ∗ in the text. Code segments for the calculations in the dark-grey box are listed in Figure 1.2b and Figure 1.2c. (b) Baseline implementation of the CUDA kernel for the “left-multiplication” between the time-integrated DOF and the stiffξ ξ ness matrix Klk . “Kxi dense” corresponds to the dense matrix representation of Klk , “dgwork” corresponds to the time-integrated DOF and the result of the multiplication is stored in “temp DOF”. (c) A segment of the optimized CUDA kernel for the ξ “left-multiplication” between the time-integrated DOF and the stiffness matrix Klk . ξ “Kxi sparse” corresponds to the sparse matrix representation of Klk . Meanings of other symbols are identical to those in Figure 1.2b.
Mathematica, Maple) usually have automated procedures for translating long mathematical expressions into the C language, which is usually error-proof and can be directly incorporated into the CUDA kernels with minimal effort.
1
1.1.4.2
Applications of Full-Wave Seismic Data Assimilation (FWSDA)
37
Performance analysis
For our single-GPU performance analysis, we computed the speedup factors for problems with 7 different mesh sizes (Fig. 1.3). The number of tetrahedral elements used in our experiments are 3,799, 6,899, 12,547, 15,764, 21,121, 24,606 and 29,335. The material property is constant throughout the mesh with density 3,000 kg/m3 and Lam´e parameters λ 5.325×1010 and µ 3.675×1010 Pascal. We applied the traction-free boundary condition on the top of the mesh and absorbing boundary condition on all other boundaries. The seismic source is an isotropic explosive source buried in the center of the mesh. The wall-time measurements were obtained by running the simulations for 1,000 time steps. The speedup factors were computed for our single-precision GPU code with respect to the CPU code running on one, two, four and eight cores. For the multi-core runs on the CPUs, the parallel version of the “SeisSol” code is used as the reference. For the 7 different mesh sizes, the speedup factor ranges from 23.7 to 25.4 with respect to the serial CPU code running on one core, from 12.2 to 14 with respect to the parallel CPU code running on two cores, from 6.5 to 7.2 with respect to the parallel CPU code running on four CPU cores and from 3.5 to 3.8 with re-
Fig. 1.3 Single-GPU speedup factors obtained using 7 different meshes and 4 different CPU core numbers. The total number of tetrahedral elements in the 7 meshes is 3,799, 6,899, 12,547, 15,764, 21,121, 24,606 and 29,335, respectively. The speedup factors were obtained by running the same calculation using our CPU-GPU hybrid code with 1 GPU and using the serial/parallel “SeisSol” CPU code on 1/2/4/8 CPU cores on the same compute node. The black columns represent the speedup of the CPU-GPU hybrid code relative to 1 CPU core, the dark grey columns represent the speedup relative to 2 CPU cores, the light grey columns represent the speedup relative to 4 CPU cores and the lightest grey columns represent the speedup relative to 8 CPU cores.
38
Dawei Mu, En-Jui Lee, and Po Chen
spect to the parallel CPU code running on eight CPU cores. The speedup factor does not decrease linearly with increasing number of CPU cores. For instance, the speedup factor with respect to eight CPU cores is about 14% better than what we would have expected considering the speedup factor with respect to one CPU core if we had assumed a linear scaling. For the parallel version of the CPU code, there are overheads incurred by the MPI communication among different cores, while for the single-GPU simulations, such communication overheads do not exist. To analyze the performance of the parallel version of our CUDA codes, we use a simplified version of the SEG/EAGE salt model (K¨aser et al., 2010) as the benchmark. This model is geometrically complex, as shown in Figures 1.4a,b. However, the generation of the tetrahedral mesh for such a complex model is highly automatic once the geometry of the structural interfaces is imported into the meshing software. The material properties of the different zones in this model are summarized in Table 1.1. We note that a thin layer of water lies on top of the three-dimensional model. The ADER-DG method can accurately handle seismic wave propagation in water simply by setting the shear modulus of the elements in the water region to zero (K¨aser and Dumbser, 2008).
Fig. 1.4 (a) A perspective view of the 3D geometry of the discretized salt body in the SEG/EAGE salt model. (b) A two-dimensional cross-section view of the SEG/EAGE salt model along the A-A′ profile (Aminzadeh et al., 1997). The material properties for the different geological structures are listed in Table 1.1.
1
Applications of Full-Wave Seismic Data Assimilation (FWSDA)
39
This salt model is discretized into tetrahedral meshes with different number of elements. In Figure 1.5, we show the speedup factors obtained for two different mesh sizes, one with 327,886 elements and the other with 935,870 elements. The simulations were run on 8, 16, 32 and 48 CPU cores using the parallel version of the “SeisSol” code. And the speedup factors were obtained by running the same simulations on the same number of GPUs. On average, the speedup factor for our parallel GPU codes is around 28, which is slightly higher than the speedup factor obtained in the single-GPU-single-CPU comparison. This may due to the fact that in the parallel CPU code the outer elements of a subdomain are not treated separately from the inner elements, which does not allow the parallel CPU code to overlap the computation on the inner elements with the communication of the time-integrated DOFs of the outer elements. To investigate the strong scalability (i.e., the decrease in wall-time with increasing GPU number while holding the total workload, that is the number of Tab. 1.1 Material properties of the modified SEG 3D salt model. ρ(kg/m 3 )
λ(109 )
M (109 Pa)
Vp (m/s)
Vs (m/s)
Water
1,020
2.2950
0
1,500
0
Zone 01
2,000
4.5067
4.5067
2,600
1,501
Zone 02
2,050
5.0000
5.0000
2,705
1,562
Zone 03
2,500
7.5000
7.5000
3,000
1,732
Zone 04
2,600
9.0000
9.0000
3,223
1,861
Salt
2,160
20.800
14.457
5,094
3,103
Fig. 1.5 Speedup factors of our parallel GPU codes obtained using 2 different mesh sizes. The number of tetrahedral elements used in our experiments are 327,866, 935,870. The speed factors were computed for our single-precision multiple GPUs code with respect to the CPU code running on 16/32/48/64 cores runs on different nodes.
40
Dawei Mu, En-Jui Lee, and Po Chen
elements and time steps, constant), we discretized the salt model using a mesh with about 1.92 million elements and ran the simulation for 100 time steps. The number of GPUs used in the simulations ranges from 32 to 64. As seen in Figure 1.6, the strong scaling of our parallel GPU codes is close to the ideal case with some fluctuations. Our codes start to slightly underperform the ideal case when the number of GPUs used in the simulation is larger than 48. As analyzed in Section 1.3.2, to effectively overlap computation with communication, the ratio between the number of outer elements and the number of inner elements of a subdomain cannot exceed a certain threshold, which is determined by the processing capability of the GPU and the speed of the inter-connections. In our case, when the number of GPUs used in the simulation starts to exceed 48, this ratio becomes larger than 2%, which we believe is the threshold for our hardware configuration. The performance of our parallel GPU codes depends upon a number of factors, such as load balancing, but we think the extra communication overhead that was not effectively hidden by the computation was the dominant factor for causing our codes to underperform the ideal case. In Figure 1.7, we show the results of our weak scaling test (i.e., the workload per GPU is kept about constant while increasing the number of GPUs). By definition, the total number of elements in the weak scaling test increases approximately linearly with the number of GPUs. If the communication cost effectively overlaps computation, the weak scaling test should be approximately flat. In our tests, the average number of elements per GPU was kept around
Fig. 1.6 Strong scalability of our multiple GPUs codes with 1.92 million elements, the black line shows the average wall-time per 100 time steps for this size-fixed problem performed by 32 to 64 GPUs.
1
Applications of Full-Wave Seismic Data Assimilation (FWSDA)
41
Fig. 1.7 Weak scalability of our multiple GPUs code performed by 2–80 GPUs, the black line shows the average wall-time per 100 time steps for these size-varied problems. The average number of elements per GPU is around 53,000 with about 6% fluctuation.
53,000 with about 6% fluctuation across different simulations. The ratio between the number of outer and inner elements was kept around 1%. The weak scaling is approximately flat (Fig. 1.7), with some fluctuation mostly caused by the variation in the number elements per GPU used in each simulation.
1.2
Automating the Waveform Selection Process for FWSDA
The waveform is a highly nonlinear function of the seismic velocity model. However, past experiences have shown that the phase of the waveform is quasi-linear with respect to seismic velocity. In particular, one can show that the linearization involved in each iteration of the gradient-based nonlinear inversion is the Rytov approximation, which is valid for large phase shifts as long as the phase perturbation per wavelength is small (e.g., Chernov, 1960; Snieder and Lomax, 1996). But if multiple wave arrivals are considered at the same time, this quasi-linear relationship may break down (e.g., Keller, 1969; Woodward, 1989). Therefore to improve the linearity of the tomographic inversion, it is important to separate different wave arrivals and invert their phases separately. Some of the recent successful full-3D, full-wave tomographic inversions (e.g., Chen et al., 2007a;
42
Dawei Mu, En-Jui Lee, and Po Chen
Fichtner et al., 2009; Tape et al., 2010) are all based on inverting phase misfits measured on selected waveforms. Throughout the history of waveform inversions, a large number of waveform selection algorithms have been developed and many of them have been automated to some extent as the volume of high-quality digital seismograms keeps increasing. In surface-wave studies, automated algorithms have been developed to extract dispersion measurements from the fundamental mode (e.g., Trampert and Woodhouse, 1995; Laske and Masters, 1996; Levshin and Ritzwoller, 2001). Algorithms based on time-frequency analysis have been applied to obtain measurements from higher modes (e.g., Debayle, 1999; Lebedev et al., 2005; Visser et al., 2007). In body-wave studies and also in some surface-wave studies or studies of joint inversions using both types of waves (e.g., Panning and Romanowicz, 2006), automated waveform selection algorithms usually pick time windows around specific seismic phases defined by traveltimes predicted using ray-theoretic methods in one-dimensional structural models (e.g., Ritsema and van Heijst, 2002; Sigloch and Nolet, 2006). For full-3D waveform inversions, we need a new type of waveform selection algorithm that is not tied to ray-theoretic traveltime predictions of known phases and can accommodate wave arrivals resulting from complex wave propagation effects in three-dimensional structural models. To address some of the difficulties in the waveform selection algorithm used in full-3D waveform inversions, Maggi et al. (2009) proposed an automated timewindow selection algorithm based on the short-term average/long-term average ratio (STA/LTA) curve. A set of criteria based on similarities between the observed and synthetic waveforms is used to accept or reject time windows. The algorithm has been successfully applied to seismograms recorded at local, regional and global distances in different full-3D waveform inversion and modeling studies (e.g., Tape et al., 2010; Zhu et al., 2012). A purely time-domain waveform selection algorithm may have limitations in separating waves arriving at overlapping time windows. We have developed a new semi-automatic seismogram segmentation and waveform selection algorithm to address this issue. In our algorithm, seismogram segmentation is performed in the time-frequency domain through the continuous wavelet transform, which allows extra freedom in separating waves arriving at overlapping time windows but with disjoint frequency domain or time-frequency domain supports.
1.2.1
Seismogram Segmentation
A seismogram is composed of different wave arrivals. Our seismogram segmentation algorithm is based on the speculation that in the time-frequency domain
1
Applications of Full-Wave Seismic Data Assimilation (FWSDA)
43
different wave arrivals may correspond to distinct regions of high power in the scalogram. The continuous wavelet transform of a seismogram s(t) with respect to a mother wavelet g(t) is defined as (Holschneider, 1995) +∞ 1 ∗ t−b g s(t)dt (1.17) Wg s(b, a) = a −∞ a where a is the dilation (a>0), b is the translation and the symbol “∗” denotes complex conjugate. The scalogram used for segmentation is defined as the squared modulus of Wg s(b, a). The corresponding inverse transform is then defined as +∞ +∞ dbda 1 2 t−b g Wg s(b, a) (1.18) Re s(t) = Cg a a 0 −∞ a where Cg is a normalization constant. In principle, our segmentation algorithm can be based on any type of time-frequency transform. The reason we are adopting the continuous wavelet transform in our algorithm is its linearity, which results in the absence of interfering cross-terms, and its dyadic pavement of the time-frequency space (e.g. Holschneider, 1995; Kulesh et al., 2005), which allows efficient and high-resolution representation of the time-frequency content of the seismogram. In our approach, we did not adopt the discrete wavelet transform due to the lack of redundancy in the discrete wavelet bases, which could reduce the resolution of the resulting time-frequency domain image of the seismogram. The appropriate choice of the mother wavelet is an important issue. We have selected the complex-valued Cauchy wavelet (Holschneider, 1995), which is also known as the Paul wavelet, for our analysis. In frequency domain, the Cauchy wavelet is defined as 0, for ω < 0 (1.19) gm (ω) = ω m e−ω , for ω 0
where m is the order of the wavelet. We have chosen the Cauchy wavelet because of its nice algebraic and analytic properties. First, the Cauchy wavelet is admissible, meaning that the seismogram can be reconstructed from its continuous wavelet transform without any loss of information. Second, the Cauchy wavelet has good time and frequency localization properties, which is important for improving the local resolution of the time-frequency domain image of the seismogram. Its time-frequency area approaches the smallest area allowable by the Heisenberg inequality asymptotically when its order tends towards infinity. Third, the first- and second-order temporal derivatives of the Cauchy wavelet is still a Cauchy wavelet, which simplifies the application of our algorithm to velocity and acceleration seismograms. Fourth, the Cauchy wavelet is progressive, meaning its Fourier transform has zero amplitude in negative frequency. This
44
Dawei Mu, En-Jui Lee, and Po Chen
property allows us to separate the wavelet transform of the seismogram into prograde and retrograde components, which can be used for analyzing the polarization properties of multi-component seismograms and can potentially provide additional criteria for wave arrival separation (Kulesh et al., 2005). The scalogram is a two-dimensional function of both translation (i.e., time) and dilation (i.e., frequency). An example of the scalogram computed using the Cauchy wavelet for a seismogram used in our full-3D waveform tomography project for Southern California is shown in Figure 1.8b. The dominant arrival on the seismogram is the surface wave, which corresponds to the global maximum on the scalogram. Other local maxima on the scalogram correspond to other seismic phases. The segmentation is performed automatically on the time-frequency domain scalogram using the topological watershed method, a topological transformation designed to cluster all pixels that are connected to the same local extremum (e.g., Vincent and Soille, 1991; Bertrand, 2005). The earliest watershed algorithm was introduced by Beucher and Lantuejoul (1979) as a tool for segmenting gray-scale images. Efficient implementation of the watershed algorithm based on immersion simulations was proposed by Vincent and Soille (1991) and Meyer (1992). In such immersion simulations, the two-dimensional gray-scale image is reversed and the local maxima become local minima, which are called the catchment basins. The catchment basins are flooded through inlets (seeds) pierced at those local minima. As the flooding progresses, some regions could start to mix and at this point a dam is built to keep the regions separated. As the flood reaches the top of the reversed scalogram, all the dams that have been built during the flooding process form the watershed of the scalogram. A conceptual representation of the immersion simulation using a one-dimensional topography is shown in Figures 1.9a–c. The number and the locations of the seeds can be selected in advance to avoid over-segmentation when the signal-to-noise ratio is too low. Catchment basins without seeds can be flooded by water coming from a neighboring catchment basin. The topological approach to the watershed problem was introduced in Couprie and Bertrand (1997). The basic idea is to let the water gradually “erode” the relief of the gray-scale image while preserving the connectivity of every lower cross-section. A conceptual comparison between the immersion simulation and the topological approach is shown in Figures 1.9a–c and Figures 1.9d–f. It was proved in Bertrand (2005) that unlike other types of watershed algorithms, this topological approach is “contrast-preserving” (i.e., the height of the dam is the same as the height of the lowest mountain separating the catchment basins), which is a property useful for further processing (e.g., reconnection of oversegmented regions). The result of the topological watershed transform is a set of “mask files” corresponding to the different portions on the segmented scalogram. Each mask file is composed of a sequence of “1”s and “0”s and each number corresponds to a single pixel on the scalogram. The value is “1” for
1
Applications of Full-Wave Seismic Data Assimilation (FWSDA)
45
Fig. 1.8 An example of the seismogram segmentation process. (a) the time-domain seismogram after the pre-processing step; (b) the time-frequency domain scalogram obtained by applying CWT on the time-domain seismogram; (c) segmented scalogram obtained by applying the TW transform on the scalogram; (d) the time-domain wave packets obtained by converting each portion of the segmented scalogram back into the time domain through ICWT.
46
Dawei Mu, En-Jui Lee, and Po Chen
Fig. 1.9 Conceptual representations of the immersion simulation used in conventional watershed algorithms (a)–(c) and the topological watershed transform (d)–(f) using a one-dimensional topography. In (a)–(c), different stages of flooding are shown in grey. As regions start to merge, a dam (black vertical bar) is built to separate the water. As the water level keeps increasing, more dams are built until the whole topography is fully flooded. In (d)–(f ), the topological watershed transformation is computed by iteratively lowering W-destructible points until no such points exist and the dams are built. We note that the final altitudes of the dams in (c) are different from those in (f).
pixels within the region selected by the mask file and “0” for the outside pixels. The time-frequency domain window (i.e., the mask file), generated by applying the topological watershed transform to the scalogram Wg s(b, a), is then multiplied onto the continuous wavelet transform Wg s(b, a) and the result is then transformed back to the time-domain through the inverse continuous wavelet transform, as illustrated in Figure 1.8. The purpose for developing our seismogram segmentation and waveform selection algorithm is similar to that in Maggi et al. (2009). The authors have packaged their codes in the software “FLEXWIN”, which is open-source and hosted at the Computational Infrastructure for Geodynamics (CIG) website. In their algorithm, a target seismogram is separated into many small time windows based on the local minima in its short-term average/long-term average ratio (STA/LTA) curve. The local minima below the user-defined water level are “strong” barriers and the local minima between strong barriers are “soft”
1
Applications of Full-Wave Seismic Data Assimilation (FWSDA)
47
barriers. The time-windows that cross strong barriers cannot be merged. The time-windows separated by soft barriers are allowed to merge with their neighboring time-windows or to be rejected, depending on the local maxima of its neighboring time-windows and the separation of distinct phases. In our algorithm, the segmentation is performed in the time-frequency domain, which allows us to separate waves arriving at overlapping time windows but with disjoint frequency domain or time-frequency domain supports. Compared with conventional time-frequency domain filtering techniques, the topological watershed algorithm allows the boundaries to separate different wave arrivals to naturally conform to the distribution of energy in the time-frequency domain. An example is shown in Figure 1.10. An explosive source is buried at 300 m depth and receivers with offset ranging from 2.5 km to 5.3 km are evenly distributed on the left of the source on the surface. Synthetic seismograms were computed by solving the three-dimensional elastic seismic wave equation using a 4th -order staggered-grid finite-difference code (Olsen, 1994). The verticalcomponent seismograms for all receivers are shown in Figure 1. 10a, from which we can see the direct-arriving P-wave, the Rayleigh wave, a stronger reflected phase PP and a weaker reflected phase pPP. At around 4 km offset, the Rayleigh wave is interfering with the reflected PP and pPP arrivals and an example is shown in Figure 1.10c. We segmented all seismograms using our algorithm and plotted only the PP waves in Figure 1.10b, which shows a clean separation between PP and the Rayleigh wave for all offsets and the shapes of the PP waves are well preserved. The scalogram for the selected seismogram in Figure 1.10c is shown in Figure 1.10d and the segmented scalograms, as well as their corresponding time-domain waveforms, are shown in Figures 1.10e–l. The seismogram shown in Figure 1.10c was also segmented using the FLEXWIN code of Maggi et al. (2009) and the results are shown in Figures 1.10m–n. It is possible to isolate the PP wave by manually adjusting the “internal minima” parameter in FLEXWIN (Maggi et al., 2009), however, the shape of the isolated PP wave is still distorted by the Rayleigh wave as shown in Figure 1.10m.
48
Dawei Mu, En-Jui Lee, and Po Chen
Fig. 1.10 A comparison with FLEXWIN. (a) synthetic seismograms of verticalcomponent velocity computed using a layer-over-half-space velocity model. Within the circled region, the PP reflection and the surface wave arrive in overlapping time windows. (b) the PP reflection isolated using our segmentation algorithm. (c) an example seismogram with PP, pPP and the surface wave arriving in overlapping time windows. (d) the scalogram of the seismogram shown in (c). (e-l) the segmented scalograms and their corresponding time-domain waveforms obtained using our segmentation algorithm. (m-n) the time windows selected by the FLEXWIN code, the boxes indicate the selected time windows and the dash line on the STA:LTA curve is the water level. The FLEXWIN parameters used for obtaining this result are T0,1 =0.0225,10; wE =0.0365; c0 =1.0; c1 =1.0; c2 =0.0; c3a,b =2.0,1.0; c4a,b =2.0,6.0; wCC =1.0; wlen =1.0; wnwin =1.0.
We note that for phases that are well separated in the time domain, FLEXWIN and our algorithm gave very similar results. An example is the direct-arriving P wave as shown in Figure 1.10f and Figure 1.10m.
1
1.2.2
Applications of Full-Wave Seismic Data Assimilation (FWSDA)
49
Waveform Selection
The wave packets obtained by segmenting the synthetic seismogram need to be paired with those obtained by segmenting the corresponding observed seismogram before we can make frequency-dependent phase and amplitude misfit measurements. The process of our waveform selection algorithm has three major steps: Step (1): We reject some of the wave packets based on their temporal lengths and their arrival times. We require the temporal length of the wave packet to be larger than the pre-set parameter WIN MIN (Tab. 1.2), which is usually set to the shortest period of the pre-processed seismogram. We also require the arrival time of the wave packet to be later than WIN BEG and earlier than WIN END (Tab. 1.2). In our local waveform inversion examples, WIN BEG was set in a few seconds before the first arrival and WIN END was set based on the estimated surface-wave arrival time. Step (2): For a wave packet obtained from the observed seismogram, we reject it if its signal-to-noise ratio is too low, i.e., SN RA where < b, a > stands for the projection of function a onto the wavelet vector b. This matrix represents completely the linear operator (Cand`es and Demanet, 2003). In matrix form, the operator decomposition can be written as (Wu and Yang, 1997; Wu and Wang, 1998) P = BP B T
(2.23)
where P is the space domain propagation matrix, and P is the one in beamlet domain; B is the beamlet vector and B T is its transpose. Figure 2.5 illustrates the concept of propagator decomposition and wave propagation in the beamlet domain. For beamlet scattering problem, two approaches widely used in wave propagation theory can be adopted to this purpose: one is the asymptotic approximation for smooth media; the other is the perturbation method for rough heterogeneities. Traditionally the Gaussian beam method adopted almost exclusively ˇ the asymptotic approach (see Cerven´ y, 1983, 2001); while the beamlet imaging uses the combination of asymptotic method for the background propagation and perturbation approximation for rough heterogeneities (Wu et al., 2000; Wu and Chen, 2001, 2002a; Chen et al., 2006; Wang and Wu, 2002; Luo and Wu, 2003; Wang et al., 2003, 2005).
92
Ru-Shan Wu and Jinghuai Gao
Fig. 2.5 Beamlet decomposition and propagation of wavefield.
Fig. 2.6 Matrix representations of Kirchhoff propagation operators in space domain: dx = dz = 25 m, v = 2, 000 m/s, N = 128. (a) 5.9 Hz, (b) 25 Hz. Only real parts of the complex operators are plotted.
2
Wavefield Representation, Propagation and Imaging Using Localized Waves. . .
93
Fig. 2.7 Beamlet propagators (propagation operators in beamlet domain) (Daub4: Daubechies-4 wavelet): The left panel is for DWT (discrete wavelet transform), and the right panel is for best basis wavelet-packets. The top panel is for 5.9 Hz, and the bottom panel is for 25 Hz. Only real parts of the complex operators are plotted.
94
Ru-Shan Wu and Jinghuai Gao
Fig. 2.8 Beamlet propagators (propagation operators in beamlet domain) (Coif5: Coifman-5 wavelet): The left panel is for DWT (discrete wavelet transform), and the right panel is for best basis wavelet-packets. The top panel is for 5.9 Hz, and the bottom panel is for 25 Hz. Only real parts of the complex operators are plotted.
2
Wavefield Representation, Propagation and Imaging Using Localized Waves. . .
95
Fig. 2.9 Propagator matrices of the LCB background propagator in the beamlet domain (Top panel: f =5.9 Hz; Bottom panel: f =25.0 Hz) (Left panel: real part; Right panel: imaginary part) with operator aperture of Nx =256.
In the case of smooth media, the local homogeneous approximation or asymptotic approximation can be applied to the propagator decomposition. Under these approximations, the Green’s function is known or can be approximated by asymptotical solutions. Then the propagator matrix in beamlet domain can be
96
Ru-Shan Wu and Jinghuai Gao
easily calculated. Different wavelet atoms, such as Daubechies wavelet D4, Coifman wavelet C5, best bases of wavelet packet, and local cosine basis, have been tested for the propagator P decomposition (Wu and Yang, 1997; Wu and Jin., 1997; Wu and Wang, 1998; Wang and Wu, 1998a, b). Figures 2.5 to 2.9 show the decomposition of propagation integral under local homogeneous approximation into different beamlet domains and the comparison of sparseness of the beamlet propagators. First, we see obviously that the propagator matrices in beamlet domain are quite sparse compared to the dense space-domain ones. It is also seen that the smooth wavelets (with well-localized wavenumber spectra) can deliver better behaved propagator matrices than the popular wavelets in imaging compression such as the orthonormal Daubechies wavelets. Especially the local cosine basis yields very sparse and well-organized propagator matrices, which are important for efficient implementation. Later in our research, we mostly adopted the smooth wavelets, such as G-D (Gabor-Daubechies) frame and LCB beamlets for wave propagation and imaging.
2.3.3
Beam Propagation in Smooth Media with High-Frequency Asymptotic Solutions
Equation (2.22) formally defines the beamlet propagator matrix. There is no essential difficulty in calculating propagation matrices for smooth media under the local homogeneous approximation or h-f asymptotic approximation. However, the calculation of the beamlet propagator in generally heterogeneous media is quite complicated. Various approximations have been obtained for different applications. Historically, there are basically two approaches. One is the beam propagation methods which calculate the evolution of the elementary wave (a beamlet) globally using asymptotic solution of the wave equation in smooth media, and synthesize the wavefield at the end point by reconstruction (superposition of all the arriving beams). In this approach, the propagator matrices do not enter into the computation and exist only in the definition. Each beamlet, the elementary wave, evolves into a global beam in the propagating space and arrives at the receiving point, contributing to the final summation. The other is the perturbation approach for wave propagation in the beamlet domain. The propagation of a wavefield is a step-by-step beamlet propagation and coupling using propagator matrices. At each step, the laterally varying velocity profile is decomposed into a background velocity profile and local perturbations. Each beamlet will be spread (in the background media) and scattered (by local perturbations) into other beamlets. No global beam solution is used in this approach. The details of this perturbation approach will be summarized in the next subsection.
2
Wavefield Representation, Propagation and Imaging Using Localized Waves. . .
2.3.3.1
97
Conditions for the application of asymptotic analysis
Before going reviewing the global beam approach, let us discuss the conditions of applicability for asymptotic solutions in inhomogeneous media. The one-way evolution of a beamlet can be formally written as amn (x) = e±iAn ∆z bmn (x) where An is the square-root operator (for the case of scalar wave). An ≡ ∂x2 + ω 2 /v 2 (x, z)
(2.24)
(2.25)
Note that amn is no longer a beamlet due to diffraction and scattering. Redecomposition of amn into beamlets leads to the formulation of propagator matrix as seen in Equation (2.22). For general heterogeneous media, the solution of Equation (2.24) can be quite involved and sometimes only numerical methods are applicable. However, when the media is smooth in comparison to the wavelength, h-f asymptotic approximations may be applied to the solution. In Equations (2.22) or (2.24), if we transform the propagator into wavenumber (ξ) domain, the application of propagator operator to the beamlet can be written as ±iAn z amn (x, z) = e bmn (x, z) ⇒ dξP (x, z)eiξx bmn (ξ, z) (2.26)
ˇ Standard h-f asymptotic analysis (Garding, 1987; Cerven´ y, 1983, 2001; Cand`es and Demanet, 2005; Demanet, 2006) assumes that P (x, z)eiξx can be approximated (2.27) P (x, z)eiξx ∼ eiΦ(x,ξ,z) σ(x, ξ, z) where Φ(x, ξ, z)is the phase term and σ(x, ξ, z), the amplitude term. The phase Φ is homogeneous of degree one in ξ; the amplitude σ ∼ σ 0 + σ 1 + · · · with σ m homogeneous of order –m in ξ. Physically the above conditions mean that the amplitude term is a much slowly varying function in space, and the phase term is dominated by the linear dependence in frequency (wavenumber). With this asymptotic approximation, the one-way beamlet propagator can be represented by a Fourier integral operator (FIO). If the phase term in Equation (2.27) can be expressed exactly the linear function of frequency, i.e., P (x, z)eiξx ∼ eiξx σ(x, ξ, z)
(2.28)
and σ(x, ξ, z) obeys some constraints |(∂ξα ∂xβ σ(x, ξ)| Cα,β (1 + |ξ|)m−α
(2.29)
for multi-indices α and β. Then the propagator belongs to a type of pseudodifferential operator (PsDO) and σ(x, ξ) is its symbol of order m(type(1, 0))
98
Ru-Shan Wu and Jinghuai Gao
(Grossman, 2005; Demanet, 2006). The space domain expression of the operator is σ(x, D) with D = −i∇. The condition in Equation (2.29) requires the fast decay in high frequency of the operator spectrum, implying smooth variation of the wavefield along x. Here z in Equation (2.26) stands for the coordinate along the propagation direction. For a long distance propagation using asymptotic methods, it can be replaced with the curved-linear coordinates along the ray which obeys the eikonal equation for the phase. The amplitude can be determined by the transport equation. The mathematical requirements for asymptotic analysis can be translated into wave propagation regimes in terms of beam and medium parameters. For beam evolution in inhomogeneous media, there are three basic parameters that determine the propagation regimes: wavelength λ, beam width β, and the scale of the heterogeneity a. First, a≫λ and β≫λ must hold for the validity of asymptotic approach. For short range propagation, the beam asymptotic solution requires also βa, the wave front will be distorted and it is in the diffraction regime (ibid). For long range beam propagation, there are two more parameters entering into the game: the range R and the total perturbation strength Φ. The requirement of a≫λ is replaced by a≫rF ⋍ Rλ/2π where rF is the Fresnel radius along the beam path. This is understandable, since diffraction regime will be entered if the scale of heterogeneity is smaller than the Fresnel radius. The other requirement that the total r.m.s. perturbation strength must not exceed certain value is for the case long range propagation in randomly heterogeneous media (Flatt´e et al., 1979, Chapter 8; Aki and Richards, 1980, Chapter 13; Wu and Aki, 1988), which we will not discuss here. Therefore, for strongly and roughly heterogeneous media, the h-f asymptotic approximation may not be applicable and the beamlet propagator (or more generally the linear operators of wave propagation in Equations (2.18) and (2.20)) cannot be represented by Fourier integral operators and pseudodifferential operators. There are mainly two approaches related to beam propagation: the Gaussian beam method and the coherent state method. In fact, they are close relatives, but with different historical origins and different approximations to the asymptotic solution.
2.3.3.2
Gaussian beam method
Gaussian beam can be considered as evolved from a parabolic approximation of the Gabor beamlet (or Gabor-Daubechies beamlet) in Equation (2.30).
2
Wavefield Representation, Propagation and Imaging Using Localized Waves. . .
bgmn (x)
≈ g(x − x ¯n )e
„ « ¯2 ξ iξ¯m x+i k0 − 2km ∆z 0
99
(2.30)
where ∆z implies that the wave front is defined in the vicinity of the reference point. Gaussian beam is a wave-beam with a Gaussian envelope and a parabolic wave front. Its evolution in smoothly varying media has been extensively studˇ ˇ ied and documented (Cerven´ y et al., 1982; Popov, 1982; see Cerven´ y, 2001, ˇ and Popov, 2002 for detailed expositions; Cerven´ y et al., 2007 for a review of recent progress). The beam propagates along a ray-path. The spatial trajectories and the travel times are determined by the eikonal equation (ray-tracing) and change of the beam shape and amplitude is governed by the dynamic raytracing equation which calculates the complex-valued second derivatives of the travel time along the central ray. In this way, the traveltime and amplitude in the paraxial region about the ray can be determined. The evolution of a Gaussian beam depends on two initial parameters, the ray curvature and the beam width. The Gaussian beam summation provides regular wavefield everywhere even in media where caustics and shadow zones may be present. The ˇ solution for a single beam can be constructed in f -x domain (e.g., Cerven´ y et al., 1982) as well as in t-x domain using a Hilbert transform (e.g., Hill, 1990, 2000). In seismic imaging, the recorded wavefield is first decomposed using a set of overlapping Gaussian windows. For each windowed data section, the local slant stack is performed to form individual beams (e.g., Raz, 1987; Hale, 1992). Such decomposition is closely related to the decomposition using windowed Fourier transform and efficient reconstructions can be obtained based on the “frame” theory (e.g., Daubechies, 1992; Kaiser, 1994) in wavelet analysis. The Gabor frame-based overcomplete representation provides redundant yet stable reconstruction of the wavefield (e.g., Einziger, 1986; Nowack et al., 2006). The accuracy of the beam solution depends on the beam width. High-frequency asymptotic analysis can be also applied to the non-approximated Gabor beamlet, and the resulting propagating beam is the windowed Fourier beam (Steinberg, 1993; Steinberg and Birman, 1995), the coherent state beam (Klauder, 1987; Foster and Huang, 1991; Thomson, 2001; Albertin et al., 2001; Foster et al., ˇ 2002) or the Gabor-frame beam (Gao et al., 2006). Cerven´ y et al. also called the parabolic-approximated Gaussian beam the paraxial Gaussian beam, and the ˇ non-approximated beam the strict Gaussian beam (Cerven´ y, 2007).
2.3.3.3
Asymptotic coherent-state solutions
The original definition of CST (coherent state transform) is a windowed Fourier transform with Gaussian windows (Klauder and Skagerstam, 1985; Klauder, 1987; Foster and Huang, 1991). Each Gabor atom defined by Equation (2.9) is a coherent state. Later Thomson (2001) proposed to call the propagating coherent
100
Ru-Shan Wu and Jinghuai Gao
state after making stationary phase approximation (saddle point approximation) to the h-f asymptotic solution integral of wave equation the “elementary coherent state” and even the solution may not have a Gaussian envelope. Therefore, the term coherent state may have different meanings in the literature. Here we use the term in the general sense and refer to asymptotic solution of coherent state (CS) as asymptotic CS. The coherent state transform (CST) is defined as ′ 1 ′2 ′ ϕ(p, x) = u(x + x) exp − ωΩ x e−iωpx dx′ (2.31) 2 where u is the wavefield in space domain, p is the horizontal slowness (ray parameter) and Ω is inversely proportional to the beamwidth of the coherent state (width parameter). The inverse transform is ω 2 u(x) = ϕ(p, x)dp (2.32) 2π The coherent-state transform is a complete but nonorthogonal representation. A property of the coherent-state transform is that the Heisenberg uncertainty cell is minimized due to the Gaussian windowing. A coherent state can be considered as sum of weighted plane waves (wavenumber domain summation) or a Gaussian windowed beam (Gaussian beam) with certain width. In a smooth media, the h-f asymptotic solution (or semi-classical solution, as called by Klauder, 1987) can be obtained for the coherent state propagation. The phase is calculated by solving the complex eikonal equation (Hamilton Jacobi equation), and the amplitude by the transport equation. Due to the special inverse transform defined by Equation (2.32), the final wavefield is formed by a summation of coherent states with different directions (p) at the same spatial location without contributions from neighboring beams (coherent states). This is different from the Gaussian beam summation which is a summation over neighboring beams with different directions. Of course, there are other alternative reconstruction formulas which can be defined. However, the inverse transform defined by Equation (2.32) is used in all the relevant articles. In this sense, asymptotic CS (coherent state) modeling bears more similarity with the Maslov method than with the Gaussian beam method. The Asymptotic CS method can be considered as a windowed Maslov method; while the Maslov method (see Chapman and Drummond, 1982; Chapman, 2004) can be said to be the limiting case of the Asymptotic CS method when the beamwidth is set to infinity (i.e., Ω =0). The other limiting case of the Asymptotic CS method is when Ω approaches to infinity, i.e. when the beamwidth is close to zero, it becomes the classic ray method. Remember that the wavelength must be approaching zero faster than the beamwidth in order to apply the h-f asymptotic solution. From the calculations of the eikonal and transport equations, we see that there are three parameters to consider for the applicability of the method: beamwidth (1/Ω ), wavefront curvature and
2
Wavefield Representation, Propagation and Imaging Using Localized Waves. . .
101
wavelength. Wavelength is much smaller than the beamwidth and wavefront curvature is the prerequisite of the asymptotic solution. Under the asymptotic regime, different approximations can be applied to the cases of large or small curvature/beamwidth (Klauder, 1987; Thomson, 2001, 2004). As for numerical modeling schemes, there are different implementation. Foster and Huang (1991) and Foster et al. (2002) apply directly the inverse transform Equation (2.32) to synthesize the space domain wavefield, which is a summation of a bundle of asymptotic CS solutions. The other way is apply further a stationary phase approximation (or saddle-point approximation) to the integral and the final solution is similar to a single CS (Klauder, 1987; Thomson, 2001, 2004). In this way, the asymptotic CS ray tracing is similar to the classic ray tracing with advantage of avoiding caustics. The beam width parameter Ω enters only into the amplitude calculation as well as the boundary condition of the ray tracing. The actual beamwidth is irrelevant to the final results. Therefore, unlike the Gaussian beam dynamic ray tracing where the beamwidth changes along the propagation path, the asymptotic CS solution assumes a fixed Ω along the ray. The other difference of coherent state from the Gaussian beam is the inability of modeling frequency-dependent wave phenomena. The asymptotic CS solution is a global, uniform asymptotic solution and can handle all caustics, including the pseudo caustics for which the Maslov method is invalid. It can also give some approximated results even in shadow zones. However, the solution loses the frequency-dependent wave information and can only model primary waves. On the other hand, the Gaussian beam approximates the wave field in the paraxial region of the ray with a parabolic approximation. In the original frequency domain formulation, the Gaussian beam method can explicitly simulate frequencydependent wave phenomena with certain degree of approximation. Later, time (complex time) domain versions (Hill, 1990, 2000) were also derived for seismic imaging with high efficiency but with the loss of frequency-dependent wavefield information. The asymptotic CS method avoids the ray-centered coordinates and each CS has an asymptotic solution. Besides some canonical examples for demonstration purpose in the literature, Albertin et al. (2001) made some comparison of impulse responses and imaging results between Maslov, Gaussian beam and the asymptotic CS migration methods for a complex geological model (the Marmousi model).
2.3.4
Beamlet Propagation in Heterogeneous Media by the Local Perturbation Approach
In non-smooth heterogeneous media, the h-f asymptotic solution for long range propagation has very limited applications. Wu et al. (2000) developed a local
102
Ru-Shan Wu and Jinghuai Gao
perturbation theory for wave propagation in the beamlet domain. The propagation of a wavefield is not synthesized by the superposition of a collection of globally evolving beams, but by step-by-step beamlet propagation using propagator matrices. At each step, the laterally varying velocity profile is decomposed into a background velocity profile and local perturbations. The decomposition in fact is bi-scale decomposition. The large-scale component is a piecewise homogeneous medium with the scale of window-width defined for the spatial localization (see Fig. 2.2c); the small-scale component is the local perturbations with respect to the local reference velocities (see Fig. 2.2d). By comparison, the standard (global) perturbation scheme has a global background velocity (see Fig. 2.2b) and the perturbations are spreading to all scales (see Fig. 2.2c). In the local perturbation approach, the propagator P is decomposed into a background propagator P 0 and a perturbation term P 1 . The background propagation uses one of the two approximations: the local homogeneous approximation and the average slowness approximation. Both belong to the h-f asymptotic solution for small-step propagation. Since the local perturbations are much weaker than the global perturbations, the perturbation correction operator can be approximated by a phase-screen correction. It should be understood that the propagation of large-angle waves is mainly controlled by the background propagator and can be quite accurate due to the wide operator aperture. Therefore, the beamlet propagator in the local perturbation approach is a hybrid solver with the asymptotic method for the large-scale background media, and the perturbation method for the small-scale fluctuations with respect to the background (see Wu et al., 2000; Wu and Chen, 2001, 2002a; Wang and Wu, 2002, 2003; Luo and Wu, 2003; Chen et al., 2006; Wu et al., 2008, for detailed derivation and discussions). One special feature of the beamlet decomposition and propagation is the availability of local angle information during the propagation. Of course, the wavefield decomposition into the local angle domain can be performed by local slant stack or local Fourier transform (e.g., Xu and Lambare, 1998; Sava and Fomel, 2003; Xie and Wu, 2002) during propagation using other types of wave extrapolator. However, the beamlet propagator is formulated in the local angle domain and therefore is more efficient to get the angle-related information. Directional illumination analysis has been developed using the wavefield information in the local angle domain to study the influence of the acquisition configuration and overburden structures to the illumination of subsurface targets with different dip-angles (Wu and Chen, 2002b, 2003, 2006; Xie and Wu, 2002, 2003; Xie et al., 2004, 2006). Based on the illumination analysis, amplitude correction theory and methods to compensate the acquisition aperture effects have been also developed (Wu et al., 2004, Wu and Luo, 2005; Jin et al., 2005).
2
Wavefield Representation, Propagation and Imaging Using Localized Waves. . .
2.3.4.1
103
Beamlet evolution and wave propagation in the beamlet domain
Wavefield and propagator decompositions Substituting the wavefield decomposition equation into the wave Equation (2.17), the beamlets will also propagate obeying the scalar wave equation, (2.33) u ˆ(¯ xn , ξ¯m ; z)[∂x2 + ∂z2 + ω 2 /v 2 (x, z)]bmn (x) = 0 n
m
Note that in the above equation, u ˆ(¯ xn , ξ¯m ; z) is a set of coefficients with z as the labeling parameter, not a variable. The propagation effect of the wavefield now is included in the evolution of beamlets. For a local beam evolution problem, invoking the one-way wave approximation (neglecting interactions between the forward-scattered and backscattered waves), we can write a formal solution for the evolution of beamlets amn (x) = e±iAn ∆z bmn (x)
(2.34)
where amn is a function evolved from a beamlet bmn propagating in the heterogeneous medium, and An is the square-root operator (2.35) An ≡ ∂x2 + ω 2 /v 2 (x, z) As we mentioned above, amn is no longer a beamlet due to distortion after propagation. Decomposing amn with the same beamlet basis functions (or with the dual frame atom in the case of frame beamlet) �amn , bjl �bjl (x) (2.36) amn (x) = l
j
The propagator matrix P in beamlet domain is that we defined in Equation (2.22) xl , ξ¯j ; x ¯n , ξ¯m ) = �amn (x), bjl (x)� (2.37) Pjl,mn = P(¯ The beamlet domain wavefield at depth z + ∆z can be obtained as P (¯ xl , ξ¯j ; x ¯n , ξ¯m )ˆ u(¯ xn , ξ¯m ; z) = Pjl,mn u ˆ(¯ xn , ξ¯m ; z) u ˆ(¯ xl , ξ¯j ; z+∆z) = n
m
n
m
(2.38) Here Pjl,mn are the matrix elements of the beamlet propagator matrix P, which governs the beamlet propagation and cross-coupling. Wavefield reconstruction Here we perform the beamlet decomposition using orthogonal bases so the reconstruction atoms are the same as the decomposition atoms. The wavefield at
104
Ru-Shan Wu and Jinghuai Gao
depth z + ∆z after extrapolation can be reconstructed from the beamlet domain wavefield through u(x, z+∆z) = u ˆ(¯ xn , ξ¯m ; z)amn (x) = u ˆ(¯ xl , ξ¯j ; z + ∆z)bjl (x) (2.39) n
2.3.4.2
m
l
j
Beamlet propagator with local perturbation approximation
The main task for beamlet imaging (migration) is to derive efficient propagators in the chosen beamlet domain. The evolution of beamlets is governed by an operator Equation (2.35) which involves a square-root operator and there is no exact solution available for a general problem. Various approximations are invoked to make the calculation practical. Exploring the efficiency of fast wavelet transforms and the sparseness of the propagator matrix, Wu et al. (2000) applied a local perturbation approximation to the beamlet propagator resulting in a split-step implementation of wave propagation in the beamlet domain. Since no high-frequency asymptotics is involved, the local perturbation approach can keep all the wave phenomena of forward propagation, such as diffraction, interference, scattering and cross-coupling between beamlets, in the propagator. One successful example of applying the local perturbation theory is the LCB (local cosine basis) beamlet propagator and imaging algorithm. xn , z) is selected In the local perturbation theory, a local reference velocity v0 (¯ for each window x ¯n , and the local perturbation is calculated from the local reference velocity. Due to the adaptability of local reference velocities to the lateral variations of velocity model, generally the local perturbations are small, so that the first order approximation, i.e., the phase-screen approximation, can be adopted for the perturbation-correction in each window. This leads to the approximation of the square-root operator (see Wu et al., 2000; Chen et al., 2006) as xn , z) + ∆kn (x) + · · · (2.40) An ≡ ∂x2 + ω 2 /v 2 (x, z) ≈ ∂x2 + ω 2 /v02 (¯ where ∆kn (x) = ω(1/v(x, z) − 1/v0 (¯ xn , z)) denotes the local perturbations. Therefore, the beamlet evolution equation can be approximated by 1 dξeiξx eiζn ∆z bmn (ξ) amn (x) = ei∆kn (x)∆z (2.41) 2π
where
bmn (ξ) =
dxe−iξx bmn (x)
(2.42)
is the basis vector in the wavenumber domain, ξ is the horizontal wavenumber and ζn = ω 2 /v02 (¯ xn , z) − ξ 2 (2.43)
2
Wavefield Representation, Propagation and Imaging Using Localized Waves. . .
105
is the vertical wavenumber with the local reference velocity. Equation (2.41) is a dual-domain implementation of an operator split-step approximation. The first factor on the right-hand side is a phase-screen term in the space-domain; the second factor is a phase-shift in the wavenumber domain using the local reference velocity and localized by a beamlet projection. Efficient algorithms of propagation and imaging in the beamlet domain based on Equation (2.41) have been developed using frame or basis beamlets (Wu and Chen, 2001, 2002a, b; Wang and Wu, 2002; Wu et al., 2003, 2008; Chen et al., 2006). Figure 2.10 gives a schematic illustration of the decomposition of a lateral velocity section into a background velocity profile and local perturbations. We see that the decomposition in fact is a bi-scale decomposition. The large-scale component is a piecewise homogeneous medium with the scale of window-width; the small-scale component is the local perturbations with respect to the local reference velocities. By comparison, the standard perturbation scheme has a global background velocity and global perturbations spreading to all scales. We see that the local perturbations are in general much smaller than the global perturbations, and therefore can reach higher accuracy in extrapolation and imaging in strongly heterogeneous media.
Fig. 2.10 Background velocity profile (using local reference velocities) and local perturbations in the local perturbation theory, compared with the global reference velocity (dash line) and global perturbations in standard perturbation methods. Note that local reference velocities are plot for the nominal windows. The real windows are overlapped to each others with taper functions.
2.3.4.3
Beamlet imaging in strongly heterogeneous media
As a numerical example, Figure 2.11 shows the image of the prestack depth migration using the LCB (local cosine basis) beamlet propagator for the 2D
106
Ru-Shan Wu and Jinghuai Gao
SEG/EAGE salt model. The minimum velocity of the model is 5,000 feet/s and the maximum velocity is 14,700 feet/s. The salt boundary is sharp and very irregular, especially on the top. Therefore the model is a strong-contrast and rough heterogeneous medium and h-f asymptotic method alone will not work. The acquisition system of this model consists of 325 shots (sources) with 176 left-hand-side receivers with receiver interval of 40 feet. Figure 2.11a is image using LCB beamlet propagator without acquisition aperture correction. The aperture-effect corrected image is shown in Figure 2.11b. We see the excellent quality of the image and the significant improvement of the acquisition-aperture compensation in the local angle domain.
Fig. 2.11 Images of prestack depth migration using LCB (local cosine basis) beamlet propagator for the 2D SEG/EAGE salt model (see Fig. 2.1): (a) the image before the acquisition-aperture correction, (b) image after the acquisition-aperture correction (from Cao and Wu, 2006).
2.4
Curvelet and Wave Propagation
2.4.1
Curvelet and Its Generalization
Curvelet transform is originally developed for efficient representation of images with sharp and curved edges (Cand`es and Donoho, 2000, 2005). Curvelets are
2
Wavefield Representation, Propagation and Imaging Using Localized Waves. . .
107
elementary oscillatory patterns that are highly anisotropic at fine scales, with effective support obeying the parabolic scaling: width≈length 2 . Later the scaling law was loosened and modified to width≈wavelength 2 in order to let its cousin “wave atom” and other family members live in the same kingdom (Demanet and Ying, 2006). The latter will be discussed later in this section. Figure 2.11 gives the definition of curvelets and the original meaning of parabolic scaling. A curvelet is a directional wavelet, indexed by three parameters: a scale a, 0β≫λ is not satisfied, or other criteria of asymptotic analysis are violated, beamlet or curvelet scattering will occur and needs to be studied. For beamlet propagation, a local perturbation theory and method have been developed to handle strong and rough heterogeneities. The propagator is decomposed into a background propagator and a perturbation operator for each forward marching step. For background propagation asymptotic solutions can be applied to the piece-wise background media, and a phase-screen correction is performed for the local perturbations. Numerical examples demonstrated the superior image quality for subsalt structures. Future work on beamlet/curvelet scattering in generally heterogeneous media is needed for the development of efficient methods for propagation and imaging. The multi-scale nature and the advantages of phase-space localization have to be further studied and put into applications in imaging and inversion.
Acknowledgement We have benefited from the discussion with Yingcai Zheng, Jun Cao, Xiao-Bi Xie and Chuck Mosher. Yingcai has contributed to the review of Gaussian beam and coherent state. Yu Geng, Yaofeng He, Jian Mao, Lingling Wang, Bangyu Wu have participated in the preparation of the draft and helped in drawing some figures. We are very grateful to Laurent Demanet, Huub Douma and Martijn de Hoop for allowing the use of their figures in this review article. We would like to acknowledge the support from WTOPI (Wavelet Transform On Propagation and Imaging for seismic exploration) Project and the DOE/Basic Energy Sciences project at University of California, Santa Cruz, California, USA
132
Ru-Shan Wu and Jinghuai Gao
and the support from the National Natural Science Foundation of China (NSFC) and other projects of Xi’an Jiaotong University.
References Aki, K. and Richards, P.G., 1980. Quantitative Seismology, Theory and Methods, vol. 2. W.H. Freeman and Company, San Francisco. Albertin, U., Yingst, D., and Jaramillo, H., 2001. Comparing common-offset Maslov, Gaussian beam, and coherent state migrations. Expanded abstracts, SEG 71st Annual Meeting. Albertin, U., Yingst, D., Jaramillo, H., and Wiggins, W., 2002. Towards a hybrid raytrace-based beam/wavefield-extrapolated beam migration algorithm. Expanded abstracts, SEG 72nd Annual Meeting, 1344–1347. Ali, S.T., Antoine, J.-P., and Gazeau, J.-P., 2000. Coherent States, Wavelets and Their Generalizations. Springer, Chapter 15. Antonie, J.-P., 2004. The 2-D wavelets transform, physical applications and generalizations. In: J.C. Van den Berg (Ed.). Wavelet in Physics. Cambridge Univ. Press, 23–75. Antonie, J. -P. and Murenzi, R., 1996. Two-dimensional directional wavelets and he scale-angle representation. Signal Processing, 52, 259–281. Auscher, P., 1994. Remarks on the local Fourier bases. In: Benedetto J.J. and Frazier M.W. (Ed.). Wavelets, Mathematics and Applications. CRC Press, 203–218. Averbuch, A., Braverman, L., Coifman, R., Israeli, M., Sidi, A., 2002. Efficient computation of oscillatory integrals via adaptive multiscale local Fourier bases. Appl. Comput. Harm. Anal., 9(1), 19–53. Babich, V. and Ulin, V. V., 1984. Complex space-time ray method and ‘quasiphotons’. Journal of Mathematical Sciences, 24 (3), 269–273. Balian, R., 1981. Un principe d’incertitude fort en th´eorie du signal ou en m´ecanique quantique. C. R. Acad. Sci. Paris S´er. II, 292, 1357–1361. Baraniuk, R.G. and Jones, D.L., 1992. New dimensions in wavelet analysis. Proc. IEEE intern. Conf. Acoust. Speech Signal Process. IEEE Press, Piscataway, NJ. Bastiaans, M.J., 1993. Gabor’s signal expansion and its relation to sampling of the sliding-window spectrum. In: Marks II JR (Ed.). Advanced Topics in Shannon Sampling and Interpolation Theory. Springer-Verlag, Berlin. Battle, G., 1988. Heisenberg proof of the Balian-Low theorem. Lett. Math. Phys., 15, 175–177. Battle, G., 1992. Wavelets: A renormalization group point of view. In: Ruskai M B, Beylkin G., Coifman R., Daubechies I., Mallat S., Meyer Y., and Raphael L. (Eds.). Wavelets and Their Applications. Jones and Bartlett, Boston. Benedetto, J.J. and Frazier, M.W., 1993. Wavelets: Mathematics and Their Applications. CRC Press, Boca Raton.
2
Wavefield Representation, Propagation and Imaging Using Localized Waves. . .
133
Benedetto, J.J. and Walnut, D. F., 1993. Gabor frames for L2 and related spaces. In: Benedetto J. J. and Frazier M. W. (Eds.). Wavelets: Mathematics and Their Applications. CRC Press, Boca Raton. Benedetto, J.J. and Zayed, A.I., 2004. Sampling, Wavelets, and Tomography. Birkh¨ auser. Beylkin, G., 1985. Imaging of discontinuities in the inverse scattering problem by inversion of a causal generalized Radon transform. J. Math. Phys., 26, 99–108. Beylkin, G. and Sandberg, K., 2005. Wave propagation using bases for bandlimited functions. Wave Motion, 41(3), 263–291. Beylkin, G., Coifman, R., and Rokhlin, V., 1991. Fast wavelet transforms and numerical algorithms. Comm. Pure Appl. Math., 44, 141–183. CaLder´ on, A. P., 1950. On the behaviour of harmonic functions at the boundary. Trans. Amer. Math. Soc., 68, 47–54. Cand`es, E.J., 1999. Harmonic analysis of neural networks. Appl. Comput. Harmon. Anal., 6, 197–218. Cand`es, E.J. and Demanet, L., 2003. Curvelets and Fourier integral operators. C. R. Acad. Sci. Paris, Ser. I 336, 395–398. Cand`es, E.J. and Demanet, L., 2005. The curvelet representation of wave propagators is optimally sparse. Comm. Pure Appl. Math., 58(11), 1472–1528. Cand`es, E. and Donoho, D., 1999. Ridgelets: The key to high-dimensional intermittency? Phil. Trans. R. Soc. Lond. A., 357, 2495–2509. Cand`es, E. and Donoho, D., 2000. Curvelets: A surprisingly effective nonadaptive representation for objects with edges. In: Cohen A., Rabut C., Schumaker L. (Eds.). Curves and Surfaces Fitting. Venderbilt University Press, Nashville, 105–120. Cand`es, E.J. and Donoho, D., 2002. Recovering edges in ill-posed inverse problems: Optimality of curvelet frames. Ann. Statist., 30, 784–842. Cand`es, E.J. and Donoho, D. L., 2005. Continuous curvelet transform: I. resolution of the wavefront set. Appl. Comput. Harmon. Anal., 19(2), 162–197. Cand`es, E.J., Demanet, L., Donoho, D.L., Ying, L., 2006. Fast Discrete Curvelet Transforms, SIAM Multiscale Model. Simul., 5(3), 861–899. Cand`es, E. J., Romberg, J. K., and Tao, T., 2006, Stable signal recovery from incomplete and inaccurate measurements. Communications on Pure and Applied Mathematics, 59(8), 1207–1223. Cao, J., and Wu, R.S., 2005. Influence of Propagator and Acquisition Aperture on Image Amplitude. Expanded abstracts, SEG 75th Annual Meeting, 1946–1949. Cao, J. and Wu, R.S., 2006. Amplitude compensation of one-way wave propagators in inhomogeneous media and its application to seismic imaging. Special Issue: Computational geophysics, Communications in computational physics, 3(1), 203–221. ˘ Cerven´ y, V., 1983. Synthetic body wave seismograms, for laterally varying structures by the Guassian beam method. Geophys. J. R. astr. Soc., 73, 389–426. ˘ Cerven´ y, V., 2001. Seismic Ray Theory. Cambridge University Press. ˘ Cerven´ y, V., Klimes, L., and Psencik, I., 2007. Seismic ray method: Recent developments. Advances in Geophysics, 48, 1–126. ˘ Cerven´ y, V., Popov, M.M., and Pˇsenˇcik, I., 1982. Computation of wave fields in inhomogeneous media—Gaussian beam approach. Geophys. J. R. astr. Soc., 70, 109–128.
134
Ru-Shan Wu and Jinghuai Gao
Chauris, H., 2006. Seismic imaging in the curvelet domain and its implications for the curvelet design. Expanded abstracts, SEG 76th Annual Meeting, 2404–2410. Chen, L. and Wu, R.S., 2002. Target-oriented prestack beamlet migration using GaborDaubechies frames. Expanded abstracts, SEG 72nd Annual Meeting, 1356–1359. Chen, L., Wu, R.S. and Chen Y., 2006. Target-oriented beamlet migration based on Gabor-Daubechies frame decomposition. Geophysics, 71, s37–s52. Cohen, L., 1995. Time-frequency Analysis. Prentice Hall PTR, New Jersey. Cohen, A., 2003. Numerical Analysis of Wavelet Methods. North-Holland, Elsevier. Coifman, R.R. and Meyer, Y., 1991. Remarques sur l’analyse de Fourier a fenetre. Comptes Rendus de l’Academie des Sciences, Paris, Serie I 312, 259–261. Coifman, R.R. and Wickerhauser, M.V., 1994. Wavelet and adapted waveform analysis. In: Benedetto J.J. and Frazier M.W. (Eds.). Wavelets, Mathematics and applications. CRC Press, 399–423. Coifman, R.R., Matviyenko, G., and Meyer, Y., 1997. Modulated Malvar-Wilson bases. Appl. Comput. Harmon. Anal., 4, 58–61. Chapman, C.H., 2004. Fundamentals of Seismic Wave Propagation. Cambridge University Press. Chapman, C.H. and Drummond, R., 1982. Body-wave seismograms in inhomogeneous media using Maslov asymptotic theory. Bull. seism. Soc. Am., 72, 277–317. Choudhary, S. and Felsen, L.B., 1974. Analysis of Gaussian beam propagation and diffraction of inyomog wave tracking. Bull Seis. Soc. Am., 72, 5277–5317. Daubechies, I., 1988. Time-frequency localization operators: A geometric phase space approach. IEEE Trans. Inform. Theory, 34, 605–612. Daubechies, I., 1990. The wavelet transform, time-frequency localization and signal analysis. IEEE Trans. Info. Theory, 36, 961–1005. Daubechies, I., 1992. Ten Lectures on Wavelets. Society for Industrial and Applied Mathematics, Philadelphia, Pennsylvania. Daubechies, I. and Janssen, A.J.E.M., 1993. Two theorems on lattice expansions. IEEE Trans. Information. Theory, 39(1), 3–6. Daubechies, I., Jaffard, S., and Journ´e, J.-L., 1991. A simple Wilson orthonormal basis with exponential decay. SIAM J. Math. Anal., 22, 554–573. Demanet, L., 2006. Curvelets, wave atoms, and wave equations. Ph. D. Thesis, California Inst. of Tech., California. Demanet, L. and L. Ying, 2007. Curvelets and wave atoms for mirror-extended images: Optical Engineering Applications. International Society for Optics and Photonics, 67010J, doi: 10.1117/12.733257. Deschamps, G.A., 1971. Gaussian beams as a bundle of complex rays. Electronics Lett., 7, 684–685. De Hoop, M. and Stolk, C.C., 2002. Microlocal analysis of seismic inverse scattering in anisotropic, elastic media. Comm. Pure Appl. Math., 55, 261–301. Do, M. N. and Vertelli, M., 2003. Contourlet. In: Welland G. V. (Ed.). Beyound Wavelets. Academic Press. Donoho, D., 1995. Nonlinear solution of linear inverse problems by wavelet-vaguelette decomposition. Appl. Comput. Harmon. Analytic., 2, 101–126.
2
Wavefield Representation, Propagation and Imaging Using Localized Waves. . .
135
Donoho, D., 1999. Tight frames of k-plane ridgelets and the problem of representing objects that are smooth away from d-dimensional singularities in Rn . Proc. Nat. Acad. Sci., 96, 1828–1833. Donoho, D. L., 2006. Compressed sensing. IEEE Transactions on Information Theory, 52(4), 1289–1306. Douma, H. and de Hoop, M.V., 2004. Wave-character preserving pre-stack map migration using curvelets. Expanded Abstracts, 74th SEG Annual International Meeting. Douma, H. and de Hoop, M.V., 2005. On common-offset pre-stack time-migration with curvelets. Expanded Abstracts, 75th SEG Annual International Meeting, 2009–2012. Douma, H. and de Hoop, M.V., 2006. Leading-order seismic imaging using curvelets. Expanded abstracts, SEG 76th Annual Meeting, 2411–2415. Douma, H. and de Hoop, M.V., 2007. Leading-order seismic imaging using curvelets. Geophysics, 71(1), S13–S28. Einziger, P. D., Raz, S., and Shapira, M., 1986. Gabor Representation and Aperture Theory. J. Opt. Soc. Am., A 3, 508–522. Felsen, L.B., 1975. Complex rays. Philips Res. Rep., 30, 185–187. Felsen, L.B., 1976. Complex-source-point solutions of the field equations and their relation to the propagation and scattering of Guassian beams. Symposie Mathematics, Inst. In: Nationa Alta Matematica, XVIII. Academic Press, 40–56. Felsen, L. B., 1984. Geometrical theory of diffraction, evanescent waves, complex rays and Gaussian beams. Geophys. J. R. As-tron. Soc., 79, 77–88. Feichtinger, H.G. and Strohmer, T., 1998. Gabor Analysis and Algorithms, Theory and Applications. Birkh¨ auser. Fishman, L. and McCoy, J.J., 1984. Derivation and application of extended parabolic wave theories II. Path integral representations. J. Math. Phys., 25, 297–308. Flatt´e, S.M., Dashen, R., Munk, W.H., Watson, K., and Zachariasen, F., 1979. Sound Transmission Through A Fluctuation Ocean. Cambridge University Press. Fomel, S., 2006. Towards the Seislet Transform. SEG Annual Meeting, New Orleans, USA. Foster, D. and Huang, J., 1991. Global asymptotic solutions of the wave equation. Geophys. J. Int., 105, 163–171. Foster, D.J., Lane, F.D., Mosher, C.C., and Wu, R.S., 1997. Wavelet transforms for seismic data processing. Expanded abstracts, SEG 67th Annual Meeting, 1318–1321. Foster, D.J., Wu, R.S., and Mosher, C.C., 2002. Coherent-state solutions of the wave equation. Expanded abstracts, SEG 72nd Annual Meeting, 1348–1351. Geng, Y., Wu, R. S., and Gao, J. H., 2013. Gabor frame based Gaussian Packet migration. Geophysical Prospecting, in press. Haar, A., 1910. Zur Theorie der orthogonalen Functionensysteme. Math. Ann., 69, 331–371. Hale, D., 1992. Migration by the Kirchhoff, slant stack, and Gaussian beam methods. Colorado School of Mines Center for Wave Phenomena Report 121. Heil, C. and Walnut, D., 1989. Continuous and discrete wavelet transforms. SIAM Rev., 31, 628–666. Hermanez-Figueroa, H., Zamboni-Rached, M., and Recami, R., 2008. Localized Waves. Wiley Series in Micorwave and Optical Engineering. Wiley-Interscience.
136
Ru-Shan Wu and Jinghuai Gao
Herrmann, F., 2003. Optimal seismic imaging with curvelets. Expanded Abstracts, SEG 70th Annual International Meeting, 997–1000. Herrmann, F. J., 2010. Randomized sampling and sparsity: Getting more information from fewer samples. Geophysics, 75(6), 173–187. Heyman, E. and B. Z. Steinberg, 1987. Spectral analysis of complex source pulsed beams. J. Opt. Soc. Am., A4, 473–480. Hill, N.R. 1990. Gaussian beam migration. Geophysics, 55, 1416–1428. Hill, N.R. 2000. Prestack Gaussian beam depth migration. Geophysics, 66, 1240–1250. Holschneider, M., 1995. Wavelets, An Analysis Tool. Clarendon Press, Chapter 5. Gabor, D., 1946. Theory of Communication. J. Inst. Electr. Eng., 93(III), 429–457. Gao, J.H., Zhou Y., Mao, J., Chen, W., Wu R.S., and Li, Y., 2006. A wave propagation method in local angle domain. Chinese Geophys. J., 50, 249–259. G˚ arding, L., 1987. Singularities in Linear Wave Propagation. Lecture Notes in Mathematics. Springer. Geng, Y., Wu, R. S., and Gao, J. H., 2009. Dreamlet transform applied to seismic data compression and its effects on migration. Expanded abstracts, SEG 79th Annual Meeting, 28, 3640–3644. Goupillaud, P., Grossmann, A., and Morlet, J., 1984. Cycle-octave and related transforms in seismic signal analysis. Geoexploration, 23, 85–102. Grossman, J. P., 2005. Theory of adaptive, nonstationary filtering in the Gabor domain with applications to seismic inversion. Ph.D. Thesis, University of Calgary, Canada. Grossmann, A. and Morlet, J., 1984. Decomposition of Hardy functions into square integrable wavelets of constant shape. SIAM J. Math. Anal., 15, 723–736. Grossmann, A., Morlet, J., Paul, T., 1985. Transforms associated to square integrable group representations I: General results. J. Math. Phys., 26, 2473–2479. Grossmann, A., Morlet, J., Paul, T., 1986. Transforms associated to square integrable group representations II: Examples. Ann. Inst. Henri Poincar´e Physique th´eorique, 45, 293–309. Grossmann, A., Paul, T., 1984. Wave functions on subgroups of the group of affine canonical transformations. In: Resonances, Models and Phenomena. Lectures Notes in Physics, Vol.211. Springer-Verlag, Berlin. Jaffard, S., Meyer, Y., and Ryan, R. D., 2001. Wavelets: Tool for Science and Technology. Society for Industrial and Applied Mathematics, Philadelphia. Jin, S., Luo, M.Q., Wu, R.S., and Walraven, D., 2005. Application of beamlet propagator to migration amplitude correction. Expanded abstracts, SEG 75th Annual Meeting, 1962–1965. Kaiser, G., 1994. A Friendly Guide to Wavelets. Birkh¨ auser, Boston, Basel, Berlin. Kaiser, G., 2004. Eigenwavelets of the wave equation. Signals & waves, Austin, TX. Keller, J.B. and Streifer, W., 1971. Complex rays with an application to Gaussian beams. J. Opt. Soc. Am., 61, 40–43. Kiselev, A. P. and M. V. Perel, 1999. Gaussian wave packets. Optics and Spectroscopy, 86(3), 307–309. Klauder, J.R., 1987. Global, uniform, asymptotic wave-equation solutions for large wavenumbers. Annals of Physics, 180, 108–151. Klauder, J.R. and Skagerstam, B.-S., 1985. Coherent States. World Scientific, Singapore.
2
Wavefield Representation, Propagation and Imaging Using Localized Waves. . .
137
Kleyn, A., 1977. On the migration of reflection time contour maps. Geophysical Prospecting, 25, 125–140. Klimeˇs, L., 1989. Gaussian packets in the computation of seismic wavefields. Geophysical Journal International, 99(2), 421–433. Kravtsov, Yu. A., 1967. Complex rays and complex caustics, Radiophys. Quantum Electronics, 10, 1283–1304. Kravtsov, Yu. A. and Berczynski, P., 2007. Gaussian beams in inhomogeneous media: A review. Stud. Geophys. Geod., 51, 1–36. Li, S.X. and Liu, J.Q., 1994. Wavelet Transform and the Mathematical Basis of Inversion. Geological Press of China. (in Chinese) Low, F., 1985. Complete sets of wave packets. In: A Passion for Physics-Essays in Honor of Geoffrey Chew. World Scientific, Singapore, 17–22. Luo, M. and Wu, R.S., 2003. 3D beamlet prestack depth migration using the local cosine basis propagator. Expanded abstracts, SEG 73rd Annual Meeting, 985–988. Luo, M., Wu, R.S., and Xie, X.B., 2004. Beamlet migration using local cosine basis with shifting windows. Expanded abstracts, SEG 74th Annual Meeting, 945–948. Luo, M., Wu, R.S. and Xie, X.B., 2005. True amplitude one-way propagators implemented with localized corrections on beamlets. Expanded abstracts, SEG 75th Annual Meeting, 1966–1969. Mallat, S., 1989. A theory for multiresolution signal decomposition: The wavelet representation. IEEE Trans. Patt. Analytic. Mach. Intell., 11, 674–693. Mallat, S., 1998. A Wavelet Tour of Signal Processing. Academic Press. Mallat, S., 1999. A Wavelet Tour of Signal Processing, second edition. Academic Press. Mallat, S. and Zhang, Z., 1993. Matching pursuits with time-frequency dictionaries. IEEE Trans. Signal Process., 41, 3397–3415. Margrave, G. F. and Ferguson, R.J., 1999. Wavefield extrapolation by nonstationary phase shift. Geophysics, 64, 1067–1078. Malvar, H.S., 1990. Lapped transforms for efficient transform/subband coding. IEEE Trans. Acoust. Speech Signal Process., 38, 969–978. Malvar, H.S., 1992. Signal Processing with Lapped Transforms. Artech House, Norwood, MA. Meyer, F. G., 1998. Image compression in libraries of bases. Lecture notes for a course given at the Institut Henri Poincar´e, Paris. Meyer, Y. and Coifman, R.R., 1997. Brushlets: A tool for directional image analysis and image compression. Appl. Comput. Harmon. Anal., 4, 147–187. Meyer, Y. and Coifman, R.R., 1997. Wavelets, Calder´ on-Zygmund and Multilinear Operators. Cambridge University Press, Cambridge. Morlet J., Arens, G., Fourgeau, S, E., and Giard, D., 1982a. Wave propagation and sampling theory—Part I: Complex signal and scattering in multilayered media. Geophysics, 47, 203–221. Morlet J., Arens, G., Fourgeau, S, E., and Giard, D., 1982b. Wave propagation and sampling theory—Part II: Sampling theory and complex waves. Geophysics, 47, 222–236.
138
Ru-Shan Wu and Jinghuai Gao
Mosher, C.C, Foster, D.J. and R.S. Wu, 1996. Phase shift migration with wave packet algorithms, “Mathematical methods in geophysical imaging IV”. Proc. SPIE, 2822, 2–16. Nowack, R.L., 2003. Calculation of synthetic seismograms with Gaussian beams. Pure Appl. Geophys., 160, 487–507. Nowack, R.L., Dasgupta, S., Schuster, G.T., and Sheng, J.M., 2006. Correlation migration using Gaussian beams of scattered teleseismic body waves. Bull. Seism. Soc. Am., 96, 1–10. Norris, A. N., White, B. S., and Schrieffer, J. R., 1987. Gaussian wave-packets in inhomogeneous-media with curved interfaces. Proceedings of the Royal Society of London Series A—Mathematical Physical and Engineering Sciences, 412, 1842, 93– 123. Pascal, A., Guido, W., and Wickerhauser, M.V., 1992. Local sine and cosine bases of Coifman and Meyer and the construction of smooth wavelets. In: Charles K. Chui (Ed.). Wavelets: A Tutorial in Theory and Applications. Academic Press, Inc, 237–256. Perel, M. V. and Sidorenko, M. S., 2007. New physical wavelet ‘Gaussian wave packet’. Journal of Physics a-Mathematical and Theoretical, 40(13), 3441–3461. Popov, M.M., 1982. A new method of computation of wave fields using Gaussian beams. Wave Motion, 4, 85–97. Popov, M.M., 2002. Ray Theory and Gaussian Beam Method for Geophysicists. EDUFBA, Salvador-Bahia. Qian, J. L. and L. X. Ying, 2010. Fast gaussian wavepacket transforms and Gaussian beams for the schrodinger equation. Journal of Computational Physics, 229(20), 7848–7873. Qian, S. and Chen, D.P., 1996. Joint Time-Frequency Analysis, Methods and Applications. Prentice-Hall Inc. Raslton, J., 1983. Gaussian beams and the propagation of sigularities. In: Littman W.(Ed.). Studies in Partial Differential Equations. MAA Studies in Mathematics, 23, 206–248. Raz, S., 1987. Beam stacking: A generalized preprocessing technique. Geophysics, 52, 1199–1210. Ristow, D. and R¨ uhl, T., 1994. Fourier finite-difference migration. Geophysics, 59, 1882–1893. Sava, P. and Fomel, S., 2003. Angle-domain common-image gathers by wavefield continuation methods. Geophysics, 68(3), 1065–1074. Smith, H., 1998a. A Hardy space for Fourier integral operators. J. Geom. Anal., 8, 629–653. Smith, H., 1998b. A parametrix construction for wave equations with coefficients. Ann. Inst. Fourier (Grenoble), 48, 797–835. Stein, E. M., 1993. Harmonic Analysis: Real-variable Methods, Orthogonality, and Oscillatory Integrals. Princeton University Press. Steinberg, B.Z., Heyman, E., and Felsen, L.B., 1991. Phase-space beam summation for time-dependent radiation from large apertures: Continuous parameterization. J. Opt. Soc. Am., 8, 943–958.
2
Wavefield Representation, Propagation and Imaging Using Localized Waves. . .
139
Steinberg, B.Z. and Heyman, E., 1991. Phase-space beam summation for time-dependent radiation from large apertures: Discretized parameterization. J. Opt. Soc. Am., 8, 959–966. Steinberg, B.Z., 1993. Evolution of local spectra in smoothly varying nonhomogeneous environments-Local canonization and marching algorithms. J. Acoust. Soc. Am., 93, 2566–2580. Steinberg, B.Z. and Birman, R., 1995. Phase-space marching algorithm in the presence of a planar wave velocity discontinuity—A Qualitative study. J. Acoust. Soc. Am., 98, 484–494. Steinberg, B.Z. and McCoy, J.J., 1993. Marching acoustic fields in a phase space. J. Acoust. Soc. Am., 93, 188–204. Str¨ omberg, J.-O., 1983. A modified Franklin system and higher-order spline systems on R as unconditional bases for Hardy spaces. Conference on Harmonic Analysis in Honor of Antoni Zygmund, vol. II, W. Beckner et al., eds., Wadsworth, Belmont, CA, 475–494. Sweldens, W. and Shr¨ oder, P., 1996. Building Your Own Wavelets at Home. Wavelets in Computer Graphics, 15–87. Thomson, C.J., 2001. Seismic coherent states and ray geometric spreading. Geophys. J. Int., 144, 320–342. Thomson, C.J., 2004. Coherent states analysis of the head wave problem: An overcomplete representation and its relationship to rays and beams. Geophys. J. Int., 157, 1189–1205. Torr´esani, B., 1991. Wavelets associated with representations of the affine WeylHeisenberg group. J. Math. Phys., 32, 1273–1279. Van den Berg, 2004. Wavelets in Physics. Cambridge University Press. Ville, J., 1948. Th´eorie et applications de la notion de signal analytique, Cˆ ables et Transmissions. Laboratoire de T´el´ecommunications de la Soci´et´e Alsacienne de Construction M´ecanique, 2A, 61–74. Wang, Y. and Wu, R.S., 1998a. Migration operator decomposition and compression using a new wavelet packet best basis algorithm. Expanded abstracts, SEG 68th Annual Meeting, 1167–1170. Wang, Y. and Wu, R.S., 1998b. Decomposition and compression of Kirchhoff migration operator by adapted wavelet packet transform. Wavelet Applications in Signal and Image Processing VI, Proc., SPIE, 3458, 246–258. Wang, Y. and Wu, R.S., 2002. Beamlet prestack depth migration using local cosine basis propagator. Expanded abstracts, SEG 72nd Annual Meeting, 1340–1343. Wang, Y., Cook, R., and Wu, R.S., 2003. 3D local cosine beamlet propagator. Expanded abstracts, SEG 73rd Annual Meeting, 981–984. Wang, Y., Verm, R., and Bednar, B., 2005. Application of beamlet migration to the SmaartJV Sigsbee2A model. Expanded abstracts, SEG 75th Annual Meeting, 1958–1961. Weber, M., 1955. Die bestimmung einer beliebig gekruemmten schichtgrenze aus seismischen reflexionsmessungen. Geofisica Pura e Applicata, 32, 7–11. Wickerhauser, M.V., 1993. Smooth localized orthonormal bases. Comptes Rendus de l’Academie des Sciences de Paris, 316, 423–427.
140
Ru-Shan Wu and Jinghuai Gao
Wickerhauser, M.V., 1994. Adapted Wavelet Analysis form Theory to Software. A K Peters. Wigner, E.P., 1932. On the quantum correction for thermodynamic equilibrium. Phys. Rev., 40, 749–759. Wilson, K.G., 1971. Renormalization group and critical phenomena II: Phase-space cell analysis of critical behavior. Phys. Rev. B., 4, 3184–3205. Wu, B., Wu, R.S., and Gao, J., 2009. Dreamlet prestack depth migration using local cosine basis and local exponential frames. Expanded Abstracts, SEG 79th Annual Meeting, 2753–2757. Wu, B., Wu, R. S., and Gao, J. H., 2013. Dreamlet source-receiver survey sinking prestack depth migration. Geophysical Prospecting, 61, 63–64. Wu, R. S., 1985. Gaussian beams, complex rays, and the analytic extension of the Green’s function in smoothly inhomogeneous media. Geophys. J. R. astr. Soc., 83, 93–110. Wu, R. S. and Aki, K, 1988. Seismic wave scattering in the three-dimensionally heterogeneous earth. In the special issue “Seismic Wave Scattering and Attenuation”, edited by Wu and Aki. Pure and Applied Geophys., 128, 1–6. Wu, R. S. and Chen, L, 2001. Beamlet migration using Gabor-Daubechies frame propagator. Expanded abstracts, 63rd Conference & Technical Exhibition, EAGE, 74. Wu, R. S. and Chen, L., 2002a. Wave Propagation and Imaging Using Gabor-Daubechies Beamlets: Theoretical and Computational Acoustics. World Scientific, New Jersey, 661–670. Wu, R. S. and Chen, L., 2002b. Mapping directional illumination and acquisitionaperture efficacy by beamlet propagators. Expanded Abstracts, 72nd Ann. Internat. Mtg., Soc. Expl. Geophys., 1352–1355. Wu, R.S. and Chen, L., 2003. Directional illumination and acquisition dip-response. Extended Abstracts, EAGE 65rd Annual Meeting. Wu, R. S. and Chen, L., 2006. Directional illumination analysis using beamlet decomposition and propagation. Geophysics, 71, s147–s159. Wu, R.S. and Jin, S., 1997. Windowed GSP (generalized screen propagators) migration applied to SEG-EAEG salt model data. Expanded abstracts, SEG 67th Annual Meeting, 1746–1749. Wu, R.S. and Luo, M.Q., 2005. Comparison of different scheme of image amplitude correction in prestack depth migration. Expanded abstracts, SEG 75th Annual Meeting, 2060–2063. Wu, R.S. and Maupin, V., 2007. Advances in Wave Propagation in Heterogeneous Earth. Elsevier. Wu, R.S. and Wang, Y., 1998. Comparison of propagator decomposition in seismic imaging by wavelets, wavelet-packets, and local harmonics. Mathematical Methods in Geophysical Imaging, V. Proc. SPIE, 3453, 163–179. Wu, R.S. and Yang, F., 1997. Seismic imaging in wavelet domain: Decomposition and compression of imaging operator. Wavelet Applications in Signal and Image Processing V. Proc. SPIE, 3169, 148–162. Wu, R. S., Chen, L., and Wang, Y., 2002. Prestack migration/imaging using synthetic beamsources and plane sources. Stud. Geophys. Geod., 46, 651–665.
2
Wavefield Representation, Propagation and Imaging Using Localized Waves. . .
141
Wu, R.S., Chen, S., and Luo, M., 2004. Migration amplitude correction in angle domain using beamlet decomposition. Expanded abstracts, EAGE 66th Annual Meeting, G029. Wu, R.S., Geng, Y., and Wu, B., 2011. Physical wavelet defined on an observation plane and the Dreamlet. Expanded abstracts, SEG 81st Annual International Meeting, 3835–3839. Wu, R.S., Luo, M., Chen, S., and Xie, X.B., 2004. Acquisition aperture correction in angle-domain and true-amplitude imaging for wave equation migration. Expanded abstracts, SEG 74th Annual Meeting, 937–940. Wu, R. S., Wang, Y., and Gao, J. H., 2000. Beamlet migration based on local perturbation theory. Expanded abstracts, 70th Ann. Internat. Mtg., Soc. Expl. Geophys., 1008–1011. Wu, R.S., Wang, Y., and Luo, M., 2003. Local-cosine beamlet migration for 3D complex structures. Eighth International Congress of the Brazilian Geophysical Society, 14–18. Wu, R.S., Wang, Y., and Luo, M., 2008. Beamlet migration using local cosine basis. Geophysics, 73, S207–217. Wu, R.S., Wu, B., and Geng, Y., 2009. Imaging in compressed domain using dreamlets. CPS/SEG Beijing ’2009, International Geophysical Conference, Expanded abstracts, ID: 57. Wu, R.S., Yang, F., Wang Z., and Zhang, L., 1997. Migration operator compression by wavelet transform: Beamlet migrator. Expanded Abstracts of the Technical Program, SEG 67th Annual Meeting, 1646–1649. Xie, X.B. and Wu, R.S., 2002. Extracting angle related image from migrated wavefield. Expanded abstracts, SEG 72nd Annual Meeting, 1360–1363. Xie X.B. and Wu, R.S., 2003. Three-dimensional illumination analysis using wave eauation based propagator. Expanded abstracts, SEG 73rd Annual Meeting, 989– 992. Xie, X.B., Jin, S., and Wu, R.S., 2004. Wave equation based illumination analysis. Expanded abstracts, SEG 74th Annual Meeting, 933–936. Xie, X.B., Jin, S., and Wu, R.S., 2006. Wave-equation based seismic illumination analysis. Geophysics, 71(5), S169–177. Xu, S. and Lambare, G., 1998. Maslov + Born migration/inversion in complex media. Expanded abstracts, SEG 68th Annual Meeting, 1702–1707. Ying, L., Demanet, L., and Cand`es, E.J., 2006. 3D Curvelet Transform. Proc. Conf. Wavelets XI. San Diego. Young, R.K., 1993. Wavelet Theory and Its Applications. Kluwer Academic Publishers. Zacek, K., 2004. Gaussian packet pre-stack depth migration. Expanded Abstracts, SEG 74th Annual International Meeting, 957–960. Zacek, K., 2005. Gaussian packet pre-stack depth migration of the marmousi data set. Expanded Abstracts, SEG 75th Annual International Meeting, 1822–1925. Zacek, K., 2006a. Decomposition of the wave field into optimized gaussian packets. Studia Geophysica Et Geodaetica, 50(3), 367–380. Zacek, K., 2006b. Optimization of the shape of gaussian beams. Studia Geophysica Et Geodaetica, 50(3), 349–366.
142
Ru-Shan Wu and Jinghuai Gao
Authors Information Ru-Shan Wu Modeling and Imaging Laboratory, Institute of Geophysics and Planetary Physics/ Department of Earth and Planetary Sciences, University of California, Santa Cruz, CA 95064, USA E-mail: [email protected] Jinghuai Gao School of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an, China
Chapter 3
Two-way Coupling of Solid-fluid with Discrete Element Model and Lattice Boltzmann Model Yucang Wang, Sheng Xue, and Jun Xie
In this chapter, we present a fully coupled solid-fluid code using Discrete Element Method (DEM) and Lattice Boltzmann Method (LBM). The DEM permits the particles to be bonded and is used to model the deformation and fracture of solid, while the LBM is used to model fluid flow. These two methods are coupled in a two-way process, i.e., the solid part provides a moving boundary condition and transfers momentum to fluid, and the fluid exerts a dragging force to the solid. At the same time Darcy flow is implemented to model the flow in a narrow tunnel or crack using Lattice Boltzmann method. Two widely used open source codes, the ESyS-Particle and OpenLB, are integrated as both of the codes are written using C++ and paralleled with MPI library. Some preliminary 2-D simulations such as particles moving in the fluid, fluid flow in a narrow tunnel or crack and hydraulic fracture induced by the injection of fluid into a borehole are carried out to validate the integrated code. Key Words: Solid-fluid coupling, Discrete Element Method, Lattice Boltzmann Method, The ESyS-Particle, OpenLB.
3.1
Introduction
Fluid-solid interaction problems are of great importance in a number of engineering and scientific fields. These problems may include fluidized beds, fluvial erosion, piping, liquefaction, particle suspension, transportation and sedimenta-
144
Yucang Wang, Sheng Xue, and Jun Xie
tion. One of such interaction problems deals with large deformation, or even fracturing of solids and the flow of fluids in the fractures of solids. For example, in geothermal energy extraction or petroleum industry, where hydraulic fracturing technique is often used to increase the production, high pressure fluid is injected into boreholes to fracture reservoir rocks to enhance the flow of gas, oil or other fluids. In underground coal mines, the occurrence of coal and gas outbursts results from strong interactions between gas and coal. In geophysical researches, there are interests to study the relations between earthquake occurrence and underground water flow or water injection in reservoirs. In the study of tsunami generation and inundation, there is a strong coupling between the movement of water and solid materials. In these problems, the movement of solid materials is accompanied, influenced or even driven by fluid flow of different forms such as laminar flow or turbulence flow, and flow patterns are strongly affected by the presence and movements of solids. Therefore the two-way solid and liquid coupling is critical in understanding the behavior of such interactions. The physics involved in these problems is not yet fully understood and is usually described in some empirical formulas. Sometimes it is very difficult to investigate these interactions by experimental studies due to their complexity, geometries, large scales, and difficulty to control their conditions and parameters. A welldeveloped numerical model containing the most basic physical mechanisms is a desirable alternative to study these problems. In the past 30 years, different numerical approaches have been developed to study the movement of solids and fluids. For solids, the most widely used numerical approaches are Finite Element Method (FEM), Boundary Element Method (BEM), and Finite Difference Method (FDM). Apart from these continuum based models, Discrete Element Method (DEM) has emerged as a powerful numerical tool and has attracted considerable research attention in recent years due to its discrete nature. The basic idea of DEM is to model solids as an assemblage of discrete particles interacting with one another. Some DEM models allow the bonds between particles to break, explicitly modeling microscopic fracturing events. Due to the unique discrete nature of DEM, many problems which are highly dynamic with large deformations and a large number of frequently changing contacts can be modeled naturally and easily. For fluid, the classical numerical approach is based on the numerical solutions of the Navier-Stokes (N-S) equations such as in Computational Fluid Dynamics (CFD). There are numerous methods available to discretize and solve N-S equations. Three of the most popularly used numerical methods are FDM, FEM, and the Finite Volume Method (FVM). In addition to the N-S approach, there are also other approaches, such as Lattice Boltzmann Method (LBM), the Molecular Dynamics method (MD) and Smoothed Particle Hydrodynamics (SPH), which are not directly based on N-S equations. Among these methods, LBM is based
3
Two-way Coupling of Solid-fluid with Discrete Element Model and . . .
145
on the kinetic gas theory, which simulates fluid flows by tracking the evolution of the single fluid particle distribution. Major advantages of LBM over the classical N-S approach include its ease in implementation, paralleling, and handling boundary conditions of complicated geometries (Chen et al., 2003). For solid-fluid coupling problems, various combinations of models for the particle phase and fluid phase can be made depending on the type of problem (Zhou et al., 2010). Among these, one promising approach is to combine CFD for continuum fluid and DEM for discrete particles, and it has been increasingly used to study the fundamentals of coupled particle-fluid flows. Over the years, different DEM-CFD models have been developed. Due to the limitations of computer power, some earlier models had to adopt certain simplifications for either the fluid component or the solid component. For example, some DEM-CFD coupled models had to employ coarse resolution of fluid flow and associated coupling (Tsuji et al., 1993; Bruno, 1994), while other models had to ignore detailed interactions of particles although the fluid component has been fully resolved through the full solution of N-S equations (Ladd, 1994a, b; Aidun and Lu, 1995; Hu, 1996; Aidun et al., 1998; Qi, 1999). These simplifications and neglects make the models less acceptable as many important engineering and scientific problems require the accurate resolution of discontinuous particle-fluid dynamics at fine spatial scales and across multiple flow regimes. Fortunately with the rapid improvement of computer capability, such detailed DEM-CFD approaches become possible. Various investigators have developed fully coupled DEM-CFD models by solving N-S equations using FVM (Chu and Yu, 2008; Bluhm-Drenhaus et al., 2010; Wu et al., 2011; Chen et al., 2011; Xiao and Sun, 2011; Chareyre et al., 2012). Others have proposed the coupled scheme of DEM-SPH (Potapov et al., 2001; Huang and Nydal, 2012) and DEM-LBM (Cook et al., 2004; Feng et al., 2007; Strack and Cook, 2007; Feng et al., 2010; Lomine et al., 2011). In this chapter, we present a DEM-LBM coupling approach based on two open source codes: the ESyS-Particle and OpenLB. The advantages of DEM-LBM coupling model come from the capability of DEM to model solid interaction, fracturing, and the ease in handling complex boundary conditions using LBM. The use of LBM instead of CFD also eliminates severe mesh distortion due to frequent mesh geometry adaptation required in CFD. Because of its Eulerian grids, LBM is particularly suitable for modeling fluid-solid interaction problems, and a large number of solid particles can easily be accommodated. The explicit nature and local interaction of both LBM and DEM make them an ideal partner from the perspectives of computation, implementation and parallelization, and make the coupling into a competitive numerical tool for the simulation of particle-fluid systems. Although some studies have been carried out in combining DEM and LBM (Cook et al., 2004; Feng et al., 2007; Strack and Cook, 2007; Feng et al., 2010;
146
Yucang Wang, Sheng Xue, and Jun Xie
Lomine et al., 2011), the coupled model presented in this paper differs from these previous studies in the following aspects. Firstly, the previous studies were mainly designed for dilute suspension of solid particles in fluids in which the solid particles are real “discrete” elements, while our approach allows solid particles to be bonded, and the bonds to be broken, explicitly simulating fracturing events. Secondly, once fractures occur, fluid can flow into narrow cracks, thus free, turbulent flow and Darcy flow may co-exist in our model. Thirdly, rather than an immersed moving boundary condition (Noble and Torczynski, 1998), a different moving boundary condition (Yu et al., 2003) is adopted in this study, which is numerically more stable. Lastly, our approach couples two widely used open source codes which are paralleled using MPI, permitting large-scale simulations in super-computers. This paper is arranged as follows. A brief introduction of DEM principle and the ESyS-Particle code is given in Section 3.2, followed by describing the main ideas of LBM, together with the introduction of boundary conditions and the OpenLB code in Section 3.3. A detailed discussion of two-way coupling of the two codes is presented in Section 3.4. Finally, the coupling scheme is validated through several simulation examples.
3.2
Discrete Element Method and the ESyS-Particle Code
The Discrete Element Method (DEM) was pioneered by Cundall (Cundall and Strack, 1979). It is based on the concept that the material to be modeled can be represented as a collection of discrete solid particles interacting with one another at their contacts. The precise nature of the interaction depends on the scale of interest and the details of the simulation. At each time step, the calculations performed in DEM alternate between integrating equations of motion for each particle, and applying the force-displacement law at each contact, through which the contact forces are updated based on the relative motions between two particles and their relevant contact stiffness. There are different kinds of DEMs proposed. One kind of DEMs, like the ESyS-Particle, permits particles to be bonded so that tensile forces can be transmitted. Fracturing is represented explicitly as broken bonds, which form and coalesce into macroscopic fractures. The bonded DEM model is often used to model wave propagation and fracture of intact materials such as rocks.
3
3.2.1
Two-way Coupling of Solid-fluid with Discrete Element Model and . . .
147
A Brief Introduction to the Open Source DEM Code: The ESyS-Particle
The ESyS-Particle (https://launchpad.net/esys-particle/Esys-particle) is an open source simulation software developed by the Australian Computational Earth Systems Simulator (ACcESS). Written using C++ and Python, the ESyS-Particle is designed for execution on parallel supercomputers (Abe et al., 2004). The simulation engine implements spatial domain decomposition via the Message Passing Interface (MPI). Some of the most critical developments have recently been made by the first author of this paper to model the full set of degrees of freedom (six kinds of independent relative movements are transmitted between two 3-D interacting particles) needed to accurately simulate observed fracture patterns (Wang et al., 2006, Wang and Mora, 2008b; Wang, 2009; Wang and Mora, 2009; Wang and AllonsoMarroquin, 2009). The major features that distinguish the ESyS-Particle from existing DEMs are the explicit representation of particle orientations using unit quaternion, complete interactions, and a new way of decomposing relative rotations between two rigid bodies (Wang et al., 2006; Wang, 2009; Wang and Alonso-Marroquin, 2009). The ESyS-Particle has been successfully utilised in the study of some physical processes such as rock fracture (Mora and Place, 1993; Place and Mora, 1999; Place et al., 2002; Wang et al., 2004; Wang and Mora, 2008b; Wang and AlonsoMarroquin, 2009) and earthquake dynamics (Mora and Place, 1993, 1994, 1998, 1999, 2002; Mora et al., 2002; Place and Mora, 1999, 2000; Abe et al., 2002, 2006; Abe and Mair, 2009).
3.2.2
The Basic Equations
In the ESyS-Particle, the solid particle motion can be decomposed into two completely independent parts: translational motion of the center of mass and rotation about the center of mass. The former is governed by the Newtonian equation r¨(t) = f (t)/M (3.1) where r(t) and M are the position of a particle and its particle mass respectively. f (t) is the total forces acting on the particle, which may include the spring forces by neighboring particles, the forces by the walls, viscous force, gravitational force, etc. The equation above can be integrated using the velocity Verlet scheme (Mora and Place, 1994; Place and Mora, 1999; Allen and Tildesley, 1987).
148
Yucang Wang, Sheng Xue, and Jun Xie
The particle rotation depends on the total torque applied and usually involves two coordinate frames. The first one is fixed in space, called space-fixed frame, in which Equation (3.1) is applied. The second one is attached to the principal axes of the rotation body, referred to as body-fixed frame. The particle rotation is governed by the Euler’s equations (in the body-fixed frame) (Goldstein, 1980). τxb = Ixx ω˙ xb − ωyb ωzb (Iyy − Izz ) τyb = Iyy ω˙ yb − ωzb ωxb (Izz − Ixx ) τzb = Izz ω˙ zb − ωxb ωyb (Ixx − Iyy )
(3.2)
where τxb , τyb and τzb are the components of total torque τ b expressed in bodyfixed frame, ωxb , ωyb and ωzb are components of angular velocities ω b measured in body-fixed frame, and Ixx , Iyy and Izz are the three principle moments of inertia in body-fixed frame in which the inertia tensor is diagonal. In case of a 3-D sphere, Ixx =Iyy =Izz =I. In the ESyS-Particle, the unit quaternion q=q0 + q1 i + q2 j + q3 k is used to explicitly describe the orientation of each solid particle. The physical meaning of a quaternion is that it represents a one-step rotation around the vector q1ˆi + q2 ˆj + q3 kˆ with a rotation angle of 2 arccos(q0 ) (Kuipers, 1998). A quaternion for each particle satisfies the following equation (Evans, 1977; Evans and Murad, 1977), 1 Ω Q˙ = Q0 (q)Ω (3.3) 2 where q0 −q1 −q2 −q3 0 q˙0 q1 q0 −q3 q2 ωxb q˙1 , Ω = Q˙ = ωb q˙2 , Q0 (q) = q2 q3 q0 −q1 y
q˙3
q3
−q2
q1
q0
ωzb
Equations (3.2) and (3.3) can be solved numerically (Wang et al., 2006; Wang, 2009).
3.2.3
Contact Laws and Particle Interaction
Three kinds of interactions exist between contact particles in the current ESySParticle model: bonded, solely normal repulsive and cohesionless frictional interactions. For solely normal repulsive interaction, two particles contact elastically, and the normal force exists only when d