262 9 4MB
English Pages 328 Year 2012
De Gruyter Series in Mathematics and Life Sciences 1 Editors Alexandra V. Antoniouk, Kyiv, Ukraine Roderick V. N. Melnik, Waterloo, Ontario, Canada
Mathematics and Life Sciences Edited by Alexandra V. Antoniouk Roderick V. N. Melnik
De Gruyter
Mathematics Subject Classification 2010: 97M60, 92C42, 62P10, 35Q92, 37N25, 92C15, 92B05, 92C80, 92-03, 92-02.
ISBN 978-3-11-027372-4 e-ISBN 978-3-11-028853-7 Library of Congress Cataloging-in-Publication Data A CIP catalog record for this book has been applied for at the Library of Congress. Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at http://dnb.dnb.de. © 2013 Walter de Gruyter GmbH, Berlin/Boston Typesetting: PTP-Berlin Protago-TEX-Production GmbH, www.ptp-berlin.eu Printing and binding: Hubert & Co. GmbH & Co. KG, Göttingen Printed on acid-free paper Printed in Germany www.degruyter.com
In memory of Alexander Antoniouk
Preface
Life sciences provide a significant source of some of the most challenging problems in modern applied mathematics, leading frequently to the necessity of studying complex systems and developing new systems-science-based approaches that are at the forefront of modern scientific thinking. Moreover, since our understanding of the complex systems in life sciences increasingly requires the development of new quantitative approaches, as well as on-going interactions and close collaboration between different disciplines, the fundamental role of interdisciplinary research at the interface of mathematics and life sciences will continue to grow in its importance. Written by 22 experts from North America, Eurasia, and Australia, this book provides a selection of interdisciplinary areas at the interface of mathematical and life sciences where recent progress has been achieved and where new opportunities exist for exploring this interface. The book is aimed at researchers in academia, practitioners and graduate students who want to foster interdisciplinary collaborations required to meet the challenges at the interface of modern life sciences and mathematics. It can serve as a reference to state-of-the-art original works on the applications of mathematical, statistical and computational methods and tools in seemingly diverse, yet intrinsically connected, areas of life sciences. The book provides a number of state-of-the-art surveys and topics accessible to graduate students and can serve as a source for graduate student projects. Supported by both rigorous mathematical procedures and examples from life science applications, the book has a strong multidisciplinary focus, promoting the methodology of mathematical analysis, modeling and computational experiment as a ubiquitous tool in applications to life sciences. Both groups of researchers, mathematicians who are interested in advanced applications and life scientists who would like to learn advanced mathematical tools and methodologies applicable in their disciplines, would benefit from this book. We would like to thank the Alexander von Humboldt Foundation, the NSERC, and the CRC Program for their support. We are grateful to the referees whose help was invaluable. We are thankful to our many colleagues in Europe, North America, Asia, and Australia whose encouragements were vital for the completion of this project. Last but not least, we are thankful to the De Gruyter editors team, and in particular to Mrs. Friederike Dittberner, Mrs. Anja Möbius and Mrs. Hella Behrend, for their continuous support and assistance during the process of publishing this book. Kyiv – Waterloo – Bilbao, December 2012
Alexandra V. Antoniouk Roderick V. N. Melnik
Contents
1
Introduction
1
1.1 Scientific Frontiers at the Interface of Mathematics and Life Sciences 3 Alexandra V. Antoniouk and Roderick V. N. Melnik 1.1.1 Developing the Language of Science and Its Interdisciplinary Character . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1.2 Challenges at the Interface: Mathematics and Life Sciences . . 5 1.1.3 What This Book Is About . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.1.4 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2
3
Mathematical and Statistical Modeling of Biological Systems
17
2.1 Ensemble Modeling of Biological Systems . . . . . . . . . . . . . . . . . . . . . . David Swigon 2.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.3 Ensemble Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.4 Computational Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.5 Application to Viral Infection Dynamics . . . . . . . . . . . . . . . . . 2.1.6 Ensemble Models in Biology . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19 19 21 25 27 30 34 36
Probabilistic Models for Nonlinear Processes and Biological Dynamics
43
3.1 Nonlinear Lévy and Nonlinear Feller Processes: an Analytic Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vassili N. Kolokoltsov 3.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.2 Dual Propagators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.3 Perturbation Theory for Weak Propagators . . . . . . . . . . . . . . . 3.1.4 T -Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.5 Nonlinear Propagators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.6 Linearized Evolution Around a Path of a Nonlinear Semigroup 3.1.7 Sensitivity Analysis for Nonlinear Propagators . . . . . . . . . . . .
45 45 49 52 54 57 60 64
x
Contents
3.1.8 3.1.9 4
Back to Nonlinear Markov Semigroups . . . . . . . . . . . . . . . . . . 66 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
New Results in Mathematical Epidemiology and Modeling Dynamics of Infectious Diseases 4.1 Formal Solutions of Epidemic Equation . . . . . . . . . . . . . . . . . . . . . . . . Vitaly A. Stepanenko and Nikolai Tarkhanov 4.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.2 Epidemic Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.3 Formal Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.4 Separation of Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.5 Solvability of General Equations . . . . . . . . . . . . . . . . . . . . . . . 4.1.6
5
5.1 Asymptotic Analysis of the Dirichlet Spectral Problems in Thin Perforated Domains with Rapidly Varying Thickness and Different Limit Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Taras A. Mel’nyk and Andrey V. Popov 5.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.2 Description of a Thin Perforated Domain with Quickly Oscillating Thickness and Statement of the Problem . . . . . . . . 5.1.3 Equivalent Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.1.6 6
73 75 76 79 80
87
89 89 90 92
The Homogenized Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 Asymptotic Expansions for the Eigenvalues and Eigenfunctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
Axiomatic Modeling in Life Sciences with Case Studies for Virus-immune System and Oncolytic Virus Dynamics 6.1 Axiomatic Modeling in Life Sciences . . . . . . . . . . . . . . . . . . . . . . . . . . Natalia L. Komarova 6.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.2 Boosting Immunity by Anti-viral Drug Therapy: Timing, Efficacy and Success . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.3 Predictive Modeling of Oncolytic Virus Dynamics . . . . . . . . . 6.1.4
73
Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Mathematical Analysis of PDE-based Models and Applications in Cell Biology
5.1.4 5.1.5
71
111 113 113 115 123
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
Contents
7
Theory, Applications, and Control of Nonlinear PDEs in Life Sciences 145 7.1 On One Semilinear Parabolic Equation of Normal Type . . . . . . . . . . . Andrei V. Fursikov 7.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.2 Semilinear Parabolic Equation of Normal Type . . . . . . . . . . . . 7.1.3 The Structure of NPE Dynamics . . . . . . . . . . . . . . . . . . . . . . . 7.1.4 Stabilization of Solution for NPE by Start Control . . . . . . . . . 7.1.5 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
147 147 148 153 158 159
7.2 On some Classes of Nonlinear Equations with L1 -Data . . . . . . . . . . . . Alexander A. Kovalevsky 7.2.1 Nonlinear Elliptic Second-order Equations with L1 -data . . . . 7.2.2 Nonlinear Fourth-order Equations with Strengthened Coercivity and L1 -Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.3 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
178 185
Mathematical Models of Pattern Formation and Their Applications in Developmental Biology
189
8.1 Reaction-Diffusion Models of Pattern Formation in Developmental Biology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anna Marciniak-Czochra 8.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.2 Mechanisms of Developmental Pattern Formation . . . . . . . . . . 8.1.3 Motivating Application: Pattern Control in Hydra . . . . . . . . . . 8.1.4 Diffusive Morphogens and Turing Patterns . . . . . . . . . . . . . . . 8.1.5 Receptor-based Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.6 Multistability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
xi
Modeling the Dynamics of Genetic Mechanism, Pattern Formation, and the Genetics of “Geometry” 9.1 Modeling the Positioning of Trichomes on the Leaves of Plants . . . . . Robert S. Anderssen, Maureen P. Edwards and Sergiy Pereverzyev Jr. 9.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1.2 Activator-inhibitor Reaction-diffusion Modeling of the Trichome Positioning . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1.3 Hexagonal Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
161 162
191 191 193 194 197 200 206 207 213 215 215 218 221 225
xii
Contents
10 Statistical Modeling in Life Sciences and Direct Measurements 10.1 Error Estimation for Direct Measurements in May–June 1986 of 131I Radioactivity in Thyroid Gland of Children and Adolescents and Their Registration in Risk Analysis . . . . . . . . . . . . . . . . . . . . . . . . Illya Likhtarov, Sergii Masiuk, Mykola Chepurny, Alexander Kukush, Sergiy Shklyar, Andre Bouville and Lina Kovgan 10.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1.2 Materials and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1.3 Conclusion and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1.4 Appendix. Approximation of Conditional Expectations . . . . . .
229
231
231 233 239 240
11 Design and Development of Experiments for Life Science Applications 245 11.1 Physiological Effects of Static Magnetic Field Exposure in an in vivo Acute Visceral Pain Model in Mice . . . . . . . . . . . . . . . . . . . . . . . . . . . János F. László 11.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Mathematical Biomedicine and Modeling Avascular Tumor Growth 12.1 Continuum Models of Avascular Tumor Growth . . . . . . . . . . . . . . . . . Helen M. Byrne 12.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1.2 Diffusion-limited Models of Avascular Tumor Growth . . . . . . 12.1.3 Tumor Invasion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1.4 Multiphase Models of Avascular Tumor Growth . . . . . . . . . . . 12.1.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
247 247 249 256 266 269 277 279 279 281 289 295 303
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
1
Introduction
Alexandra V. Antoniouk and Roderick V. N. Melnik
1.1 Scientific Frontiers at the Interface of Mathematics and Life Sciences
Abstract. Many areas within life sciences are becoming increasingly quantitative and the progress in those areas will be more and more dependent on the successful development of advanced mathematical, statistical and modeling methodologies and techniques. This chapter provides a historical perspective at the interface between mathematics and life sciences and identifies a number of frontier areas where such methodologies and techniques have recently been developed. Keywords. Coupled Dynamic System, Mathematical Modeling, Monte Carlo Method, Partial Differential Equation, Stochastic Process, Systems-Science-Based Approach 2010 Mathematics Subject Classification. 00A71, 92-02, 92-03, 92B05
1.1.1 Developing the Language of Science and Its Interdisciplinary Character Mathematics has never been developed in isolation and has always been influenced by other disciplines, in turn offering to these discipline a universal language capable to significantly advance their own fields of knowledge. The intrinsic relationship between mathematics and science goes back to the dawn of human civilizations. For example, scientists in Ancient Egypt deduced their insights into the phenomena observed in nature by using quantitative representations, schemes, and figures. Geometry played a fundamental role in the Ancient World, while the summation series (most likely known in Ancient Egypt at least since the construction of the Chephren Pyramid of Giza in 2500 BCE) was the origin of harmonic design. A well-known example of such series is provided by the Fibonacci sequence, and in the context of life sciences the significance of the latter is hard to overestimate. We all know that the arrangements of leaves, flowers, seeds, petals, just to give a few examples, all demonstrate this sequence. Plenty of examples exist also in zoology, and multiple examples can be given for the human body too. Recall, for example, that the DNA (deoxyribonucleic acid) molecule measures 34 angstroms long by 21 angstroms wide for each full cycle of its double helix spiral. While 21 and 34 are subsequent numbers in the Fibonacci series, the ratio between 34 and 21 provides a good approximation to .
4
1 Introduction
In Ancient Greece, Pythagoras of Samos taught that reality is, at its deepest level, mathematical in nature and that numbers provide a key to the ultimate reality. Galileo, who later said that “the Book of Nature is written in the language of mathematics,” followed the Pythagorean tradition. In the Ancient World in general, and in Greece in particular, the initial quantification of life sciences was driven by the development of agriculture and botany as a science. Aristotle himself, who lived during the 300s BCE, collected information about a variety of plants known at that time in the world, while his student Theophrastus classified them. Two other pillars of the initial quantification of life sciences at that time were studies in what was later called natural selection and zoology. Indeed, many Greek philosophers including Aristotle, who studied the manner in which species evolve to fit their environment, were important precursors in the development of modern evolutionary theories. The holistic view, evolving later in the systems science, including systems biology, was originated in the Ancient World. It was Aristotle who taught that “the whole is more than the sum of its parts,” emphasizing the importance of the systems science approach long before it started being developed theoretically. Through these ancient traditions and first attempts of quantification of life sciences and Galileo’s vision on the fundamental role of mathematics in the description of all phenomena and processes in nature, perhaps one of the greatest stimuli for closer links between mathematics and life sciences was the rediscovery of G. Mendel’s laws in 1900 (originally presented and published in 1865–1866, [2]) and the subsequent growth of genetics. Based on natural selection, C. Darwin’s theory of evolution and the new science of genetics could barely co-exist at the time. It is mathematics that helped to resolve the crisis by what is now known as R. A. Fisher’s Fundamental Theorem of Natural Selection, formulated in his 1930 book [1]. The essence of this theorem is given in the form of a partial differential equation (PDE), expressing the fact that the rate of fitness increase for any living organism is determined by its genetic variance in fitness at the corresponding moment of time. This mathematical result gave a solid foundation for the development of population biology. Since the beginning of the 20th century, mathematical models have started to play an ever increasing role in the life sciences. Examples include L. Michaelis and M. L. Menten’s equation for enzyme kinetics (1913), J. B. S. Haldane’s equation for genetic mapping (1919), A. J. Lotka and V. Volterra’s preditor-prey systems (1925–1931), A. A. Malinovsky’s models for evolutionary genetics and systems analysis (1935), R. Fisher and A. Kolmogorov equation for gene propagation (1937), A. L. Hodgkin and A. F. Huxley’s equations for neural axon membrane potential (1952), and many others. Many such models are based on coupled systems of equations, most typical for life science applications. It is also important to realize that the influence of mathematics on the life sciences has been magnified by the development of mathematics-based new technologies in the applications of these disciplines and medicine. A now classical example of that is A. M. Cormack’s equation for the representation of a function by its line integrals (1963), which inspired the field of computer-assisted tomography,
1.1 Scientific Frontiers at the Interface of Mathematics and Life Sciences
5
medical and biological imaging. Notwithstanding that the fundamentals for this were already laid in 1917 by the mathematical works of J. Radon. The development of such theories as self-organization and biological pattern formation provides another important example where the cross-fertilization between mathematics and the life sciences has been profoundly influential, clearly demonstrating the fact that only through interdisciplinary efforts can substantial progress be made at the science frontiers. This development is dotted with now classical mathematical models developed by A. J. Lotka, d’A. Thomson, A. Turing, B. Belousov and A. Zhabotinski, A. Gierer and H. Meinhardt, among many others. Much of the current development at the interface of mathematics and life sciences is influenced by the availability of detailed molecular, functional, and genomic data. This leads to the development of new data-driven mathematical models, and the tools of mathematical modeling and computational experiments are becoming largely important in today’s life sciences. As we know, the interactions between mathematics and life sciences through their history have not always been easy. Indeed, to mathematicians it should not come as a surprise that many life scientists would not accept a view that “in every special doctrine of nature only so much science proper can be found as there is mathematics in it,” expressed by I. Kant back in 1786. However, there is a growing recognition in the scientific community of what E. Wigner called in 1960 “the unreasonable effectiveness of mathematics in the natural sciences,” and the life sciences are not an exception. Moreover, the interdisciplinary nature of science makes collaborations between scientists from different disciplines and mathematicians a necessity that is, more than ever before in the history of science, appreciated by the scientific community. Looking at the long history of interaction between the mathematical and life sciences, it is clear that in this ongoing process mathematics has become pervasive across many disciplines of life sciences, while the interface between these disciplines and mathematics fosters new methods, tools, and approaches in both areas.
1.1.2 Challenges at the Interface: Mathematics and Life Sciences Many areas of life sciences have been profoundly influenced by mathematics, while indispensable stimuli for new advances in mathematical disciplines have been provided by fundamental challenges coming from the life sciences. The current breadth of the two-way interaction between mathematics and life sciences is already impressive, while the potential of this interaction is virtually unlimited. Moreover, substantial benefits from the continuous deepening of such interaction lie on both sides of the interface. Nowadays, many disciplines in life sciences routinely apply mathematics-based experimental techniques. The design and increasing quality of measurements in such disciplines as structural, cellular and molecular biology require the development of new mathematical algorithms, methods, and tools. For example, in the area of drug design and delivery, optimization and control tools, along with graph theoretical approaches,
6
1 Introduction
have become ubiquitous. Mathematics has played an important role in the areas ranging from genetic mapping and cell dynamics to the study of functions of biological systems, from the study of organs to the study of organisms, covering a wide spectrum of life science disciplines, from medicine to ecology, from physiology to neuroscience and to bio-nanotechnogical innovations. Mathematical modeling tools, ranging from continuum-mechanics based models to Molecular Dynamics Simulation and Monte Carlo procedures, are fundamental to the progress made in the areas of bio-mechanics and bio-tissue engineering, membrane and cell biology, studying properties of macromolecules, such as DNA, RNA, proteins, and complex biophysical phenomena in biological systems. Apart from more traditional applications of approximation theory and statistics, other mathematical disciplines such as differential geometry and topology are becoming increasingly important in life sciences. In turn, such problems as sequencing macromolecules, already present in biological databases, provide an important catalyst for the development of new mathematics, new efficient algorithms and methods. The influence of mathematics on life sciences can be both direct and indirect, sometimes appreciated many years later. The importance of Euclidian geometry in life sciences that has been with us since 300 BCE is hard to overestimate. Only in 1830 did the mathematical world come to know the non-Euclidian geometries of J. Bolyai and N. I. Lobachevsky. Yet it was many years later (around 1999–2003) when scientists realized that these seemingly very abstract mathematical constructs can be applied in the flat mapping of the human cerebellum. Similarly, it was in the 1970s when we learned about B. Mandelbrot’s fractal geometry where symmetry across scales plays a fundamental role. But it was only years later when life scientists found that many biological processes and systems, including human organs from the bronchi to the liver, can be well described by using methods of fractal geometry. A classical example of the impact of life sciences on mathematics is the discovery of what is now called Brownian motion. The microscope was discovered in the late 17th century and in 1827, R. Brown, a Scottish botanist, was examining under a microscope grains of pollen of Clarkia pulchella plant suspended in water. Years later, it was the basis for the development of a mathematical theory that nowadays is pivotal in probability theory and numerous applied branches of mathematics. New challenges in life sciences provide a great impetus for dynamical systems theory, differential equations (partial, stochastic, integro-differential), mathematical modeling, geometry and topology, data mining, mathematical and numerical analysis, and other branches of modern mathematics. It is also well known that the development of game theory, and contributions by von Neumann in particular, as well as the development of control theory and cybernetics by N. Wiener, were influenced by the development in life sciences. In turn, the growth of the systems science has provided one of the fundamental corner stones for the development of systems biology where biological systems (e.g., cells, organisms) are considered in a holistic way (recall Aristotle’s quote from our Section 1.1.1). This new discipline aims at accounting the interactions between different components of a
1.1 Scientific Frontiers at the Interface of Mathematics and Life Sciences
7
biological system at different scales (e.g., from the molecular to the systemic level). Mathematically speaking, we have to account for a multiscale spatio-temporal character of the system and an intrinsic interplay of its components. What comes out of such a consideration is one of the important examples of complex systems. Mathematics of complex systems is at the beginning of its development. On the other hand, as J. E. Cohen once said “if every biologist who plotted data on x–y coordinates acknowledged the contribution of Descartes to biological understanding, the key role of mathematics in biology would be uncontested.” Recall that this link of geometry and algebra, taken nowadays for granted, was made by Rene Descartes back in 1637. Present day life sciences comprise a vast range of different disciplines where the application of mathematical, statistical and computational science tools is growing rapidly. They include, but are not limited to, the following (frequently interconnected) areas:
Systems biology and medicine, including direct and inverse problems applied to specific fields, e.g. cellular systems biology, systems oncology, etc., studies of various organs, their systems and functions, e.g., cardiac and skeletal-muscular functions, studies of human tissues and blood flow, etc.;
Dynamics of complex biological networks, including regulatory networks, their interactions; mathematical models for cell biology, including cell dynamics, membrane biology; developing complex signaling models and their applications;
Mathematical models in neuroscience and physiology, developmental biology, evolution and evolutionary dynamics of biological games;
Mathematical biomedicine, including immunology problems, epidemiology, modeling infectious diseases, mathematics of clinical trials, drug development, delivery, and resistance;
Genomics and genetics research, including the analysis of the mechanism of protein synthesis in control of gene expression (and in cells), molecular machines and gene regulation, gene regulatory networks governing cell proliferation and differentiation decisions, and the developing strategies to control genetic expressions;
Self-assembly and spatio-temporal pattern formation in biological systems; nonlinear waves and excitable systems, including nonlinear reaction-diffusion systems, phases in biological systems, phase transitions, etc.;
Mathematical models for bio-macromolecules, including DNA, RNA, proteins, their properties, dynamics, and interactions at various length and time scales;
Mathematical models for industrial biotechnology and biotechnological applications, including bio-nanotechnology, bio-imaging, reconstructions of computerized tomography scans, etc.;
8
1 Introduction
Bionics and using nature’s ideas in creating artificial bio-tissues, bio-sensors and bio-actuators, etc.; studies into coupled systems and coupled effects in biological systems, e.g., piezoeffect, chemotaxis, phototaxis, new biophysical phenomena and new models for photosynthesis, biological fluid dynamics and fluid-structure interactions in biological systems, etc.;
Mathematics in other areas of life sciences, including ecology in general and microbial (food, environment) ecology in particular, protecting ecosystems, homeland security, sustainability, plant biology and agriculture, etc.
The problems we encounter in these areas require a variety of methods and tools from mathematical, statistical and computational sciences. For example, probabilistic graphical models are essential in the analysis of cellular networks, while computational methods are playing a fundamental role in studies of cellular signaling, pathways, and signal transduction. Ordinary, partial, integro-differential, functional, and stochastic differential equations, methods of dynamical systems theory, neural networks, singularity theory, discrete mathematics approaches, tools from statistics and algebraic geometry, to name just a few, have all brought their invaluable contributions to life sciences, enriching the body of knowledge so important in our further progress. While these mathematical, statistical and computational tools are ubiquitous and are frequently applied in other areas of human endeavors, their applications in life sciences have certain specifics. Firstly, compared to many physical or engineering systems, where mathematical theories, tools and methods have long played a profound role in their understanding and quantification, well beyond the descriptive level (still typical for many life science disciplines), our knowledge of biological systems is quite limited. It should not come as a surprise as most biological theories are of recent origin, now undergoing profound transformations. Secondly, one of the consequences of biological system complexity lies with the fact that most systems we have to deal with are highly heterogeneous with their parts interacting on a large range of spatio-temporal scales, often exhibiting non-deterministic behavior. Indeed, many problems that appear in life sciences involve systems that are only partially understood. They are inherently uncertain, demanding their studies with new mathematical tools. Attempts to analyze such systems with conventional methods of mathematical analysis almost inevitably lead to a situation where our models remain often over-simplified, compared to their real-world biological prototypes. Therefore, while in the analysis of physical and engineering systems we often look for simplified mathematical models to understand and explain the data we have, in life sciences it may not always be the case (as many historical examples, including A. Turing’s example, clearly show). Similarly to physical and engineering sciences, the development of more realistic mathematical models in life sciences leads frequently to coupled systems of equations, e.g., coupled PDEs. Efficient mathematical and computational methodologies for their solutions have to often account for the multiscale character of
1.1 Scientific Frontiers at the Interface of Mathematics and Life Sciences
9
the problem. In a large variety of situations such methodologies are at the beginning of their development. Typical examples of such problems are coming from cell biology, the study of metabolic pathways, and complex networks (e.g., biochemical reaction networks and genomic interactions via gene regulatory networks). Moreover, in studying the dynamics of such complex biological networks, stochastic mathematical models are usually needed, with non-equilibrium mathematical and statistical mechanics models not being uncommon. Indeed, any biological system, comprising interacting cells, is influenced (already at the cell level) by a variety of external factors. As a result, the models and approaches developed for physical systems may not be sufficient to deal with this biological complexity. However, based on non-equilibrium statistical mechanics, we can still get a flavor of system behavior at the macroscopic level from corresponding moments of the distribution function taken over the microscopic states. Mathematical models are at the heart of predictive capabilities of life sciences. As a result, the role of mathematical modeling in life sciences has continued to increase. Indeed, mathematical modeling not only can support theories, but can often suggest the need for better experiments and more focused observations, providing in turn a check to the model accuracy. Observations and experiments may produce large amounts of data that can only be intelligently processed with efficient mathematical data mining algorithms, powerful statistical and visualization tools. The application of these algorithms and tools require a close collaboration between mathematicians and scientists from other respective disciplines. Thus, observations and experiments, theory and modeling reinforce each other, leading together to our better understanding of phenomena, processes, and systems we study. With the increase in data in some areas of life sciences (e.g., genomic research), bioinformatics has played an important role in the analysis of such data. This increase in data has also clearly shown that the next step should lie with the integration of this data with mathematical models possessing predictive capabilities. Indeed, this step is essential in making us closer to the application of mathematics in such important areas as clinical research and practice, including efficient disease treatments. At the same time, many complex biological systems are still hard to measure on spatio-temporal scales required for their understanding. This lack of measurements may lead to serious constraints on the application of conventional methods for systems parameter identification and estimation. At the same time, it leads to the necessity of more close interactions between mathematical modeling, computational analyses and experimental approaches, paving the way for the development of new mathematical techniques based on inverse problem analysis and their applications in life sciences. Finally, we mention that the development of a holistic view already pursued in many areas of life sciences and generically termed as systems biology, opens a practically unlimited number of avenues for exploring the interface of mathematics and life sciences. This view not only is crucial in a large body of disciplines within life sciences to meet their modern challenges, but also provides an important stimulus for the development
10
1 Introduction
of new models, methods and techniques in mathematical sciences in response to those challenges.
1.1.3 What This Book Is About The book contains 12 state-of-the-art surveys and research articles at the interface of mathematics and life sciences. It is based on selected invited contributions by researchers from Europe, North America, Asia, and Australia. The book presents a broad spectrum of mathematical, statistical and computational methods important in life sciences, as well as some representative examples of modern problems from life sciences where the development of new mathematical approaches is required. After this introductory chapter, each remaining chapter stands alone as a survey or in-depth research within a specific area, exploring the interface of mathematics and life sciences. In what follows, we highlight the main features of each such chapter.
Mathematical and Statistical Modeling of Biological Systems. A survey of a new type of mathematical models in life sciences, termed ensemble models, is given by D. Swigon (Pittsburg, USA). These models are extensions of conventional deterministic or stochastic mathematical models to the situations where we have small or extensive amount of data. Such models are identified and their parameters are estimated by using Bayesian techniques. Examples are given for viral infection dynamics modeling. Computational techniques, implementation of ensemble models, as well as open problems are also discussed.
Probabilistic Models for Nonlinear Processes and Biological Dynamics. An introductory survey to the important analytical aspects of mathematical models based on general non-linear Markov processes is given by V. N. Kolokoltsov (Warwick, UK). Among such processes are Levy and Feller processes, as well as nonlinear Markov chains. The author demonstrates mathematical tools that are important in the study of replicator dynamics in evolutionary biology and have potential for applications in many other areas of life sciences.
New Results in Mathematical Epidemiology and Modeling Dynamics of Infectious Diseases. A detailed mathematical analysis leading to formal solutions of general epidemic equations is carried out by V. A. Stepanenko (Krasnoyarsk, Russia) and N. Tarkhanov (Potsdam, Germany). The authors’ analysis is, in fact, performed even for a more general equation describing Markov stochastic processes, of which an epidemic equation is a particular example. Their explicit solution may provide a new efficient way for modeling the transmission of infectious diseases in living organisms.
Mathematical Analysis of PDE-based Models and Applications in Cell Biology. Spectral problems with rapidly oscillating coefficients are analyzed by T. A. Melnyk and A. V. Popov (Kyiv, Ukraine). Such problems arise in numerous life science
1.1 Scientific Frontiers at the Interface of Mathematics and Life Sciences
11
applications and the authors focus on one of them, pertinent to cell biology. In particular, certain biophysical processes within biological cells (e.g., some metabolic processes) can be described by boundary value problems in thin perforated domains with a rapidly oscillating boundary. The corresponding mathematical models may assist in the development of new methods of diagnostics and new healing mechanisms for damaged cells. The analysis of such models, based on asymptotic expansions for the eigenvalues and eigenfunctions, is carried out for different limit dimensions and a new homogenization theorem is established. These results and the complete asymptotic expansions constructed by the authors may also prove to be useful in other areas of life sciences.
Axiomatic Modeling in Life Sciences with Case Studies for Virus-Immune System and Oncolytic Virus Dynamics. Approaches based on what is termed as “axiomatic modeling” are surveyed by N. L. Komarova (Irvine, USA). The author argues that, unlike physics or chemistry, life sciences still demonstrate notable resistance to the application of conventional mathematical approaches. At the same time, mathematical modeling provides an appropriate framework in these fields to formulate assumptions in a quantitative way. It is demonstrated that despite very limited biological knowledge about the systems of interest, the axiomatic approach allows us to make some nontrivial statements based on the analysis of the corresponding models. The first case study is considered in the context of strategies for boosting immunity by anti-viral drug therapy and discusses the issues of timing, efficacy and success of such a therapy. The model is considered on the example of data for hepatitis C virus (HCV). However, the developed theoretical framework can also be applied to other situations, e.g., in the case of human immunodeficiency virus (HIV). The second case study concerns the construction of a predictive model for oncolytic virus dynamics where the author analyzes different scenarios of virus spread, as well as stability properties of equilibrium solutions. Advantages and limitations of the axiomatic modeling approach are also discussed.
Theory, Applications, and Control of Nonlinear PDEs in Life Sciences. This area encompasses many different topics at the interface of life sciences and mathematics, and is represented in this chapter by two papers. Mathematical modeling and computational experiments help better understand biological systems, as well as get further insight into many biophysical and biochemical phenomena and processes. The workhorse for many mathematical models in life sciences is built on the basis of PDEs. In this chapter, a detailed analysis of semilinear parabolic equations of normal type is given by A. V. Fursikov (Moscow, Russia). The main focus is on the development of various tools for damping turbulence in fluid flows which find important applications in biology and medicine, among other disciplines. Mathematically, the problem is studied in the context of theory of stabilization of the solution to the Navier–Stokes equations by feedback control. Details of the derivation of normal parabolic equations, the structure of dynamics of such equations and
12
1 Introduction
stabilization of their solutions by starting control are discussed. The tools developed could be important in a range of applications of biological fluid dynamics. The other paper in this chapter, by A. A. Kovalevsky (Donetsk, Ukraine), focuses on nonlinear PDE equations with L1 data. The author provides a survey of results on the existence and properties of solutions to several classes of nonlinear secondand fourth-order equations. As many applied mathematics problems in life sciences do not possess higher regularity, the analysis provided here is important. The author notes that in particular cases, the principal parts of analyzed equations can be generalized by the p-Laplacian operator, which arises naturally in the context of the Navier–Stokes equations describing the motion of non-Newtonian fluids. The author describes different kinds of solutions to his PDE-based models such as entropy and proper entropy solutions. Among other areas, the applications of these results are important in the study of the motion of non-Newtonian fluids, e.g., blood flow, synovial fluid, saliva, etc. Other applications include problems related to modeling biological pattern formations, as well as the interaction of diffusing biological species.
Mathematical Models of Pattern Formation and Their Applications in Developmental Biology. Developmental biology deals with the entire range of biological complexity of living organisms, e.g., from egg to embryo. This area provides a very fertile ground for the formulation of mathematical models, their analysis, developing efficient numerical techniques, and subsequent computational experiments. This area is represented in this chapter by a paper aiming at a better understanding of developmental processes via the study of pattern formation. Based on reactiondiffusion equations, the paper by A. Marciniak-Czochra (Heidelberg, Germany) focuses on the study of a symmetry break and formation of spatially heterogeneous structures during development. A detailed analysis is given for two important cases of pattern formation: (a) diffusion-driven instability (Turing instability) and (b) a hysteresis-driven mechanism. The author demonstrates main possibilities of these mechanisms and their constraints in explaining different aspects of structure formation in cell systems. The results are discussed in the context of morphogenesis of a fresh water polyp, known as a model organism in developmental biology.
Modeling the Dynamics of Genetic Mechanisms, Pattern Formation, and the “Genetics of Geometry”. One of the challenging problems in life sciences in general, and in plant science in particular, is the recovery of information about the dynamics of genetic mechanisms by which biological systems (e.g., plants) can control the development of their various topological/geometrical features. In this chapter, R. S. Anderssen (Canberra, Australia), M. P. Edwards (Wollongong, Australia), and S. Pereverzyev Jr. (Linz, Austria) address this problem in the context of the positioning of trichomes on the leaves of plants. The authors compare the application of reaction-diffusion models to this problem with cellular signaling and switching models. It is demonstrated that in order to better understand genetic control mechanisms,
1.1 Scientific Frontiers at the Interface of Mathematics and Life Sciences
13
leading to signaling and switching between cells to produce the observed patterns, it is essential to utilize cellular models of the plant organ. Furthermore, the authors put their research on the positioning of trichomes on the leaves of plants in the context of a new Anderssen–Pereverzyev framework for performing the biocombinatorial sorting of the known genes to be involved into biomechanistic categories.
Statistical Modeling in Life Sciences and Direct Measurements. Measurements represent an important part of research tools intrinsic to life sciences. A natural question that arises once such measurements are completed is that of error estimations. The development of statistical models for such estimations is fundamental to many areas of life sciences. This is the subject of a paper presented by the group of researchers from the Center for Radiation Medicine (I. Likhtarov, S. Masiuk, M. Chepurny, L. Kovgan), National T. Shevchenko University of Kyiv (A. Kukush, S. Shklyar) in Ukraine and the National Cancer Institute (A. Bouville) in the USA. They develop a new statistical model for radioactivity measurements. The model involves both classical and Berkson measurement errors. While in the estimation of the Berkson error the Monte Carlo method can be applied, a new methodology is needed for the classical error estimation. They carry out a detailed risk analysis where they propose two methods to characterize dose uncertainty, based on parametric and non-parametric calibrations.
Design and Development of Experiments for Life Science Applications. A vast range of mathematical, statistical, and operation research tools are required in this area which is represented in this chapter by a new quantitative study of in vivo biological responses to static magnetic field exposure. In a paper by J. F. László (Debrecen, Hungary) this study is largely motivated by an ever increasing interest in this topic due to the proliferation of high field magnetic resonance tomography in medical diagnostics. It is an important example of the situations where it is fundamental to optimize the experiment in methods, in materials, as well as conceptually. In doing so, we are making a step toward increasing the interest of clinical practitioners to the results of mathematical, statistical and computational modeling. In the current paper, the author deals with mathematical modeling in physiology, which is frequently defined as the science of the function of living systems. The study involves the analysis of static magnetic field known to induce a wide range of biological responses. In particular, this field can induce analgesic effects in humans. Since pain is a very complex state that involves both central and peripheral mechanisms, to study such effects is an extremely difficult task. The author provides a detailed account on recent progress in this field, focusing on statistical and experimental models. Based on his finding, suggestions for SMF-based devices for therapy are also given.
Mathematical Biomedicine and Modeling Avascular Tumor Growth. One of the biggest challenges in life sciences in general, and in biomedicine in particular, lies with the development of mathematical models for various diseases in order to
14
1 Introduction
assist clinical practitioners and help the advancement of strategies for disease treatments. In this chapter a comprehensive survey of a series of increasingly complex mathematical models for avascular tumor growth is given by H. Byrne (Oxford, UK), with the major focus on continuum models. These include two-dimensional models that can be used to determine the stability of radially-symmetric solutions to symmetry-breaking perturbations, allowing the establishment of conditions under which the growth remains localized or invasive. Tumor growth is one of the primary challenges of cancer research. Its dynamics features a range of different scales, from molecular to macroscopic, and its treatment seems to require new, systems biomedicine based, approaches. In this chapter the author explains how the existing models are inter-related as well as the biophysical insight that they provide. A discussion of the theoretical challenges that lie ahead is also given.
1.1.4 Concluding Remarks In this chapter we have presented a selection of topics, representing part of a vast spectrum of the interface between mathematics and life sciences. The chapters that follow provide a unique collection of in-depth mathematical, statistical, and modeling methods and techniques for life sciences, as well as their applications in a number of areas within life sciences. They also provide a range of new ideas that represent emerging frontiers in life sciences where the application of such quantitative methods and techniques is becoming increasingly important. It is hoped that many mathematical, statistical, and computational tools presented in this book will help address current and future challenges in life sciences. Such challenges lie in different spatio-temporal scales, from genetic and molecular levels to cells, to whole organisms and to the level of entire ecosystems, from fractions of a second to the evolutionary scale. Therefore, we also hope that problems arising in life sciences will provide an exciting ground and continuous stimulus for the development of new theories, methods, and tools in mathematical, statistical, and computational sciences. We are convinced that since our understanding of the complex systems we encounter in life sciences increasingly requires the development of new quantitative approaches, as well as on-going interactions and close collaboration between different disciplines, the fundamental role of interdisciplinary research at the interface of mathematics and life sciences will continue to grow in its importance.
Bibliography [1] R. A. Fisher: The Genetical Theory of Natural Selection. Clarendon Press, Oxford, 1930. [2] G. Mendel: Versuche über Pflanzen-Hybriden. Verhandlungen des Naturforschenden Vereins zu Brünn 4 (1866), 3–47.
1.1 Scientific Frontiers at the Interface of Mathematics and Life Sciences
15
Author Information Alexandra V. Antoniouk, Institute of Mathematics National Academy of Sciences Ukraine, Kyiv, Ukraine E-mail: [email protected] Roderick V. N. Melnik, M2 NeT Laboratory, Wilfrid Laurier University, Waterloo, ON, Canada; Ikerbasque, Basque Foundation for Science and BCAM, Bilbao, Spain E-mail: [email protected]
2
Mathematical and Statistical Modeling of Biological Systems
David Swigon
2.1 Ensemble Modeling of Biological Systems*
Abstract. Mathematical modeling of biological systems must cope with difficulties that are rarely present in traditional fields of applied mathematics, such as a large number of components involved, extreme complexity and variety of interactions, and the lack of reproducible and consistent data. These difficulties may be overcome by models of new type, termed ensemble models, which allow for the parameters and the model structure to vary and thereby describe a population of all models that are consistent with biological knowledge about the system. Ensemble models are identified and their parameters are estimated using Bayesian techniques, and the models are subsequently used to provide probabilistic predictions about the system behavior and and its response to changes in conditions. We here survey the basic methodology of ensemble modeling and its applications to biological systems. Keywords. Bayesian Inference, Ensemble Modeling, Parameter Estimation, System Identification 2010 Mathematics Subject Classification. 62F15, 65C40, 91-08, 92B05, 93B30
2.1.1 Introduction Mathematical modeling of biological systems has intensified considerably in recent years, especially with the advent of experimental techniques such as microarray analysis of gene expression profiles, whole genome sequencing, or high throughput flow cytometry and ELISA biochemical assays, that are capable of providing a wealth of new biological data. There are important differences in the character of biological system models, when compared to those in traditional fields of applied mathematics, which require new approaches to model development, parameter estimation and model prediction [44, 52]. The first difference is in the modeling approach. Traditional model development relies on a strategy of reduction: the search for a small number of general laws governing the behavior of the system or a decomposition of the system into simple components governed by few interactions, and the design of experiments in which those laws or interactions can be characterized. Biological modeling has not been universally
D. S. acknowledges support by NSF grant DMS0739261 and NIH grants R01-GM83602 and R01DC008290
20
2 Mathematical and Statistical Modeling of Biological Systems
successful with this approach, primarily because in many situations the decomposition of the system into the simplest laws is tedious and impractical, especially when complexities in the behavior extend over multiple length- and time-scales [74]. For example, many genetic network models have hundreds of components and thousands of reactions, and it is nearly impossible to design experiments that would isolate each such reaction so that its kinetic rate constants can be measured with sufficient accuracy. Furthermore, the results of such experiments, performed in vitro, may not represent the true interaction in vivo, for that reaction may be influenced by other reactions and molecules in ways yet unknown. The second difference is in the amount and quality of experimental data. A reductionist approach requires large amount of low dimensional data to provide statistical confidence in estimates of parameters. In biological modeling, the data may be available only in limited quantities or in large quantities (obtained by high throughput techniques) but over multiple dimensions [42]. More importantly, the data may not be available for the same subject (organism or cell), but as a collection of measurements obtained for a population of systems that were subject to identical stimulus. In the case of time-dependent (longitudinal) data, it is frequently impossible to use the same subject for two time-point measurements because data collection methods require its destruction (e.g., animal sacrifice). The data at different time points thus come from different subjects and it is not guaranteed that these subjects represent identical copies of the same system. Although at the molecular level the biological processes in such subjects are likely to be governed by chemical and biochemical laws with similar parameters, there may be genetic differences between cells or organisms which can lead to discrepancies in interactions that multiply into large differences in parameters describing macroscopically observable phenomena. The combination of these difficulties calls for the development of a spectrum of phenomenological models tailored to specific situations, and the use of parameter/model inference techniques which result in distributions of parameters rather than specific values, describing parameter variability over a population. Such models have been called ensemble models, and they are understood not to represent the behavior of an individual subject but that of a whole population. The models are probabilistic in principle, because of the underlying distribution of parameters, but not necessarily stochastic—that term is reserved for models that describe interactions and laws that are probabilistic in individual interactions (such as binding and unbinding of molecules). Indeed, the majority of ensemble models in the literature are deterministic, in that the evolution of the system, once its parameters are fixed, is free of random effects. The best approach to formulating ensemble models appears to be using Bayesian inference. The use of Bayesian techniques in biology thus brings a new meaning to the probability densities produced by Bayesian computation. Because data used for parameter estimation comes from a population of systems, the distributions can be thought of as distributions of parameters within the population.
21
2.1 Ensemble Modeling of Biological Systems
This paper contains the description and properties of ensemble models, gives examples of the use of such models in studies of biological systems, and outlines both conceptual and technical open problems in this new, important, and exciting area of research. A basic introduction to parameter estimation and inverse problems can be obtained in [2, 37–39] or [70]. The focus here is on modeling within the framework of ordinary differential equations; for recent reviews of the use of Bayesian methods in bioinformatics and computational system biology, focused primarily on sequence analysis, microarray data analysis, protein bioinformatics, and network inference, see [22, 77].
2.1.2 Background For simplicity, consider a model formulated as an initial-value problem for a system of ordinary differential equations, in which the time evolution of a vector of state variables x depends on a vector of parameters p (such as kinetic rate constants) and a vector of inputs u. The observations about the system are made by measuring output variables y which are functions of the state variables, and may depend on the parameters and inputs as well. x P .t / D f .t; x.t /; u.t /; p/;
x.t / D x0
y.t / D g.x.t /; u.t /; p/
(2.1) (2.2)
The function u.:/ may be employed to control the dynamics of the system, or to perturb its dynamics in order to study its behavior. In a majority of biological models, u is fixed and the dynamics of the system is studied by changing the initial conditions and/or parameters. We assume that the model (2.2) is mathematically well posed and has a unique solution .x.t /; y.t // for any p, x0 , and admissible function u.:/. (This condition can be relaxed by restricting attention to sub-domains of the parameter, state, and input spaces in which existence and uniqueness hold.) In a classical setting, one assumes that (i) the system under consideration is described by a model with a unique parameter set p and a trajectory y.t I p/ and (ii) the observed values yN i of the output variables at time points ti are normally distributed random variables with mean y.ti I p/ and variance 2i due to independent random measurement noise. In that case the probability density for yN i is Y 2 12 .2i;j / e Li .Ny jp/ D i
.yj .ti Ip/y N i /2 j 2 2 i;j
(2.3)
j
where j ranges over the components of y. We assume that the data Q provided about the system consist of the observed values yN i and their uncertainties 2i . If measurement errors are independent, the likelihood
22
2 Mathematical and Statistical Modeling of Biological Systems
of observing the full set of data for a model with parameters p is Y Li .Ny i jp/ L.Qjp/ D
(2.4)
i
In practice, the values of the parameters are not known and are to be estimated from the data Q. A cost function (or objective function) E.p; Q/ is constructed to reflect the agreement between the model with parameters p and the data Q. It quantifies the difference between measured and model-predicted values of the output at each time point, and depends on the source of the discrepancy between model output and measured data. Traditionally, E.p; Q/ is taken to be the negative logarithm of the likelihood L.Qjp/, i.e., E.p; Q/ D log L.Qjp/ D E0 C
X X .yj .ti I p/ yNji /2 i
j
2 2i;j
(2.5)
More general distributions of the measurement error would result in different forms of (2.5). Furthermore, in addition to the information included in the measured data, one can include in E.p; Q/ terms that account for observed maximum or minimum values of outputs over the duration of the experiment, expected timing of peak values of output variables, etc. If multiple runs of the experiment are done with different initial conditions and/or input functions u, then the function E in (2.5) should include additional summation over all such conditions. Traditional parameter fitting identifies the parameters of the observed system with the parameter set p that minimizes E.p; Q/ given the data Q. When the form (2.5) is used, the vector p is called the maximum likelihood estimate of the parameters. The vector p exists for any reasonable choice of the function E, but it is unique only if the model is structurally identifiable, i.e., if there are no two distinct parameter sets that would lead to systems with identical dynamics. Multiple types of identifiability have been defined and various algebraic criteria and transform methods for testing identifiability have been developed, see, e.g., the review [54]. If the existence and uniqueness of the minimizer p is guaranteed, one can proceed with its computation using function minimization algorithms, such as the Levenberg– Marquardt scheme [43]. Such methods may require the computation of the trajectory sensitivity gradient S.t / D @x.t /=@p which can be obtained for the system (2.2) by simultaneous integration of the associated ODE problem [27]: P / D @f S.t / C @f ; S.t @x @p
S.t / D 0
(2.6)
For stiff or chaotic systems the trajectory is extremely sensitive to the initial value and parameter values. Multiple shooting algorithm removes this difficulty by converting the initial value problem to multiple boundary value problems that are solved for
23
2.1 Ensemble Modeling of Biological Systems
segments of the full trajectory [3, 9, 73]. Kalman filter technique is another popular method for estimating a unique parameter set. It is a stepwise algorithm based on predicting the trajectory x.t C t / using a model estimated from the data up to time t , and then correcting the parameters of the model using the data y.t C t /. It has been shown that if the model is linear and the data are subject to Gaussian noise, the parameters of the model converge to the maximum likelihood estimate p . There is a large amount of literature on extensions of the Kalman filter method to nonlinear systems [75]. Note that the identifiability of a model is a structural property that is independent of the data. Sontag has shown that to determine all r parameters of an identifiable system, one needs at most 2r C 1 judiciously chosen measurements [66]. Thus, for sparse enough or poorly chosen data the minimizer p may not be unique, even if the system is identifiable. One can still proceed with minimization algorithm for such an under-determined problem, but the resulting value of p will depend on the initial guess for p used in the minimization algorithm. By varying initial guesses one can obtain a collection of parameter sets that represent the set of optimal parameter values for the system; these values lie on a lower-dimensional manifold in the parameter space defined as E.p; Q/ D E.p ; Q/. A model with such a collection of parameters has been called ensemble model in some literature [17, 49]. Alternatively, one can try to enforce uniqueness of the minimizer P by adding P to E.p; Q/ an artificial term that controls the magnitude of p, such as i pi2 or i .log pi /2 (which controls both the large and small values of p). An issue of prime importance in parameter estimation is parameter sensitivity of the model, i.e., the dependence of objective value on a change in parameters. The sensitivity of a model (given data Q) can be assessed by analyzing the quadratic expansion of E.p; Q/ about the minimizer p : 1 E.p; Q/ D E0 C .p p /T H.p p / C O.jjp p jj3 / 2
(2.7)
The eigenvectors of H determine the principal linear combinations of parameters; the largest eigenvalue of the Hessian H correspond to the stiff parameter combinations, i.e., a linear combinations of parameters (specified by the corresponding eigenvectors) along which a small change results in a large increase in the cost function. These represent the directions in which the system is sensitive to parameter changes. Small eigenvalues correspond to the soft directions along which the change in E.p; Q/ is relatively small. The sensitivity can be decomposed into three components: sensitivity of the output to the trajectory G.t / D @g.x.t /; u.t /; p/=@x computed for the trajectory x.t /, the sensitivity of the trajectory to parameters S.t /, and the sensitivity of the Q objective function E.p; Q/ E.y.:I p// to the output y: HD
X i
S.ti /T G.ti /T
@2 EQ .ti /G.ti /S.ti / @y 2
(2.8)
24
2 Mathematical and Statistical Modeling of Biological Systems
Q i /=@y 2 is (If weighted Euclidean norm (2.5) is used to compute E.p; Q/, then @2 E.t 2 a diagonal matrix with entries 1=.4i;j /.) In classical parameter fitting, the sensitivity H is used to characterize the accuracy with which model parameters can be determined. Along the soft direction the accuracy is low, while along the stiff direction the accuracy is high. For a large number of ODE models inspired by biological systems, most parameters contribute to the eigenvector of some soft direction which results in an apparent parameter insensitivity [29]. In addition to the traditional parameter fitting techniques, there are methods based on theory of probability. Bayesian parameter inference [14, 22, 37–39, 56] provides, for a given model and a set of data, not a unique set of parameters, but a parameter distribution describing the probability (some authors say plausibility) that the model has a parameter set p given the data Q. This distribution is characterized by the posterior density .pjQ/. The application of Bayesian inference requires that the likelihood L.Qjp/ of observing the data Q for a model with parameters p is specified. In the classical setting described above, one would take L.Qjp/ D exp .E.p; Q// with E.p; Q/ defined as in (2.5). The posterior density is related to L.Qjp/ by .pjQ/ D R
L.Qjp/.p/ L.Qjp/.p/d p
(2.9)
where .p/, the prior density, reflects all information known about the distribution of the parameters before the data are taken into account. The prior density is generally based on probable ranges of parameters obtained from biological literature, but it can also include heuristic terms that account for qualitative criteria known to be satisfied by the system, such as the number and stability of equilibrium points in the system, long-term behavior of the trajectory, or the presence of oscillations in the trajectory [33]. The choice of the prior distribution plays an important role in Bayesian parameter estimation [11,50], although that role is diminished the more data is available for parameter estimation. Some researchers argue for noninformative (also called objective) priors which are flat relative to the likelihood function, but such priors are difficult to construct [59], while others argue for the use of subjective priors that express our belief in likely ranges of parameter values [8]. Jeffreys [39] discusses a uniform prior based on Fisher information matrix that does not change much over the region in which the likelihood is significant and does not assume large values outside that range. In biological ODE models it is a common practice to employ for kinetic rate constants priors that are uniform in the log space over biologically reasonable parameter ranges [13]. Such a choice does not affect the principles of Bayesian inference, only the sampling metric on the parameter space and the prior distributions. The posterior distribution .pjQ/ can be characterized by its mean pN and a covariance matrix C, which, in the case of a quadratic cost function (2.7), is equal to H1 . For more general distributions, C 1 provides better information about global parameter sensitivity of the model than H. This information can be obtained by eigenvalue analysis as described above.
2.1 Ensemble Modeling of Biological Systems
25
2.1.3 Ensemble Model Ensemble modeling replaces the uniquely parametrized model (2.2) with a collection of models. In the simplest case, the models in the ensemble all have identical structure (i.e., identical functions f ; g) but different parameter values. The ensemble is then characterized by the probability density function .p/. The most common, but not necessarily exclusive, interpretation of an ensemble model is one in which the model represents the response of a population of individuals and the parameter distribution represents the variability of parameters within that population. The solution of the ensemble version of the model (2.2) for any fixed x0 and u.:/ is a special type of a stochastic process for which each sample trajectory x.t I p/ is a deterministic solution of (2.2) and the probability of the trajectory is equal to .p/. The output y.t O / of the ensemble model (2.2) is a time-dependent random variable with probability density Z p t .Oy /d yO D .p/d p (2.10)
where D ¹pjOy < y.t I p/ yO C d yO º. The likelihood of observing the value yN i as an output of the ensemble model at time ti is therefore Li .Ny i / D p ti .Ny i /
(2.11)
In this case, however, the densities p t .:/ are not independent Q and hence, in general, the likelihood of observing the data, L.Q/, is not equal to i Li .Ny i / (cf. (2.4)). Both the classical model with noisy measurement and the ensemble model result in probability distributions for data and therefore may be difficult to distinguish based on the observations alone. One of the main differences between the two R models is that in the case of the classical model, the mean output values hy i i D Li .jp/d , with Li as in (2.3), all lie on a single trajectory of (2.2), i.e., hy i i D y.ti I p/ for all i, while in the case of the ensemble model there is no single trajectory of (2.2) containing the R mean values hy i i D Li ./d with Li as in (2.11). The ensemble model resembles a classical model with parameters estimated using Bayesian inference. Both are characterized by a distribution over the space of parameters, but there is an important distinction. The posterior distribution estimate for a classical model reflects the likelihood of a particular parameter value given the available data, while the parameter distribution of the ensemble model, .p/, describes the frequency of occurrence of a particular parameter set in the ensemble and is clearly independent of the data. The posterior distribution depends on the data and contains additional information, such as the sensitivity of the model to parameters and the sensitivity of the cost function to the data. The posterior density will converge to unique parameter values if one collects a sufficiently large amount of data so as to average out the measurement error. On the other hand, no amount of data can average out the effect of the parameter distribution, it can only improve the accuracy of its estimate.
26
2 Mathematical and Statistical Modeling of Biological Systems
For the moment, however, it appears that Bayesian inference is the only readily available tool for estimating the probability .p/, it is being used in that way with the hope that with sufficient amount of data the influence of other factors will diminish, and the Bayesian estimate .pjQ/ will converge to .p/ just like it converges to a localized density in the case of a classical model with a unique parameter value. Brown and Sethna [13] have made several interesting observations about parameter distributions inferred for biological ensemble models with large numbers of parameters. They found that cost functions for such models have H (or C 1 ) with eigenvalues distributed almost uniformly (on the logarithmic scale) over a broad range of magnitudes, no matter how detailed are the data used to fit the model. They also noted that the eigenvectors corresponding to the eigenvalues generally include nontrivial components along multiple parameters. They termed such models “sloppy”, which is meant to indicate that both the structure and parameter values of the models cannot not be determined with certainty, however, the behavior predicted by the ensemble models was very well characterized. They also discuss the selection of an optimal model using Bayesian techniques. Gutenkunst et al. [29] extended the work of [13] by analyzing a collection of biological models and finding that they all have spectral distribution characteristic of sloppy models. They found that fitting even to a large amount of data leaves many parameters poorly determined. They also found that, contrary to their expectation, if a random perturbation is made to the parameters of the model, of magnitude much smaller than the variance given by the largest eigenvalue, then the fit of the model is worsened significantly. Thus, having poorly determined parameters does not mean that the parameters can take on any value, because their values are tightly correlated. They suggest to shift the focus of investigation from parameter estimation onto ensemble model predictions. Within the Bayesian framework, the posterior likelihood .RjQ/ of observing additional output R of the model given existing data Q obeys [20] Z Z .RjQ/ D p.R; pjQ/d p D L.Rjp/.pjQ/d p (2.12) Of course, in ensemble modeling one is not restricted to using just one model structure. In the case of a finite number of distinct models M1 ; M2 ; : : : ; Mm , the likelihood of the model i given data Q is given by the marginal posterior probability P .Mi jQ/: g.QjMi /P .Mi / P .Mi jQ/ D P i g.QjMi /P .Mi /
(2.13)
where P .Mi / is the prior probability expressing our belief in the accuracy of the model i, and g.QjMi / can be written as Z g.QjMi / D L.Qjp; Mi /.pjMi /d p (2.14) Ensemble models can be extended to include models of different types via weighted averaging with weights corresponding to their posterior probabilities [15, 23].
2.1 Ensemble Modeling of Biological Systems
27
2.1.4 Computational Techniques The posterior probability density .pjQ/ can be computed by a variety of methods. The most efficient and generally applicable method appears to be Markov Chain Monte Carlo (MCMC) sampling [20, 24], which is based on Metropolis–Hastings algorithm [31,53]. Originally, MCMC method was designed in computational physics to sample the Boltzmann distribution .p/ / exp .ˆ.p// for the states of a system with potential ˆ.p/, but the method can be adapted to sample any distribution. MCMC constructs a collection of points p1 ; p2 ; : : : as a trajectory of a Markov chain constructed so that its limiting distribution is equal to .pjQ/. The algorithm, here adapted to the computation of the Bayesian posterior density (2.9) , works as follows: (i)
Initialize pk , k D 1
(ii) Sample pO from a proposal distribution q.pjp O k/ (iii) Solve the ODE system (2.2) and compute .pjQ/ O D L.Qjp/. O p/ O ± ° k jp/ .pjQ/q.p O O (iv) Set pkC1 D pO with probability P D min 1; .p k jQ/q.pjp O k/ or pkC1 D pk with probability 1 P (v) Increment k by 1, go to (ii) The choice of the proposal distribution q does not affect the limiting distribution for pk as long as q is symmetric, i.e., as long as q.ajb/ D q.bja/. If q is symmetric, then MCMC algorithm implies that pO is accepted with certainty when .pjQ/ O .pk jQ/, and that there is chance that pO will be accepted even if .pjQ/ O < .pk jQ/, but that chance decreases with decreasing .pjQ/. O This mechanism allows the chain to escape from local maxima of .pjQ/. The variance of the proposal distribution q should be chosen so that the acceptance probability P in step (iv) is on average about 25%. Higher variance leads to larger proportions of rejections and a waste of computing time, while lower variance results in small distances between pO and pk and leads to inefficient sampling of the parameter space. The distribution q.pjp O k / is usually chosen k to be a multivariate Gaussian distribution with mean p . As k ! 1, the distribution of points pk approaches .pjQ/. The sample of points 1 p ; : : : ; pm can be used to estimate the ensemble average of any trajectory-dependent quantity G as m X G.x.t I pi // (2.15) hG.t /i Š m1 i D1
and the percentile value PX .G/ as the smallest number that is larger than X% of values of ¹G.x.t I pi //ºm i D1 . There are technical issues that need to be addressed when applying MCMC to Bayesian computation [65]. The most important problem is how to decide whether
28
2 Mathematical and Statistical Modeling of Biological Systems
the Markov chain generated by MCMC has converged to the stationary distribution. A number of statistical tests can be utilized to test the convergence and mixing of the chain [16]. A commonly used test of Gelman and Rubin [21] compares the mean values of parameters of two chains running in parallel. While coding of MCMC and convergence tests may be a hurdle, it can be avoided by using one of several packages for the analysis of biological dynamical systems that include Bayesian inference, for example, ABC-SysBio [48], Synbioss [34], BioBayes [76], or KINSOLVER [1]. MCMC techniques can be generalized to include multiple model types. One possibility is to enlarge parameter space to include the model as an additional unknown and allow jumps between the models (reversible jump MCMCs, or RJMCMCs for short). A prior distribution over the model space must be specified, but with a good choice of the jumps the number of the models need not be specified in advance [26, 64]. An alternative approach is the birth and death MCMC (BDMCMC) where the time between jumps to a model of different dimension is determined by a rate constant, and moves between models are always accepted—the probability of a model is determined by the length of time MCMC spends at that model [67]. Greater efficiency of MCMC computations can be achieved using the technique of parallel tempering, which results in a more thorough exploration of the parameter space, and ultimately faster convergence of the chain [30]. Unlike MCMC, which consists of a single Markov chain, parallel tempering algorithm generates samples pk;i of multiple Markov chains with limiting probabilities i .p/ / exp .ˇi ˆ.p// / .p/ˇi evaluated for different values ˇi of a new parameter ˇ. The origin of this technique is again in statistical physics where ˆ represents the potential of the system and ˇ is proportional to the reciprocal of the temperature at which the system is observed. At low ˇ (i.e., high temperature), the potential differences between any two states of the system are smaller and hence the chain can have a bigger step size and explore a larger region of the state space. At high ˇ (i.e., low temperature), the distribution becomes more focused in small regions [19]. The chains are evolved independently and at regular intervals the parameter sets pk;i and pk;i C1 for two neighboring values ˇi and ˇi C1 are swapped with probability ² ³ .pk;i C1 jQ/ ˇi ˇiC1 O P D min 1; .pk;i jQ/
(2.16)
Swapping of the parameter values allows for the region surrounding a newly discovered high probability parameter set to be explored by a low temperature chain. The modified algorithm becomes (i)
Initialize pk;i and ˇi , k D 1, i D 1; : : : ; C .
(ii) For each i D 1; : : : ; C
Sample pO i from a proposal distribution qi .pO i jpk;i /
29
2.1 Ensemble Modeling of Biological Systems
Solve the ODE system (2.2) with p D pO i and compute .pO i jQ/ D L.QjpO i /.pO i / ± ° ˇi i k;i i Set pkC1;i D pO i with probability P D min 1; .pOk;ijQ/ ˇiqi .p i jpOk;i/ .p
or pkC1;i D pk;i with probability 1 P
jQ/
qi .p O jp
/
(iii) For each i D 1; : : : ; C 1 swap pk;i and pk;i C1 with probability PO given by (2.16) (iv) Increment k by 1, go to (ii) The main benefit of the parallel tempering algorithm is to improve mixing by allowing the low temperature chain (the primary chain of interest) to escape energy barriers and converge faster to a stationary probability distribution. The values of ˇi should be chosen so that 1 D ˇ1 > ˇ2 > > ˇC . In computational physics literature there is an ongoing discussion about the optimal strategy for choosing ˇ i for various problem types [41, 46, 58, 72]. A rule of thumb is that ˇi should be spaced so that the swapping probability is on average about 20%. Too closely spaced ˇi result in an inefficient sampling of the space by the highest temperature chain. Too widely spaced ˇi result in an insufficient communication between the chains. For any choice of ˇi , the variances of the proposal distributions qi must be adjusted so that the acceptance of the proposed value in the step (ii) is 25% on average for each chain. Only the chain pk;1 corresponding to ˇ1 D 1 samples the posterior distribution .pjQ/. MCMC with the included parameter ˇ can be used to find the maximum likelihood parameter set p by using a simulated annealing algorithm [7], in which the value of ˇ is gradually increased during the simulation and hence the limiting distribution .p/ becomes narrower and localized near p . Approximate Bayesian computation (ABC) has become popular recently due to its computational efficiency, especially in cases in which the likelihood function cannot be written out explicitly (for example, when no standard deviations are given for the observed data or because we require precise agreement between the simulated and observed data) [5, 51, 62, 71]. ABC presents an alternative representation of the posterior distribution .pjQ/. The output of the ABC algorithm is a sample from the distribution .pjd.Q ; Q/ / where d.Q ; Q/ denotes the distance between the experimental data Q and the simulated data Q corresponding to the parameter p. When is small, the distribution .pjd.Q ; Q/ / is a good approximation to the distribution .pjQ/. Toni et al. [71] proposed a novel approximate Bayesian computation (ABC) method for evaluating posterior distributions for ODE models based on sequential Monte Carlo (SMC) method, in which a fixed number of sampled parameter values are propagated though a sequence of intermediate distribution until they represent a sample from the target distribution. They show that ABC SMC performs better than the traditional ABC approach. Busetto and Buhmann [14] proposed a method for Bayesian parameter estimation based on a new stable resampling technique for sequential Monte Carlo algorithm. They argue that this technique overcomes
30
2 Mathematical and Statistical Modeling of Biological Systems
some drawbacks of classical SMC methods such as the lack of stability and sample degeneracy.
2.1.5 Application to Viral Infection Dynamics In this section we give a simple example of the construction and utilization of an ensemble model. Consider the following basic nonlinear model of an acute viral infection [55]: VP D pI cV HP D ˇH V IP D ˇH V ıI
(2.17) (2.18) (2.19)
where V is the concentration of viable virus particles, H is the number of uninfected target cells, and I is the number of infected cells. The virus particles interact with uninfected target cells which become infected at a rate ˇH V (here the parameter ˇ is not to be confused with the inverse temperatures ˇi ). Free virus particles are cleared at a rate of c per day. The infected cells increase viral concentration at a rate of p per cell and die at a rate of ı per day. The initial conditions for the model are .V; H; I /.0/ D .H0 ; V0 ; 0/. Baccam et al. [4] discuss the application of this model to modeling influenza A virus infection and calibrate it with virus titer data for individual human subjects. The model can be readily extended to an ensemble model by assuming that the parameters are taken from a distribution—such a model then can be applied to cases in which the data are collected and combined for a group of subjects, such as in the study of the dynamics of virus infection in humans by Hayden et al. [32]. The study provides virus titer measurements from nasal wash expressed in TCID50 per ml of the wash for a group of 26 human volunteers inoculated intranasally with influenza A/Texas/91 (H1N1). The combined data are shown in Table 2.1. Table 2.1. Averaged virus titer data for a group human volunteers inoculated intranasally with influenza A/Texas/91 (H1N1) [32]. t days
log 10 V log 10 .TCID50 =ml/
1 2 3 4 5 6 7 8
0.95 2.67 2.67 1.90 0.81 0.65 0.48 0.30
0.62 0.94 0.94 0.67 0.52 0.42 0.47 0.21
31
2.1 Ensemble Modeling of Biological Systems
Since the initial level of the virus, V0 , is not known in that study, it is included in the list of parameters characterizing the model. The initial number H0 of uninfected target cells is estimated at 4 108 . The posterior distribution for p D .log 10 V0 ; log 10 p; log 10 c; log 10 ˇ; log 10 ı/ can be obtained using MCMC algorithm with parallel tempering as described in Section 2.1.4, by sampling 3 chains. For the prior distribution .p/ we chose a product of uniform distributions for p of width 2 centered at the base line values pN = log 10 .0:25 TCID50 =ml, 0:014 .TCID50 =ml/1 , 2:7 105 .TCID 50 =ml/day1 , 3:2 day1 , 3:2 day1 ) and for the proposal distributions qi we chose uncorrelated multivariate Gaussian distributions for ln p with variance matrices †i D 2 i2 I , i.e., ² pi 2 .pNi 1; pNi C 1/ 1=25 (2.20) .p/ D 0 pi … .pNi 1; pNi C 1/ X 5 2 5=2 2 2 O D .2 i / exp .pOj pj / =.2 i / (2.21) qi .pjp/ j D1
(Note that the uniform prior on p corresponds to the uninformative Jeffreys prior on the original positive parameters.) For the likelihood function we use L.Qjp/ D exp .E.p; Q// with E.p; Q/ as in (2.5), with the output variable y D log 10 V , and with data as in Table 2.1. The sample size obtained in this example is 900,000 parameter sets. Reasonable acceptance and swapping ratios were obtained for .ˇ1 ; ˇ2 ; ˇ3 / D .1; 0:33; 0:11/ and . 1 ; 2 ; 3 / D .0:087; 0:15; 0:26/. Marginal histograms of the posterior distribution .pjQ/ show that the data contain no information about the initial value V0 (see Figure 2.1 (a)). The coefficient of infectivity ˇ is localized at the center of the prior range while the value of p gravitates toward the lower end of the range. There is an interesting bimodality in the marginal distributions for the two degradation rates, c and ı. Further information about the distribution can be obtained from correlation plots in Figure 2.1 (b), which show that there is a negative correlation between log 10 ˇ and log 10 p and that the sample can be split into two clusters in the projection onto .c; ı/ plane. It appears that each cluster is focused around a local minimum of the likelihood function. The ensemble trajectories in Figure 2.1 (c) show that the model has trouble following the prescribed data (it does not reach the peak value of V and the decay is linear on the logarithmic scale) but that the variance of virus and uninfected target cell (H ) trajectories is rather low. On the other hand, the variance over the ensemble in the trajectory of infected cells (I ) is rather high. To investigate the matter further, we examine the 500 best fitting trajectories (those from the sample with the largest values of L.Qjp/). These trajectories and the histograms of parameters for these trajectories are shown in Figure 2.2. Clearly, the trajectories are almost indistinguishable in V and H , but they differ significantly in I . The trajectories with large maximum value of I correspond to the histograms shown in cyan and are characterized by a well defined value of ı and poorly defined c. The
32
2 Mathematical and Statistical Modeling of Biological Systems
–1 (a)
0 log10 V0
1
–1
0 log10 b
–1
1
0 log10 p
1
–1
0 log10 c
1
–1
0 log10 d
1
log10 p
1 0
log10 c
–1 –1
0
1
1
0
0
–1 –1
log10 d
1
0
1
1
1
0
0
0
(b)
0 log10 b
–1 –1
1
4
4
3
0 log10 p
–1 –1
1
0 log10 c
҂108
4
3
3
2
2
1
҂108
I
2 H
log10 V
0
1
–1 –1
1 1
0 –1 (c)
–1 –1
1
0
2
4 Days
6
8
0
1
0
2
4 Days
6
8
0
0
2
4 Days
6
8
Figure 2.1. (a) Marginal distributions and (b) correlation plots for the posterior distribution, and (c) probabilistic trajectory prediction for the ensemble model obtained using (2.15). At each time point, the solid curve shows the median value of the variable over the ensemble, dark gray shows the 50th percentile range, and light gray the 90th percentile range. The data of [32], used to compute the posterior distribution, are shown as triangles.
33
2.1 Ensemble Modeling of Biological Systems
–1 (a)
– 0.5 log10 V0
0
0.2
0.4 0.6 log10 b
4
4
– 0.5 log10 p
0
–1
0 log10 c
҂108
4
3
3
2
2
1
–1
0 log10 d
1
҂108
I
2 H
log10 V
3
1
–1
1
1
0
(b)
–1
0
2
4 Days
6
8
0
0
2
4 Days
6
8
0
0
2
4 Days
6
8
Figure 2.2. (a) Marginal distributions for two locally optimal clusters of parameters and (b) trajectories corresponding to the clusters. The trajectories with low maximum I are for the red distributions.
trajectories with small maximum I correspond to the histograms shown in red and are characterized by well defined c and poorly defined ı. In the language of classical modeling, the data can be fitted equally well with multiple parameter sets that fall under two distinct categories. If the parameters were computed by maximization of the likelihood function, it is likely that one or the other optimum would have been missed. Consider now the situation, also studied by Hayden et al. [32], in which an antiviral treatment is initiated about 30 hrs after the initial infection. Specifically, suppose that the subjects are administered a neuraminidase inhibitor, such as zanamivir or oseltamivir, which blocks the function of neuraminidase protein and prevents the virus from budding from the host cell [28]. In the ensemble model this effect can be accounted for by lowering the value of the parameter p, describing the production of new virus cells, to 1=30th of its starting value when t > 1:25 days for each parameter set in the sample. The resulting trajectories in Figure 2.3 show a significant decrease of virus levels after the administration of the treatment, with larger variance observed for the predicted virus level after treatment than the variances observed for the original model. The model predictions agree well with the observed data considering that log 10 V D 0 is the detection limit of the experiment. The other two variables also
34
2 Mathematical and Statistical Modeling of Biological Systems
4
4
4
3
3
2
2
҂108
I
2 H
log10 V
3
҂108
1
–1
1
1
0 0
2
4 Days
6
8
0
0
2
4 Days
6
8
0
0
2
4 Days
6
8
Figure 2.3. Prediction of the distribution of trajectories for the ensemble treated with neuraminidase inhibitor 30 hrs post infection. The data from [32] are shown as solid circles. The detection limit for log 10 V is 0.
show larger variance, especially the trajectory of the uninfected target cells H . Although the trajectory of H was well constrained in the original model, the model does not allow us to make any conclusion about the number of uninfected cells after the treatment. This result, which would have been completely missed with any uniquely parametrized model, further illustrates the benefits of ensemble modeling.
2.1.6 Ensemble Models in Biology The number of papers utilizing ensemble modeling in biology is growing steadily, mostly in the the area of genetic and biochemical networks. For example, Battogtokh et al. [6] utilized the principles of ensemble modeling in a study of chemical reaction network for the regulation of the quinic acid (qa) gene cluster of Neurospora crassa. Their model consists of 14 differential equations for the mRNA and protein concentrations of 7 genes. The unknowns are 14 initial concentrations of both mRNA and proteins and 25 rate constants, that were fitted to the total of 42 data points obtained for mRNA concentrations. They pointed out that the poor constraining of parameters may be improved by collecting additional data for protein concentrations which were predicted by the model with broad distributions. Putter et al. [63] used Bayesian approach to estimate the parameters of the HIV model of Griffith, May and Nowak. They argue for the use of Bayesian models because the data are collected from two compartments, one of which is subject to censoring, and random effects in one variable are assumed to be from beta distribution. They use a mix of informative (Gaussian in the log space) and non-informative prior distributions and estimate the pre-treatment and post-treatment reproductive ratios for the virus. Brown et al. [12] used the ensemble technique to study genetic network modeling the action of of NGF and mitogenic epidermal growth factor (EGF) in rat pheochromocytoma (PC12) cells. They predict the influence of specific signaling modules in
2.1 Ensemble Modeling of Biological Systems
35
determining the integrated cellular response to the two growth factors. They note that only a small fraction of parameter combinations are well constrained and most parameters (rate constants) vary over huge ranges. However, the few well constrained parameters reveal critical features of the network that generates the appropriate output. Kuepfer et al. [47] utilize ensemble modeling in the analysis of signaling networks. As opposed to the traditional approach, they vary not only the parameters of the model but also the model structure by including or excluding particular interaction terms. They develop a library of alternative dynamical models for TOR pathway of S. cerevisiae. They perform comparisons of the models using Bayesian methods and point out that significant information about the pathway can be extracted using this procedure even with highly uncertain biological mechanisms and few quantitative experimental data. Zenker, Rubin, and Clermont [78] present an example of the application of ensemble modeling and Bayesian inference to a disease diagnostic process. The output of the technique, based on a simplified model of hypotension and patient specific clinical observation, is shown to produce a multi-modal posterior density function whose peaks correspond to clinically relevant differential diagnoses. These can be constrained to a single diagnosis using additional observations from dynamical interventions. Daun et al. [17] have employed an ensemble model to study acute inflammatory response to bacterial LPS in rats. The model they construct is not based on Bayesian inference but rather constructed by finding the maximum likelihood optimal set of parameters starting from different initial guesses. Due to nonuniqueness of the maximum, a sample of 103 points that fit the data equally well was obtained that fit the data equally well. Sensitivity analysis was used to reduce the 46 parameter space of the 8 state model to 18 parameters that was show to capture satisfactorily the essential dynamical properties of the full model. Jayawardhana et al. [36] employed Bayesian inference to study the variability of steady states of metabolic pathways. The pathways are modeled using ODE systems, however, there is no data available about the dynamics of these pathways, only their steady states. The authors have adapted the ensemble technique to estimate the parameters’ distributions and steady state concentrations in the ensemble model based on the observed steady states of some state variables. Klinke [45] uses Bayesian approach to calibrate a complex model of early EGF signaling comprised of 35 non-linear ODEs. Using experimentally determined dissociation constants he reduced the model parameters to 28 kinetic parameters and 6 initial concentrations, and determined maximum likelihood estimates for unknown parameters from experimental data using a simulated annealing optimization algorithm. The observed significant covariance between specific parameters and a broad range of variance is characteristic of a sloppy model. Jaeger and Lambert [35] present a systematic study of Bayesian estimation of parameters for linear ODE systems based on expansion of solutions of the ODE system in B-splines, which coefficients are adjusted to satisfy the system of ODEs. The benefit
36
2 Mathematical and Statistical Modeling of Biological Systems
of such an expansion, compared to the usual numerical integration, is that it is much less time consuming, independent of the inherent drawbacks associated with numerical integration, and that the posterior distribution can be given in a closed form. The B-spline approach also has its problems as a poor spline fit can lead to poor parameter estimates. Luan et al. [49] report on how ensemble models can be used to characterize the response of a biological system to therapeutic interventions on the example of the response of human coagulation cascade to recombinant factor VIIa and prothrombin additions in normal and hemophilic plasma. They develop an ensemble of human coagulation models consisting of 193 state variables (protein concentrations) with 467 unknown parameters. The ensemble model in this work has not been created using Bayesian methods but rather by repeated energy minimization, which lead to different optima being found due to its nonuniqueness. Ensemble techniques have been employed also in the study of human immune response to influenza A virus infection [60, 61] and a study of the vocal fold infection [69]. The influenza model [60, 61] consists of 20 nonlinear differential equations with 90 parameters that was calibrated with a collection of FACS, luminex, and ELISA assay data obtained for mice infected with the virus at two levels of the inoculum leading to either lethal or sublethal outcome. As each data point was collected from a different mouse, ensemble modeling was essential for describing the parameter variability over the population. Using multi-objective fitting, a parameter distribution was constructed that explained both types of trajectory paths. The study of vocal fold inflammation [69] contains a 4 variable model with 17 constants that is calibrated with sparse data. The ensemble modeling approach allowed to make probabilistic predictions about the effect of treatment strategies on the outcome of inflammation.
2.1.7 Conclusions Notwithstanding the difficulties connected with implementation of ensemble models and parameter estimation techniques, it is clear that such models have their place in theoretical biology. They are easy to construct as straightforward extensions of existing stochastic or deterministic models, they are applicable to situations with small or extensive amounts of data, and they provide probabilistically supported predictions of model behavior with clearly identified implications of all model assumptions. Of course, in the growing area of ensemble modeling there are still many problems to be resolved and techniques to be developed. The utility and applicability of ensemble models depends on how accurately one is able to determine the parameter distribution .p/ from the available data. Bayesian inference can provide an estimate of that distribution, however, the posterior distribution .pjQ/ contains not only information about the parameter variability, but also about parameter sensitivity, and the accuracy and completeness of data. Further analytical work is needed to find methods for deconvoluting the posterior distribution into individual components, provide
2.1 Ensemble Modeling of Biological Systems
37
convergence proofs, if possible, and criteria for estimating the amount of data needed to achieve accurate prediction of .p/. Furthermore, since the posterior distribution includes information about parameter sensitivity, it should be possible to use that information to reduce the number of parameters needed to be estimated. This is important for computational efficiency because the number of points needed to obtain a convergent MCMC sample grows exponentially with the dimension of the parameter space [25]. More work is also needed to explore and design appropriate methods for parameter reduction of ensemble models [40, 68]. Open problems remain also in Bayesian inference itself, such as, for example, the appropriate choice of the prior distribution. The presence of heuristic criteria in the prior distribution has proved to be important for controlling the space of admissible qualitative dynamical trajectories of the system [33]. As the parameters of nonlinear ODE systems are changed, such systems undergo bifurcations that qualitatively change their phase portraits. Such changes are usually undesirable in the ensemble model. Moreover, there are intuitive expectations about the behavior of a biology inspired model that are not captured by the available data, such as the requirement of non-explosion (existence and boundedness of trajectories for all time), or the requirement of a steady homeostatic state of an organism. Additionally, it would also be helpful to incorporate specifics of ODE solutions into Bayesian inference, specifically the dependence of noise on time, noise autocorrelation between data taken at neighboring time points, and possible noise correlation in multi-response data. Extensions of the Bayesian approach from output error analysis to equation error or input error formulation would also be useful. There is an alternative way to estimate parameter variability within populations by using the hierarchical mixed-effects modeling approach (see, e.g., [10,18,57]. The authors estimate population parameters using maximum likelihood formulation and then estimate the standard deviation of the parameter variability from the data by linearizing the model about the maximum likelihood trajectory and inverse-mapping the standard deviations of the data. Such methods are convenient when variances in parameter values are small, while the key feature of the Bayesian approach is that it provides global information about the parameter distribution. Acknowledgments. Many thanks to G. Clermont and J. Rubin for numerous discussions on the subject and to S. Zenker for excellent lecture notes.
Bibliography [1] B. Aleman-Meza, Y. Yu, H. B. Schüttler, J. Arnold, T. R. Taha, KINSOLVER: A simulator for computing large ensembles of biochemical and gene regulatory networks, Computers & Mathematics with Applications 57 (2009), 420–435. [2] R. C. Aster, C. H. Thurber, B. Borchers, Parameter Estimation and Inverse Problems 90, Academic Press, Waltham, Massachusetts, 2005.
38
2 Mathematical and Statistical Modeling of Biological Systems
[3] E. Baake, M. Baake, H. G. Bock, K. M. Briggs, Fitting ordinary differential equations to chaotic data, Physical Review A 45 (1992), 5524–5529. [4] P. Baccam, C. Beauchemin, C. A. Macken, F. G. Hayden, A. S. Perelson, Kinetics of influenza A virus infection in humans, Journal of Virology 80 (2006), 7590. [5] C. Barnes, D. Silk, X. Sheng, M. P. H. Stumpf, Bayesian design of synthetic biological systems, Arxiv preprint arXiv:1103.1046 (2011). [6] D. Battogtokh, D. K. Asch, M. E. Case, J. Arnold, H. B. Schüttler, An ensemble method for identifying regulatory circuits with special reference to the qa gene cluster of Neurospora crassa, Proceedings of the National Academy of Sciences 99 (2002), 16904. [7] K. J. Beers, Numerical Methods for Chemical Engineering: Applications in Matlab®, Cambridge University Press, Cambridge, 2007. [8] J. Berger, The case for objective Bayesian analysis, Bayesian Analysis 1 (2006), 385– 402. [9] H. G. Bock, Recent advances in parameter identification for ordinary differential equations, Progress in Scientific Computing 2 (1983), 95–121. [10] D. M. Bortz, P. W. Nelson, Model selection and mixed-effects modeling of HIV infection dynamics, Bulletin of Mathematical Biology 68 (2006), 2005–2025. [11] G. E. P. Box, G. C. Tiao, Bayesian Inference in Statistical Analysis, Wiley Online Library, John Wiley & Sons, Hoboken, New Jersey, 1973. [12] K. S. Brown, C. C. Hill, G. A. Calero, C. R. Myers, K. H. Lee, J. P. Sethna, R. A. Cerione, The statistical mechanics of complex signaling networks: nerve growth factor signaling, Physical Biology 1 (2004), 184. [13] K. S. Brown, J. P. Sethna, Statistical mechanical approaches to models with many poorly known parameters, Physical Review E 68 (2003), 021904. [14] A. G. Busetto, J. M. Buhmann, Stable Bayesian parameter estimation for biological dynamical systems, in: 2009 International Conference on Computational Science and Engineering, pp. 148–157, IEEE, John Wiley & Sons, Hoboken, New Jersey, 2009. [15] P. Congdon, Bayesian Statistical Modelling, 670, Wiley, 2006. [16] M. K. Cowles, B. P. Carlin, Markov chain Monte Carlo convergence diagnostics: a comparative review, Journal of the American Statistical Association (1996), 883–904. [17] S. Daun, J. Rubin, Y. Vodovotz, A. Roy, R. Parker, G. Clermont, An ensemble of models of the acute inflammatory response to bacterial lipopolysaccharide in rats: results from parameter space reduction, Journal of Theoretical Biology 253 (2008), 843–853. [18] M. Davidian, D. M. Giltinan, Nonlinear Models for Repeated Measurement Data, 62, Chapman & Hall/CRC, London, 1995. [19] D. J. Earl, M. W. Deem, Parallel tempering: Theory, applications, and new perspectives, Physical Chemistry Chemical Physics 7 (2005), 3910–3916. [20] D. Gamerman, H. F. Lopes, Markov Chain Monte Carlo: Stochastic Simulation for Bayesian Inference, 68, Chapman & Hall/CRC, London, 2006.
2.1 Ensemble Modeling of Biological Systems
39
[21] A. Gelman, D. B. Rubin, Inference from iterative simulation using multiple sequences, Statistical Science 7 (1992), 457–472. [22] J. Geweke, Contemporary Bayesian Econometrics and Statistics, 537, WileyInterscience, John Wiley & Sons, Hoboken, New Jersey, 2005. [23] J. Geweke, Bayesian model comparison and validation, The American Economic Review 97 (2007), 60–64. [24] W. R. Gilks, S. Richardson, D. J. Spiegelhalter, Markov Chain Monte Carlo in Practice, Chapman & Hall/CRC, London, 1996. [25] J. Gill, Is partial-dimension convergence a problem for inferences from MCMC algorithms?, Political Analysis 16 (2008), 153. [26] P. J. Green, Reversible jump Markov chain Monte Carlo computation and Bayesian model determination, Biometrika 82 (1995), 711. [27] M. Guay, D. D. McLean, Optimization and sensitivity analysis for multi-response parameter estimation in systems of ordinary differential equations, Computers and Chemical Engineering 19 (1995), 1271–1286. [28] L. V. Gubareva, Molecular mechanisms of influenza virus resistance to neuraminidase inhibitors, Virus Research 103 (2004), 199–203. [29] R. N. Gutenkunst, J. J. Waterfall, F. P. Casey, K. S. Brown, C. R. Myers, J. P. Sethna, Universally sloppy parameter sensitivities in systems biology models, PLoS Computational Biology 3 (2007), e189. [30] U. H. E. Hansmann, Parallel tempering algorithm for conformational studies of biological molecules, Chemical Physics Letters 281 (1997), 140–150. [31] W. K. Hastings, Monte Carlo sampling methods using Markov chains and their applications, Biometrika 57 (1970), 97. [32] F. G. Hayden, J. J. Treanor, R. F. Betts, M. Lobo, J. D. Esinhart, E. K. Hussey, Safety and efficacy of the neuraminidase inhibitor GG167 in experimental human influenza, JAMA: The Journal of the American Medical Association 275 (1996), 295. [33] C. Higham, Bifurcation analysis informs Bayesian inference in the Hes1 feedback loop, BMC Systems Biology 3 (2009), 12. [34] A. D. Hill, J. R. Tomshine, E. Weeding, V. Sotiropoulos, Y.vN. Kaznessis, SynBioSS: the synthetic biology modeling suite, Bioinformatics 24 (2008), 2551. [35] J. Jaeger, P. Lambert, Bayesian generalized profiling estimation in hierarchical linear dynamic systems, Technical Report 11001, IAP Statistics Network, Université Catholique de Louvain, Belgium, 2010. [36] B. Jayawardhana, D. B. Kell, M. Rattray, Bayesian inference of the sites of perturbations in metabolic pathways via Markov Chain Monte Carlo, Bioinformatics 24 (2008), 1191. [37] E. T. Jaynes, G. L. Bretthorst, Probability Theory: The Logic of Science, Cambridge University Press, Cambridge, 2003. [38] H. Jeffreys, Scientific Inference, Cambridge University Press, Cambridge, 1937.
40
2 Mathematical and Statistical Modeling of Biological Systems
[39] H. Jeffreys, Theory of Probability, Clarendon Press, Oxford, 1961. [40] B. Jin, Fast Bayesian approach for parameter estimation, International Journal for Numerical Methods in Engineering 76 (2008), 230–252. [41] H. G. Katzgraber, S. Trebst, D. A. Huse, M. Troyer, Feedback-optimized parallel tempering Monte Carlo, Journal of Statistical Mechanics: Theory and Experiment 2006 (2006), P03018. [42] D. B. Kell, Metabolomics and systems biology: making sense of the soup, Current Opinion in Microbiology 7 (2004), 296–307. [43] C.T. Kelley, Iterative Methods for Optimization, 18, Society for Industrial Mathematics, Philadelphia, 1999. [44] H. Kitano, Systems biology: a brief overview, Science 295 (2002), 1662. [45] D. Klinke, An empirical Bayesian approach for model-based inference of cellular signaling networks, BMC Bioinformatics 10 (2009), 371. [46] A. Kone, D. A. Kofke, Selection of temperature intervals for parallel-tempering simulations, The Journal of Chemical Physics 122 (2005), 206101. [47] L. Kuepfer, M. Peter, U. Sauer, J. Stelling, Ensemble modeling for analysis of cell signaling dynamics, Nature Biotechnology 25 (2007), 1001–1006. [48] J. Liepe, C. Barnes, E. Cule, K. Erguler, P. Kirk, T. Toni, M. P. H. Stumpf, ABCSysBio-approximate Bayesian computation in Python with GPU support, Bioinformatics 26 (2010), 1797. [49] D. Luan, F. Szlam, K. A. Tanaka, P. S. Barie, J. D. Varner, Ensembles of uncertain mathematical models can identify network response to therapeutic interventions, Mol. BioSyst. 6 (2010), 2272–2286. [50] D. J. C. MacKay, Information Theory, Inference, and Learning Algorithms, Cambridge University Press, Cambridge, 2003. [51] P. Marjoram, J. Molitor, V. Plagnol, S. Tavaré, Markov chain Monte Carlo without likelihoods, Proceedings of the National Academy of Sciences of the United States of America 100 (2003), 15324. [52] F. Mazzocchi, Complementarity in biology, EMBO Reports 11 (2010), 339. [53] N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, E. Teller et al., Equation of state calculations by fast computing machines, The Journal of Chemical Physics 21 (1953), 1087. [54] H. Miao, X. Xia, A. S. Perelson, H. Wu, On identifiability of nonlinear ODE models with applications in viral dynamics, SIAM Review: accepted (2010). [55] M. A. Nowak, R. M. C. May, Virus Dynamics: Mathematical Principles of Immunology and Virology, Oxford University Press, New York, 2000. [56] A. O’Hagan, J. Forster, M. G. Kendall, Bayesian Inference, Arnold, London, 2004. [57] J. C. Pinheiro, D. M. Bates, Mixed-effects Models in S and S-PLUS, Springer, New York, 2009.
2.1 Ensemble Modeling of Biological Systems
41
[58] C. Predescu, M. Predescu, C. V. Ciobanu, The incomplete beta function law for parallel tempering sampling of classical canonical systems, The Journal of Chemical Physics 120 (2004), 4119. [59] S. J. Press, Subjective and Objective Bayesian Statistics: Principles, Models, and Applications, 328, LibreDigital, 2003. [60] I. Price, Mathematical modeling of chemical signals in inflammatory pathways, Ph.D. thesis, University of Pittsburgh, 2011. [61] I. Price, D. Swigon, B. Ermentrout, F. Toapanta, T. Ross, G. Clermont, Immune response to influenza A, Respiratory Care Clinics of North America 24 (2009), e33–e33. [62] J. K. Pritchard, M. T. Seielstad, A. Perez-Lezaun, M. W. Feldman, Population growth of human Y chromosomes: a study of Y chromosome microsatellites, Molecular Biology and Evolution 16 (1999), 1791. [63] H. Putter, S. H. Heisterkamp, J. M. A. Lange, F. De Wolf, A Bayesian approach to parameter estimation in HIV dynamical models, Statistics in Medicine 21 (2002), 2199– 2214. [64] C. P. Robert, G. Casella, Monte Carlo Statistical Methods, Springer, New York, 2004. [65] A. F. M. Smith, G. O. Roberts, Bayesian computation via the Gibbs sampler and related Markov chain Monte Carlo methods, Journal of the Royal Statistical Society. Series B (Methodological) (1993), 3–23. [66] E. D. Sontag, For differential equations with r parameters, 2r+ 1 experiments are enough for identification, Journal of Nonlinear Science 12 (2002), 553–583. [67] M. Stephens, Bayesian analysis of mixture models with an unknown number of components—an alternative to reversible jump methods, Annals of Statistics (2000), 40–74. [68] C. Sun, J. Hahn, Parameter reduction for stable dynamical systems based on Hankel singular values and sensitivity analysis, Chemical Engineering Science 61 (2006), 5393– 5403. [69] S. Tang, Stochastic methods in modeling the immune response, Ph.D. thesis, University of Pittsburgh, 2011. [70] A. Tarantola, Inverse Problem Theory and Methods for Model Parameter Estimation, Society for Industrial Mathematics, Philadelphia, 2005. [71] T. Toni, D. Welch, N. Strelkowa, A. Ipsen, M. P. H. Stumpf, Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems, Journal of the Royal Society Interface 6 (2009), 187. [72] S. Trebst, M. Troyer, U. H. E. Hansmann, Optimized parallel tempering simulations of proteins, The Journal of Chemical Physics 124 (2006), 174903. [73] B. Van Domselaar, P.W. Hemker, Nonlinear Parameter Estimation in Initial Value Problems, Stichting Mathematisch Centrum, Amsterdam, 1975. [74] M. H. V. Van Regenmortel, Reductionism and complexity in molecular biology, EMBO Reports 5 (2004), 1016.
42
2 Mathematical and Statistical Modeling of Biological Systems
[75] H. U. Voss, J. Timmer, J. Kurths, Nonlinear dynamical system identification from uncertain and indirect measurements, International Journal of Bifurcation and Chaos 14 (2004), 1905–1933. [76] V. Vyshemirsky, M. Girolami, BioBayes: A software package for Bayesian inference in systems biology, Bioinformatics 24 (2008), 1933. [77] D. J. Wilkinson, Bayesian methods in bioinformatics and computational systems biology, Briefings in Bioinformatics 8 (2007), 109. [78] S. Zenker, J. Rubin, G. Clermont, From inverse problems in mathematical physiology to quantitative differential diagnoses, PLoS Computational Biology 3 (2007), e204.
Author Information David Swigon, Department of Mathematics, University of Pittsburgh, Pittsburgh, PA, USA E-mail: [email protected]
3
Probabilistic Models for Nonlinear Processes and Biological Dynamics
Vassili N. Kolokoltsov
3.1 Nonlinear Lévy and Nonlinear Feller Processes: an Analytic Introduction*
Abstract. The program of studying general nonlinear Markov processes was put forward in V. N. Kolokoltsov “Nonlinear Markov Semigroups and Interacting Lévy Type Processes” (Journ. Stat. Physics 126:3 (2007), 585–642), and was developed by the author in monograph “Nonlinear Markov processes and kinetic equations”. Cambridge University Press, 2010, where, in particular, nonlinear Lévy processes were introduced. Nonlinear Markov processes model numerous real life processes from natural, social and life science. The present paper is an invitation to the rapidly developing topic of nonlinear Markov processes. We provide a quick (and at the same time more abstract) introduction to the basic analytical aspects of the theory developed in Part II of the above mentioned book. Keywords. Nonlinear Feller Process, Nonlinear Lévy Process, Nonlinear Markov Process, Nonlinear Markov Semigroup, Nonlinear ODE in Banach Space, Sensitivity 2010 Mathematics Subject Classification. 34G20, 47J35, 60J99
3.1.1 Introduction Nonlinear Lévy processes were introduced by the author in [18], where the first systematic analysis of general nonlinear Markov processes was given. Here we provide a quick introduction to the basic analytical aspects of the theory developed in Part II of [18] giving more concise and more general formulations of some basic facts on well-posedness and sensitivity of nonlinear processes. Nonlinear Markov processes (in particular nonlinear Lévy and Feller processes, as well as nonlinear Markov chains) model numerous real life processes from natural, social and life science. The latter are related to the replicator dynamics of evolutionary biology that represents one of the basic examples of a nonlinear Markov evolution, see again [18] and [20]. For general background in Lévy and Markov processes we refer to books [1,19,22], and specially for time non-homogeneous processes to [9] and [10]. For sensitivity of the nonlinear jump-type processes, e.g., Boltzmannn or Smoluchovski, we refer to papers [12] and [2].
Supported by the AFOSR grant FA9550-09-1-0664 ‘Nonlinear Markov control processes and games’ http://arxiv.org/abs/1103.5591
46
3 Probabilistic Models for Nonlinear Processes and Biological Dynamics
Loosely speaking, a nonlinear Markov evolution is just a dynamical system generated by a measure-valued ordinary differential equation (ODE) with the specific feature of preserving positivity. This feature distinguishes it from a general Banach space valued ODE and yields a natural link with probability theory, both in interpreting results and in the tools of analysis. Technical complications for the sensitivity analysis, again compared with the standard theory of vector-valued ODE, lie in the specific unboundedness of generators that causes the derivatives of the solutions to nonlinear equations (with respect to parameters or initial conditions) to live in other spaces than the evolution itself. From the probabilistic point of view, the first derivative with respect to initial data (specified by the linearized evolution around a path of nonlinear dynamics) describes the interacting particle approximation to this nonlinear dynamics (which, in turn, serves as the dynamic law of large numbers to this approximating Markov system of interacting particles), and the second derivative describes the limit of fluctuations of the evolution of particle systems around its law of large numbers (probabilistically the dynamic central limit theorem). In this paper we concentrate only on the analytic aspects of the theory, referring to [18] for probabilistic interpretation. Recall first the definition of a propagator. For a set S, a family of mappings U t;r , from S to itself, parametrized by the pairs of real numbers r t (resp. t r ) from a given finite or infinite interval is called a forward propagator (resp. a backward propagator), if U t;t is the identity operator in S for all t and the following chain rule, or propagator equation, holds for r s t (resp. for t s r ): U t;s U s;r D U t;r . If the mappings U t;r forming a backward propagator depend only on the differences r t , then the family T t D U 0;t forms a semigroup. That is why propagators are sometimes referred to as two-parameter semigroups. By a propagator we mean a forward or a backward propagator (which should be clear from the context). Q Let M.X/ be a dense subset of the space M.X/ of finite (positive Borel) measures on a polish (complete separable metric) space X (considered in its weak topology). Q By a nonlinear sub-Markov (resp. Markov) propagator in M.X/ we shall mean any t;r Q propagator V of possibly nonlinear transformations of M.X/ that do not increase (resp. preserve) the norm. If V t;r depends only on the difference t r and hence specifies a semigroup, this semigroup is called nonlinear or generalized sub-Markov or Markov respectively. The usual, linear, Markov propagators or semigroups correspond to the case when all the transformations are linear contractions in the whole space M.X/. In probability theory these propagators describe the evolution of averages of Markov processes, i.e., processes whose evolution after any given time t depends on the past Xt only via the present position X t . Loosely speaking, to any nonlinear Markov propagator there corresponds a process whose behavior after any time t depends on the past Xt via the position X t of the process and its distribution at t . More precisely, consider the nonlinear kinetic equation d .g; t / D .AŒ t g; t / dt
(3.1)
3.1 Nonlinear Lévy and Nonlinear Feller Processes: an Analytic Introduction
47
with a certain family of operators AŒ in C.X/ depending on as a parameter and such that each AŒ specifies a uniquely defined Markov process (say, via solution to the corresponding martingale problem, or by generating a Feller semigroup). Suppose that the Cauchy problem for equation (3.1) is well posed and specifies the weakly continuous Markov semigroup T t in M.X/. Suppose also that for any weakly continuous curve t 2 P .X/ (the set of probability measures on X) the solutions to the Cauchy problem of the equation d .g; t / D .AŒ t g; t / dt
(3.2)
define a weakly continuous propagator V t;r Œ : , r t , of linear transformations in M.X/ and hence a Markov process in X, with transition probabilities Œ: pr;t .x; dy/. Then to any 2 P .X/ there corresponds a (usually linear, but time non-homogeneous) Markov process X t in X ( stands for an initial distribution) such that its distributions t solve equation (3.2) with the initial condition . In particular, the distributions of X t (with the initial condition ) are t D T t . / for all times t Œ and the transition probabilities pr;t : .x; dy/ specified by equation (3.2) satisfy the condition Z Œ f .y/pr;t : .x; dy/ r .dx/ D .f; V t;r r / D .f; t /: (3.3) X2
We shall call the family of processes X t a nonlinear Markov process. When each AŒ generates a Feller semigroup and T t acts on the whole M.X/ (and not only on its dense subspace), the corresponding process can be also called nonlinear Feller. Q Allowing for the evolution on subsets M.X/ is however crucial, as it often occurs in applications, say for the Smoluchovski or Boltzmann equation with unbounded rates. Thus, a nonlinear Markov process is a semigroup of the transformations of distributions such that to each trajectory is attached a “tangent” Markov process with the same marginal distributions. The structure of these tangent processes is not intrinsic to the semigroup, but can be specified by choosing a stochastic representation for the generator, that is of the r.h.s. of (3.2). In this paper we shall prove a general well-posedness result for nonlinear Markov semigroups that will cover, as particular cases, (i)
nonlinear Lévy processes specified by the families 1 A f .x/ D .G. /r; r/f .x/ C .b. /; rf /.x/ 2Z C Œf .x C y/ f .x/ .y; rf .x//1B1 .y/ . ; dy/;
(3.4)
where, for each probability measure on Rd , . ; :/ is a Lévy measure (i.e., a Borel measure on Rd without a mass point at the origin and such that the function
48
3 Probabilistic Models for Nonlinear Processes and Biological Dynamics
min .1; jyj2 / is integrable with respect to it), G. / is a symmetric non-negative d d -matrix, b. / a vector in Rd and B1 is the unit ball in Rd with 1B1 being the corresponding indicator function; (ii) processes of order at most one specified by the families Z A f .x/ D .b.x; /; rf .x// C .f .x C y/ f .x// .x; ; dy/;
(3.5)
Rd
where the Lévy measure is supposed to have a finite first moment; (iii) mixtures of possibly degenerate diffusions and stable-like processes and processes generated by the operators of order at most one, explicitly defined below in Proposition 3.10. It is worth noting that equations of type (3.2) that appear naturally as Dynamic Law of Large Numbers for interacting particles, can be deduced, on the other hand, from the mere assumption of positivity preservation, see [18] and [26]. In case of diffusion (partial second order) operators AŒ , the corresponding evolution (3.1) was first analyzed by McKean and Freidlin, see e.g., [7], and is often called the McKean or McKean–Vlasov diffusion. For recent developments on nonlinear diffusions we can refer to Belopolskaya [3–5]. An important particular case that arises as the limit of grazing collisions in the Boltzmann collision model is sometimes referred to as the Landau–Fokker–Planck equation, see [11] for some recent results. The case of AŒ being a Hamiltonian vector field is often called a Vlasov-type equation, as it contains the celebrated Vlasov equation from plasma physics. The case of AŒ being pure integral operators comprises a large variety of models from statistical mechanics (say, Boltzmann and Smoluchovskiu equations) to evolutionary games (replicator dynamics), see [18] for a comprehensive review and papers [6, 27, 28] for the introduction to nonlinear Markov evolutions from the physical point of view. The following basic notations will be used:
C1 .Rd / C.Rd / consists of f such that limx!1 f .x/ D 0,
k .Rd /) is the Banach space of k times continuously differentiable C k .Rd / (resp. C1 functions with bounded derivatives on Rd (resp. its closed subspace of functions f with f .l/ 2 C1 .Rd /, l k) with
kf kC k .Rd / D
k X
kf .l/ kC.Rd / ;
lD0
P .Rd /
the set of probability measures on Rd .
kAkD!B denotes the norm of an operator A in the Banach space L.D; B/ of bounded linear operators between Banach spaces D and B, and k kB denotes the norm of as an element of the Banach space B.
3.1 Nonlinear Lévy and Nonlinear Feller Processes: an Analytic Introduction
49
3.1.2 Dual Propagators A backward propagator ¹U t;r º of uniformly (for t; r from a compact set) bounded linear operators on a Banach space B is called strongly continuous if the family U t;r depends strongly continuously on t and r . For a strongly continuous backward propagator ¹U t;r º of bounded linear operators on a Banach space B with a common invariant domain D B, which is itself a Banach space with the norm k kD k kB , let ¹A t º, t 0, be a family of bounded linear operators D ! B depending strongly measurably on t (i.e., the function t 7! A t f 2 B is measurable for each f 2 D). Let us say that the family ¹A t º generates ¹U t;r º on the invariant domain D if the equations d s;r d t;s U f D U t;s As f; U f D As U s;r f; t s r; (3.6) ds ds hold a.s. in s for any f 2 D, that is there exists a negligible subset S of R such that for all t < r and all f 2 D equations (3.6) hold for all s outside S, where the derivatives exist in the Banach topology of B. In particular, if the operators A t depend strongly continuously on t (as bounded operators D ! B), this implies that equations (3.6) hold for all s and f 2 D, where for s D t (resp. s D r ) it is assumed to be only a right (resp. left) derivative. For a Banach space B or a linear operator A one usually denotes by B ? or A? its Banach dual (space or operator respectively). Alternatively, the notations B 0 and A0 are in use. Theorem 3.1 (Basic duality). Let ¹U t;r º be a strongly continuous backward propagator of bounded linear operators in a Banach space B with a common invariant domain D, which is itself a Banach space with the norm k kD k kB , and let the family ¹A t º of bounded linear operators D ! B generate ¹U t;r º on D. Then (i)
the family of dual operators V s;t D .U t;s /? forms a weakly-? continuous in s; t propagator of bounded linear operators in B ? (contractions if all U t;r are contractions) such that d s;t d s;t V D V s;t A?t ; V D A?s V s;t ; t s; (3.7) dt ds hold weakly-? in D ? , i.e., say, the second equation means d .f; V s;t / D .As f; V s;t /; t s; f 2 DI (3.8) ds
(ii) V s;t is the unique solution to the Cauchy problem of equation (3.8) in B ? , i.e., if t D for a given 2 B ? and s , s 2 Œt; r , is a weakly-? continuous family in B ? satisfying d .f; s / D .As f; s /; t s r; f 2 D; (3.9) ds then s D V s;t for t s r .
50
3 Probabilistic Models for Nonlinear Processes and Biological Dynamics
(iii) U s;r f is the unique solution to the inverse Cauchy problem of the second equation in (3.6). Proof. Statement (i) is a direct consequence of duality. (ii) Let g.s/ D .U s;r f; s / for a given f 2 D. Writing .U sCı ;r f; sCı / .U s;r f; s / D .U sCı ;r f U s;r f; s / C .U s;r f; sCı s / C .U sCı ;r f U s;r f; sCı s / and using (3.6), (3.8) and the invariance of D, allows one to conclude that d g.s/ D .As U s;r f; s / C .U s;r f; A?s s / D 0; ds because a.s. in s
U sCı ;r f U s;r f ; sCı s ı
! 0;
as ı ! 0 (since the family ı 1 .U sCı ;r f U s;r f / is relatively compact, being convergent, and s is weakly continuous). Hence, g.r / D .f; r / D g.t / D .U t;r f; t /, showing that r is uniquely defined. (iii) is proved similar to (ii). Remark 3.2. In addition to the statement of Theorem 3.1 let us note (as one sees directly from duality), that (i) V s;t depend weakly-? continuous on s; t uniformly for bounded and (ii) V s;t is a weakly-? continuous operator, that is n ! weakly? implies V s;t n ! V s;t weakly-?. Remark 3.3. Working with discontinuous A t is crucial for the development of the related theory of SDE with nonlinear noise, see [16] and [17]. In this paper we shall use only continuous families of generators ¹A t º. We deduce now some corollaries of Theorem 3.1: on the extension of the operators V s;t to D ? , and on their stability with respect to a perturbation of the family A t . Theorem 3.4. Under the assumptions of Theorem 3.1 suppose additionally that (i)
¹U t;s º is a strongly continuous backward propagator of uniformly bounded operators in D;
(ii) there exists another subspace DQ D, dense in D, which is itself a Banach space with the norm k kDQ k kD such that the mapping t 7! A t is a continuous Q D/; mapping t ! L.D; (iii) B ? is dense in D ? (which holds automatically in case of reflexive D).
3.1 Nonlinear Lévy and Nonlinear Feller Processes: an Analytic Introduction
51
Then the operators V s;t W B ? ! B ? extend to the operators V s;t W D ? ! D ? forming a weakly-? continuous propagator in D ? that solves equation (3.8) weakly-? Q in DQ ? , that is, for any 2 D ? , equation (3.8) holds for all f 2 D. Proof. The fact that V s;t extend to linear operators in D ? follows without any additional assumption from the invariance of D under U t;s . Assumption (i) implies that this extension is bounded and weakly-? continuous in D ? . In order to prove that (3.8) holds for f 2 DQ and 2 D ? , observe that Zr .f; V
r;t
.As f; V s;t / ds
/ D .f; / C
(3.10)
t
Q let us pick up a sequence for 2 B ? , f 2 D. Now, for a 2 D ? and f 2 D, ? ?
n 2 B such that n ! in the norm topology of D as n ! 1 (which is possible by assumption (iii)). As As f 2 D (by assumption(ii)), we can pass to the limit in (3.10) with n instead of (using dominated convergence) yielding (3.10) for 2 D ? Q Finally, as .As f; V s;t / is a continuous function of s (by assumption (ii) and f 2 D. and the weak-? continuity of V s;t in D ? ), equation (3.10) implies (3.8) for 2 D ? Q and f 2 D. Theorem 3.5. Under the assumptions of Theorem 3.4 assume additionally that the backward propagator ¹U t;s º in D is generated by ¹A t º on the invariant domain DQ (in particular DQ is invariant and equations (3.6) hold in the norm topology of D for Q Then V s;t represents the unique weakly-? continuous in D ? solution any f 2 D). of equation (3.8) in DQ ? . Moreover, for the propagator ¹U t;s º in D to be generated by ¹A t º on DQ it is sufficient to assume that ¹U t;s º is a strongly continuous family of Q bounded operators in D. Proof. The first statement is a direct consequence of Theorem 3.1 applied to the pair Q D. The last statement is proved as in the previous theorem. Namely, we of spaces D; first rewrite equation (3.6) in the integral form, i.e., as
U
t;r
Zr f Df C
As U t
s;r
f ds;
U
t;r
Zr f Df C
U t;s As f ds:
(3.11)
t
These equations would imply (3.6) with the derivative defined in the norm topology Q if we can prove that the functions As U s;r f and U t;s As f are of D, for f 2 D, continuous functions s 7! D. To see that this is true, say for the first function, we can write AsCı U sCı ;r f As U s;r f D AsCı .U sCı ;r f U s;r f / C .AsCı As /U s;r f:
52
3 Probabilistic Models for Nonlinear Processes and Biological Dynamics
The first term tends to zero in the norm topology of D, as ı ! 0, by the strong Q and the second term tends to zero by the continuity of the continuity of U s;r in D, family As (assumption (ii) of Theorem 3.4). We conclude this section with a simple result on the convergence of propagators. Theorem 3.6. Suppose we are given a sequence of backward propagators ¹Unt;r º, n D 1; 2; : : : generated by the families ¹Ant º and a backward propagator ¹U t;r º generated by the family ¹A t º. Let all these propagators satisfy the same conditions as U t;r and A t from Theorem 3.1 with the same D, B. Suppose also that all U t;r are uniformly bounded as operators in D. Assume finally that, for any t and any f 2 D, Ant f converge to A t f , as n ! 1, in the norm topology of B. Then Unt;r converges to U t;r strongly in B. Moreover, k.Vnr;t V r;t / kD? ckAns As kD!B k kB ? :
(3.12)
Proof. By the density argument (taking into account that Unt;r g are uniformly bounded in B), in order to prove the strong convergence of Unt;r to U t;r , it is sufficient to prove that Unt;r g converges to U t;r g for any g 2 D. But if g 2 D, Zr .Unt;r
U
t;r
/g D
Unt;s U s;r g jrsDt D
Unt;s .Ans As /U s;r g ds;
(3.13)
t
which converges to zero in the norm topology of B by the dominated convergence. Estimate (3.12) also follows from (3.13).
3.1.3 Perturbation Theory for Weak Propagators The main point of the perturbation theory is to build a propagator generated by the family of operators ¹A t C F t º, when a propagator U t;r generated by ¹A t º is given and ¹F t º are bounded. However, if ¹F t º are only bounded, then instead of the solutions to the equation d (3.14) f D As f C Fs f; t s r; ds with a given terminal fr , as desired, one can only construct the solutions to the so called mild form of this equation: ft D U
t;r
Zr f C
U t;s Fs fs ds;
(3.15)
t
which is only formally equivalent to (3.14) (i.e., when a solution to the mild equation is regular enough which may not be the case).
3.1 Nonlinear Lévy and Nonlinear Feller Processes: an Analytic Introduction
53
Let us recall the simplest perturbation theory result for propagators, which clarifies this issue (a proof can be found in, e.g., [19], Theorem 1.9.3, and a simpler, but similar fact for semigroups is discussed in almost any text book on functional analysis). Theorem 3.7. (i)
Let U t;r be a strongly continuous backward propagator of bounded linear operators in a Banach space B, and ¹F t º be a family of bounded operators in B depending strongly continuous on t . Set Zr t;r t;r ˆ D U C U t;s Fs U s;r ds t
C
1 X
Z
U t;s1 Fs1 U s1 ;s2 Fsm U sm ;r ds1 dsm : (3.16)
mD1 t s s r 1 m
It is claimed that this series converges in B and the family ¹ˆt;r º also forms a strongly continuous propagator of bounded operators in B such that f t D ˆt;s f is the unique solution to equation (3.15). (ii) Suppose additionally that a family of linear operators ¹A t º generates ¹U t;r º on the common invariant domain D, which is dense in B and is itself a Banach space under a norm k:kD k:kB . Suppose that U t;r and ¹F t º are also uniformly bounded operators in D. Then D is invariant under ¹ˆt;r º and the family ¹A t C F t º generates ¹ˆt;r º on D. Moreover, series (3.16) also converges in the operator norms of D and operators ˆt;r f are bounded as operators in the Banach space D. We presented this theorem because for the sensitivity analysis of nonlinear equations we shall need non-homogeneous extensions of equations (3.9) of the form d .f; s / D .As f; s / C .Fs f; s /; ds
t s r;
(3.17)
where Fs is a family of operators bounded in D, but, what is crucial and necessitates technical complications, not bounded in B. Under the assumption of Theorem 3.5 and assuming ¹F t º is a bounded strongly continuous family of operators in D, it follows directly from Theorem 3.7 (ii) applied Q that the perturbation theory propagator (3.16) to the pair of Banach spaces .D; D/ solves equation (3.14) in D and is generated on DQ by the family ¹A t C F t º. Hence, by Theorem 3.1, the dual propagator ¹‰ r;t D .ˆt;r /0 º is weakly-? continuous in D ? and yields a unique solution to (3.17) in DQ ? (i.e. so that, for s D ‰ s;t t , equation (3.17) Q holds for all f 2 D). The next result proves the same fact, except for uniqueness, under weaker assumptions of Theorem 3.4.
54
3 Probabilistic Models for Nonlinear Processes and Biological Dynamics
Theorem 3.8. Under the assumptions of Theorem 3.4 assume ¹F t º is a bounded strongly continuous family of operators in D. Let ¹ˆt;r º be given by (3.16), which by Q is a strongly continuous Theorem 3.7 (i) (applied to the pair of Banach spaces .D; D/) r;t t;r 0 propagator in D, and let ¹‰ D .ˆ / º, which is clearly a weakly-? continuous backward propagator in D ? . Then the curve s D ‰ s;t t solves equation (3.17) in Q DQ ? with a given terminal condition t , that is (3.17) holds for all f 2 D. Proof. From duality and (3.16) it follows that ‰ r;t D V r;t C
Z
1 X
V r;sm Fs0m V s2 ;s1 Fs01 V s1 ;t ds1 dsm ; (3.18)
mD1 t s s r 1 m
where Fs0 are of course dual operators to Fs , and where the integral is understood in weak-? sense and the series converges in the norm-topology of D ? (we need to take into account Remark 3.2 to see that the weak integral is well defined). To prove (3.17) for f 2 DQ we should now differentiate term by term the corresponding series .f; ‰ r;t / with respect to r using Theorem 3.4. This term-by-term differentiation is then justified by the fact that the series of derivatives .Ar f; V
r;t
t / C .Fr f; V
r;t
Zr
t / C
.Ar f; V
r;s
Fs0 V s;t / ds
C
t
converges uniformly in r .
3.1.4 T -Products Here we shall recall the notion of T -products showing how they can be used to construct propagators generated by families of operators each of which generates a sufficiently regular semigroup. We shall work with three Banach spaces B0 ; B1 ; B2 with the norms denoted by k ki , i D 0; 1; 2, such that B0 B1 B2 , B0 is dense in B1 , B1 is dense in B2 and k k0 k k1 k k2 . Let L t W B1 7! B2 , t 0, be a family of uniformly (in t ) bounded operators such that the closure in B2 of each L t is the generator of a strongly continuous semigroup of bounded operators in B2 . For a partition D ¹0 D t0 < t1 < < tN D t º of an interval Œ0; t let us define a family of operators U .; s/, 0 s t , by the rules U .; s/ D exp ¹. s/L tj º;
tj s tj C1 ;
U.; r / D U .; s/U .s; r /;
0 r s t:
Let tj D tj C1 tj and ı./ D max j tj . If the limit U.s; r /f D
lim U .s; r /f
ı./!0
(3.19)
55
3.1 Nonlinear Lévy and Nonlinear Feller Processes: an Analytic Introduction
exists for some f and all 0 r s t (in the norm of B2 ), Rit is called the T -product s (or chronological exponent) of L t and is denoted by T exp ¹ r L d ºf . Intuitively, one expects the T -product to give a solution to the Cauchy problem d ' D L t '; '0 D f; dt in B2 with the initial conditions f from B1 .
(3.20)
Theorem 3.9. Let a family L t f , t 0, of linear operators in B2 be given such that (i)
each L t generates a strongly continuous semigroup e sLt , s 0, in B2 with invariant core B1 ,
(ii) L t are uniformly bounded operators B0 ! B1 and B1 ! B2 , (iii) B0 is also invariant under all e sLt and these operators are uniformly bounded as operators in B0 ; B1 , B2 , with the norms not exceeding e Ks with a constant K (the same for all Bj and L t ), (iv) L t f , as a function t 7! B2 , depends continuously on t locally uniformly in f (i.e., for f from bounded subsets of B1 ). Then (i)
Rs the T -product T exp ¹ 0 L d ºf exists for all f 2 B2 , and the convergence in (3.19) is uniform in f on any bounded subset of B1 ;
(ii) if f 2 B0 , then the approximations U .s; r / converge also in B1 ; (iii) this T -product defines a strongly continuous (in t; s) family of uniformly bounded operators in both B1 and B2 , Rs (iv) this T -product T exp ¹ 0 L d ºf is a solution of problem (3.20) for any f 2 B1 . Proof. (i) Since B1 is dense in B2 and all U .s; r / are uniformly bounded in B2 (by (iii)), the existence of the T -product for all f 2 B2 follows from its existence for f 2 B1 . In the latter case it follows from the formula U .s; r / U0 .s; r / D
U0 .s; /U .; r /jDs Dr
Zs D r
d U0 .s; /U .; r / d d
Zs U0 .s; /.LŒ LŒ0 /U .; r / d
D r
(where we denoted Œs D tj for tj s < tj C1 ), because L t are uniformly continuous (condition (iv)) and U .s; r / are uniformly bounded in B2 and B1 (by condition (iii)).
56
3 Probabilistic Models for Nonlinear Processes and Biological Dynamics
(ii) If f 2 B0 , then the equations Zs U .s; r / D
LŒ U .; r / d ; r
imply that the family U .s; t / is uniformly Lipschitz continuous in B1 as a function of t , because Ls are uniformly bounded operators B0 ! B1 and U .s; r / are uniformly bounded in B0 . Hence, one can choose a subsequence, Un .s; r /, converging in C.Œ0; T ; B1 /. But the limit is unique (it is the limit in B2 ), implying the convergence of the whole family U .s; t /, as ı./ ! 0. (iii) It follows from (iii) that the limiting propagator is bounded. Strong continuity in B1 is deduced first for f 2 B0 and then for all f 2 B1 by the density argument. (iv) If f 2 B0 , we can pass to the limit in the above approximate equations to obtain the equation Zs U.s; r /f D L U.; r /f d : r
Since B0 is dense in B1 , we then deduce the same equation for an arbitrary f 2 B1 . This implies that U.s; r /f satisfies equation (3.20) by condition (iv) and the basic theorem of calculus. To conclude the section we present a rather general example of a non-homogeneous generator of a strongly continuous Markov propagator specifying a time nonhomogeneous Feller process. This will be a time-nonhomogeneous possibly degenerate diffusion combined with a mixture of possibly degenerate stable-like processes and processes generated by the operators of order at most one, that is a process generated by an operator of the form L t f .x/ D
1 tr. t .x/ tT .x/r 2 f .x// C .b t .x/; rf .x// 2Z
C
.f .x C y/ f .x// t .x; dy/ Z
C P
Z ZK f .x C y/ f .x/ .y; rf .x// .dp/ d jyj ap;t .x; s/ !p;t .ds/: jyj˛p;t .x;s/C1 0
S d 1
(3.21) Here s D y=jyj, K > 0 and .P; dp/ is a Borel space with a finite measure dp and !p;t are certain finite Borel measures on S d 1 . Theorem 3.10. Let the functions ; b; a; ˛ and the finite measure jyj .x; dy/ be of smoothness class C 5 with respect to all variables (the measure is smooth in the weak
3.1 Nonlinear Lévy and Nonlinear Feller Processes: an Analytic Introduction
57
sense), and ap ; ˛p take values in compact subintervals of .0; 1/ and .0; 2/ respectively. Then the family of operators L t of form (3.21) generates a backward propaga2 .Rd /, and hence a unique Markov process. tor U t;s on the invariant domain C1 Proof. For a detailed proof (that uses several ingredients including Theorem 3.9 as a final step) we refer to the book [19].
3.1.5 Nonlinear Propagators The following result from [18] represents the basic tool allowing one to build nonlinear propagators from infinitesimal linear ones. Recall that V s;t denotes the dual of U t;s given by Theorem 3.1. Let M be a bounded subset of B ? that is closed in the norm topologies of both B ? and D ? . For a 2 M let C .Œ0; r ; M / be the metric space of the continuous in the norm D ? curves s 2 M , s 2 Œ0; r , 0 D , with the distance . : ; : / D sup s2Œ0;r k s s kD? : Theorem 3.11. (i)
Let D be a dense subspace of a Banach space B that is itself a Banach space such that kf kD kf kB , and let 7! AŒ be a mapping from B ? to bounded linear operators AŒ W D ! B such that kAŒ AŒkD!B ck kD? ;
; 2 B ? :
(3.22)
(ii) For any 2 M and : 2 C .Œ0; r ; M /, let the operator curve AŒ t W D ! B generate a strongly continuous backward propagator of uniformly bounded linear operators U t;s Œ : in B, 0 t s r , on the common invariant domain D (in particular, (3.6) holds), such that kU t;s Œ : kD!D c;
t s r;
(3.23)
for some constant c > 0 and with their dual propagators V s;t Œ : preserving the set M . Then the weak nonlinear Cauchy problem d .f; t / D .AŒ t f; t /; dt
0 D ;
f 2 D;
(3.24)
is well posed in M . More precisely, for any 2 M it has a unique solution T t . / 2 M , and the transformations T t of M form a semigroup for t 2 Œ0; r depending Lipschitz continuously on time t and the initial data in the norm of D ? , i.e., kT t . / T t ./kD? c.r; M /k kD? ; with a constant c.r; M /.
kT t . / kD? c.r; M /t
(3.25)
58
3 Probabilistic Models for Nonlinear Processes and Biological Dynamics
Proof. Since .f; .V t;0 Œ :1 V t;0 Œ :2 / / D .U 0;t Œ :1 f U 0;t Œ :2 f; / and U 0;t Œ :1 U 0;t Œ :2 D U 0;s Œ :1 U s;t Œ :2 jtsD0 Zt U 0;s Œ :1 .AŒ s1 AŒ s2 /U s;t Œ :2 ds;
D 0
and taking into account (3.22) and (3.23) one deduces that k.V t;0 Œ :1 V t;0 Œ :2 / kD? kU 0;t Œ :1 U 0;t Œ :2 kD!B k kB ? t c.r; M /sup s2Œ0;r k s1 s2 kD? (of course we used the assumed boundedness of M ), implying that for t t0 with a small enough t0 the mapping t 7! V t;0 Œ : is a contraction in C .Œ0; t ; M /. Hence, by the contraction principle there exists a unique fixed point for this mapping. To obtain the unique global solution one just has to iterate the construction on the next interval Œt0 ; 2t0 , then on Œ2t0 ; 3t0 , etc. The semigroup property of T t follows directly from uniqueness. Finally, if T t . / D t and T t ./ D t , then
t t D V t;0 Œ : V t;0 Œ: D .V t;0 Œ : V t;0 Œ: / C V t;0 Œ: . /: Estimating the first term as above yields sup st k s s kD? c.r; M /.t sup st k s s kD? C k kD? /; which implies the first estimate in (3.25) first for small times, which is then extended to all finite times by the iteration. The second estimate in (3.25) follows from (3.8). Remark 3.12. For our purposes, the basic examples are given by B D C1 .Rd /, 2 1 M D P .Rd /, and D D C1 .Rd / or D D C1 .Rd /. In order to see that P .Rd / is ? k closed in the norm topology of D for D D C1 .Rd / with any natural k, observe that k the distance d on P .Rd / induced by its embedding in .C1 .Rd //0 is defined by 2 .Rd /; kf kC1 d. ; / D sup ¹j.f; /j W f 2 C1 2 .Rd / 1º:
and hence d. ; / D sup ¹j.f; /j W f 2 C 2 .Rd /; kf kC 2 .Rd / 1º: Consequently, convergence n ! , n 2 P .Rd /, with respect to this metric implies the convergence .f; n / ! .f; / for all f 2 C k .Rd / and hence for all f 2 C1 .Rd / and for f being constants. This implies tightness of the family n and that the limit 2 P .Rd /.
3.1 Nonlinear Lévy and Nonlinear Feller Processes: an Analytic Introduction
59
Theorem 3.9 supplies a useful criterion for condition (ii) of the previous theorem, thus yielding the following corollary. Theorem 3.13. Under the assumption (i) of Theorem 3.11 assume instead of (ii) the following: Q which is a dense subspace of D, so that (ii0 ) There exists another Banach space D, all AŒ , 2 M , are uniformly bounded operators DQ ! D and D ! B. (iii0 ) For any 2 M the operator AŒ W D ! B generates a strongly continuous semigroup e tAŒ in B with invariant core D, such that DQ is also invariant under Q D, B, all e sAŒ, and these operators are uniformly bounded as operators in D; with the norms not exceeding e Ks with a constant K, (iv0 ) the set M is invariant under all dual semigroups .e tAŒ/0 . Then condition (ii) and hence the conclusion of Theorem 3.11 hold. Moreover, the operators U t;s Œ : form a strongly continuous propagator of bounded operators in D. Proof. For : 2 C .Œ0; r ; M /, the operator curve Ls D AŒ s W D ! B clearly satisfies conditions (i)–(iii) of Theorem 3.9. To check its last condition (iv) we have to show that AŒ t f as a function t 7! B is continuous uniformly for f from a bounded domain of D. And this follows from (3.22), as it implies k.AŒ t AŒ s /f kB ck t s kD? kf kD : Hence, Theorem 3.9 is applicable to the curve Ls D AŒ s W D ! B, implying condition (ii) of Theorem 3.11. As a preliminary step in studying sensitivity, let us prove a simple stability result for the above nonlinear semigroups T t with respect to the small perturbations of the generator. Q is another Theorem 3.14. Under the assumptions of Theorem 3.11 suppose 7! AŒ ? mapping from B to bounded operators D ! B satisfying the same condition as A with the corresponding UQ t;s , VQ s;t satisfying the same conditions as U t;s , V s;t . Suppose Q AŒ kD!B ; 2 M (3.26) kAŒ with a constant . Then kTQ t ./ T t . /kD? c.r; M /. C k kD? /:
(3.27)
Proof. As in the proof of Theorem 3.11, denoting T t . / D t and TQ t ./ D Q t one can write Q /
t Q t D .V t;0 Œ : / VQ t;0 ŒQ : / C VQ t;0 Œ.
60
3 Probabilistic Models for Nonlinear Processes and Biological Dynamics
and then sup st k s Q s kD? c.r; M / t .sup st k s Q s kD? C / C k kD? ; which implies (3.27) first for small times, and then for all finite times by iterations.
3.1.6 Linearized Evolution Around a Path of a Nonlinear Semigroup Both for numerical simulations and for the application to interacting particles, it is crucial to analyze the dependence of the solutions to nonlinear kinetic equations on some parameters and on the initial data. Ideally we would like to have smooth dependence. More precisely, suppose we are given a family of operators A˛ Œ , depending on a real parameter ˛ and satisfying the assumptions of Theorem 3.11 for each ˛. For
˛t D ˛t . ˛0 /, a solution to corresponding (3.1) with the initial condition ˛0 , we are interested in the derivative @ ˛t : (3.28)
t .˛/ D @˛ In this section we shall start with the analysis of the linearized evolution around a path of a nonlinear semigroup. Namely, differentiating (3.1) (at least formally for the moment) with respect to ˛ yields the equation d .g; t .˛// D .A˛ Œ ˛t g; t .˛// C .D t .˛/ A˛ Œ ˛t g; ˛t / dt ˛ ˛ @A Œ t ˛ C g; t ; @˛ with the initial condition
0 D 0 .˛/ D
@ ˛0 ; @˛
(3.29)
(3.30)
where D A˛ Œ D lim
s!0C
1 ˛ .A Œ C s A˛ Œ / s
(3.31)
denotes the Gateaux derivatives of AŒ as a mapping D ? ! L.D; B/, assuming that the definition of A˛ Œ can be extended to a neighborhood of M in D ? . This section is devoted to the preliminary analysis of the solutions to equation (3.29). In the next section we shall explore their connections with the derivatives from the r.h.s. of (3.28). Let DQ D B be, as above, three Banach spaces such that k kDQ k kD k kB , D is dense in B in the topology of B and DQ is dense in D in the topology of B; and let M and C .Œ0; r ; M / be defined as in Section 3.1.5.
3.1 Nonlinear Lévy and Nonlinear Feller Processes: an Analytic Introduction
61
Theorem 3.15. (i)
Let, for each ˛, 7! A˛ Œ be a mapping from B ? to linear operators A˛ Œ that are uniformly bounded as operators D ! B and DQ ! D and such that kAŒ AŒkD!B ck kD? ;
; 2 B ?
(3.32)
for a constant c > 0. (ii) For any ˛, 2 M and : 2 C .Œ0; r ; M /, let the operator curve A˛ Œ t generate a strongly continuous backward propagator of uniformly bounded linear operators U t;sI˛ Œ : , 0 t s r , in B on the common invariant domain D, and with the dual propagator V s;t I˛ Œ : preserving the set M . (iii) Let the propagators ¹U t;sI˛ Œ : º, t s, are strongly continuous and bounded propagators in both B and D. (iv) Let the derivatives @A˛ Œ ˛t =@˛ exist in the norm topologies of L.D; B/ and Q D/, and represent also bounded operators in L.D; B/ and L.D; Q D/. L.D; (v) Let A˛ Œ can be extended to a mapping D ? ! L.D; B/ such that the limit in (3.31) exists in the norm topology of L.D; B/ for any 2 B ? ; 2 D ? . Moreover, the Gateaux derivatives 7! D A˛ Œ are continuous in (taken in the norm topology of B ? ) and define a bounded linear operator D ? ! L.D; B/, that is (3.33) kD A˛ Œ kD!B ck kB ? k kD? with a constant c. (vi) Finally, suppose there exists a representation .D A˛ Œ g; / D .F ˛ Œ g; /
(3.34)
with F ˛ Œ being a continuous mapping D ? ! L.D; D/. Then, for each ˛; 2 M , there exists a weakly-? continuous in D ? family of propagator …s;t Œ˛; (constructed below) solving equation (3.29) in DQ ? , that is, for any Q
0 2 D ? , t˛ D …s;t Œ˛; 0 satisfies (3.29) for any f 2 D. Remark 3.16. Condition (vi) causes no trouble. In fact it follows from duality and an additional weak continuity assumption on D . We shall not formulate this assumption for two reasons. (i) In case of reflexive B it is satisfied automatically. (ii) Though in the case we are most interested in, that is for B ? being the space of Borel measures, B is not reflexive, in applications to Markov semigroup representation (3.34) again arises automatically, due to the special structure of AŒ (of the Lévy–Khintchin type).
62
3 Probabilistic Models for Nonlinear Processes and Biological Dynamics
Remark 3.17. Construction of propagators from condition (ii) can naturally be carried out via Theorem 3.13, that is via T -products. Proof. Theorem 3.11 implies that, for any ˛, the weak nonlinear Cauchy problem d .f; ˛t / D .A˛ Œ ˛t f; ˛t /; dt
0 D ;
f 2 D;
(3.35)
is well posed in M , and its resolving semigroup T t˛ satisfies (3.25) uniformly in ˛. Next, the equation d .g; t .˛// D .A˛ Œ ˛t g; t .˛// C .D t .˛/ A˛ Œ ˛t g; ˛t / dt
(3.36)
has form (3.17) with Fs specified by (3.34), i.e., .Fs g; / D .F ˛ Œ ˛s g; / D .D A˛ Œ ˛s g; ˛s /: From (3.33) it follows that 2 kFs kD!D D sup kgkD 1 sup k kD? 1 .D A˛ Œ ˛s g; ˛s / ck kD? k kB ? ; (3.37)
which is uniformly bounded for ˛s 2 M . Consequently, Theorem 3.8 yields a construction of the strongly continuous family ¹ˆt;r º in D such that its dual propagator ¹‰ r;t D .ˆt;r /0 º solves the Cauchy problem for equation (3.36). By the Duhamel principle, the solution to equation (3.29) for r t with the initial condition t can be written as r;t
.g; … Œ˛; t / D .ˆ
t;r
Zr Œ˛; g; t / C t
@A˛ Œ ˛s s;r ˆ Œ˛; g; ˛s ds: (3.38) @˛
Theorem 3.18. Under the assumptions of Theorem 3.15, assume additionally that the backward propagators ¹U t;sI˛ Œ : º, t s, represent strongly continuous bounded propagators also in DQ (and hence, by the last statement of Theorem 3.5, the family Q Then, for A˛ Œ t W D ! B also generates ¹U t;sI˛ º, as a propagator in D, on D). each ˛; 2 M; 0 2 D ? , the curve …s;t Œ˛; 0 represents the unique weakly-? continuous in D ? solution to equation (3.29) in DQ ? . Proof. This is a straightforward extension of Theorem 3.15, obtained by taking into account the simple arguments given before Theorem 3.8. We shall not further pay attention to somewhat complicated details arising under the conditions of Theorem 3.15, but will use more natural conditions of Theorem 3.18. We complete this section by an additional stability result for …s;t .
3.1 Nonlinear Lévy and Nonlinear Feller Processes: an Analytic Introduction
63
Theorem 3.19. Under the assumptions of Theorem 3.18, suppose that (i)
Q D/, in addition to (3.32) and (3.33), one has the same properties for the pair .D; i.e., kAŒ AŒkD!D ck kD? ; ; 2 B ? ; (3.39) Q ck kB ? k kD? ; kD A˛ Œ kD!D Q
(3.40)
(ii) derivatives of A˛ Œ are Lipschitz in the norm-topology of D ? , more precisely: k
@A˛ Œ @A˛ Œ kQ ck kD? ; @˛ @˛ D!D
kD .A˛ Œ A˛ Œ /kD!B ck kD? k kD? :
(3.41) (3.42)
Suppose now that ˛0 .n/ ! ˛0 in the norm-topology of D ? , as n ! 1 for each ˛. Then …s;t Œ˛; ˛0 .n/ 0 ! …s;t Œ˛; ˛0 .n/ 0 weakly-? in D ? and in the norm topology of DQ ? . Proof. We shall use the notation for propagators introduced above adding dependence on n for all objects constructed from ˛0 .n/. By (3.25) we conclude that T t˛ ˛0 .n/ ! T t˛ ˛0 , as n ! 1, in the norm-topology of D ? uniformly in t; ˛. Hence, by (3.39) and Theorem 3.6 (applied to the pair of Q D/), spaces .D; U t;sI˛ ŒT:˛ ˛0 .n/ ! U t;sI˛ ŒT:˛ ˛0 in the norm-topology of L.D; D/. Similarly, by (3.40) and (3.41), j.D A˛ Œ g; / .D A˛ Œg; /j j.D .A˛ Œ A˛ Œ/g; /j C j.D A˛ Œg; /j ck kD? kgkDQ k kD? .k kB ? C kkB ? /; so that kFs Œ g Fs ŒgkD ck kD? kgkDQ .k kB ? C kkB ? /: and thus by Theorem 3.6, ˆt;s Œ˛; T:˛ ˛0 .n/ ! ˆt;s Œ˛; T:˛ ˛0 ;
n ! 1;
in the norm-topology of L.D; D/. Consequently, again by Theorem 3.6, ‰ s;t Œ˛; T:˛ ˛0 .n/ ! ‰ s;t Œ˛; T:˛ ˛0 weakly-? in D ? and in the norm-topology of DQ ? , for any 2 D ? .
64
3 Probabilistic Models for Nonlinear Processes and Biological Dynamics
Finally, from (3.38) it follows that .g; …r;0 Œ˛; .n/ …r;0 Œ˛; / Zr ˛ ˛ @A Œ s .n/ s;r 0;r 0;r s;r ˛ .ˆ .n/ ˆ /g; s D ..ˆ .n/ ˆ /g; / C @˛ 0
Zr C
@A
0
Zr C 0
˛
Œ ˛s .n/ @˛
ˆs;r .n/g; ˛s .n/ ˛s
@A˛ Œ ˛s .n/ @A˛ Œ ˛s s;r ˛ ˆ g; s ; @˛ @˛
which allows one to conclude that k…r;0 Œ˛; .n/ …r;0 Œ˛; kDQ ? ! 0; as n ! 1, as required.
3.1.7 Sensitivity Analysis for Nonlinear Propagators Our final question is whether the solution t constructed in Theorem 3.15 does in fact yield the derivative (3.28). The difference to the standard case, discussed in textbooks on ODE in Banach spaces, lies in the fact that the solution to the linearized equation (3.29) exists in a different space than the nonlinear curve t itself. Theorem 3.20. Under the assumptions of Theorem 3.19, let 0 D 2 B ? and is defined by (3.30), where the derivative exists in the norm-topology of DQ ? and weakly-? in D ? . Then the unique solution t Œ˛ D …t;0 Œ˛; ˛0 of equation (3.29) constructed in the Theorem 3.18 satisfies (3.28), where the derivative exists in the norm-topology of DQ ? and weakly-? in D ? . Proof. The main idea is to approximate A˛s by bounded operators, use the standard sensitivity theory for vector valued ODE and then obtain the required result by passing to the limit. To carry out this program, let us pick up a family of operators A˛s .n/, n D 1; 2; : : : bounded in B and D, that satisfy all the same conditions as A˛s and such that k.A˛s .n/A˛s /gkB ! 0 for all g 2 D and uniformly for all ˛ and g from bounded Q As such an approximation, one can use either a standard Iosida approxsubsets of D. imation (which is convenient in an abstract setting) or, in case of the generators of Feller Markov processes, generators of approximating pure-jump Markov processes. As in the proof of Theorem 3.19, we shall use the notation for propagators introduced in the previous section adding dependence on n for all objects constructed from A˛s .n/.
3.1 Nonlinear Lévy and Nonlinear Feller Processes: an Analytic Introduction
65
Since .A˛s .n//0 are bounded linear operators in B ? and D ? , the equation for t and
t are both well posed in the strong sense in both B ? and D ? . Hence, the standard result on the differentiation with respect to initial data is applicable (see, e.g., [25] or Appendix D in [18]) leading to the conclusion that t Œ˛.n/ represent the derivatives of ˛t .n/ in both B ? and D ? . Consequently, Z˛ ˛0 ˛
t .n/ t .n/ D t Œˇ.n/ dˇ (3.43) ˛0
holds as an equation in D ? (and in B ? whenever 2 B ? ). Using Theorem 3.14 we deduce the convergence of ˛t .n/ to ˛t in the normtopology of D ? . Consequently, using Theorem 3.19 we can deduce the convergence of t˛ .n/ to t˛ in the norm-topology of DQ ? . Hence, we can pass to the limit n ! 1 in equation (3.43) in the norm-topology of DQ ? yielding the equation
˛t
˛t 0
Z˛ D
t Œˇ dˇ;
(3.44)
˛0 1 where all objects are well defined in .C1 .Rd //? . This equation together with continuous dependence of t on ˛ (which is proved in literally the same way as continuous dependence on in Theorem 3.19) implies (3.28) in the sense required.
Applying Theorem 3.20 for the case of As not depending on any additional parameter, we obtain directly the smooth dependence of the nonlinear evolution t on the initial data. Namely, for t D t . 0 /, a solution to (3.1) with the initial condition
0 , we can define the Gateaux derivatives
t . 0 ; / D D t . 0 / D lim
s!0C
1 . t . 0 C s / t . 0 // s
(3.45)
Differentiating (3.1) with respect to initial data yields d .g; t . 0 ; // D .AŒ t g; t . 0 ; // C .D t .0 ; / AŒ t g; t /; dt
(3.46)
which represents a simple particular case of equation (3.29). Hence, Theorem 3.20 implies that, under the assumptions of this theorem (that do not involve the dependence on ˛), the derivative (3.45) does exists and is given by the unique solution to equation (3.46) with the initial condition 0 D , However, this existence and well-posedness hold weakly-? in DQ ? , not in B ? , as the nonlinear evolution itself.
66
3 Probabilistic Models for Nonlinear Processes and Biological Dynamics
3.1.8 Back to Nonlinear Markov Semigroups We developed the theory in the most abstract form, for general nonlinear evolutions in Banach spaces, not even using positivity. This unified exposition allows one to obtain various concrete evolutions as a direct consequence of one general result. The main application we have in mind concerns the families AŒ of the Lévy–Kchintchin type form (with variable coefficients): 1 AŒ u.x/ D .G .x/r; r/u.x/ C .b .x/; ru.x// 2Z C Œu.x C y/ u.x/ .y; ru.x//1B1 .y/ .x; dy/;
(3.47)
where .x; :/ is a Lévy measure for all x 2 Rd ; 2 P .Rd /. The basic examples were given in the introduction. Applied to nonlinear Lévy processes specified by the families (3.4), our general results yield the following. Theorem 3.21. Suppose the coefficients of a family (3.4) depend on Lipschitz con2 2 tinuously in the norm of the Banach space .C1 .Rd //0 dual to C1 .Rd /, i.e., Z kG. / G./k C kb. / b./k C min.1; jyj2 /j . ; dy/ .; dy/j k k.C1 2 .Rd //0 D sup kf k
2 .Rd / C1
1 j.f;
/j
(3.48)
with constant . Then there exists a unique nonlinear Lévy semigroup generated by A , and hence a unique nonlinear Lévy process. Proof. The well-posedness of all intermediate propagators is obvious in case of Lévy processes, because they are constructed via Fourier transform, literally like Lévy semi2 .Rd /, group (details are given in [18]). Of course here M D P .Rd /, D D C1 4 d DQ D C1 .R /. Remark 3.22. Condition (3.48) is not at all weird. It is satisfied, for instance, when the coefficients G,b, depend on via certain integrals (possibly multiple) with smooth enough densities, i.e., in a way that is usually met in applications. Applied to processes of order at most one specified by the families (3.5), our general results yield the following. Theorem 3.23. Assume that for any 2 P .Rd /, b.:; / 2 C 1 .Rd / and r .x; ; dy/ (gradient with respect to x) exists in the weak sense as a signed measure and depends weakly continuously on x. Let the following conditions hold:
3.1 Nonlinear Lévy and Nonlinear Feller Processes: an Analytic Introduction
(i)
boundedness:
67
Z sup x; sup x;
Z
min.1; jyj/ .x; ; dy/ < 1; min.1; jyj/jr .x; ; dy/j < 1;
(ii) tightness: for any > 0 there exists a K > 0 such that Z Z .x; ; dy/ < ; sup x; jr .x; ; dy/j < ; sup x; Rd nBK
Z sup x;
(3.49)
(3.50)
Rd nBK
jyj .x; ; dy/ < ;
(3.51)
B1=K
(iii) Lipschitz continuity: Z sup x min.1; jyj/j .x; 1 ; dy/ .x; 2 ; dy/j ck 1 2 k.C1 1 .Rd //? ; sup x jb.x; 1 / b.x; 2 /j ck 1 2 k.C1 1 .Rd //?
(3.52) (3.53)
uniformly for bounded 1 ; 2 . Then the weak nonlinear Cauchy problem (3.1) with A given by (3.5) is well posed, i.e., for any 2 M.Rd / it has a unique solution T t . / 2 M.Rd / (so that (3.5) holds 1 for all g 2 C1 .Rd /) preserving the norm, and the transformations T t of P .Rd /, t 0, form a semigroup depending Lipschitz continuously on time t and the initial 1 .Rd //? . data in the norm of .C1 1 2 .Rd /, DQ D C1 .Rd /. The corresponding Proof. Here we use M D P .Rd /, D D C1 auxiliary propagators required in Theorem 3.1 are constructed in [18, Chapter 4] and [19, Chapter 5].
In both cases above, straightforward additional smoothness assumptions on the coefficients of the generator yield smoothness with respect to parameters and/or initial data via Theorem 3.20. Similarly, one gets the well-posedness for mixtures of nonlinear diffusions and stable-like processes given by (3.21) with coefficients depending on distribution . Our theory also applies to nonlinear stable-like processes on manifolds, see [18, Section 11.4], and to nonlinear dynamic quantum semigroups, see [18, Section 11.3]. Let us stress again, referring to [15,18] and [14], that the first and second derivatives of nonlinear Markov semigroups with respect to initial data (for simplicity, we dealt only with the first derivative here) describe the dynamic law of large numbers for interacting particle systems and the corresponding central limit theorem for fluctuations, respectively.
68
3 Probabilistic Models for Nonlinear Processes and Biological Dynamics
3.1.9 Concluding Remarks We provided a quick introduction to the basic analytical aspects of the theory of nonlinear Markov processes that model numerous processes from natural, social and life science. The latter are related to the replicator dynamics of evolutionary biology. More generally, applications to life science, engineering and finance arise from the control and optimization problems built in a nonlinear Markov process. For the developments in these direction we refer to [8, 13, 21, 23, 24].
Bibliography [1] D. Applebaum, Lévy Processes and Stochastic Calculus, Cambridge Studies in Advanced Mathematics 93, Cambridge University Press, Cambridge, 2004. [2] I. Bailleul, Sensitivity for Smoluchovski equation. J. Phys. A, 44 (2011), no. 24, 245004. [3] Ya. I. Belopol’skaya, Nonlinear equations in diffusion theory. Probability and statistics. Part 4, Zap. Nauchn. Sem. POMI, 278, POMI, St. Petersburg, 2001, 15–35; English version: Journal of Mathematical Sciences, 118:6 (2003), 5513–5524. [4] Ya. I. Belopol’skaya, A probabilistic approach to a solution of nonlinear parabolic equations, Theory Probab. Appl., 49:4 (2005), 589–611. [5] Ya. I. Belopol’skaya and Yu. L. Daletcky, Stochastic Equations and Differential Geometry, Kluwer, Norwell, MA, 1990. [6] T. D. Frank, Nonlinear Markov processes. Phys. Lett. A, 372:25 (2008), 4553–4555. [7] M. I. Freidlin, Quasilinear parabolic equations and measures on a function space. Funct. Anal. Appl., 1 (1967), 234–240. [8] N. Gast, B. Gaujal, A Mean Field Approach for Optimization in Partcle Systems and Applications, in: Proceedings of the Fourth International ICST Conference on Performance Evaluation Methodologies and Tools, DOI: 10.4108/ICST.VALUETOOLS2009.7477 [9] I. I. Gikhman, A. V. Skorokhod, The Theory of Stochastic Processes I. Translated from the Russian by S. Kotz, reprint of the 1974 edition, Classics in Mathematics, Springer, Berlin, 2004. [10] I. I. Gikhman, A. V. Skorokhod, The Theory of Stochastic Processes II. Translated from the Russian by S. Kotz, reprint of the 1975 edition, Classics in Mathematics, Springer, Berlin, 2004. [11] H. Guérin, S. Méléard, E. Nualart, Estimates for the density of a nonlinear Landau process, Journal of Functional Analysis, 238 (2006), 649–677. [12] V. N. Kolokoltsov, On the regularity of solutions to the spatially homogeneous Boltzmann equation with polynomially growing collision kernel, Advanced Studies in Contemp. Math., 12 (2006), 9–38. [13] M. Huang, P. E. Caines, R. P. Malhamé, The NCE (mean field) principle with locality dependent cost interactions. IEEE Trans. Automat. Control, 55:12 (2010), 2799–2805.
3.1 Nonlinear Lévy and Nonlinear Feller Processes: an Analytic Introduction
69
[14] V. N. Kolokoltsov, Nonlinear Markov Semigroups and Interacting Lévy Type Processes, Journ. Stat. Physics 126:3 (2007), 585–642. [15] V. N. Kolokoltsov, The central limit theorem for the Smoluchovski coagulation model, arXiv:0708.0329v1[math.PR] 2007, Prob. Theory Relat. Fields, 146:1 (2010), 87–153. [16] V. N. Kolokoltsov, The Lévy–Khintchine type operators with variable Lipschitz continuous coefficients generate linear or nonlinear Markov processes and semigroupos, Probability Theory Related Fields, Online First DOI 10.1007/s00440-010-0293-8. [17] V. N. Kolokoltsov, Stochastic integrals and SDE driven by nonlinear Lévy noise, in, D. Crisan (ed.) Stochastic Analysis in 2010, pp. 227–242, Springer, Berlin–Heidelberg, 2011. [18] V. N. Kolokoltsov, Nonlinear Markov Processes and Kinetic Equations, Cambridge University Press, Cambridge, 2010. [19] V. N. Kolokoltsov, Markov Processes, Semigroups and Generators, De Gruyter, Berlin, 2011. [20] V. N. Kolokoltsov, O. A. Malafeyev, Understanding Game Theory, World Scientific, New Jersey, London, 2010. [21] V. N. Kolokoltsov, J. Li, W. Yang, Mean Field Games and Nonlinear Markov Processes, in preparation. [22] A. E. Kyprianou, Introductory Lectures on Fluctuations of Lévy Processes with Applications, Universitext, Springer, Berlin, 2006. [23] J.-Y. Le Boudec, D. McDonald, J. Mundinger, A generic mean field convergence result for systems of interacting objects, in, QEST 2007 (4th International Conference on Quantitative Evaluation of SysTems), Princteon, New Jersey–Woodstock, Oxfordshire, pp. 3–18, 2007. [24] J.-M. Lasry, P.-L. Lions, Mean field games, Japanese Journal of Mathematics, 2:1 (2007), 229–260. [25] R. H. Martin, Nonlinear Operators and Differential Equations in Banach Spaces, New York, 1976. [26] D. W. Stroock, Markov Processes from K. Ito’s Perspective, Annals of Mathematics Studies, Princeton University Press, 2003. [27] M. Zak, Dynamics of intelligent systems. Int. J. Theor. Phys., 39:8 (2000), 2107–2140. [28] M. Zak, Quantum evolution as a nonlinear Markov process. Foundations of Physics Letters 15:3 (2002), 229–243.
Author Information Vassili N. Kolokoltsov, Department of Statistics, University of Warwick, Coventry, UK E-mail: [email protected]
4
New Results in Mathematical Epidemiology and Modeling Dynamics of Infectious Diseases
Vitaly A. Stepanenko and Nikolai Tarkhanov
4.1 Formal Solutions of Epidemic Equation
Abstract. Markov stochastic processes give rise to evolution equations of the form uP D Au, where A is a stationary differential operator with polynomial coefficients. The unknown function u is the probability generating function of the process, and so it takes the values in bounded holomorphic functions on the unit polydisk in C n . As but one example of such equations we study in this paper the epidemic equation. We derive an explicit formula for the formal power series solution of epidemic equation under initial conditions. For general equations uP D Au we discuss the solvability question under initial condition u.0/ D u0 with data in holomorphic functions on the unit polydisk. Keywords. Asymptotic Expansion, Epidemic Equation, Evolution Equation, Representation of Solution, Series Solution 2010 Mathematics Subject Classification. Primary: 34M25; Secondary: 35C10, 35C203
4.1.1 Introduction By a stochastic process is meant some possible physical process in the real world that has some random or stochastic element involved in its structure. Many obvious examples of such processes are to be found in various branches of science and technology, for example, the phenomenon of Brownian motion, the growth of a bacterial colony, the transmission of infection from person to person, or the fluctuating numbers of electrons and photons in a cosmic-ray shower. In many of these examples the statistical or random variables under study, such as the coordinates of a Brownian particle, are changing with time. In practice, many processes are found to be Markovian, cf. for instance [2]. For Markov processes of the population type a wide class of partial differential equations is available for the probability generating function of the process, cf. Section 3.5 in [2]. This paper was initiated by a question of Rem G. Khlebopros concerning solutions to epidemic equation. This latter is a second order partial differential equation of
The research of the first author was supported by grant NSh-7347.2010.1 and the DFG grant TA 289/4-1.
74
4 New Results in Mathematical Epidemiology
evolution type u0t D .x 2 xy/ u00x;y C .1 x/ u0x C .y 1/ u
(4.1)
occurring in the mathematical theory of epidemics, see [6, 10, 15]. Here, , and are nonnegative constants characterizing the stochastic process, see Section 4.1.2. So the operator A on the right-hand side of (4.1) is a partial differential operator with polynomial coefficients on the plane. As usual one looks for a solution of (4.1) which is subject to the initial condition u t D0 D u0 . Note that (4.1) is not of Cauchy–Kovalevskaya type, hence this theorem is not applicable. An abstract Cauchy–Kovalevskaya theorem still applies to the epidemic equation, thus implying the existence of unique solution u valued in the scale of Gevrey classes G1 .˝I M.s// for 0 s 1, see Example 6.1 of [17]. However, this solution is not constructive. The construction of the holomorphic semigroup S t D exp .tA/ described in Section 9.1.6 of [5] is rather complicated, for it uses the resolvent of A. Thus, the general theory fails to work. To be well motivated we seek for a formal power series solution to the Cauchy problem for (4.1) with data at t D 0. An explicit formula is given in Section 4.1.3. This solution proves to converge in the topology of formal power series, as is defined by Krull [9]. Moreover, it is unique. Hence, it follows at once that the Cauchy problem for the epidemic equation has at most one analytic solution in a neighborhood of the origin with data at t D 0. However, the formal power series solution need not converge point-wise to any function in a neighborhood of the origin in R3 for each smooth initial data u0 on the plane t D 0. To test the solvability we apply separation of variables to the initial problem for (4.1) in Section 4.1.4. A function u.t I x; y/ D exp . t /u0 .x; y/ satisfies the initial problem if and only if Au0 D u0 , i.e., u0 is an eigenfunction of the operator A corresponding to an eigenvalue . When restricted to the space of polynomials of x and y, the operator A preserves polynomials of any degree, and so it has discrete spectrum, each eigenvalue being of finite multiplicity. In this way we obtain a countable linearly independent system of polynomials whose finite linear combinations lie dense in the space of admissible Cauchy data. To specify the function-theoretic setting we take into account that the unknown function u in (4.1) is the probability generating function of the stochastic process. This is an analytic function of .x; y/ in the quadrate Q2 D ¹.x; y/ 2 R2 W jxj < 1; jyj < 1º given by X pj;k .t / x j y k ; p.t I x; y/ WD j D0;1;::: kD0;1;:::
where 0 pj;k .t / 1 are transition probabilities whose sum over all j and k just amounts to 1. Hence, it follows that p.t I x; y/ extends to a holomorphic function on the bidisk D 2 D ¹.x; y/ 2 C 2 W jxj < 1; jyj < 1º and jp.t I x; y/j p.t I 1; 1/ D 1
75
4.1 Formal Solutions of Epidemic Equation
for all .x; y/ 2 D 2 . Admissible Cauchy data are therefore holomorphic functions u0 in the bidisk whose modulus is bounded by 1. We are thus led to complex dynamical systems to be treated in Section 4.1.5. It is worth pointing out that the epidemic equation was used by Gani [3, 4] in his stochastic model of AIDS.
4.1.2 Epidemic Models Let ¹X.t /º t 0 be a Markov process where X.t / is a random vector with values in Zn0 . For j D 1; : : : ; n, the j th component Xj .t / of the vector X.t / can be thought of as a number of j th type individuals at time t . Write ej for the n -dimensional vector whose j th component is 1 and the other components 0. Given a multi-index ˛ D .˛1 ; : : : ; ˛n /, we denote by pj;˛ .t / the probability for the j th type individual to have ˛1 offsprings of type 1, ˛2 offsprings of type 2, and so on, if X.0/ D ej . Obviously, X pj;˛ .t / D 1 ˛2Zn 0
for all j D 1; : : : ; n. The functions pj .t I z/ D E.z Xj .t / / D
X
pj ;˛ .t /z ˛ ;
˛2Zn 0
are called probability generating functions of the stochastic process. The vector-valued function p D .p1 ; : : : ; pn / is equivalent to the partition function of statistical mechanics. Write jzj1 D max ¹jz1 j; : : : ; jzn jº for the polycylindric norm of z D .z1 ; : : : ; zn / in C n . It is easy to see that X jpj ;˛ .t /j1 D 1 jp.t I z/j1 ˛2Zn 0
whenever t 0 and z 2 D n. Hence p D p.t I z/ is a family of holomorphic selfmappings of the unit polycylinder D n parametrized by t 0. The primitive iterative relations for Markov chains in discrete time led naturally to differential and integro-differential equations of various type, see [8]. The classification of these various equations becomes even clearer if one examines the character of the ‘forward’ and ‘backward’ equation for Markov processes in continuous time, see for instance [2, Section 3.5]. In terms of the probability generating function p.t I z/ they have the form @ @ 0 p t .t I z/ D F t; log z1 ; : : : ; log zn ; z1 p.t I z/: ; : : : ; zn @z1 @zn The appropriate equation for many Markov chains can be written down at once in terms of the possible transitions.
76
4 New Results in Mathematical Epidemiology
We now turn to a typical model associated with transmission of infection from person to person. Assume that a population has at any time t a number S.t / of individuals susceptible to a certain disease, and a number I.t / of individuals actually infected. We shall for the moment suppose that S (but not I ) can be augmented by new susceptibles entering the population from outside. For example, in the case of measles, which is one of the most convenient epidemic diseases available for study, children are constantly growing up into the critical age period. We assume the following idealized scheme of random transitions, where the probability generating function variable x corresponds to I and y to S. The transition xy 7! x 2 has rate and is represented by the operator @2 =@x@y. The transition x 7! 1 has rate and is represented by @=@x. The transition 1 7! y has rate and is represented by the operator 1. The equation for the probability generating function p.t I x; y/ is correspondingly 00 p 0t D .x 2 xy/ px;y C .1 x/ px0 C .y 1/ p;
(4.2)
see [2, p. 124]. A large part of the literature has been devoted to the ‘deterministic’ form of the equation, see [6,15] (cf., however, [10]). For direct constructions along stochastic lines we refer the reader to [1–4], etc. We restrict the discussion to an isolated susceptible population with ¤ 0, where the infected individuals are removed in due course. Then D 0.
4.1.3 Formal Solutions Let p.t I x; y/ be a probability generating function of the epidemic stochastic process. Write u.t I x; y/ for either of the two components of p.t I x; y/. Then u.t I x; y/ is required to satisfy u0t D .x 2 xy/ u00x;y C .1 x/ u0x u D u0
for t 0; for t D 0;
(4.3)
where u0 is a smooth function of .x; y/ 2 Q2 . From what has been said in Section 4.1.2 it follows that for the existence of a solution to (4.3) with desired properties it is necessary that u0 be actually the restriction to Q2 of a holomorphic function of modulus 1 in the bidisk D 2 . We are thus interested in the initial or Cauchy problem for solutions of epidemic equation (4.1). We first look for a solution of the problem in the form of formal power series in t , x, y X t i xj yk u.t I x; y/ D : (4.4) ui;j;k iŠ j Š kŠ i D0;1;::: j D0;1;::: kD0;1;:::
77
4.1 Formal Solutions of Epidemic Equation
On substituting this series into (4.3) and equating the coefficients of the same powers t i x j y k on the both sides of the equality we arrive at the system ui C1;0;k D ui;1;k ;
ui C1;1;0 D ui;2;0 ui;1;0 ;
ui C1;1;k D ui;2;k .k C /ui;1;k ;
ui C1;j;0 D ui;j C1;0 j ui;j;0 C j.j 1/ui;j 1;1
valid for all i; j; k 0. These equations can be actually written in the unified recurrent form ui C1;j;k D j.j 1/ ui;j 1;kC1 j.k C /ui;j;k C ui;j C1;k ;
(4.5)
as is easy to check. To solve (4.5) we represent it schematically in the form of “Newton polyhedron” or “Newton triangle” (cf. Figure 4.1). k
( j 앥1, k 쎵1)
( j 앥1, k )
( j , k 쎵1)
( j, k )
( j 쎵1, k )
j
Figure 4.1. The “Newton triangle” 1 .
Expanding the initial data u0 .x; y/ as Taylor series around the origin in R2 , we obtain X @j Ck u0 xj yk : .0; 0/ u0 .x; y/ D j Š kŠ @x j @y k j D0;1;::: kD0;1;:::
The initial condition now implies u0;j;k D
@j Ck u0 .0; 0/ @x j @y k
for all j; k 0, and so we compute u1;j;k D j.j 1/ u0;j 1;kC1 j.k C /u0;j;k C u0;j C1;k :
(4.6)
78
4 New Results in Mathematical Epidemiology
Using (4.5) with i replaced by i C 1, we compute ui C2;j;k . This yields immediately the formula ui C2;j;k D j.j 1/2 .j 2/2 ui;j 2;kC2 j.j 1/..j 1/ C .2j 1/.k C //ui;j 1;kC1 C 2j 2 ui;j;kC1 C j 2 .k C /2 ui;j;k .2j C 1/.k C / ui;j C1;k C 2 ui;j C2;k for all i; j; k 0. In particular, we get u2;j;k D j.j 1/2 .j 2/2 u0;j 2;kC2 j.j 1/..j 1/ C .2j 1/.k C //u0;j 1;kC1 C 2j 2 u0;j;kC1 C j 2 .k C /2 u0;j;k .2j C 1/.k C / u0;j C1;k C 2 u0;j C2;k (4.7) for all j; k 0. The expression is cumbersome, however, its “Newton polyhedron” 2 is rather simple (cf. Figure 4.2). k ( j 앥2, k 쎵2)
( j , k 쎵2)
( j 앥1, k 쎵1)
( j 앥2,k )
( j 앥1,k )
( j , k 쎵1)
( j ,k )
( j 쎵1,k )
( j 쎵2,k )
j
Figure 4.2. The “Newton triangle” 2 .
Arguing in this way we obtain a formula for the coefficients ui;j;k in the general case. It looks like X j 0 ;k 0 ui;j;k D ci;j;k .; / u0;j Cj 0 ;kCk 0 (4.8) k 0 D0;:::;i j 0 Dk 0 ;:::;i 2k 0
j 0 ;k 0
for i; j; k 0, where ci;j;k are polynomials of and of degree i. On substituting these expressions into (4.4) we arrive at a formal power series satisfying initial problem (4.3), see [9].
79
4.1 Formal Solutions of Epidemic Equation
Theorem 4.1. For any formal power series u0 at 0 2 R2 , the epidemic equation has exactly one formal power series solution u at 0 2 R3 satisfying u t D0 D u0 . It is worth mentioning that (4.3) does not satisfy the hypothesis of the Cauchy– Kovalevskaya theorem, for the principal homogeneous part of the equation degenerates on the planes ¹x D 0º and ¹xy D 0º. The problem is fit to analysis on manifolds with edges. Interchanging the sums in (4.4), (4.8) leads in general beyond formal power series, implying X u.t; x; y/ D u0;j;k uj;k .t; x; y/ j D0;1;::: kD0;1;:::
where uj;k are formal power series of t , x, y satisfying @ t uj;k D .x 2 xy/ @x @y uj;k C .1 x/ @x uj;k uj;k D
j
k
x y j Š kŠ
for t 0; for t D 0:
(4.9)
By Theorem 4.1, the initial problem (4.9) has a unique solution for all j; k 0. In particular, for j D 0 this is yk u0;k .t; x; y/ D : kŠ
4.1.4 Separation of Variables The formal power series uj;k .t; x; y/ are actually summed up to exponential polynomials of the form X exp .j 0 .k 0 C /t /pj 0 ;k 0 .x; y/; (4.10) j 0 Ck 0 j Ck
where pj 0 ;k 0 are polynomials of x and y of degree j 0 C k 0 . Example 4.2. An immediate computation shows that the coefficient of u0;1;1 sums up to
.y 1/ C 2.x 1/e t C .x 1/2 e 2t u1;1 .t; x; y/ D 1 C C
2 2 2
2 e .C/t ; x C x y 2 C xy
C
2 which depends meromorphically on and .
80
4 New Results in Mathematical Epidemiology
The Cauchy problem (4.3) admits separation of variables. To prove this, we look for a solution to the epidemic equation of the form u.t; x; y/ D e.t /f .x; y/. Substituting u into the equation implies Af e0 D ; D e f 00 C .1 x/fx0 : Hence it for some constant 2 R, where Af WD .x 2 xy/fx;y follows that e.t / D exp . t / up to a constant factor and f is an eigenfunction of A corresponding to the eigenvalue . Let Pn be the space of all polynomials of x and y whose degree is n. Obviously, the operator A maps Pn to Pn for all n D 0; 1; : : : . Since Pn is of finite dimension, the mapping A W Pn ! Pn has a finite number of eigenvalues of finite multiplicity depending on n.
Theorem 4.3. The eigenvalues of A W Pn ! Pn are given by j;k D j.k C / with nonnegative integers j and k satisfying j Ck n. They are simple unless j D 0, in which case each polynomial of degree n depending only on y is an eigenfunction of the operator A. Using Theorem 4.3 we readily establish decomposition (4.10) with pj 0 ;k 0 being an eigenfunction of the operator A W Pj Ck ! Pj Ck corresponding to the eigenvalue j 0 ;k 0 .
4.1.5 Solvability of General Equations In this section we apply methods of semigroup theory to establish the existence of solution u.t / WD u.t; z/ to (4.3) in the class of functions on the semiaxis t 0 taking their values in analytic functions of z 2 D 2 . More precisely, we consider a general Cauchy problem uP D Au for t 0; (4.11) u.0/ D u0 for t D 0; where A is a partial differential operator with polynomial coefficients on Rn independent of t , and u0 .z/ is a given analytic function in the polydisk D n . By the solution of (4.11) is meant any continuous function of t 0 with values in analytic functions on D n, which satisfies Zt u.t / D u0 C A u.s/ ds 0
for all t 0. Since A is a stationary operator, we get a formal solution to (4.11) by using the exponential of A, i.e., 1 k X t Ak u0 u.t / D u0 C kŠ kD1
81
4.1 Formal Solutions of Epidemic Equation
n for t 0. Whether or not this series converges in some polydisk DR of radius R > 0 n and center 0 in C for all t with jt j < T , depends on the asymptotic behavior of jAk u0 j for large k. In order to get asymptotic results, we put some restrictions on the coefficients of A. Write X A˛ .z/@˛ ; AD j˛jm
A˛ .z/ being polynomials of z 2 C n . We require the degree of A˛ to be j˛j. Lemma 4.4. For each R > 0 there is a constant CR > 0, such that for any partition 0 < rk < < r1 < r0 1 one has sup jzj1 0:
x;y!1
Note that this expression could be infinite. Table 6.1 shows examples of virus spread terms allowed by the above requirements. The meaning of the “fast” and “slow” is explained later in this section. Note that the most frequently used infection term, yG.x; y/ D xy, does not satisfy Assumption 5 above. This term corresponds to complete mass action, and can be viewed as lim!0 .1C/xy , see the first term in Table 6.1. xCyC Unless cells divide exponentially (F D 1), there is at least one spacial scale defined by the function F which is related to the colony size at which the growth slows down and deviates from exponential. Let us denote the corresponding quantity s t , where the subscript t stands for “tumor”. The quantity s t can be obtained from each particular function F . For example, in the case of linear growth, s t . The units of the quantity s t are the same as the units of x, which can be volume, mass or the number of cells. 1=3 The (linear) spacial scale is thus related to s t . Note that in the general case, the
127
6.1 Axiomatic Modeling in Life Sciences Table 6.1. Some examples of virus spread terms.
G.x; y/
Hx .y/
Hy .x/
Fast or slow?
x=.x C y C /
y y
x x
Fast Fast
const. p y
const. p x p x
Slow Slow
px p p xCyC1 . xC yC 2 / x .xC1 /.yC2 / px p p p . xyC1 /. xC yC 2 / x p x.yCc/CxC
p
y
Slow
function F could have many parameters corresponding to different scales on which the growth law changes, but in many intuitive cases we envisage a growth which starts off as exponential and then deviates from it. Therefore, we can think of the quantity s t as the colony size at which cancer growth first starts to slow down. In a similar way we can define the value sv , where v stands for “virus”. This is defined as a characteristic size at which the infection spread becomes slower than exponential. To clarify this in the context of our system, let us consider the equation yP D ˇxy ay and assume that the pool of susceptible cells is large and constant. We can see that in this case the number of infected cells grows exponentially as long as ˇx > a. This may be a good approximation if the system size is small, but for larger values of x and y this cannot hold anymore. The scale at which the growth of infected cells deviates from exponential is sv . In what follows we will present a rigorous analysis of system behavior for different types of functions G and F . An intuitive understanding of these results can often be achieved by thinking about the two characteristic scales, s t and sv , and how they trade off and influence the dynamics of disease spread and treatment. 6.1.3.2 Equilibrium Solutions and Two Classes of Virus Spread The fixed points of system (6.2)–(6.3) are given by .0; 0/ and all the solutions of the equations xF .x C y/ D ay; a G.x; y/ D : ˇ
(6.4) (6.5)
The trivial point .0; 0/ has eigenvalues F .0/ and a and is thus a saddle. The number of solutions of equations (6.4)–(6.5) depends on the particular shapes of the functions F and G. In order to find the nontrivial equilibria, we solve equation (6.4) to find y.x/, and then substitute it into equation (6.5). The equilibria are thus defined by the roots of equation G.x; y.x// D a=ˇ: (6.6)
128
6 Axiomatic Modeling in Life Sciences with Case Studies
From equation (6.4) we can see that y.0/ D 0. We know from Assumption 2 on the function G that G.0; 0/ D 0. The next step is to study the limiting behavior of G.x; y.x// for large values of x. For that we need to know the behavior of y.x/ for large x. We have from equation (6.4): lim y.x/ D lim xF .x/=a:
x!1
x!1
There are three cases. (i) For a linear type growth, we have limx!1 xF .x/ D c0 , a nonzero constant. In this case, limx!1 y.x/ D c0 =a, with 0 < c0 < 1. (ii) For any growth F which is superlinear but slower than exponential, we have limx!1 y.x/ D 1, but limx!1 y=x D limx!1 F .x C y/=a D 0, that is, y increases slower than x. (iii) Finally, for exponential growth, F D 1 and y.x/ D x=a, such that y.x/ x for large values of x. From the biological assumptions on the function G.x; y/ listed above, it follows that for any of the possible dependencies y.x/, the function G.x; y.x// approaches a finite limiting value as x ! 1, and this value can be zero or nonzero. To prove this statement we note that from Requirement 6.1.3.1, limx!1 G.x; y/ < 1 for constant values of y. For nonconstant values of y we only need to show that the limit is finite in the case where y ! 1. But from Requirement 6.1.3.1 we deduce that if limx!1 y.x/ D 1, then limx!1 G.x; y.x// limx!1 G.x; const / < 1. This completes the proof. We will use the exponential case F D 1 to separate all functions G into two classes. Let us define yexp .x/ D x=a; which is the solution y.x/ of equation (6.4) under the assumption that F D 1. Further we introduce the notation Gexp .x/ D G.x; yexp .x//: If limx!1 Gexp .x/ D 0, then we will regard the virus spread to be slow, and if 1 1 limx!1 Gexp .x/ D Gexp > 0, with Gexp < 1, we will regard this as fast spread. Examples of fast and slow virus spread terms are given in Table 6.1.1 For now we will concentrate on mathematical properties of these two classes of functions. Note that for all laws of cancer growth slower than exponential, we have G.x; y.x// Gexp .x/. This is because y.x/ yexp .x/, and G is a decreasing function of y. In the next sections we explore the mathematical consequences of the virus term being fast or slow, and show how changes in the cancer growth term affect the dynamics. Fast Virus Spread. In this case, the function Gexp .x/ is either a monotonically increasing function, or it can attain one or more local extrema before converging to its 1 , see Figure 6.4 (a). We will refer to Figure 6.4 as nonzero horizontal asymptote, Gexp 1
Note that the mass-action virus spread term, which corresponds to G.x; y/ D x, can be classified as “super-fast”, because in this case Gexp .x/ diverges as x ! 1.
129
6.1 Axiomatic Modeling in Life Sciences
G(x,x /a)
c min
(a)
⬁ Gexp
x
c max
Slow spread
G(x,x /a)
Fast spread
c max
b increases
b increases
c min
(b)
x
Figure 6.4. The shape of the function Gexp .x/ and the number of equilibria as a function of ˇ. (a) Fast virus spread. (b) Slow virus spread.
a graphical way of solving equation (6.6) by plotting the left-hand side and the (constant) right-hand side as functions of x for different values of the parameter ˇ. The number of intersections of Gexp .x/ with the constant function ˇa equals the number of roots in equation (6.6). For a monotonically increasing Gexp , low values of ˇ correspond to zero roots in equation (6.5), which means that the cancer growth will continue indefinitely. As ˇ 1 crosses a critical value defined by a=Gexp , there is one root. The value of x at this root 1 drops as ˇ increases (this is due to the convergence of Gexp to an asymptote, Gexp ). For large values of ˇ, the value of x at the intersection tends to zero. If the function Gexp has an absolute maximum at point xexp (see Figure 6.4 (a)), then an initial increase of ˇ above a=cmax with cmax D Gexp .xexp / results in the appearance of two roots. Additional local extrema will result in appearance and disappearance of pairs of roots. However, as ˇ increases through the second threshold given by a=c2 , only one (the lowest) root remains. The value of c2 is given by the lower of the values 1 1 ¹Gexp ; cmin º, where Gexp is the value of the horizontal asymptote, and cmin is the value at the lowest local minimum, if it exists. In all cases, for sufficiently large values of ˇ, there will be only one root in equation (6.5). Introducing other cancer growth laws can increase the limiting value of G thus decreasing the value of ˇc . In the case of a monotonically increasing Gexp , there will be no qualitative change. If Gexp is one- or multiple-humped, the hump(s) may disappear. Whether this qualitative change happens depends on the relative size of the two spacial scales involved. The first scale is defined by the location of the maxima of Gexp and is related to the virus spread scale, sv . The second scale is given by the size, s t , at which cancer growth law starts to deviate from exponential. Once s t sv (or it is smaller), the limiting value of G becomes sufficiently large such that the “hump” disappears. Figure 6.5 illustrates the case where the function Gexp is monotonically increasing. We use a particular law of virus spread coupled with three different laws of cancer
130
6 Axiomatic Modeling in Life Sciences with Case Studies 1.0 linear
G(x,y(x))
0.8
surface exponential (limited)
0.6
exponential
0.4
0.2 W 0 0.01
0.1
1
10
100
1000
104
x
Figure 6.5. Fast virus spread. The function G.x; y.x// (equations (6.4)–(6.5)) for the particular choice of the virus spread law (G.x; y/ D x=.x C y C 1/) and three different laws of cancer growth: exponential, surface growth and linear growth. The solid lines correspond to the unlimited cancer growth; the dotted lines to a growth up to a given size, W . The parameters are: a D 1, D 10 and W D 104.
growth: exponential, surface growth and linear growth, see the three solid lines in the figure. In all cases, the function is monotonically growing with a horizontal asymptote. The slower the cancer growth, the higher the asymptote and the lower the threshold of ˇ which corresponds to the possibility of treatment success. It is useful to investigate the value of x at the equilibrium as a function of ˇ, for different values of s t . Suppose that the graph of G.x; x=a/ is a monotonically growing 1 . Suppose that the cancer growth function of x which approaches a limiting value, Gexp slows down around the scales near s t . So near x s t , the function G.x; y.x// devi1 ates from the horizontal line Gexp , and starts growing toward a different, and higher 1 , see Figure 6.6 for a particular horizontal asymptote, which we will call G1 > Gexp example. The phase diagram as ˇ increases can be seen as follows: for ˇ < a=G1 , there are no roots. As ˇ crosses the first threshold, a=G1 , one root appears. The value of x at this equilibrium decays rapidly from infinity to values around s t , as ˇ grows (because of the fact that G1 is a horizontal asymptote). Then as ˇ grows through its 1 second threshold, a=Gexp , the value of x at equilibrium drops from s t to values of order 1. The second transition is sharp if the following is satisfied: s t x1 , where x1 1 1 is the value of x such that jG.x1 ; y.x1 // Gexp j D jG1 Gexp j. In other words, x1 is the value of x where the function G.x; y.x// comes near the horizontal line defined 1 1 1 (“near” means that it is at least as close to Gexp , as Gexp is to G1 ). If s t x1 , by Gexp then the function G has a significant interval in x where it approaches its horizontal 1 , before it deviates from it to start growing toward G . This guarantees asymptote, Gexp 1 a threshold effect.
131
6.1 Axiomatic Modeling in Life Sciences G1
1.0
h =100
G(x,y(x))
0.8
h =10
⬁ Gexp
1
h =102
0.6
h =103
h =104 h =105
h =107 h =106
0.4
0.2 x~s v 0 0.01
1
x~s t 100
10
4
10 6
10 8
x
Figure 6.6. The dependence of the equilibrium on ˇ and s t . We use the model with F D =. C x C y/ (thus, s t is defined by ) and G D x=.x C y C /. The function G.x; y.x// is plotted against x for different values of . The dotted vertical lines indicate the scales of interest: the leftmost such line corresponds to x sv , and the rest of the lines to x s t for different values of . The other parameters are: a D 4, D 1.
We conclude that for all cancer growth laws and for all functions G corresponding to fast virus spread, increasing ˇ beyond a threshold leads to the existence of only one equilibrium, whose value correlates negatively with the infectivity, ˇ. For large enough s t , there is a “threshold” effect, such that the size at equilibrium decreases very sharply as ˇ approaches a defined value. In biological terms, this class of models is always characterized by a viral replication rate threshold beyond which oncolytic virus therapy results in the elimination of the cancer. Slow Virus Spread. In this case, the function Gexp .x/ is a one- or a multiple-humped function, which for large x decreases to zero (Figure 6.4 (b)). In the case of an exponential growth, the bifurcation diagram looks as follows. As before, small values of ˇ correspond to no equilibria (zero roots in equation (6.5)). As we increase ˇ, a pair of roots appears after the threshold given by a=cmax , where cmax D G.xexp ; y.xexp //, the maximum value of Gexp . As ˇ increases, other roots may appear and disappear in pairs. Since the function Gexp has zero as its horizontal asymptote, there will be two equilibria for all values of ˇ larger than a=cmin , where cmin is the value of G at its lowest local minimum in the case that such a minimum exists; cmin D cexp otherwise. Two roots for large values of ˇ is a universal feature of the systems with a slow virus spread term. Let us next consider how non-exponential laws of cancer growth modify this picture. In the case of a linear growth, y.x/ ylin .x/ converges to a nonzero constant, c1 , and we have limx!1 Glin .x/ D limx!1 G.x; c1 / D Gx .c1 / D c2 < 1, which is a nonzero constant. Depending on the value of s t , Glin can be a one- or a
132
6 Axiomatic Modeling in Life Sciences with Case Studies
multiple-humped function, or (for s t similar or smaller than xexp ) it will become a monotonically increasing function of x. In either of these cases, there exists a finite value of ˇ given by a=c2 such that for all values of ˇ larger than this value, there is only one root in equation (6.5). The following approximate estimate takes place. Let us suppose that the function Gexp .x/ has one local maximum. The position of the maximum is defined by the only spatial scale present in this case, which is sv , that is, the scale on which the virus spread slows down. Therefore, roughly for s t sv , treatment becomes possible. In other words, the cancer must slow down on spatial scales comparable or lower than the scale of virus spread in order to yield successful treatment. By changing the function F , we make the cancer growth slower than exponential. In some cases (e.g., the case of linear growth described above), this will lead to the horizontal asymptote of G.x; y.x// becoming nonzero. In general, whether this happens depends on the functional forms of both G and F . For growths faster than linear but slower than exponential, we have y ! 1 as x grows, but y D o.x/, i.e., it grows slower than x. In some cases the function G will retain a zero asymptote (e.g., in the case where G D x=.x C 1/=.y C 1/ and a surface growth p law for F ). In other cases it will acquire a nonzero limit (e.g., with G D x=.x C 1 C x.y C 1// and a surface growth law for F ). Two particular cases are illustrated in Figure 6.7 (a,b), solid lines. We can see that in (a), where we took G D a=.x C1/=.y C1/, both the exponential and surface cancer growth laws lead to a one-humped function G with a zero asymptote, which means that no matter how high ˇ is, there are two roots in the system which corresponds to the existence of a saddle point and a possibility for the system to escape to infinity. A linear cancer growth leads to a one-humped function with a nonzero asymptote for
G(x,y) = G(x,y(x))
0.8
1.0
x (x+1)(y+1)
G(x,y) = 0.8 G(x,y(x))
1.0
linear, h =1
0.6 0.4
linear, h =5
0.2 0 0.01 (a)
exponential
0.1
1
10 x
0.6 surface, h =20 0.4
1000
surface, h =2
0.2 exponential
surface 100
x x+1+ x(y+1)
10
0 0.01
4
(b)
1
100
104
W
x
Figure 6.7. Slow virus spread. The function G.x; y.x// (equations (6.4)–(6.5)) for two particular choices p of the virus spread law: (a) G.x; y/ D x=.x C 1/=.y C 1/ and (b) G.x; y/ D x=.x C 1 C x.y C 1//. Different laws of cancer growth are implemented: exponential, surface growth and linear growth (in (a), with two values of , D 5 and D 1). The solid lines correspond to the unlimited cancer growth; the dotted lines to a growth up to a given size, W . The other parameters are: a D 1, W D 104 in (a) and W D 105 in (b).
133
6.1 Axiomatic Modeling in Life Sciences
larger value of s t , and to a monotonically increasing function for smaller s t , such that for ˇ high enough, only one root exists which corresponds to cancer control. p Figure 6.7 (b) presents a different virus spread term, G D x=.x C 1 C x.y C 1//. We can see that for the surface growth, the particular function G presented in Figure 6.7 (b) acquires a nonzero limit. For this system, the growth of virus is slow (Gexp tends to zero), but if surface cancer growth is implemented, this results in a nonzero asymptote. In this case we can say that the surface cancer growth is sufficiently slow to warrant successful treatment given the particular mode of viral spread. Bounded Tumor Growth. In all the considerations above we performed our analysis under the assumption of an unbounded cancer growth. Next, we consider a growth term which becomes zero in a finite time. We assume that the growth starts off exponential (F .0/ D 1/ and at some size, s t , it slows down (we do not exclude the possibility that s t 1, that is, the growth becomes slower than exponential right away). Then there exists another characteristic size, W s t such that the growth slows down further and stops. In particular, we define W such that F .W / D 0. Note that if s t W then there is no need to introduce the two scales, s t and W . Therefore, the assumption s t W must hold. Now, we can see that the analysis above holds on the scales intermediate between s t and W , such that s t x W . In Figures 6.5 and 6.7, the function G in the case of growth limited by a size W is plotted with dashed lines. For values x W , the shape of the curve G.x; y.x// is similar to that obtained for the corresponding unlimited growth. As x grows far beyond s t and approaches W , the function G approaches G.W; 0/. If, for the unbounded growth, the limiting value of the G function is c2 , we have in general G.W; 0/ c2 . In other words, the curve G takes an upward turn in the vicinity of x D W . This means that equation (6.5) acquires an additional root corresponding to the cancer growing to its carrying capacity, W . In the systems with unrestricted growth this was equivalent to an unlimited growth of the cell population. It is useful to note the following: in systems with a limited size, the function G.x; y.x// is always bounded away from zero. Therefore, strictly speaking, we can always find a threshold value ˇ t such that for ˇ > ˇ t , only one root is present. However, if W sv , such values of ˇ are very large compared to ˇc , and in most cases are probably not achievable. 6.1.3.3 Stability Properties of the Equilibria Let us suppose that .x0 ; y0 / with x0 0 and y0 0 is a solution of system (6.4)– (6.5), and consider its stability. The Jacobian of the system can be written as a 2 2 matrix, ¹mij º, with m11 D F C x0 F 0 ˇy0 Gx ; m21 D ˇy0 Gx ;
m12 D x0 F 0 ˇ.G C y0 Gy /;
m22 D ˇy0 Gy ;
134
6 Axiomatic Modeling in Life Sciences with Case Studies
where the functions F and G and their derivatives are evaluated at the point .x0 ; y0 /: Gx D @G=@xjxDx0;yDy0 , and similarly with Gy and F 0 . The equilibrium is stable if the following two conditions hold: m11 C m22 < 0;
(6.7)
m11m22 m21 m12 0:
(6.8)
Saddle Points. Condition (6.8) is equivalent to the positivity of the derivative of G in the direction defined by the implicit relation ya D xF .x C y/, equation (6.4). The latter expression is one of the two equations that define the equilibria. Differentiating it, we get: ady D F dx C xF 0 .dx C dy/. The directional derivative is equal to .Gx dx C Gy dy/ D ŒGx .a F 0 x0 / C Gy .F C x0 F 0 /=.a F 0 x0 /. The denominator is positive, so this expression has the same sign as the left-hand side of condition (6.8). The equilibria are defined by the roots of equation (6.6). From equation (6.4) we can see that y.0/ D 0. We know from Assumption 2 on the function G that G.0; 0/ D 0. Therefore, all the odd roots of equation (6.6) will correspond to a positive, and the even ones to a negative slope of the left-hand side of equation (6.6). This means that all even equilibria are saddles. To prove this we note that in such cases, the directional derivative is negative, condition (6.8) is violated, and therefore there are two real eigenvalues of opposite signs. On the other hand, an odd root can be either a sink, a source or a spiral (stable or unstable). This is because for such a root, condition (6.8) is always satisfied, so that we could either have complex eigenvalues, or real roots of the same sign (positive or negative). In the presence of a saddle, an infinite outcome (corresponding to an unchecked cancer growth) is possible. For large values of x, we have xP D xF 1 ˇyG 1 .y/; 1
yP D y.ˇG .y/ a/;
(6.9) (6.10)
where limx!1 G.x; y/ D G 1 .y/ and limx!1 F .x; y/ D F 1 . The growth of y becomes negative as y increases if limy!1 G 1 .y/ D 0, which suggests that y settles to a finite value which makes the right-hand side of equation (6.10) zero, such that the outcome .1; const / is observed. If limy!1 G 1 .y/ D const > 0, then for large enough values of ˇ we can have an outcome of the form .1; 1/. Stability of the Internal Equilibrium. Let us show that for large values of ˇ, there will be an equilibrium, .x0 ; y0 /, such that limˇ !1 x0 D 0 and limˇ !1 y0 D 0. We call this equilibrium the “internal equilibrium”. Its existence follows from equation (6.6) and the properties of the function G. We know that y.0/ D 0, and also that G.0; 0/ D 0. It is also clear that there is an interval of x, Œ0; , where G is a growing function. Therefore, by continuity, for all ˇ a=G. ; y. //, there will be a solution of equation (6.6). From monotonicity of the function G, the value of x at the intersection with a=ˇ decays with ˇ. From equation (6.4) it follows that there is an interval of
6.1 Axiomatic Modeling in Life Sciences
135
x, Œ0; 1 , where y is a growing function of x. Therefore, we conclude that for large enough ˇ, there is an equilibrium, .x0 ; y0 /, whose values x0 and y0 decay with ˇ and approach 0 in the limit ˇ ! 1. Let us evaluate the left-hand sides of inequality (6.7) for small values of x0 and y0 . First, we approximate the curve y.x/ by its Taylor series for small values of x0 : y0 D F x0 =a C .a C F /F 0 .x0 =a/2 C .a C F /..F 0 /2 C 1=2.a C F /F 00 /.x0 =a/3 C OŒ.x0 =a/4 ;
(6.11)
where the function F and its derivatives are evaluated at 0. This expression follows from expanding both sides of equation (6.4) in Taylor series in terms of x0 and y0 , solving for y0 and using a Taylor expansion of this expression. Next, we express ˇ from equation (6.5): ˇ D a=G.x0 ; y0 /. Now, let us multiply the left-hand side of inequality (6.7) by G.x0 ; y0 /, and use expression (6.11). Expanding in terms of small x0 , we obtain: G.x0 ; y0 /.m11 C m22 / D .F 0 Gx C Gxy Gxx =2/x02 1 1 C ..a C 1/F 00 Gx C .a C 2/F 0 Gxy C Gxyy C ..a 1/Gxxy F 0 Gxx / a 2 1 3 4 (6.12) aGxxx /x0 C O.Œx0 /: 3 Here the functions F and G and their derivatives2 are evaluated at zero. To derive the above expression we also used the fact that the function G and its y-derivatives are equal to zero if x D 0, and F .0/ D 1. Next, we evaluate the left-hand side of inequality (6.8) in the same manner: G.x0 ; y0 /.m11 m22 m21 m12 / D aGx x0 C.aF 0 Gx C2Gxy CaGxx /x02 CO.Œx0 3 /: We can see that the expression above is always positive, so condition (6.8) is satisfied for large enough values of ˇ. Condition (6.7) however is not necessarily satisfied, as follows from expression (6.12). The expansion can be positive or negative, depending on the particular properties of the functions F and G. Later we will encounter examples where the internal equilibrium changes stability depending on the model parameters. Next, we would like to investigate whether the eigenvalues are real or complex. For the eigenvalues to have an imaginary part, the following condition has to be satisfied: .m11 m22 / C 4m12m21 < 0:
2
(6.13)
Here we assume that the functions F and G are differentiable at zero. Non-differentiable functions are handled similarly by using generalized expansions.
136
6 Axiomatic Modeling in Life Sciences with Case Studies
Performing a Taylor expansion of the above expression for small values of x0 and y0 at internal equilibrium, we obtain: G.x0 ; y0 /..m11 m22 / C 4m12 m21/ D 4aGx2x02 2Gx .2aF 0 Gx C 6Gxy C 3aGxx /x03 C O.Œx0 4 /: We can see that this quantity is always negative. Therefore, we conclude that the internal equilibrium has complex eigenvalues for sufficiently large values of ˇ. 6.1.3.4 Discussion of the Results for the Virus Spread We have preformed a detailed mathematical analysis of a modeling approach that investigates the dynamics of oncolytic viruses in a general setting, going beyond specific models in which results can depend on unknown and arbitrarily chosen mathematical formulations. This is very important if the aim is to generate predictive models, because the dynamics of the cancer and virus populations, and thus the correlates of successful therapy, can be heavily influenced by those unknown and arbitrary mathematical terms. We found that all possible models can be divided into two categories with fundamentally different behaviors, which we characterized mathematically. With the analysis presented here, the framework can be used for model selection and validation purposes when applied to detailed experimental data that document the dynamics of the cancer cell populations over time during treatment with oncolytic viruses. Besides providing a general tool to study oncolytic virus dynamics, our work has also given rise to new biological insights about the correlates of success in oncolytic virus treatment. Based on both previous experimental and theoretical work, it is believed that increasing the rate of virus replication will improve the chances of therapy success. In terms of theory, this notion is based on models that consider mass action where any virus can reach any cell for infection, regardless of spatial constraints. This is clearly an unrealistic assumption that can lead to unrealistic dynamics. In fact, spatial effects play a large role in the way viruses spread. Here we investigated different types of virus spread (without including space explicitly in the model). We found that the model behavior can be divided into two categories. These categories phenomenologically correspond to different types of tumor environments. In the first category (which we called “fast” virus spread), infected cells can be considered interspersed with uninfected cells, such that most infected cells will have uninfected cells in their neighborhood to which the virus can be passed on. This might correspond to non-solid tumors. In this class of models, there is a clear viral replication rate threshold beyond which the outcome of the dynamics corresponds to cancer extinction in the model. Hence, in this setting, the notion remains true that increasing the viral replication rate threshold will promote success. In the second model category (which we called “slow” virus spread), infected cells can be considered to be clustered together. This could correspond to solid tumors. In this case, only the infected cells at the surface of the cluster can pass on the virus to neighboring cells. Infected cells located in the center of the
6.1 Axiomatic Modeling in Life Sciences
137
cluster are unlikely to pass on their virus because they are surrounded only by other infected cells and not by uninfected cells. The larger the number of infected cells, the smaller the proportion of cells that can pass on the virus. This makes successful therapy more difficult to achieve, especially when tumor growth only saturates at larger tumor sizes. In this case, the outcome of the dynamics depends on the initial conditions. If the number of cancer cells lies above a threshold, the cancer cell population will outrun the virus population, and therapy will fail. This creates problems because there is only a narrow window between the size at which the cancer is detected (about 1010 cells) and the size at which the cancer is lethal (about 1013 cells). In this case, increasing the rate of viral replication even to unrealistically large values will not significantly promote treatment success. Successful treatment is only possible if tumor growth saturates at relatively low tumor sizes. In this case, a parameter region exists in which tumor control is the only outcome. If tumor growth saturates at even lower sizes, this effect disappears altogether and tumor control is the only outcome. This suggests that with solid tumors, the best chance is to combine oncolytic virus therapy with conventional treatment approaches which will limit tumor growth to a certain degree and allow the virus to gain the upper hand over the cancer. Previous data indicate that a combination of chemotherapy with virus therapy tends to be more effective than virus therapy alone. In summary, studying constraints in the virus spread term, as well as the cancer growth term, has allowed us to gain new insights into the correlates of successful virus therapy. In particular, our results highlight potential difficulties in the treatment of solid tumors with virus therapy alone, even if the virus replicates with a relatively fast rate. An advantage of our approach is its consistency and generality. A disadvantage is the fact that the less information we specify about the system, the less we can say about its behavior. For example, if we employ particular functional forms for functions F (the cancer growth term) and G (the virus spread term) and thus define the system of ODEs completely, then we can describe its behavior to any degree of precision, given the set of parameters and initial conditions. On the other hand, if only some (but not all) properties of the functions F and G are known, then the best we can hope to achieve is to describe the phase space in some general terms. A very exciting result of this particular work is that despite a high degree of generality of the system, we were still able to generate a set of predictions about the system’s behavior, both the dynamics and the long-term states. Our approach is necessarily limited by the choice of ODEs as our “toolbox”. By restricting ourselves to this framework we make it impossible to take into account explicitly many essential properties of biological systems such as random fluctuations and spatial constraints. As mentioned before, some of the effects of spatial interactions are mimicked by the choice of rate terms F and G; however this is only a crude approximation whose validity is a topic of a separate investigation.
138
6 Axiomatic Modeling in Life Sciences with Case Studies
6.1.4 Conclusions We have reviewed two case studies to illustrate the advantages and drawbacks of the axiomatic modeling approach. For other examples, we refer to [50,55]. The advantage of our method is the clarity of the approach: by cataloging the exact assumptions that we make on the system’s behavior, we make sure that all the results are a direct consequence of these assumptions, and not an artifact of arbitrary modeling choices made for convenience, simplicity, or any other considerations. The necessary limitation of this approach is that the analysis cannot always be carried out, unless we impose further constraints on the modeling choices. This limitation, however, reflects mostly our lack of knowledge of the underlying system, which is the reason for the model being too “general”. The more we know about the biological system, the more concrete details can be put in the model, and the more specific predictions can be made by means of the mathematical analysis. It can be argued that the examples presented in this review are not “axiomatic”, but rather represent a different level of generality from say the model in equation (6.1). We would like to point out, however, that it is not generality per se that we are advocating. There is nothing wrong with a model being very concrete and containing specific functional forms. The point is that the model has to be justifiable biologically. If we know that a certain response or a control loop has a particular form (say, a response behaving like the Hill function), then this is the form that should be used in the model, and the more specific the model can be made, the better! However, if we only know that one variable positively correlates with another, then we should use a general increasing function and try to derive our conclusions from that. Of course, the models presented here also make some implicit assumptions. How do we justify the usage of ODEs? How do we justify the absence of stochasticity? These are important questions. However, using ODEs brings in a well-known and well-understood set of limitations which can be discussed. These are standard modeling approaches, and by using them we subscribe to a well-known set of handicaps. But even within this limited realm of modeling (or within any other limited modeling technique), one must strive to match the available information about the underlying biological system with the model design. One encouraging part is that despite very limited biological knowledge, in the systems presented here, we were still able to make some nontrivial statements based on the analysis of the corresponding models.
Bibliography [1] M. Aghi, R. L. Martuza, Oncolytic viral therapies – the clinical experience, Oncogene, 24 (2005), 7802–7816. [2] Z. Bajzer, T. Carr, K. Josic, S. J. Russell, D. Dingli, Modeling of cancer virotherapy with recombinant measles viruses, J. Theor. Biol., 252 (2008), 109–122.
6.1 Axiomatic Modeling in Life Sciences
139
[3] E. Barnes, G. Harcourt, D. Brown, M. Lucas, R. Phillips, G. Dusheiko, P. Klenerman, The dynamics of T-lymphocyte responses during combination therapy for chronic hepatitis C virus infection, Hepatology, 36 (2002), 743–754. [4] M. Begon, C. R. Townsend, J. L. Harper, Ecology: From Individuals to Ecosystems, Blackwell Publishing, Malden, MA, 2006. [5] F. C. Bekkering, C. Stalgis, J. G. McHutchison, J. T. Brouwer, A. S. Perelson, Estimation of early hepatitis C viral clearance in patients receiving daily interferon and ribavirin therapy using a mathematical model, Hepatology, 33 (2001), 419–423. [6] J. C. Bell, Oncolytic viruses: what’s next? Curr. Cancer Drug Targets, 7 (2007), 127– 131. [7] J. C. Bell, B. Lichty, D. Stojdl, Getting oncolytic virus therapies off the ground, Cancer Cell, 4 (2003), 7–11. [8] C. Boni, A. Penna, G. S. Ogg, A. Bertoletti, M. Pilli, C. Cavallo, A. Cavalli, S. Urbani, R. Boehme, R. Panebianco, F. Fiaccadori, C. Ferrari, Lamivudine treatment can overcome cytotoxic T-cell hyporesponsiveness in chronic hepatitis B: new perspectives for immune therapy, Hepatology, 33 (2001), 963–971. [9] A. Caplin, M. Dean, Axiomatic neuroeconomics, in: P.W. Glimcher, ed., Neuroeconomics: decision making and the brain, pp. 21–31, Elsevier, London, 2009. [10] T. W. Chun, L. Stuyver, S. B. Mizell, L. A. Ehler, J. A. Mican, M. Baseler, A. L. Lloyd, M. A. Nowak, A. S. Fauci, Presence of an inducible HIV-1 latent reservoir during highly active antiretroviral therapy, Proc. Natl. Acad. Sci. USA, 94 (1997), 13193– 13197. [11] A. M. Crompton, D. H. Kirn, From ONYX-015 to armed vaccinia viruses: the education and evolution of oncolytic virus development, Curr. Cancer Drug Targets, 7 (2007), 133–139. [12] J. J. Davis, B. Fang, Oncolytic virotherapy for cancer treatment: challenges and solutions, J. Gene. Med., 7 (2005), 1380–13895. [13] H. M. Diepolder, M. C. Jung, E. Keller, W. Schraut, J. T. Gerlach, N. Gruner, R. Zachoval, R. M. Hoffmann, C. A. Schirren, S. Scholz, G. R. Pape, A vigorous virusspecific CD4+ T cell response may contribute to the association of HLA-DR13 with viral clearance in hepatitis B. Clin. Exp. Immunol., 113 (1998), 244–251. [14] D. Dingli, M. D. Cascino, K. Josi´c, S. J. Russell, Z. Bajzer, Mathematical modeling of cancer radiovirotherapy, Math Biosci, 199 (2006), 5–78. [15] A. Friedman, J. P. Tian, G. Fulci, E. A. Chiocca, J. Wang, Glioma virotherapy: effects of innate immune suppression and increased viral replication capacity, Cancer Res., 66 (2006), 2314–2319. [16] S. D. Frost, J. Martinez-Picado, L. Ruiz, B. Clotet, A. J. Brown, Viral dynamics during structured treatment interruptions of chronic human immunodeficiency virus type 1 infection, J. Virol., 76 (2002), 968–979. [17] M. P. Hassell, The Dynamics of Arthopod Predator-Prey Systems, Princeton University Press, Princeton, NJ, 1978.
140
6 Axiomatic Modeling in Life Sciences with Case Studies
[18] M. W. Hirsch, S. Smale, R. L. Devaney, Differential Equations, Dynamical Systems, and an Introduction to Chaos, Pure and Applied Mathematics, Academic Press, San Diego, 2004. [19] J. M. Kaplan, Adenovirus-based cancer gene therapy, Curr. Gene Ther., 5 (2005), 595–605. [20] E. Kelly, S. J. Russell, History of oncolytic viruses: genesis to genetic engineering. Mol. Ther., 15 (2007), 651–659. [21] D. H. Kirn, F. McCormick, Replicating viruses as selective cancer therapeutics, Mol. Med. Today, 2 (1996), 519–527. [22] N. L. Komarova, E. Barnes, P. Klenerman, D. Wodarz, Boosting immunity by antiviral drug therapy: a simple relationship among timing, efficacy, and success, Proc. Natl. Acad. Sci. USA, 100 (2003), 1855–1860. [23] N. L. Komarova, D. Wodarz, ODE models for oncolytic virus dynamics. J. Theor. Biol., 263 (2010), 530–543. [24] F. Lechner, J. Sullivan, H. Spiegel, D. F. Nixon, B. Ferrari, A. Davis, B. Borkowsky, H. Pollack, E. Barnes, G. Dusheiko, P. Klenerman, Why do cytotoxic T lymphocytes fail to eliminate hepatitis C virus? Lessons from studies using major histocompatibility complex class I peptide tetramers, Philos. Trans. R. Soc. Lond., B, Biol. Sci., 355 (2000), 1085–1092. [25] F. Lechner, D. K. Wong, P. R. Dunbar, R. Chapman, R. T. Chung, P. Dohrenwend, G. Robbins, R. Phillips, P. Klenerman, B. D. Walker, Analysis of successful immune responses in persons infected with hepatitis C virus, J. Exp. Med., 191 (2000), 1499– 1512. [26] S. R. Lewin, R. M. Ribeiro, T. Walters, G. K. Lau, S. Bowden, S. Locarnini, A. S. Perelson, Analysis of hepatitis B viral load decline under potent therapy: complex decay profiles observed, Hepatology, 34 (2001), 1012–1020. [27] J. D. Lifson, J. L. Rossio, R. Arnaout, L. Li, T. L. Parks, D. K. Schneider, R. F. Kiser, V. J. Coalter, G. Walsh, R. J. Imming, B. Fisher, B. M. Flynn, N. Bischofberger, M. Piatak, V. M. Hirsch, M. A. Nowak, D. Wodarz, Containment of simian immunodeficiency virus infection: cellular immune responses and protection from rechallenge following transient postinoculation antiretroviral treatment, J. Virol., 74 (2000), 2584– 2593. [28] J. D. Lifson, J. L. Rossio, M. Piatak, T. Parks, L. Li, R. Kiser, V. Coalter, B. Fisher, B. M. Flynn, S. Czajak, V. M. Hirsch, K. A. Reimann, J. E. Schmitz, J. Ghrayeb, N. Bischofberger, M. A. Nowak, R. C. Desrosiers, D. Wodarz, Role of CD8(+) lymphocytes in control of simian immunodeficiency virus infection and resistance to rechallenge after transient early antiretroviral treatment, J. Virol., 75 (2001), 10187–10199. [29] H. F. Lohr, S. Krug, W. Herr, S. Weyer, J. Schlaak, T. Wolfel, G. Gerken, K. H. Meyer zum Buschenfelde, Quantitative and functional analysis of core-specific Thelper cell and CTL activities in acute and chronic hepatitis B, Liver, 18 (1998), 405– 413.
6.1 Axiomatic Modeling in Life Sciences
141
[30] R. M. Lorence, A. L. Pecora, P. P. Major, S. J. Hotte, S. A. Laurie, M. S. Roberts, W. S. Groene, M. K. Bamat, Overview of phase I studies of intravenous administration of PV701, an oncolytic virus, Curr. Opin. Mol. Ther., 5 (2003), 618–624. [31] F. Lori, R. Maserati, A. Foli, E. Seminari, J. Timpone, J. Lisziewicz, Structured treatment interruptions to control HIV-1 infection, Lancet, 355 (2000), 287–288. [32] M. K. Maini, A. Bertoletti, How can the cellular immune response control hepatitis B virus replication? J. Viral Hepat., 7 (2000), 321–326. [33] R. Malka, V. Rom-Kedar, Bacteria-phagocyte dynamics, axiomatic modelling and mass-action kinetics, Math Biosci Eng, 8 (2011), 475–502. [34] R. Malka, E. Shochat, V. Rom-Kedar, Bistability and bacterial infections, PLoS ONE, 5 (2010), e10010. [35] R. M. May. Stability and Complexity in Model Ecosystems, Princeton Landmarks in Biology, Princeton University Press, Princeton, NJ, 2001. [36] F. McCormick, Cancer-specific viruses and the development of ONYX-015. Cancer Biol. Ther., 2 (2003), S157–160. [37] F. McCormick, Future prospects for oncolytic therapy, Oncogene, 24 (2005), 7817– 7819. [38] L. J. Montaner, Structured treatment interruptions to control HIV-1 and limit drug exposure, Trends Immunol., 22 (2001), 92–96. [39] V. Muller, A. F. Maree, R. J. De Boer, Small variations in multiple parameters account for wide variations in HIV-1 set-points: a novel modelling approach, Proc. Biol. Sci., 268 (2001), 235–242. [40] A. U. Neumann, N. P. Lam, H. Dahari, D. R. Gretch, T. E. Wiley, T. J. Layden, A. S. Perelson, Hepatitis C viral dynamics in vivo and the antiviral efficacy of interferonalpha therapy, Science, 282 (1998), 103–107. [41] A. S. Novozhilov, F. S. Berezovskaya, E. V. Koonin, G. P. Karev, Mathematical modeling of tumor therapy with oncolytic viruses: regimes with complete tumor elimination within the framework of deterministic models, Biol. Direct, 1 (2006), 6. [42] M. A. Nowak, R. M. May, Virus dynamics, Mathematical Principles of Immunology and Virology, Oxford University Press, Oxford, 2000. [43] G. M. Ortiz, J. Hu, J. A. Goldwitz, R. Chandwani, M. Larsson, N. Bhardwaj, S. Bonhoeffer, B. Ramratnam, L. Zhang, M. M. Markowitz, D. F. Nixon, Residual viral replication during antiretroviral therapy boosts human immunodeficiency virus type 1-specific CD8+ T-cell responses in subjects treated early after infection, J. Virol., 76 (2002), 411–415. [44] C. C. O’Shea, Viruses – seeking and destroying the tumor program, Oncogene, 24 (2005), 7640–7655. [45] K. A. Parato, D. Senger, P. A. Forsyth, J. C. Bell, Recent progress in the battle between oncolytic viruses and tumours, Nat. Rev. Cancer, 5 (2005), 965–976. [46] A. S. Perelson, Modelling viral and immune system dynamics, Nat. Rev. Immunol., 2 (2002), 28–36.
142
6 Axiomatic Modeling in Life Sciences with Case Studies
[47] H. Pirim, B. Eksioglu, An axiomatic design for modeling biological systems, in: Proceedings of the Conference on Axiomatic Design, pp. 37–41, Lisbon, Portugal, 2009. [48] D. E. Post, H. Shim, E. Toussaint-Smith, E. G. Van Meir, Cancer scene investigation: how a cold virus became a tumor killer, Future Oncol, 1 (2005), 247–258. [49] M. S. Roberts, R. M. Lorence, W. S. Groene, M. K. Bamat, Naturally oncolytic viruses, Curr. Opin. Mol. Ther., 8 (2006), 314–321. [50] I. A. Rodriguez-Brenes, N. L. Komarova, D. Wodarz, Evolutionary dynamics of feedback escape and the development of stem-cell-driven cancers, Proc. Natl. Acad. Sci. USA, 108 (2011), 18983–18988. [51] E. S. Rosenberg, M. Altfeld, S. H. Poon, M. N. Phillips, B. M. Wilkes, R. L. Eldridge, G. K. Robbins, R. T. D’Aquila, P. J. Goulder, B. D. Walker, Immune control of HIV-1 after early treatment of acute infection, Nature, 407 (2000), 523–526. [52] E. Shochat, V. Rom-Kedar, Novel strategies for granulocyte colony-stimulating factor treatment of severe prolonged neutropenia suggested by mathematical modeling, Clin. Cancer Res., 14 (2008), 6354–6363. [53] E. Shochat, V. Rom-Kedar, L. A. Segel, G-CSF control of neutrophils dynamics in the blood. Bull. Math. Biol., 69 (2007), 2299–2338. [54] J. Maynard Smith, Models in Ecology, Cambridge University Press, New York, 1978. [55] R. Sorace, N. L. Komarova, Accumulation of neutral mutations in growing cell colonies with competition, In submission, 2011. [56] N. P. Suh, Axiomatic Design: Advances and Applications. The Oxford Series on Advanced Manufacturing, Oxford University Press, New York, 2001. [57] M. J. Vaha-Koskela, J. E. Heikkila, A. E. Hinkkanen, Oncolytic viruses in cancer therapy, Cancer Lett., 254 (2007), 178–216. [58] L. M. Wein, J. T. Wu, D. H. Kirn, Validation and analysis of a mathematical model of a replication-competent oncolytic virus for cancer treatment: implications for virus design and delivery, Cancer Res., 63 (2003), 1317–1324. [59] S. A. Whalley, J. M. Murray, D. Brown, G. J. Webster, V. C. Emery, G. M. Dusheiko, A. S. Perelson, Kinetics of acute hepatitis B virus infection in humans, J. Exp. Med., 193 (2001), 847–854. [60] D. Wodarz, Viruses as antitumor weapons: defining conditions for tumor remission, Cancer Res., 61 (2001), 3501–3507. [61] D. Wodarz, Gene therapy for killing p53-negative cancer cells: use of replicating versus nonreplicating agents, Hum. Gene Ther., 14 (2003), 153–159. [62] D. Wodarz, N. Komarova, Towards predictive computational models of oncolytic virus therapy: basis for experimental validation and model selection, PLoS ONE, 4 (2009), e4271. [63] D. Wodarz, N. L. Komarova, Computational Biology of Cancer: Lecture Notes and Mathematical Modeling. World Scientific, Hackensack, NJ, 2005.
6.1 Axiomatic Modeling in Life Sciences
143
Author Information Natalia L. Komarova, Department of Mathematics, University of California Irvine, Irvine, CA, USA E-mail: [email protected]
7
Theory, Applications, and Control of Nonlinear PDEs in Life Sciences
Andrei V. Fursikov
7.1 On One Semilinear Parabolic Equation of Normal Type*
Abstract. The aim of this paper is to begin mathematical investigation of some tools for suppression of turbulence in a fluid flow. The notion of semilinear parabolic equation of normal type is introduced. The structure of dynamical flow corresponding to equations of this type with periodic boundary condition is studied. A theorem on the stabilization of mentioned equation with arbitrary initial condition by starting control supported in prescribed subset is formulated. Keywords. Equation of Normal Type, Structure of Dynamical Flow, Stabilization by Starting Control 2010 Mathematics Subject Classification. 35K58, 37L05, 93D15
7.1.1 Introduction The investigation of different tools for damping of turbulence in a fluid flow is one of the problems in turbulence theory that is very important for many applications including applications in biology and medicine. Mathematical aspects of this problem are studied in the theory of stabilization of solution to Navier–Stokes equations by feedback control. Up to now only the base of local stabilization theory has been established. Global aspects of this theory are not studied at all because of many reasons and in particular because up to now very few rigorous mathematical results on the structure of fluid dynamics are known. Some attempts in this direction were made in [2]. The first step in construction of unbounded stable invariant manifold described in [2] is a theorem on existence and uniqueness of smooth solution defined for time t 2 RC for 3D Navier–Stokes equations with initial condition from an unbounded ellipsoid. This results is based on Implicit Function Theorem that is very general being local by its nature. That is why it seems very probable that mentioned result from [2] on existence of smooth solutions is true not only for initial conditions from unbounded ellipsoid but also from some more
The work has been conducted as part of the RAS program “Theoretical problems of modern mathematics”, project “Optimization of numerical algorithms of Mathematical Physics problems”.
148
7 Theory, Applications, and Control of Nonlinear PDEs in Life Sciences
wide set.1 If so, then to know the structure of dynamical flow corresponding to this system is extremely interesting, in particular because this knowledge can be applied to the creation of some tools for turbulence suppression. It is quite natural to begin realization of this plan from simple examples. But for such well-known simplifications of 3D Navier–Stokes system as Burgers’ equation or 2D Navier–Stokes system global existence of solutions takes place and that is why it is more reasonable to begin from some other example. In Section 7.1.2 of this paper we introduce a class of semilinear parabolic equations of normal type that does not satisfy energy bound “in the most degree.” We derive explicit formula for solutions of normal one-dimensional parabolic equations with periodic boundary conditions and establish some properties of these solutions. In Section 7.1.3 for dynamical system corresponding to mentioned equation the structure of its phase flow is described. In Section 7.1.4 the theorem on stabilization of mentioned equation with arbitrary initial condition by control defined at t D 0 and supported in prescribed subset is formulated. As is already known (see [4–6]), this start control can be used for construction of the control on the boundary (in the case of mixed Dirichlet boundary-value problem) that stabilizes the solution of mentioned boundary-value problem. Section 7.1.5 is devoted to some concluding remarks.
7.1.2 Semilinear Parabolic Equation of Normal Type Our aim is to try to understand better how to investigate semilinear (or quasilinear) parabolic equations that do not satisfy energy estimates. For this we will derive some semilinear parabolic equation that does not satisfy energy bound in the most degree in the meaning that will be clear below. 7.1.2.1 Derivation of Normal Parabolic Equation (NPE) We consider Burgers’ equation @ t v.t; x/ @xx v.t; x/ @x v 2 .t; x/ D 0
(7.1)
where @ t v D @v=@t; @xx v D @2 v=@x 2 with periodic boundary condition v.t; x C 2/ D v.t; x/
(7.2)
v.t; x/j t D0 D v0 .x/:
(7.3)
and initial condition
1
If millennium problem on existence of unlocal smooth solutions for 3D Navier–Stokes system would get positive solution, this set coincides with the whole space.
149
7.1 On One Semilinear Parabolic Equation of Normal Type
Multiplying (7.1) on v, integrating with respect to x we obtain after integration by parts and integration on t the well-known energy estimate Z2 Zt Z2 Z2 2 2 v .t; x/dx C 2 .@x v.; x// dxd v02 .x/dx: 0
0 0
(7.4)
0
Let us derive from (7.1) some equation for which (7.4) is not valid. Differentiating (7.1) on x we get @ t vx .t; x/ @xx vx .t; x/ B.v; vx / D 0
(7.5)
B.v; vx / D 2vx2 C 2v@x vx :
(7.6)
where vx D @v=@x; Multiplying (7.6) on vx scalarly in L2 .T1 / where T1 D R=2Z is circumference and integrating by parts we get Z2
Z2 Z2 3 B.v; vx /vx dx D .2vx C 2vvx @x vx /dx D vx3 dx:
0
0
Denote L02 .T1 /
(7.7)
0
Z2 D ¹v.x/ 2 L2 .T1 / W v.x/dx D 0º:
(7.8)
0
Let us decompose operator B.v; vx / as follows: B.v; vx / D Bn .v; vx / C B .v; vx /:
(7.9)
Here Bn .v; vx / D ˆ.v; vx /vx where ˆ.v; vx / is a functional and vector B .v; vx / is orthogonal to the vector vx in L2 .T1 /: Z2 B .v; vx /vx dx D 0:
(7.10)
0
To determine functional ˆ.v; vx / we substitute (7.9) into (7.7) and use (7.10). As a result we get: Z2 Z2 Z2 3 2 vx dx D ˆ.v; vx /vx dx D ˆ.v; vx / vx2 dx: 0
0
0
150
7 Theory, Applications, and Control of Nonlinear PDEs in Life Sciences
Therefore, functional ˆ does not depend on v depending on vx only and for vx ¤ 0 Z2 Z2 ı 3 vx2 dx: ˆ.vx / D vx dx 0
0
It is clear that by continuity ˆ can be defined at vx 0 by the formula ˆ.0/ D 0. If in decomposition (7.9) Bn D 0, then B D B is orthogonal to vx and for solution vx of (7.5) energy estimate can be derived. When in (7.9) B D 0 operator B.v; vx / D ˆ.vx /vx in the most degree does not satisfy condition that implies energy estimate. Let us consider the case when in (7.5) B.v; vx / D ˆ.vx /vx . Using notation vx D y we obtain from (7.5) the following equation: @ t y.t; x/ @xx y.t; x/ ˆ.y/y D 0; 8 Z2 Z2 ˆ ˆ ˆ < y.x/3 dx ı y.x/2 dx; y ¤ 0 ˆ.y/ D ˆ 0 0 ˆ ˆ :0; y 0:
where
(7.11)
(7.12)
Equation (7.11) is called semilinear parabolic equation of normal type or normal parabolic equation (NPE). It is the main object of our investigation in this article. 7.1.2.2 Unique Solvability for NPE with Initial Conditions from Unbounded Ellipsoid We study equation (7.11) with periodic boundary condition supplied with initial condition (7.13) y.t; x/j t D0 D y0 .x/: It is natural to consider dynamical system (7.11), (7.13) in the phase space L02 .T1 / defined in (7.8). Let us recall the definition of Sobolev spaces that will be used below. Each periodic function z.x/ can be decomposed in Fourier series z.x/ D
1 X kD1
zk e
i kx
;
where
1 zk D 2
L02 .T1 /,
Z2
z.x/e i kx dx:
(7.14)
0
If z.x/ 2 then z0 D 0. Below we always suppose that this condition is fulfilled. Let s 2 R. Sobolev space H s .T1 / is the space of periodic real-valued distributions R z.x/ satisfying T1 z.x/dx D 0 with finite norm X 2 2 kzkH jkj2s jzk j2 < 1 (7.15) s .T / kzks D 1 k2Zn¹0º
where zk are Fourier coefficients of z (see (7.14)).
151
7.1 On One Semilinear Parabolic Equation of Normal Type
Denote Q D RC T1 ; Q t0 D .0; t0 / T1 for t0 > 0, and H 1;2.˛/.Q/ D ¹y 2 L2 .RC I H 2˛ .T1 // W @ t y 2 L2 .RC I H ˛ .T1 //º
(7.16)
with some ˛ 0. Note that it is natural to look for a solution of problem (7.11), (7.13) in the space H 1;2.1/.Q/. In particular, by virtue of the well-known embedding inequality (see, e.g., [3, Chapter 3, Section 4]) sup t 2RC ky.t; /kL0 .T1 / ckykH 1;2.1/.Q/ 2
the space L02 .T1 / should be taken as the phase space of the dynamical system generated by (7.11), (7.13). Since for a real-valued function the Fourier decomposition z.x/ D
X
zk e
i kx
k2Zn¹0º
D2
1 X
.Rezk cos kx C Imzk sin kx/
kD1
is true, we can interpret the set El D ¹z 2 L02 .T1 / W kzk21=2 D
X k2Zn¹0º
jzk j2 º jkj
1 X jRezk j2 C jImzk j2 1º D ¹z 2 L02 .T1 / W .k/=2
(7.17)
kD1
as ellipsoid in L02 .T1 / with length of axes that are directed along cos kx; sin kx equal to k=2: Since k=2 ! 1 as k ! 1, this ellipsoid is unbounded in L02 .T1 /. The following analog of the result from [2] holds: Theorem 7.1. If > 0 is sufficiently small, then for each y0 2 El there exists unique solution y.t; x/ 2 H 1;2.1/.Q/ of problem (7.11), (7.13). Moreover, ky.t; /k0 ˛ky0 k0 e t
as t ! 1
(7.18)
with constant ˛ > 0 independent of time t > 0 and datum y0 2 El . To prove Theorem 7.1 we need Lemma 7.2. There exists sufficiently small > 0 such that for each initial condition y0 2 H 1=2.T1 /; ky0 k1=2 < there exists unique solution y 2 H 1;2.3=2/.Q/ of problem (7.11), (7.13). Moreover, the operator mapping initial condition y0 to the solution y.t; xI y0 / of (7.11), (7.13) acts continuously from ¹y0 2 H 1=2 .T1 / W ky0 k1=2 < º to H 1;2.3=2/.Q/. The Lemma 7.2 proof consists of reduction to inverse map theorem (see, e.g., [1]) by well-known methods (see, e.g., [2]).
152
7 Theory, Applications, and Control of Nonlinear PDEs in Life Sciences
7.1.2.3 Explicit Formula for Solution of NPE The following explicit formula for solution of problem (7.11), (7.13) holds: Lemma 7.3. Let S.t; xI y0 / be solution of the heat equation @ t S @xx S D 0;
Sj t D0 D y0 .x/
(7.19)
with periodic boundary condition. The solution of problem (7.11), (7.13) has the form y.t; xI y0 / D
1
Rt 0
S.t; xI y0 / ˆ.S.; xI y0 //d
:
(7.20)
The proof of this lemma is reduced to substitution (7.20) into (7.11) and straightforward verification. Formula (7.20) implies Lemma 7.4. Let El be ellipsoid (7.17). There exists > 0 such that for each y0 2 El solution y.t; xI y0 / of problem (7.11), (7.13) satisfies the estimate ky.t; I y0 /k0 cky0 k0 e t
8t > 0
(7.21)
with a constant c independent of y0 2 El . Proof. By a corollary of Sobolev embedding theorem denoting by yk;0 Fourier coefficients of y0 we get ˇ Zt ˇ Zt ˇ ˇ ˇ ˆ.S.; I y0 //d ˇ c kS.; ; y0 /k1=2d ˇ ˇ 0
0
Zt D
e
=2
X
e
1=2 jyk0 j jkj d 2
k¤0
0
Zt c
e
d
1=2 X Zt
X
e
.2k 21/
1=2 d jyk;0 j jkj 2
k¤0 0
0
c1
.2k 2 1/
jyk;0 j2
k¤0
jkj 2k 2 1
1=2 c1 ky0 k1=2
where c1 does not depend on y0 ; t > 0. That is why since ky0 k1=2 , inequality Zt ˆ.S.; I y0 //d 1 c1
1 0
(7.22)
153
7.1 On One Semilinear Parabolic Equation of Normal Type
holds. As is well known, solution S.t; xI y0 / of (7.19) satisfies X 2 e 2k t jy0;k j2 e 2t ky0 k20 kS.t; I y0 /k20 D
(7.23)
k¤0
Formula (7.20) and bounds (7.22), (7.23) yield ky.t; I y0 /k20
kS.t; I y0 /k20 e 2t ky0 k20 1 c1 1 c1
(7.24)
Inequality (7.24) implies (7.21). Theorem 7.1 follows from Lemmas 7.2 and 7.4.
7.1.3 The Structure of NPE Dynamics The aim of this section is to find out the main feature of dynamical flow corresponding to NPE. We decompose the phase space of the dynamical system on three sets with different behavior of dynamical flow inside each of them. 7.1.3.1 Distinctive Sets of Phase Space Let us give definitions of three subsets of phase space for NPE. Recall that we take L02 .T1 / H 0 .T1 / as the phase space for problem (7.11), (7.13). Definition 7.5. The set M M .˛/ H 0 .T1 / of y0 such that the solution y.t; xI y0 / 2 H 1;2.1/.Q/ of problem (7.11), (7.13) exists and satisfies inequality ky.t; I y0 /k0 ˛ky0k0 e t
8t > 0
(7.25)
is called the set of stability. Here ˛ > 1 is a certain fixed number. The following simple condition that guarantees inclusion y0 2 M .˛/ is true: Lemma 7.6. Let ˛ > 1. If y0 2 H 0 .T1 / satisfies the bound Zt ˆ.S.; I y0 //d
sup t 2RC 0
˛1 ; ˛
then y0 2 M .˛/ Proof. In virtue of (7.20), (7.23) and (7.26) ky.t; I y0 /k0
kS.t; I y0 /k0 ˛e t ky0 k0 : Rt 1 sup t 2RC 0 ˆ.S.; I y0 //d
(7.26)
154
7 Theory, Applications, and Control of Nonlinear PDEs in Life Sciences
Definition 7.7. The set MC H 0 .T1 / of initial conditions y0 from (7.11), (7.13) such that corresponding solution y.t; xI y0 / exists only on a finite interval t 2 .0; t0 / with t0 > 0 depending on y0 , and blows up at t D t0 , is called the set of explosions. In virtue of formula (7.20) for solution y.t; xI y0 / Zt
0
MC D ¹y0 2 H .T1 / W 9t0 > 0
ˆ.S.; I y0 //d D 1º
(7.27)
0
The minimal magnitude from ¹t0º for which equality from (7.27) holds is called the time of explosion. Definition 7.8. The collection MI .˛/ D H 0 .T1 / n ¹M .˛/ [ MC º
(7.28)
is called an intermediate set. Remark 7.9. Definitions of stability and intermediate sets include parameter ˛ > 1 and from this point of view they are not absolute. Nevertheless, they are convenient to use. We study below the properties of these sets and, in particular, we show that all these sets are not empty. We begin from the set of stability. This set is the most important one for us. 7.1.3.2 Subspaces Belonging to the Set of Stability Show first of all that M .˛/ ¤ ;: This fact follows from Lemma 7.10. Let .˛ 2 1/=.˛ 2 c1 / where c1 is the constant from (7.22). Then El M .˛/
(7.29)
where the set El is defined in (7.17). Proof. It follows from inequality (7.24) that (7.29) holds if ˛ 2 1=.1 c1 /. But the last inequality is equivalent to .˛ 2 1/=.˛ 2 c1 /. We prove that there exists an infinite-dimensional subspace belonging to M .˛/. Besides, there are one-dimensional subspaces such that the whole ray along one direction of this subspace belongs to M .˛/, but along the opposite direction only an finite segment belongs to M .˛/.
155
7.1 On One Semilinear Parabolic Equation of Normal Type
Let UL be a certain subset of natural numbers that satisfies the property: for each k1 ; k2 ; k3 2 UL k1 C k2 k3 ¤ 0, i.e., UL D ¹k 2 N W k1 C k2 k3 ¤ 0 8k1 ; k2 ; k3 2 UL º:
(7.30)
An example of UL is the set of all odd natural numbers. Lemma 7.11. The subspace X L D ¹y0 D .zk e i kx C z k e i kx /; zk 2 Cº H 0 .T1 /
(7.31)
k2UL
belongs to M .˛/ if the set UL satisfies (7.30). Proof. If y0 2 L then Z2
X
3
S .t; xI y0 /dx D
e
.k12 Ck22 Ck32 /t
Z2
k1 ;k2 ;k3 2UL
0
C 3zk1 zk2 z k3 e
zk1 zk2 zk3 e i .k1Ck2 Ck3 /x
0 i .k1Ck2 k3 /x
C z k1 z k2 z k3 e i .k1 Ck2 Ck3
/x
C 3zk1 z k2 z k3 e i .k1 k2 k3 /x dx D 0
because by definition of UL k1 Ck2 k3 ¤ 0 8k1 ; k2 ; k3 2 UL . Since ˆ.S.t; xI y0 // D 0, by (7.20) solution y.t; xI y0 / of (7.11), (7.13) satisfies X 2 ky.t; xI y0 /k20 D e 2k t kzk e i kx C z k e i kx k20 e 2t ky0 k20 : k2UL
7.1.3.3 Rays from M .˛/ Consider now examples of rays belonging to M .˛/. Let k 2 N be fixed. We take y0 .x/ D z1 e i kx C z 1 e i kx C z2 e i 2kx C z 2 e i 2kx :
(7.32)
We will use the following notation for complex numbers zj ; j D 1; 2: z1 D jz1 je i '1 ;
z2 D jz2 je i '2 ;
i.e. 'j D Argzj ;
j D 1; 2:
(7.33)
156
7 Theory, Applications, and Control of Nonlinear PDEs in Life Sciences
Then simple calculations yield Z2 Z2 2 2 3 S .t; xI y0 /dx D .2jz1 jcos .kx C '1 /e k t C 2jz2 jcos .2kx C '2 /e 4k t /3 dx 0
0
Z2h 2 jz1 j3 cos 3 .kx C '1 /e 3k t D8 0
C 3jz1 j2 jz2 jcos 2 .kx C '1 /cos .2kx C '2 /e 6k
2
C 3jz1 jjz2 j2 cos .kx C '1 /cos 2 .2kx C '2 /e 9k i 2 C jz2 j3 cos 3 .2kx C '2 /e 12k t dx
2t
D 12jz1 j2 jz2 jcos .2'1 '2 /e 6k
2
t
t
(7.34)
where initial condition y0 is defined by (7.32). Similarly to (7.34), we obtain Z2 2 2 S 2 .t; xI y0 /dx D 4e 2k t .jz1 j2 C jz2 j2 e 6k t /;
(7.35)
0
and by definition (7.11) of ˆ we get 2
3jz1 j2 jz2 je 4k t cos .2'1 '2 / ˆ.S.t; xI y0 // D 2 jz1 j2 C jz2 j2 e 6k t
(7.36)
This formula provides us with new sets belonging to M .˛/ Lemma 7.12. Let in initial condition y0 from (7.32) jz1 j D jz2 j D 2 RC ;
and
3 2'1 '2 2 2
(7.37)
with a fixed '1 ; '2 2 .0; 2/. Then the ray ¹ 2 RC ; '1 ; '2 º
(7.38)
of initial conditions (7.32) belongs to M .˛/. Proof. Insertion 2'1 '2 2 Œ 2 ; 3 2 implies cos .2'1 '2 / 0, and Zt ˆ.S.; I y0 //d 0
1:
157
7.1 On One Semilinear Parabolic Equation of Normal Type
Lemma 7.13. Let parameters '1 ; '2 of y0 from (7.32) satisfy 0 2'1 '2
0. Then the segment ¹ 2 Œ0; 0 ; '1 ; '2 º
(7.40)
of initial conditions (7.32) belongs to M .˛/ if 0 from (7.40) is defined by the equation Z1 4t 30 ˛1 e dt cos .2'1 '2 / D : (7.41) 2 6t k 1Ce ˛ 0
Proof. By (7.39) we have cos .2'1 '2 / > 0. Nevertheless, using definition (7.41) of 0 and equality jz1 j D jz2 j D we get from (7.36) that for 2 .0; 0 Zt max t 2RC 0
30 ˆ.S.; I y0 //d 2 k
Z1 0
e 4t dt ˛1 : cos .2'1 '2 / D 6t 1Ce ˛
Therefore, Lemma 7.13 follows from Lemma 7.6. It is clear that extension of the segments ¹ 2 Œ0; 0 ; '1 ; '2 º across 0 will lead to passage corresponding to initial condition y0 from M .˛/ to MI .˛/. Lemmas 7.12 and 7.13 imply Theorem 7.14. Let initial condition y0 from (7.32), (7.33) satisfies (7.37). Then the ray R.'1 ; '2 / D ¹jz1 j D jz2 j D 2 RC ; '1 ; '2 º [¹jz1 j D jz2 j D 2 .0; 0 /; '1 C ; '2 C º
(7.42)
where 0 is defined by equation (7.41) belongs to M .˛/. Proof. We have to note only that if '1 ; '2 satisfy (7.37), then '1 C ; '2 C satisfy (7.39) and take advantage of Lemmas 7.12 and 7.13. 7.1.3.4 Set of Explosions and Intermediate Set First of all we show that MC ¤ ; and MI .˛/ ¤ ; for each ˛ > 1. Lemma 7.15. The set MC is not empty.
158
7 Theory, Applications, and Control of Nonlinear PDEs in Life Sciences
Proof. We take initial condition (7.32), (7.33) with jz1 j D jz2 j D > 0 and cos .2'1 '2 / > 0. Then by (7.36) Zt
Zt ˆ.S.; I y0 //d D 3
0
0
2
e 4k d 1 C e 6k
2
cos .2'1 '2 /:
We take such that Z1 3 0
2
e 4k d cos .2'1 '2 / D 2: 2 1 C e 6k
Then there exists t0 such that denominator in (7.20) will be equal to zero at t D t0 . Hence, solution y.t; xI y0 / of (7.11), (7.13) defined by (7.20) blows up. Lemma 7.16. For each ˛ > 1 the set MI .˛/ is not empty. Moreover, there exists initial condition y0 2 MI .˛/ such that ky.t; I y0 /k0 ! 1 as t ! 1
(7.43)
where y.t; xI y0 / is the solution of problem (7.11), (7.13).
7.1.4 Stabilization of Solution for NPE by Start Control We consider semilinear parabolic equation (7.11): @ t y.t; x/ @xx y.t; x/ ˆ.y/ D 0 where
´R 2 ˆ.y/ D
0
y.x/3 dx
.R
2 0
y.x/2 dx;
(7.44) y¤0 y0
0;
(7.45)
with periodic boundary condition y.t; x C 2/ D y.t; x/
(7.46)
y.t; x/j t D0 D y0 .x/ C u.x/:
(7.47)
and initial condition Here y0 .x/ 2 H 0 .T1 / L02 .T1 / is an arbitrary given initial datum and u.x/ 2 H 0 .T1 / is a control. We assume that on a circumference T1 D R=2Z a segment Œa; b is given, and control u.x/ is supported in Œa; b: supp u Œa; b:
(7.48)
7.1 On One Semilinear Parabolic Equation of Normal Type
159
The setting of stabilization problem is as follows: Given y0 .x/ 2 H 0 .T1 /, find a control u 2 H 0 .T1 / satisfying (7.48) such that there exists unique solution y.t; xI y0 C u/ 2 H 1;2.1/.Q/, and this solution satisfies the estimate ky.t; I y0 C u/k0 ˛ky0 C uk0 e t 8t > 0 (7.49) with a certain ˛ > 1. Note that if initial condition y0 2 M .˛/ then by Definition 7.5 of the set of stability M .˛/ the control u 0 is a solution of the stabilization problem. In other words, in this case the stabilization problem is trivial. If y0 2 MC or y0 2 M .˛/ and corresponding solution y.t; xI y0 / satisfies (7.43), then the stabilization problem is reach of content. The following theorem is true: Theorem 7.17. Let y0 2 H 0 .T1 / be given. Then there exists a control u 2 H 0.T1 / satisfying (7.48) such that there exists unique solution y.t; xI y0 Cu/ and this solution satisfies bound (7.49) with certain ˛ > 1. The proof of this Theorem 7.17 will be published elsewhere.
7.1.5 Concluding Remarks The structure of dynamical flow corresponding to semilinear parabolic equation of normal type (NPE) has been studied, and a theorem on stabilization of this equation by start feedback control supported in a given subdomain has been formulated. Our conjecture (that we intend to verify in a future investigations) is that obtained results have some relation to the original (i.e., to Burgers’) equation. More exactly, we expect that the set M of stability for NPE and analogous set for Burgers’ equation are closed as sets in the common phase space, and H 1 -norm of solution for Burgers’ equation with initial condition from the set MC of explosions for NPE should increase in the first stage of its evolution. Moreover, we hope that for construction of stabilizing feedback control for Burgers’ equation we will be able to use stabilizing control for NPE. The final purpose of our future studies is to extend these investigations up to Navier– Stokes equation, i.e., up to construction of nonlocal tools for turbulence damping in the adequate mathematical model.
Bibliography [1] V. M. Alekseev, V. M. Tikhomirov, S. V. Fomin, Optimal Control, Consultants Bureau, New York, 1987. [2] A. V. Fursikov, Local existence theorems with unbounded set of input data and unboundedness of stable invariant manifolds for 3D Navier-Stokes equations, Discr. Cont. Dyn. Syst. Series S, 3:2, (2010), 269–290.
160
7 Theory, Applications, and Control of Nonlinear PDEs in Life Sciences
[3] A. V. Fursikov, Optimal control of distributed systems. Theory and applications, in: Translations of Mathematical Monographs 187, Amer. Math. Society, Providence, Rhode Island, 2000. [4] A. V. Fursikov, Stabilizability of quasilinear parabolic equation by feedback boundary control, Sbornik: Mathematics, 192:4 (2001), 593–639. [5] A. V. Fursikov, Stabilizability of two-dimensional Navier-Stokes equations with help of a boundary feedback control, J. Math. Fluid Mech., 3 (2001), 259–301. [6] A. V. Fursikov, Stabilization for the 3D Navier-Stokes system by feedback boundary control, Discrete and Cont. Dyn. Syst., 10:1&2, (2004), 289–314. [7] R. Temam, Navier-Stokes Equations – Theory and Numerical Analysis, AMS Chelsea Publishing, Providence, 2001.
Author Information Andrei V. Fursikov, Department of Mechanics and Mathematics, Moscow State University, Moscow, Russia E-mail: [email protected]
Alexander A. Kovalevsky
7.2 On some Classes of Nonlinear Equations with L1 -Data
Abstract. In this article, we give a survey of results on the existence and properties of solutions to the Dirichlet problem for some classes of nonlinear second- and fourthorder equations with L1 -data. In particular cases, the principal parts of the equations under consideration may be generated by the p-Laplace operator or may include it. The study of equations with the p-Laplacian and its generalizations is of great importance, since such operators are involved in the statement of many applied problems, for instance, problems modeling the motion of non-Newtonian fluids, biological pattern formation and the interaction of diffusing biological species. Keywords. Dirichlet Problem, Entropy and Weak Solution, Existence, L1 -Data, Nonlinear Elliptic Second- and Fourth-order Equation, Uniqueness and Summability of Solution 2010 Mathematics Subject Classification. 35B45, 35J60, 35J65
In this article, we give a survey of results on the existence and properties of solutions for some classes of nonlinear equations with L1 -data. The matter of the article is divided into two main parts. In Section 7.2.1, we deal with the Dirichlet problem for nonlinear elliptic second-order equations with L1 -data. The cases of usual and degenerate coercivity for coefficients of the equations are treated, and the notions of weak, entropy and renormalized solutions to the given problem are considered. We describe relations between these notions and discuss known results on the existence and summability of the solutions. Section 7.2.2 is devoted to the Dirichlet problem for nonlinear fourth-order equations with L1 -data and coefficients satisfying a strengthened coercivity condition. We consider different kinds of solution to this problem. Entropy and proper entropy solutions are among them. We describe relations between the kinds of solution under consideration and state known results on their existence, uniqueness and summability properties. We observe that in particular cases, the principal parts of equations considered in the article may be generated by the p-Laplace operator (p > 1) or may include it. The investigation of equations with the p-Laplacian and its generalizations is of great importance, since such operators arise for instance in the study of motion of non-Newtonian fluids (dilatant, pseudo-plastic fluids) as well as in some problems of glaciology and
162
7 Theory, Applications, and Control of Nonlinear PDEs in Life Sciences
astronomy. The p-Laplace operator and its generalizations are also involved in the statement of problems modeling biological pattern formation and the interaction of diffusing biological species.
7.2.1 Nonlinear Elliptic Second-order Equations with L1 -data 7.2.1.1 The Dirichlet Problem for Equations with Usual Coercivity Let n 2 N, n 2, be a bounded open set of Rn and p 2 .1; n/. Let c1 ; c2 > 0, g 2 Lp=.p1/ ./, g 0 in , and let for every i 2 ¹1; : : : ; nº , ai W Rn ! R be a Carathéodory function. We shall suppose that for almost every x 2 and for every
2 Rn the following inequalities hold: n X i D1 n X
jai .x; /j c1 j jp1 C g.x/;
(7.50)
ai .x; / i c2 j jp :
(7.51)
i D1
Moreover, we shall assume that for almost every x 2 and for every , 0 2 Rn ,
¤ 0 , the next inequality holds: n X
Œ ai .x; / ai .x; 0 /. i i0 / > 0:
(7.52)
i D1
Let f 2 L1 ./. We consider the following Dirichlet problem :
n X @ ai .x; ru/ D f in ; @xi
(7.53)
i D1
u D 0 on @:
(7.54)
There is an extensive literature devoted to the study of the existence and properties of solutions of problem (7.53), (7.54) and similar problems with L1 - or measure data (see, for instance [1–5,7–13,15,20,21,23,24,27,28,37–41]. The notions of weak, entropy and renormalized solutions of the problems under consideration were introduced and investigated. ı
The notion of a weak solution (solution in W 1;1 ./ in the sense of integral identity for smooth functions) is a natural analogue of usual notion of generalized solution which, generally speaking, has no sense in the case of L1 -data. Theorems on the existence of weak solutions were established in [8–11]. We note that in general a weak solution exists not for all admissible values of the parameter p (see [3]). Moreover, in the case p D 2 there exists an example of the non-uniqueness of weak solution [8].
7.2 On some Classes of Nonlinear Equations with L1 -Data
163
An effective approach to the study of the solvability of problems with L1 -data similar to problem (7.53), (7.54) was proposed in [3]. This approach is connected with introducing the notion of entropy solutions to the problems under consideration. Generally speaking, entropy solutions are elements of a functional set which is essentially larger than the corresponding energy space in the case of sufficiently regular data. It turns out that an entropy solution of problem (7.53), (7.54) and some close problems exists without additional restrictions on the parameter p and is unique. In application to problem (7.53), (7.54) the essence of the approach proposed in [3] consists in the following. A sequence of analogous problems with data fl 2 C01 ./ approximating the function f in L1 ./ is considered and the corresponding sequence ı
of generalized solutions ul 2 W 1;p ./ of these problems is investigated. In so doing, (i)
some uniform estimates in l for the measures of the sets ¹jul j kº and ¹jrul j kº, k > 0, are established;
(ii) with the use of these estimates it is proved that for an increasing sequence ¹lj º N the sequences ¹ulj º and ¹Di ulj º, i D 1; : : : ; n, converge almost everywhere in to some functions u W ! R and vi W ! R, i D 1; : : : ; n, respectively; (iii) using the convergences established, the limit passage in the integral identity corresponding to the approximate problems is made. As a result of the given steps a family of integral inequalities is obtained. This defines the function u as an entropy solution of problem (7.53), (7.54). We remark that the functions vi are uniquely determined by the function u. They are peculiar derivatives of u. Besides [3], the questions on the existence and uniqueness of entropy solutions to the Dirichlet problem for nonlinear elliptic second-order equations with L1 -data or measures as data were studied for instance in [12, 41]. As far as renormalized solutions are concerned, briefly speaking, these are elements of the same functional set which contains entropy solutions but unlike the latter, renormalized solutions satisfy another family of integral relations. In some cases the notions of entropy and renormalized solution are equivalent. In particular, it is true for problem (7.53), (7.54). As regards the investigation of the existence and uniqueness of renormalized solutions see, for instance [4, 15, 37, 40]. Now we give formulations of definitions and results concerning problem (7.53), (7.54). ı
Definition 7.18. A weak solution of problem (7.53), (7.54) is a function u 2 W 1;1 ./ such that: (i)
for every i 2 ¹1; : : : ; nº, ai .x; ru/ 2 L1 ./;
164
7 Theory, Applications, and Control of Nonlinear PDEs in Life Sciences
(ii) for every function ' 2 C01 ./, Z ²X n
³ ai .x; ru/Di ' dx D
i D1
Z f ' dx:
In regard to this definition see, for instance [8, 9]. If p > 2 1=n, according to [9, Theorem 1], there exists a weak solution of problem (7.53), (7.54) belonging to ı
W 1; ./ for every 2 Œ1; n.p 1/=.n 1//. Next, let for every k > 0 , Tk W R ! R be the function such that ´ s if jsj k; Tk .s/ D k sign s if jsj > k: ı
ı
It is known that if 1, u 2 W 1; ./ and k > 0, then Tk .u/ 2 W 1; ./ and for every i 2 ¹1; : : : ; nº we have Di Tk .u/ D Di u 1¹juj 0, Tk .u/ 2 W 1;p ./. Obviously,
ı
ı
W 1;p ./ T 1;p ./:
(7.56)
ı
At the same time, the set T 1;p ./ contains functions which do not belong to L1 ./. In this regard see, for instance [24]. ı
ı
We note that elements of the set T
1;p
./ are measurable functions. In fact, if u 2
T 1;p ./, then the measurability of the function u follows from the measurability of the functions Tk .u/, k 2 N, and the point-wise convergence of the sequence ¹Tk .u/º to u. ı
The set T 1;p ./ was introduced in [3]. For every u W ! R and for every x 2 we set k.u; x/ D min ¹l 2 N W ju.x/j lº: ı
Definition 7.19. Let u 2 T 1;p ./ and i 2 ¹1; : : : ; nº. Then ıi u W ! R is the function such that for every x 2 , ıi u.x/ D Di Tk.u;x/.u/.x/:
(7.57)
7.2 On some Classes of Nonlinear Equations with L1 -Data
165
ı
Proposition 7.20. Let u 2 T 1;p ./ and i 2 ¹1; : : : ; nº. Then for every k > 0 we have (7.58) Di Tk .u/ D ıi u 1¹juj 0 and for every i 2 ¹1; : : : ; nº we have Di Tk .u v/ D ıi u ıi v a. e. in ¹ju vj < kº. For the proof of this result see, for instance [24]. ı
ı
Next, we note that if u 2 T 1;p ./, ' 2 W 1;p ./ \ L1 ./, k > 0 and i 2 ¹1; : : : ; nº, then the function ai .x; ıu/.ıi uıi '/ is summable in the set ¹ju'j < kº. This is a consequence of inequality (7.50) and Proposition 7.20. Definition 7.24. An entropy solution of problem (7.53), (7.54) is a function u 2 ı
T 1;p ./ such that for every ' 2 C01 ./ and for every k > 0 the following inequality
166
7 Theory, Applications, and Control of Nonlinear PDEs in Life Sciences
holds: ²X n
Z ¹ju'j 0, Tk .ul / ! Tk .u/ strongly in W 1;p ./. In turn, this ı
leads to the conclusion that for every ' 2 W 1;p ./ \ L1 ./ and for every k > 0 the following equality holds: ²X ³ Z Z n ai .x; ıu/.ıi u ıi '/ dx D f Tk .u '/dx: (7.59) ¹ju'j 2 1=n, and let u be the entropy solution of problem (7.53), (7.54). Then u is a weak solution of problem (7.53), (7.54), and for every 2 Œ1; r / ı
we have u 2 W 1; ./.
7.2 On some Classes of Nonlinear Equations with L1 -Data
167
Finally, the next result is a consequence of Theorems 7.25 and 7.28. Theorem 7.29. Let p > 2 1=n. Then there exists a weak solution of problem (7.53), ı
(7.54) belonging to W 1; ./ for every 2 Œ1; r /. This result we have already mentioned with the reference to [9]. Remark 7.30. If p 2 1=n, then problem (7.53), (7.54), generally, may not have weak solutions. In this regard see an example in [3]. The construction of such examples is connected with the use of the principle of uniform boundedness [16, Chapter 2]. Remark 7.31. Weak solutions of problem (7.53), (7.54), generally, may not belong to ı
the space W 1;r ./. This fact was noted in [9], although the corresponding examples were not given there. Some examples showing that entropy and weak solutions of ı
problem (7.53), (7.54) may not belong to W 1;r ./ were given in [21, 27]. In connection with Remark 7.31 the question arises of additional conditions on the function f ensuring that solutions of problem (7.53), (7.54) belong to the limit ı
space W 1;r ./. Thus, in [9] the existence of a weak solution of problem (7.53), ı
(7.54) belonging to W 1;r ./ was proved under the conditions p > 2 1=n and f ln .1 C jf j/ 2 L1 ./. In [20, 21, 27, 28] the same result was established under weaker assumptions on p and f and with the use of other techniques. Let us state the corresponding assertions. First of all we consider the results of [28] on general conditions of limit summability of solutions of problem (7.53), (7.54). Let fQ W Œ0; C1/ ! R be the function such that for every s 2 Œ0; C1/, Z Q f .s/ D jf jdx: ¹jf jsº
The function fQ is nonnegative, nonincreasing and measurable. Theorem 7.32. Let
C1 Z
1
1 Q n=.np/ ds < C1; Œf .s/ s
(7.60)
and let u be the entropy solution of problem (7.53), (7.54). Then u 2 Lq ./. From Theorems 7.25, 7.28 and 7.32 it follows that if p > 2 1=n and inequality (7.60) holds, then there exists a weak solution of problem (7.53), (7.54) belonging to Lq ./.
168
7 Theory, Applications, and Control of Nonlinear PDEs in Life Sciences
Theorem 7.33. Let
C1 Z
1
1 Q n=.n1/ Œf .s/ ds < C1; s
(7.61)
and let u be the entropy solution of problem (7.53), (7.54). Then jıuj 2 Lr ./. The following result is a consequence of Theorem 7.33 and Proposition 7.22. Theorem 7.34. Let p 2 1=n, and let inequality (7.61) holds. Let u be the entropy ı
solution of problem (7.53), (7.54). Then u 2 W 1;r ./. The next result follows from Theorems 7.25, 7.26, 7.33 and 7.34. Theorem 7.35. Let p 2 1=n, and let inequality (7.61) hold. Then there exists a ı
weak solution of problem (7.53), (7.54) belonging to W 1;r ./. Now we state several results which follow from Theorems 7.32, 7.34 and 7.35. Theorem 7.36. Let > .n p/=n and f Œ ln .1 C jf j/ 2 L1 ./. Let u be the entropy solution of problem (7.53), (7.54). Then u 2 Lq ./. Theorem 7.37. Let p 2 1=n, > .n 1/=n and f Œ ln .1 C jf j/ 2 L1 ./. ı
Let u be the entropy solution of problem (7.53), (7.54). Then u 2 W 1;r ./. Theorem 7.38. Let p 2 1=n, > .n 1/=n and f Œ ln .1 C jf j/ 2 L1 ./. ı
Then there exists a weak solution of problem (7.53), (7.54) belonging to W 1;r ./. We note that in essence Theorems 7.37 and 7.38 were proved in [20]. Theorem 7.36 was established in [35, Section 1.3]. In order to consider more general consequences of Theorems 7.32, 7.34 and 7.35, we introduce some numbers and functions. First, we define the sequence of numbers sj as follows: s1 D 1;
sj D e sj 1 ;
j D 2; 3; : : : :
Now let for every j 2 N, bj W Œsj ; C1/ ! Œ0; C1/ be the function such that ln ln… s; bj .s/ D ln „ ƒ‚
s 2 Œsj ; C1/:
j
Observe that if j 2 N and s > sj , then bj .s/ > 0.
7.2 On some Classes of Nonlinear Equations with L1 -Data
169
Theorem 7.39. Let m 2 N, > .n p/=n and f
Y m
.np/=n bj .sj C jf j/ Œ bmC1 .smC1 C jf j/ 2 L1 ./:
j D1
Let u be the entropy solution of problem (7.53), (7.54). Then u 2 Lq ./. Theorem 7.40. Let p 2 1=n, m 2 N, > .n 1/=n and f
Y m
.n1/=n bj .sj C jf j/ Œ bmC1 .smC1 C jf j/ 2 L1 ./:
(7.62)
j D1 ı
Let u be the entropy solution of problem (7.53), (7.54). Then u 2 W 1;r ./. Theorem 7.41. Let p 21=n, m 2 N, > .n1/=n, and let inclusion (7.62) hold. ı
Then there exists a weak solution of problem (7.53), (7.54) belonging to W 1;r ./. We note that Theorem 7.41 was proved in [21], and Theorems 7.39 and 7.40 were established in [35, Section 1.3]. Now we give consequences of Theorems 7.32, 7.34 and 7.35 connected with more general conditions on f as compared with the requirements of Theorems 7.39 and 7.40. Theorem 7.42. Let c > 0, m 2 N, > .n p/=n, and let for every k > smC1 , Z jf jdx c
Y m
.np/=n bj .k/ Œ bmC1 .k/ :
j D1
¹jf j>kº
Let u be the entropy solution of problem (7.53), (7.54). Then u 2 Lq ./. Theorem 7.43. Let p 2 1=n. Let c > 0, m 2 N, > .n 1/=n, and let for every k > smC1 , Z jf jdx c
Y m
.n1/=n bj .k/ Œ bmC1 .k/ :
(7.63)
j D1
¹jf j>kº
ı
Let u be the entropy solution of problem (7.53), (7.54). Then u 2 W 1;r ./. Theorem 7.44. Let p 2 1=n. Let c > 0, m 2 N, > .n 1/=n, and let for every k > smC1 inequality (7.63) hold. Then there exists a weak solution of problem ı
(7.53), (7.54) belonging to W 1;r ./.
170
7 Theory, Applications, and Control of Nonlinear PDEs in Life Sciences
Theorems 7.43 and 7.44 were established in [27], and Theorem 7.42 was proved in [35, Section 1.4]. Remark 7.45. In [27] we gave an example which shows that the assumption D .n 1/=n in inequality (7.63) leads to cases where the entropy solution of problem ı
(7.53), (7.54) does not belong to W 1;r ./, and there exists a weak solution of problem ı
(7.53), (7.54) not belonging to W 1;r ./. Further, we state some results concerning solutions of problem (7.53), (7.54) in the case where f 2 Lm ./ with m > 1. For every 2 Œ1; n/ we set D n=.n /. ı
We recall (see, for instance [17, Chapter 7]) that if 2 Œ1; n/, then W 1; ./ L ./ and there exists a positive constant cn; depending only on n and such ı
that for every function u 2 W 1; ./, Z 1= 1= Z juj dx cn; jruj dx :
(7.64)
Theorem 7.46. Let 1 < m < p =.p 1/ and f 2 Lm ./. Let u be the entropy solution of problem (7.53), (7.54). Then the following assertions hold: (i)
for every 2 .0; nm.p 1/=.n mp// we have u 2 L ./;
(ii) for every 2 .0; m .p 1// we have jıuj 2 L./. Theorem 7.47. Let 1 < m < p =.p 1/, m .p 1/ 1 and f 2 Lm ./. Let u ı
.p1/
be the entropy solution of problem (7.53), (7.54). Then u 2 W 1;m
./.
Theorem 7.48. Let 1 < m < p =.p 1/, m .p 1/ 1 and f 2 Lm ./. Then ı
there exists a weak solution of problem (7.53), (7.54) belonging to W 1;m
.p1/ ./.
It is easy to see that if p 2 1=n and m 2 Œ1; n/, then m .p 1/ 1. This fact and Theorem 7.48 imply the following result. Theorem 7.49. Let p 2 1=n, 1 < m < p =.p 1/ and f 2 Lm ./. Then ı
.p1/
there exists a weak solution of problem (7.53), (7.54) belonging to W 1;m
./.
The same result was obtained in [9, Theorem 3] with the only difference that the inequality p > 2 1=n was assumed in [9] instead of the inequality p 2 1=n. Besides Theorem 7.49, which slightly improves the mentioned result of [9], an assertion on the existence of a weak solution of problem (7.53), (7.54) in the case p < 2 1=n follows from Theorem 7.48. Before formulating this assertion, we observe that n=.np nC1/ < p =.p 1/ and if p < 21=n, then 1 < n=.np nC1/.
7.2 On some Classes of Nonlinear Equations with L1 -Data
171
Theorem 7.50. Let p < 2 1=n, n=.np n C 1/ m < p =.p 1/ and f 2 Lm ./. Then there exists a weak solution of problem (7.53), (7.54) belonging to ı
W 1;m
.p1/
./.
We note that Theorems 7.46 and 7.47 and in essence Theorem 7.50 were established in [20]. Inequality (7.64) is significantly used in the proof of Theorems 7.46 and 7.47. An analogue of Theorem 7.47 for entropy solutions of the Dirichlet problem for nonlinear second-order equations with degenerate coercivity was obtained in [1]. Theorem 7.48 is a consequence of Theorems 7.25, 7.26 and 7.47.
Theorem 7.51. Let f 2 Lp =.p 1/ ./, and let u be the entropy solution of problem (7.53), (7.54). Then the following assertions hold: (i)
ı
u 2 W 1;p ./;
(ii) u is a weak solution of problem (7.53), (7.54); (iii) u is the generalized solution of problem (7.53), (7.54). Details concerning the proof of this theorem are found in [35, Section 1.5]. Finally, we remark that in the case where f 2 Lm ./ with m > p =.p 1/ the improvement of summability of the generalized solution of problem (7.53), (7.54) is established by Stampacchia’s method. In this regard see, for instance [14, 18, 43]. Concluding this subsection, we consider the notion of renormalized solution of problem (7.53), (7.54). Definition 7.52. A renormalized solution of problem (7.53), (7.54) is a function u 2 ı
T 1;p ./ such that: Z 1 (i) lim m!1 m
jıujp dx D 0;
¹mjuj2mº ı
(ii) for every h 2 C01 .R/ and for every ' 2 W 1;p ./ \ L1 ./, Z ²X n
³ ai .x; ıu/Di ' h.u/dx
i D1
C
Z ²X n
³ Z 0 ai .x; ıu/ıi u h .u/'dx D f h.u/'dx:
i D1
In regard to this definition see, for instance [37].
172
7 Theory, Applications, and Control of Nonlinear PDEs in Life Sciences
Theorem 7.53. A function u W ! R is a renormalized solution of problem (7.53), (7.54) if and only if the function u is the entropy solution of problem (7.53), (7.54). We note that from the proof of this theorem given in [35, Section 1.1] it follows that ı
if u is the entropy solution of problem (7.53), (7.54), then for every ' 2 W 1;p ./ \ L1 ./ and for every k > 0 equality (7.59) holds. In the next two subsections, we state results of [23,24] on the existence and a priori properties of entropy solutions of the Dirichlet problem for nonlinear elliptic secondorder equations with degenerate coercivity and L1 -data. Unlike (7.53), the coefficients of such equations additionally depend on the argument s 2 R corresponding to unknown function, and instead of condition (7.51) an analogous condition with the term cj j N p =.1 C jsj/p1 , c; N p1 > 0, replacing the term c2 j jp , is supposed. Among previous works devoted to the study of the existence of solutions of elliptic equations with degenerate coercivity we mention for instance the articles [1, 7]. In particular, in [1] a theorem on the existence of an entropy solution of the Dirichlet problem for nonlinear elliptic equations with degenerate coercivity and L1 -data was proved. However, as regards growth conditions on the coefficients of equations under consideration with respect to unknown function, in [23] we obtained more general result as compared with the noted theorem of [1]. 7.2.1.2 The Dirichlet Problem for Equations with Degenerate Coercivity and Zero Lower-order Term Let for every i 2 ¹1; : : : ; nº, ai W R Rn ! R be a Carathéodory function. We shall assume that the following conditions are satisfied: (1 ) for every k > 0 there exist cNk > 0 and gN k 2 L1 ./, gN k 0 in , such that for almost every x 2 , for every s 2 R; jsj k, and for every 2 Rn , n X
jai .x; s; /jp=.p1/ cNk j jp C gN k .x/I
i D1
(2 ) there exist p1 2 Œ 0; p 1/ and c1 > 0 such that for almost every x 2 , for every s 2 R and for every 2 Rn , n X i D1
ai .x; s; / i
c1 j jp I .1 C jsj/p1
(3 ) for almost every x 2 , for every s 2 R and for every ; 0 2 Rn , ¤ 0 , n X i D1
Œ ai .x; s; / ai .x; s; 0 /. i i0 / > 0:
7.2 On some Classes of Nonlinear Equations with L1 -Data
173
Let f 2 L1 ./. We consider the following Dirichlet problem :
n X @ ai .x; u; ru/ D f @xi
in ;
(7.65)
i D1
u D 0 on @: ı
(7.66)
ı
Observe that if u 2 T 1;p ./, v 2 W 1;p ./ \ L1 ./, k > 0 and i 2 ¹1; : : : ; nº, then the function ai .x; u; ıu/.ıi u ıi v/ is summable in the set ¹ju vj < kº. This is a consequence of condition (1 ) and Proposition 7.20. Definition 7.54. An entropy solution of problem (7.65), (7.66) is a function u 2 ı
T 1;p ./ such that for every v 2 C01 ./ and for every k > 0 the following inequality holds: ²X ³ Z Z n ai .x; u; ıu/.ıi u ıi v/ dx f Tk .u v/dx: ¹juvj 0 and g1 2 L1 ./, g1 0 in , such that for almost every x 2 , for every s 2 R and for every
2 Rn , n X i D1
ai .x; s; / i
c1 j jp c2 p2 .1 C jsj/p2 g1 .x/: .1 C jsj/p1
174
7 Theory, Applications, and Control of Nonlinear PDEs in Life Sciences
Let a0 W R Rn ! R be a Carathéodory function, and let f 2 L1 ./. We consider the following Dirichlet problem : n X @ ai .x; u; ru/ C a0 .x; u; ru/ D f @xi
in ;
(7.67)
i D1
u D 0 on @:
(7.68)
First of all we give definitions of several kinds of solution of this problem and describe relations between them. ı
Definition 7.56. A weak solution of problem (7.67), (7.68) is a function u 2 W 1;1 ./ such that: (i)
for every i 2 ¹1; : : : ; nº, ai .x; u; ru/ 2 L1 ./;
(ii) a0 .x; u; ru/ 2 L1 ./; (iii) for every function v 2 C01 ./, Z ²X n
³
Z
ai .x; u; ru/Di v C a0 .x; u; ru/v dx D
i D1
f v dx: ı
Definition 7.57. A T -solution of problem (7.67), (7.68) is a function u 2 T such that: (i)
1;p ./
for every i 2 ¹1; : : : ; nº, ai .x; u; ıu/ 2 L1 ./;
(ii) a0 .x; u; ıu/ 2 L1 ./; (iii) for every function v 2 C01 ./, Z ²X n
³ Z ai .x; u; ıu/Di v C a0 .x; u; ıu/v dx D f v dx:
i D1
Proposition 7.58. Let u be a T -solution of problem (7.67), (7.68), and let jıuj 2 L1 ./. Then u is a weak solution of problem (7.67), (7.68). Definition 7.59. An entropy solution of problem (7.67), (7.68) is a function u 2 ı
T 1;p ./ such that: (i)
a0 .x; u; ıu/ 2 L1 ./;
7.2 On some Classes of Nonlinear Equations with L1 -Data
175
(ii) for every v 2 C01 ./ and for every k > 0, Z ¹juvj 0 and gN 2 L1 ./, gN 0 in . Let for almost every x 2 , for every s 2 R and for every 2 Rn the following inequality hold: n X
jai .x; s; /jp=.p1/ cN .jsjpN C j jp / C g.x/: N
(7.69)
i D1
Let u be an entropy solution of problem (7.67), (7.68). Then u is a T -solution of problem (7.67), (7.68). In turn, Propositions 7.58, 7.60 and 7.63 imply the next result. Proposition 7.64. Let p > 2 1=n, p1 < min ¹.np 2n C 1/=.n 1/; .p 1/=.n p C 1/º, p2 < n.p 1 p1 /=.n p/, 0 < pN < p .p 1 p1 /=.p 1/, cN > 0 and gN 2 L1 ./, gN 0 in . Let for almost every x 2 , for every s 2 R and for every
2 Rn inequality (7.69) hold. Let u be an entropy solution of problem (7.67), (7.68). Then u is a weak solution of problem (7.67), (7.68).
176
7 Theory, Applications, and Control of Nonlinear PDEs in Life Sciences
Now we concentrate our attention on results concerning summability properties of entropy solutions of problem (7.67), (7.68). Some of these properties have already been described in Proposition 7.60. A result of subsequent statements is a more precise description of summability properties of entropy solutions to the given problem under an additional condition on p2 . Proposition 7.65. Let p2 D 0 and g1 D 0 in . Let u be an entropy solution of problem (7.67), (7.68). Then the following assertions hold: R C1 (i) if h 2 C.R/, h 0 in R and 1 .1 C jt j/p1 h.t /dt < C1, then the function jıujp h.u/ is summable in ; R C1 (ii) if h 2 C 1 .R/, h.0/ D 0 and 1 .1 C jt j/p1 jh0 .t /jp dt < C1, then h.u/ 2 Lp ./. Proposition 7.66. Let p2 < n.p 1 p1 /=.n p/. Let u be an entropy solution of problem (7.67), (7.68). Then the following assertions hold: R C1 (i) if h 2 C.R/, h 0 in R, 1 .1 C jt j/p1 h.t /dt < C1 and sup t 2R .1 C jt j/p1 h.t / < C1, then the function jıujp h.u/ is summable in ; R C1 (ii) if h 2 C 1 .R/, h.0/ D 0, 1 .1 C jt j/p1 jh0 .t /jp dt < C1 and sup t 2R .1 C jt j/p1 jh0 .t /jp < C1, then h.u/ 2 Lp ./. The next two results are consequences of Proposition 7.66. Proposition 7.67. Let p2 < n.p 1 p1 /=.n p/. Let u be an entropy solution of problem (7.67), (7.68). Let ˇ > 1. Then the function jıujp .1 C juj/p1 C1 Œ ln .2 C juj/ Œ ln ln .3 C juj/ˇ is summable in . Proposition 7.68. Let p2 < n.p 1 p1 /=.n p/. Let u be an entropy solution of problem (7.67), (7.68). Let h 2 C.R/, and let the following conditions be satisfied: the function h is even, h 0 in R, the function h is nonincreasing in Œ 0; C1/ and C1 Z
1
1 Œ h.t /.np/=n dt < C1: t
Then the functions jujn.p1p1 /=.np/ h.u/ are summable in .
and jıujn.p1p1 /=.n1p1 / Œ h.u/.np/=.n1p1 /
7.2 On some Classes of Nonlinear Equations with L1 -Data
177
This proposition implies the next result. Proposition 7.69. Let p2 < n.p 1 p1 /=.n p/, let u be an entropy solution of problem (7.67), (7.68), and let ˇ > 1=.p 1 p1 /. Then juj Œ ln .2 C
juj/1=.p1p1 / Œ ln ln .3
C juj/ˇ
2 Ln.p1p1 /=.np/ ./;
jıuj 2 Ln.p1p1 /=.n1p1 / ./: Œ ln .2 C juj/1=.p1p1 / Œ ln ln .3 C juj/ˇ Obviously, in the case p2 < n.p 1 p1 /=.n p/ Proposition 7.69 gives more precise information on summability properties of entropy solutions of problem (7.67), (7.68) as compared with that described in Proposition 7.60. Now let us state an existence result. Theorem 7.70. Let p2 D 0 and g1 D 0 in . Let c 0, 0 < < p 1 p1 , R C1 g 2 L1 ./, g 0 in , ' 2 C.R/, ' 0 in R and 1 .1 C jt j/p1 '.t /dt < C1. Let for almost every x 2 , for every s 2 R and for every 2 Rn , ja0 .x; s; /j c.jsj C j j / C j jp '.s/ C g.x/:
(7.70)
Then there exists an entropy solution of problem (7.67), (7.68). This theorem was announced in [35, Section 1.8]. Its detailed proof will be given in a forthcoming publication of the author. A similar result was stated in [24] with the only difference that the inequality p1 < .p 1/=.np C1/ was additionally assumed there. A result analogous to Theorem 7.70 holds in the case p2 ¤ 0 as well. In that case the additional condition sup t 2R .1 C jt j/p1 '.t / < C1 is required. Furthermore, in regard to the results of this subsection we remark the following. Propositions 7.61, 7.63 and 7.64 generalize results obtained in [3] for the case p1 D 0, p2 D 0 and g1 0. Even in this case Proposition 7.69 gives stronger results as compared with those obtained in [10] for weak solutions. In the case where the leading coefficients of equation (7.67) do not depend on u, have the growth with respect to ru of order p 1 and satisfy the usual coercivity condition (p1 D 0, p2 D 0 and g1 0), and the lower-order coefficient a0 does not depend on ru, has an arbitrary growth with respect to u and is nondecreasing with respect to u, the existence of an entropy solution of problem (7.67), (7.68) was proved in [3]. In the case where the leading coefficients of equation (7.67) may have a growth with respect to u of order not greater than p 1, have the growth with respect to ru of order p 1 and satisfy the usual coercivity condition, the lower-order coefficient a0 satisfies inequality (7.70) with c D 0 and ' 2 L1 .R/, and the right-hand side of the equation is a bounded Radon measure in , the existence of a T -solution of the corresponding Dirichlet problem was established in [38]. Finally, we note that in the case where the leading coefficients of equation (7.67) satisfy the same conditions as in [38], the lower-order coefficient
178
7 Theory, Applications, and Control of Nonlinear PDEs in Life Sciences
has the behavior with respect to ru analogous to that assumed in [38], and the righthand side of the equation belongs to L1 ./, the existence of an entropy solution of the corresponding Dirichlet problem was proved in [41].
7.2.2 Nonlinear Fourth-order Equations with Strengthened Coercivity and L1 -Data In the previous section, we pointed out that an effective approach to the study of the solvability of nonlinear elliptic second-order equations with L1 -right-hand sides was proposed in [3]. As far as high-order equations with L1 -data are concerned, it turned out the approach analogous to that of [3] can be realized under a condition of strengthened coercivity on the coefficients of the equations. For the first time this was shown in [19, 25] for nonlinear fourth-order equations. Later on ideas of these works were developed in [26, 29–31, 33] where the solvability of nonlinear elliptic equations of fourth order and arbitrary even order with strengthened coercivity and L1 -data was investigated, including the consideration of degenerate and anisotropic equations. In this section, we state results of [19, 25, 29] on the existence and properties of solutions of the Dirichlet problem for nonlinear elliptic fourth-order equations with strengthened coercivity and L1 -right-hand sides. Let n 2 N, n > 2, and let be a bounded open set of Rn . We denote by ƒ the set of all n-dimensional multi-indices ˛ such that j˛j D 1 or j˛j D 2. Moreover, we shall use the following notation: Rn;2 is the space of all mappings W ƒ ! R; if u 2 W 2;1 ./, then r2 u W ! Rn;2 is the mapping such that for every x 2 and for every ˛ 2 ƒ, .r2 u.x//˛ D D ˛ u.x/. Let p 2 .1; n=2/ and q 2 .2p; n/. Let c1 ; c2 > 0, g1 ; g2 2 L1 ./, g1 ; g2 0 in , and let for every ˛ 2 ƒ, A˛ W Rn;2 ! R be a Carathéodory function. We shall suppose that for almost every x 2 and for every 2 Rn;2 , X
jA˛ .x; /jq=.q1/ C
j˛jD1
c1 X
² X
X j˛jD2
q
j ˛ j C
j˛jD1
A˛ .x; / ˛ c2
X
³ p
j ˛ j
C g1 .x/;
j˛jD2
² X
j˛jD1
˛2ƒ
jA˛ .x; /jp=.p1/
q
j ˛ j C
X
(7.71) ³ p
j ˛ j
g2 .x/:
(7.72)
j˛jD2
Moreover, we shall assume that for almost every x 2 and for every ; 0 2 Rn;2 ,
¤ 0, X ŒA˛ .x; / A˛ .x; 0 /. ˛ ˛0 / > 0: (7.73) ˛2ƒ
7.2 On some Classes of Nonlinear Equations with L1 -Data
179
Let F W R ! R be a Carathéodory function. We consider the following Dirichlet problem : X .1/j˛j D ˛ A˛ .x; r2 u/ D F .x; u/ in ; (7.74) ˛2ƒ
D ˛ u D 0; j˛j D 0; 1;
on @:
(7.75)
Let us state definitions and main results of [19, 25] concerning some kinds of solutions and the solvability of the given problem. 1;q We denote by W2;p ./ the set of all functions in W 1;q ./ having the weak deriva1;q tives of the second order in Lp ./. The set W2;p ./ is a Banach space with the norm
kuk D kukW 1;q ./ C
X Z
˛
p
1=p
jD uj dx
:
j˛jD2 ı
1;q 1 We denote by W 1;q 2;p ./ the closure of the set C0 ./ in W2;p ./. Let for every k 2 N, k W R ! R be the function such that k .s/
D s s kC2 C
k C 1 kC3 s ; kC3
s 2 R:
For every k 2 N we define the function hk W R ! R by 8 s if jsj k; ˆ ˆ ˆ ˆ ˆ jsj k ˆ < C 1 k sign s if k < jsj < 2k; k k hk .s/ D ˆ ˆ ˆ ˆ ˆ kC2 ˆ : 2k sign s if jsj 2k: kC3 For every k 2 N we have hk 2 C 2 .R/, jhk j 2k in R, 0 h0k 1 in R and jh00k j 3 in R. ı
1;q
We denote by H 2;p ./ the set of all functions u W ! R such that for every ı
k 2 N, hk .u/ 2 W 1;q 2;p ./. ı
Any function u in H 1;q 2;p ./ is measurable. This follows from the measurability of the functions hk .u/, k 2 N, and the point-wise convergence of ¹hk .u/º to u. ı
ı
ı
1;q 1;q 1 We note that W 1;q 2;p ./ H 2;p ./ and H 2;p ./nLloc ./ ¤ ;. ı
1;q
Definition 7.71. If u 2 H 2;p ./ and ˛ 2 ƒ, then ı ˛ u W ! R is the function such that ı ˛ u D D ˛ h1 .u/ in ¹juj 1º and for every k 2 N, ı ˛ u D D ˛ h2k .u/ in ¹2k1 < juj 2k º.
180
7 Theory, Applications, and Control of Nonlinear PDEs in Life Sciences ı
1;q
Proposition 7.72. Let u 2 H 2;p ./. Then for every ˛ 2 ƒ and for every k 2 N we have ı ˛ u D D ˛ hk .u/ a. e. in ¹juj kº. ı
˛ ˛ Observe that if u 2 W 1;q 2;p ./ and ˛ 2 ƒ, then ı u D D u a. e. in . ı
n;2 is the mapping such that for Definition 7.73. If u 2 H 1;q 2;p ./, then ı2 u W ! R ˛ every x 2 and for every ˛ 2 ƒ, .ı2 u.x//˛ D ı u.x/. ı
ı
1;q 1 We note that if u 2 H 1;q 2;p ./, ' 2 W 2;p ./ \ L ./, k 2 N and ˛ 2 ƒ, then by (7.71) and Proposition 7.72, there exist finite integrals of the functions A˛ .x; ı2 u/ı ˛ u and A˛ .x; ı2 u/ı ˛ ' over the set ¹ju 'j < 2kº. We set r D n.q 1/=.n 1/.
Definition 7.74. An entropy solution of problem (7.74), (7.75) is a function u 2 ı
1;q
H 2;p ./ satisfying the following conditions: (i)
F .x; u/ 2 L1 ./;
(ii) there exist c > 0, b 2 .1; r / and > 0 such that for every ' 2 C01 ./ and for every k 2 N, ²X ³ Z ˛ ˛ A˛ .x; ı2 u/.ı u ı '/ h0k .u '/dx ¹ju'j p1 and the following conditions are satisfied: (i)
for almost every x 2 the function F .x; / is nonincreasing in R;
7.2 On some Classes of Nonlinear Equations with L1 -Data
181
(ii) for every s 2 R the function F .; s/ belongs to L1 ./. Then there exists an entropy solution of problem (7.74), (7.75). From Theorems 7.76 and 7.77 it follows that under the conditions of Theorem 7.77 there exists a unique entropy solution of problem (7.74), (7.75). We stress the fact that condition (7.73) is important for the proof of both the uniqueness and the existence of entropy solution. ı
1;q
Definition 7.78. An H -solution of problem (7.74), (7.75) is a function u 2 H 2;p ./ satisfying the following conditions: (i)
F .x; u/ 2 L1 ./;
(ii) for every ˛ 2 ƒ, A˛ .x; ı2 u/ 2 L1 ./; (iii) for every function ' 2 C01 ./, Z ²X
³ Z A˛ .x; ı2 u/ı ' dx D F .x; u/' dx: ˛
˛2ƒ
Theorem 7.79. Let u be an entropy solution of problem (7.74), (7.75). Then u is an H -solution of the same problem. Theorem 7.80. Suppose that conditions (i) and (ii) of Theorem 7.77 are satisfied. Then there exists an H -solution of problem (7.74), (7.75). We set p2 D np=.np n C 1/. Since p 2 .1; n=2/, we have p2 2 .1; n/. Proposition 7.81. Let q > p2 , and let u be an entropy solution of problem (7.74), (7.75). Then (i)
for every n-dimensional multi-index ˛, j˛j D 2, there exists the weak derivative D ˛ u, and D ˛ u D ı ˛ u a. e. in ;
(ii) for every n-dimensional multi-index ˛, j˛j D 2, and for every 2 Œ1; rp=q/ we have D ˛ u 2 L./. ı
Definition 7.82. A W -solution of problem (7.74), (7.75) is a function u 2 W 2;1 ./ satisfying the following conditions: (i)
F .x; u/ 2 L1 ./;
(ii) for every ˛ 2 ƒ, A˛ .x; r2 u/ 2 L1 ./;
182
7 Theory, Applications, and Control of Nonlinear PDEs in Life Sciences
(iii) for every function ' 2 C01 ./, ³ Z Z ²X ˛ A˛ .x; r2 u/D ' dx D F .x; u/' dx:
˛2ƒ
Theorem 7.83. Let q > p2 , and let u be an entropy solution of problem (7.74), (7.75). Then u is a W -solution of problem (7.74), (7.75). Theorem 7.84. Suppose that q > p2 and conditions (i) and (ii) of Theorem 7.77 are satisfied. Then there exists a W -solution of problem (7.74), (7.75). Observe that the inequality q > p2 holds if p 3=2 1=n. Now let us state the main results of [29] concerning the notion of a proper entropy solution of problem (7.74), (7.75). We remark that the definition of a proper entropy solution is more natural than the above definition of an entropy solution of problem (7.74), (7.75). Besides, the form of integral inequalities in the definition of a proper entropy solution agrees with an equivalent formulation of the entropy condition in the case of second-order equations with L1 -data (as regards this formulation, see [3, Lemma 3.2]). We denote by H the set of all functions h 2 C 2 .R/ satisfying the conditions: h.0/ D 0 and there exists > 0 such that for every s 2 R, jsj , the equality h0 .s/ D 0 holds. Obviously, ¹hk º H . ı
Observe that if u W ! R, then the inclusion u 2 H 1;q 2;p ./ is equivalent to the ı
ı
1;q
1;q
following assertion: for every h 2 H , h.u/ 2 W 2;p ./. Moreover, if u 2 H 2;p ./, ı
ı
1;q
1;q
w 2 W 2;p ./ \ L1 ./ and h 2 H , then h.u w/ 2 W 2;p ./ \ L1 ./. ı
1;q
For every function u 2 H 2;p ./ we set ˆu D
X
jı ˛ ujq C
j˛jD1
X
jı ˛ ujp :
j˛jD2 ı
1;q
From Proposition 7.72 it follows that if u 2 H 2;p ./ and k 2 N, then the function ˆu is summable in ¹juj < kº. ı
1;q
For every function u 2 H 2;p ./ we set Mu D sup
k2N
1 k
Z ˆu dx: ¹juj p2 , then for every 2 Œ1; rp=q/ we have H 2;p We denote by HC the set of all functions h 2 H such that h0 0 in R. Obviously, ¹hk º HC . 1;q
ı
ı
1;q
1;q
Observe that if u 2 H 2;p ./, w 2 W 2;p ./ \ L1 ./, h 2 H and ˛ 2 ƒ, then the function A˛ .x; ı2 u/D ˛ h.u w/ is summable in . Definition 7.85. A proper entropy solution of problem (7.74), (7.75) is a function b 1;q ./ such that: u2H 2;p (i)
F .x; u/ 2 L1 ./; ı
1;q
(ii) for every w 2 W 2;p ./ \ L1 ./ and for every h 2 HC , Z ²X
˛2ƒ
³ Z A˛ .x; ı2 u/D h.u w/ dx F .x; u/h.u w/dx: ˛
Theorem 7.86. Suppose that conditions (i) and (ii) of Theorem 7.77 are satisfied. Then there exists a proper entropy solution of problem (7.74), (7.75). Theorem 7.87. Suppose that q > p1 . Let u be a proper entropy solution of problem (7.74), (7.75). Then u is an entropy solution of problem (7.74), (7.75). Theorem 7.88. Suppose that all the conditions of Theorem 7.77 are satisfied. Let u be an entropy solution of problem (7.74), (7.75). Then u is a proper entropy solution of problem (7.74), (7.75). Theorems 7.76 and 7.87 imply the following result. Theorem 7.89. Suppose that q > p1 and for almost every x 2 the function F .x; / is nonincreasing in R. Let u and v be proper entropy solutions of problem (7.74), (7.75). Then u D v a. e in . We remark that if u is a proper entropy solution of problem (7.74), (7.75), then u is an H -solution of problem (7.74), (7.75). Furthermore, if q > p2 and u is a proper entropy solution of problem (7.74), (7.75), then u is a W -solution of problem (7.74), (7.75). Proposition 7.90. Let u be a proper entropy solution of problem (7.74), (7.75). Let h 2 C 1 .R/, M > 0, and let the following conditions be satisfied: the function h is R C1 nonnegative and bounded in R, jh0 j M h in R and 1 h./d < C1. Then ˆuh.u/ 2 L1 ./.
184
7 Theory, Applications, and Control of Nonlinear PDEs in Life Sciences
The next two results are consequences of Proposition 7.90. Proposition 7.91. Let u be a proper entropy solution of problem (7.74), (7.75). Then for every > 1 we have ˆu.1 C juj/1 Œ ln .2 C juj/ 2 L1 ./. Proposition 7.92. Let u be a proper entropy solution of problem (7.74), (7.75). Let h 2 C 1 .R/, M > 0, and let the following conditions be satisfied: the function h is even, positive in R and nonincreasing in Œ0; C1/, jh0 j M h in R and C1 Z
1
.nq/=n 1 h./ d < C1:
Then jujn.q1/=.nq/ h.u/ 2 L1 ./, for every n-dimensional multi-index ˛, j˛j D 1, we have jı ˛ ujr Œh.u/.nq/=.n1/ 2 L1 ./, and for every n-dimensional multi-index ˛, j˛j D 2, we have jı ˛ ujrp=q Œh.u/.nq/=.n1/ 2 L1 ./. In turn, Proposition 7.92 implies the following result. Proposition 7.93. Let u be a proper entropy solution of problem (7.74), (7.75). Let > 1. Then jujn.q1/=.nq/ Œ ln .2 C juj/n=.nq/ Œ ln ln .3 C juj/n=.nq/ 2 L1 ./; for every n-dimensional multi-index ˛, j˛j D 1, we have jı ˛ ujr Œ ln .2 C juj/n=.n1/ Œ ln ln .3 C juj/n=.n1/ 2 L1 ./; and for every n-dimensional multi-index ˛, j˛j D 2, we have jı ˛ ujrp=q Œ ln .2 C juj/n=.n1/ Œ ln ln .3 C juj/n=.n1/ 2 L1 ./: Obviously, Propositions 7.90–7.93 are analogues of Propositions 7.65–7.69. Besides, they are analogous to results of [34] for W -solutions of the Dirichlet problem for a class of degenerate nonlinear high-order equations with strengthened monotonicity and L1 -data. We note that using the a priori estimates of proper entropy solutions of problem (7.74), (7.75) obtained in [29, Section 5] along with techniques analogous to those of [20, 22, 28], one can establish results on the improvement of the summability of proper entropy solutions of problem (7.74), (7.75) under additional conditions on the function F and in particular under the assumptions that the function F .; 0/ belongs to some logarithmic classes or Lt ./ with t > 1. Finally, we remark that high-order equations with conditions on the coefficients of kind (7.71) and (7.72) and sufficiently regular data were introduced in [42], and the data regularity assumed there, unlike the case of L1 -data, admits the study of the solvability of the equations within the usual framework of the monotone operator theory [36].
7.2 On some Classes of Nonlinear Equations with L1 -Data
185
7.2.3 Concluding Remarks In the present article, we gave a survey of results belonging to one of the intensively developed directions in the theory of nonlinear partial differential equations. Within the framework of this direction the existence and properties of different kinds of solutions of nonlinear equations with L1 -data or measure data are studied. To a great extent, the development of these investigations was inspired by the works [8, 9] and, especially, [3]. In the latter work, an effective approach to the study of the solvability of nonlinear elliptic second-order equations with L1 -data was proposed. The survey presented in the given article exposes mainly the author’s results in the mentioned field. Among the results stated in Section 7.2.1 we pointed out theorems on limit summability of weak solutions to the Dirichlet problem for nonlinear elliptic second-order equations with L1 -right-hand sides. These theorems were published in [20,21,27,28]. They are essentially stronger than the corresponding assertion on limit summability given in [9]. Here we also mention recent author’s results on the existence and a priori properties of entropy solutions to the Dirichlet problem for nonlinear elliptic secondorder equations whose coefficients admit degenerate coercivity and arbitrary growth with respect to unknown function (see Subsections 7.2.1.2 and 7.2.1.3). These results, published in [23, 24], generalize and improve those obtained in [1, 3, 10]. We observe that in particular cases the left-hand sides of the second-order equations considered in Section 7.2.1 may be generatedPor may include the p-Laplace operator p which is defined by the equality p u D niD1 Di .jrujp2 Di u/ with p > 1. Section 7.2.2 presents the author’s results on the existence and properties of different kinds of solutions to the Dirichlet problem for nonlinear elliptic fourth-order equations with strengthened coercivity and L1 -right-hand sides. These results are published in [19, 25, 29], where an approach analogous to that of [3] is realized for fourth-order equations. We emphasize that the realization of this approach, unlike [3], has some distinctive features. Firstly, in the case of equations of fourth and higher order, one cannot use standard truncated functions in the same way as in [3] for obtaining necessary estimates of solutions of approximate problems with regular data. Secondly, the use in the corresponding integral identities of other (smooth) functions instead of the standard truncations leads to the necessity of rather a delicate handling of terms related to the high-order derivatives of the appropriate test functions. In the end, these circumstances dictate which structure of high-order equations with L1 -data and an energy space for the approximate problems should be set in order to realize the approach under discussion. As is shown in [19, 25, 29], in the case of the Dirichlet problem for fourth-order equations with L1 -data in a bounded open set of Rn with n > 2, exactly strengthened coercivity condition (7.72) with the exponents p 2 .1; n=2/ and ı
q 2 .2p; n/ and the corresponding Sobolev space W 1;q 2;p ./ are suitable.
186
7 Theory, Applications, and Control of Nonlinear PDEs in Life Sciences
We note that the equation 2 u q u C juj1 u D f , where f 2 L1 ./, > 1, q 2 .4; n/, q is the q- Laplacian and 2 is the biharmonic operator, is a particular representative of the class of equations considered in Section 7.2.2. As we mentioned in the introduction, the p-Laplace operator and its generalizations are involved in the statement of many applied problems. The same concerns the biharmonic operator which is used for instance in modeling such problems as flexure of thin plates, slow viscous flow of Newtonian fluids and plane problems of elasticity theory as well as in the study of some biological processes.
Bibliography [1] A. Alvino, L. Boccardo, V. Ferone, L. Orsina, G. Trombetti, Existence results for nonlinear elliptic equations with degenerate coercivity, Ann. Mat. Pura Appl., (4) 182 (2003), no. 1, 53–79. [2] A. Alvino, V. Ferone, G. Trombetti, Nonlinear elliptic equations with lower-order terms, Differential Integral Equations, 14 (2001), no. 10, 1169–1180. [3] Ph. Bénilan, L. Boccardo, T. Gallouët, R. Gariepy, M. Pierre, J. L. Vazquez, An L1 theory of existence and uniqueness of solutions of nonlinear elliptic equations, Ann. Scuola Norm. Sup. Pisa Cl. Sci., (4) 22 (1995), no. 2, 241–273. [4] M. F. Betta, A. Mercaldo, F. Murat, M. M. Porzio, Existence of renormalized solutions to nonlinear elliptic equations with a lower-order term and right-hand side a measure, J. Math. Pures Appl., (9) 82 (2003), no. 1, 90–124. [5] M. F. Betta, T. Del Vecchio, M. R. Posteraro, Existence and regularity results for nonlinear degenerate elliptic equations with measure data, Ricerche Mat., 47 (1998), no. 2, 277–295. [6] D. Blanchard, F. Murat, H. Redwane, Existence et unicité de la solution renormalisée d’un problème parabolique non linéaire assez général, C.R. Acad. Sci. Paris Sér. I Math., 329 (1999), no. 7, 575–580. [7] L. Boccardo, A. Dall’Aglio, L. Orsina, Existence and regularity results for some elliptic equations with degenerate coercivity, Atti Sem. Mat. Fis. Univ. Modena, 46 (1998), suppl., 51–81. [8] L. Boccardo, T. Gallouët, Non-linear elliptic and parabolic equations involving measure data, J. Funct. Anal., 87 (1989), no. 1, 149–169. [9] L. Boccardo, T. Gallouët, Nonlinear elliptic equations with right hand side measures, Comm. Partial Differential Equations, 17 (1992), no. 3–4, 641–655. [10] L. Boccardo, T. Gallouët, Summability of the solutions of nonlinear elliptic equations with right hand side measures, J. Convex Anal., 3 (1996), no. 2, 361–365. [11] L. Boccardo, T. Gallouët, P. Marcellini, Anisotropic equations in L1 , Differential Integral Equations, 9 (1996), no. 1, 209–212. [12] L. Boccardo, T. Gallouët, L. Orsina, Existence and uniqueness of entropy solutions for nonlinear elliptic equations with measure data, Ann. Inst. H. Poincaré. Anal. Non Linéaire, 13 (1996), no. 5, 539–551.
7.2 On some Classes of Nonlinear Equations with L1 -Data
187
[13] L. Boccardo, T. Gallouët, J. L. Vazquez, Nonlinear elliptic equations in RN without growth restrictions on the data, J. Differential Equations, 105 (1993), no. 2, 334–363. [14] L. Boccardo, D. Giachetti, Alcuni osservazioni sulla regolarità delle soluzioni di problemi fortemente non lineari e applicazioni, Ricerche Mat., 34 (1985), no. 2, 309–323. [15] G. Dal Maso, F. Murat, L. Orsina, A. Prignet, Renormalized solutions of elliptic equations with general measure data, Ann. Scuola Norm. Sup. Pisa Cl. Sci., (4) 28 (1999), no. 4, 741–808. [16] N. Dunford, J. T. Schwartz, Linear Operators. Part I. General Theory, John Wiley & Sons, Inc., New York, 1988. [17] D. Gilbarg, N. S. Trudinger, Elliptic Partial Differential Equations of Second Order, Springer, Berlin, 1983. [18] D. Kinderlehrer, G. Stampacchia, An Introduction to Variational Inequalities and their Applications, Academic Press, New York, 1980. [19] A. A. Kovalevskii, Entropy solutions of the Dirichlet problem for a class of non-linear elliptic fourth-order equations with right-hand sides in L1 , Izv. Math., 65 (2001), no. 2, 231–283. [20] A. A. Kovalevskii, Integrability of solutions of nonlinear elliptic equations with righthand sides from classes close to L1 , Math. Notes, 70 (2001), no. 3–4, 337–346. [21] A. A. Kovalevskii, Integrability of solutions of nonlinear elliptic equations with righthand sides from logarithmic classes, Math. Notes, 74 (2003), no. 5–6, 637–646. [22] A. A. Kovalevskii, On the summability of entropy solutions for the Dirichlet problem in a class of non-linear elliptic fourth-order equations, Izv. Math., 67 (2003), no. 5, 881–894. [23] A. A. Kovalevskii, On the convergence of functions from a Sobolev space satisfying special integral estimates, Ukrainian Math. J., 58 (2006), no. 2, 189–205. [24] A. A. Kovalevskii, A priori properties of solutions of nonlinear equations with degenerate coercivity and L1 -data, J. Math. Sci. (N.Y.), 149 (2008), no. 5, 1517–1538. [25] A. Kovalevsky, Entropy solutions of Dirichlet problem for a class of nonlinear elliptic fourth order equations with L1 -data, Nonlinear Boundary Value Problems, 9 (1999), 46–54. [26] A. A. Kovalevsky, Entropy solutions of Dirichlet problem for a class of nonlinear elliptic high-order equations with L1 -data, Nelinejnye Granichnye Zadachi, 12 (2002), 119–127. [27] A. A. Kovalevsky, On a sharp condition of limit summability of solutions of nonlinear elliptic equations with L1 -right-hand sides, Ukr. Math. Bull., 2 (2005), no. 4, 507–545. [28] A. A. Kovalevsky, General conditions for limit summability of solutions of nonlinear elliptic equations with L1 -data, Nonlinear Anal., 64 (2006), no. 8, 1885–1895. [29] A. A. Kovalevsky, Nonlinear fourth-order equations with a strengthened ellipticity and L1 -data, in: On the Notions of Solution to Nonlinear Elliptic Problems: Results and Developments, Quad. Mat. 23, pp. 283–337, Dept. Math., Seconda Univ. Napoli, Caserta, 2008.
188
7 Theory, Applications, and Control of Nonlinear PDEs in Life Sciences
[30] A. Kovalevsky, F. Nicolosi, Solvability of Dirichlet problem for a class of degenerate nonlinear high-order equations with L1 -data, Nonlinear Anal., 47 (2001), no. 1, 435– 446. [31] A. Kovalevsky, F. Nicolosi, Entropy solutions of Dirichlet problem for a class of degenerate anisotropic fourth-order equations with L1 -right-hand sides, Nonlinear Anal. Ser. A: Theory Methods, 50 (2002), no. 5, 581–619. [32] A. Kovalevsky, F. Nicolosi, Existence of solutions of some degenerate nonlinear elliptic fourth-order equations with L1 -data, Appl. Anal., 81 (2002), no. 4, 905–914. [33] A. Kovalevsky, F. Nicolosi, Solvability of Dirichlet problem for a class of degenerate anisotropic equations with L1 -right-hand sides, Nonlinear Anal., 59 (2004), no. 3, 347– 370. [34] A. A. Kovalevsky, F. Nicolosi, On multipliers characterizing summability of solutions for a class of degenerate nonlinear high-order equations with L1 -data, Nonlinear Anal., 69 (2008), no. 3, 931–939. [35] A. A. Kovalevsky, I. I. Skrypnik, A. E. Shishkov, Singular Solutions of Nonlinear Elliptic and Parabolic Eequations, Naukova Dumka, Kyiv, 2010. (In Russian) [36] J.-L. Lions, Quelques Méthodes de Résolution des Problèmes aux Limites non Linéaires, Dunod, Gauthier–Villars, Paris, 1969. [37] F. Murat, Équations elliptiques non linéaires avec second membre L1 ou mesure, in: Actes du 26ème Congrès National d’Analyse Numérique, pp. A12–A24, Les Karellis, Juin 1994, Université de Lyon I, France, 1994. [38] A. Porretta, Nonlinear equations with natural growth terms and measure data, Proceedings of the 2002 Fez Conference on Partial Differential Equations, Electron. J. Differ. Equ. Conf., 9 (2002), 183–202. [39] J.-M. Rakotoson, Generalized solutions in a new type of sets for problems with measures as data, Differential Integral Equations, 6 (1993), no. 1, 27–36. [40] J.-M. Rakotoson, Uniqueness of renormalized solutions in a T -set for the L1 -data problem and link between various formulations, Indiana Univ. Math. J., 43 (1994), no. 2, 685–702. [41] S. Segura de Léon, Existence and uniqueness for L1 data of some elliptic equations with natural growth, Adv. Differential Equations, 8 (2003), no. 11, 1377–1408. [42] I. V. Skrypnik, High-order quasilinear elliptic equations with continuous generalized solutions, Differential Equations, 14 (1978), no. 6, 786–795. [43] G. Stampacchia, Équations elliptiques du second ordre à coefficients discontinus, Séminaire de Mathématiques Supérieures, No. 16 (Été, 1965), Les Press. Univ. Montreal, Montreal, 1966.
Author Information Alexander A. Kovalevsky, Institute of Applied Mathematics and Mechanics, Donetsk, Ukraine E-mail: [email protected]
8
Mathematical Models of Pattern Formation and Their Applications in Developmental Biology
Anna Marciniak-Czochra
8.1 Reaction-Diffusion Models of Pattern Formation in Developmental Biology
Abstract. In this paper we present mathematical approaches to understand a symmetry break and formation of spatially heterogenous structures during development. We focus on the models given by reaction-diffusion equations and approach the question of possible mechanisms of development of spatially heterogeneous structures. We discuss two mechanisms of pattern formation: diffusion-driven instability (Turing instability) and a hysteresis-driven mechanism, and demonstrate their possibilities and constraints in explaining different aspects of structure formation in cell systems. Depending on the type of nonlinearities, we show the existence of Turing patterns, the maxima of which may be of the spike or plateau type, and the existence of transition layer stationary solutions. These concepts are discussed on example of morphogenesis of the fresh water polyp Hydra, which is a model organism in developmental biology. Keywords. Hysteresis, Pattern Formation, Reaction-diffusion Equation, Receptorbased Model, Turing Instability 2010 Mathematics Subject Classification. 35K57, 35Q92, 92C15
8.1.1 Introduction Spatial and spatio-temporal structures occur widely in physics, chemistry and biology. In many cases, they seem to be generated spontaneously. Understanding the principles of development and design in biology is among the crucial issues not only in developmental biology but also in the field of regenerative medicine. In order to develop methods for intelligent engineering of functional tissues, the main principles of development and design have to be understood. Recent advances made in genetic and molecular biology have led to detailed descriptions of a number of events in embryological development. Although genes control pattern formation, genetics alone is insufficient to understand which physio-chemical interactions of embryonic material produce the complex spatio-temporal signaling
This work was supported by European Research Council Starting Grant 210680 “Biostruct” and Emmy Noether Programme of German Research Council (DFG).
192
8 Mathematical Models of Pattern Formation
cues which ultimately determine the cell’s fate. Since the establishment of symmetry breaking by cell polarity in developing tissues is determined by quantitative integration of multiple signals in a highly dynamical and self-organized process, it can be hardly understood using conventional molecular biology methods alone. The role of mathematical modeling is to verify which processes are sufficient to produce the patterning. Model mechanisms can suggest to the embryologist possible scenarios as to how, and sometimes when, a pattern is laid down and how the embryonic form might be created. Modeling also allows to make experimentally testable predictions and may provide alternative explanations for the observed phenomena. In other areas of biology, such as neurophysiology or ecology, mathematical modeling has led to many discoveries and insights through a process of synthesis and integration of experimental data, see [37] and references therein. Also in developmental biologymathematical models in developmental biology many different morphologies have been the subject of mathematical modeling. Some of the biological systems have attained the status of a paradigm in theoretical work [3, 7]. One such example, which shows how the study of model mechanisms can suggest real scenarios for the process of pattern formation, is limb development [37]. A mechanochemical model describes the diffusion, haptotaxis and advection of mesenchymal cells which evolve in a developing limb bud and which eventually become cartilage. The other developmental process for studying pattern formation and different aspects of embryogenesis is the segmentation of the insect embryo [3, 12]. The models based on chemotaxis and the response of cells to gradients in the chemoattractant were applied to study the life cycle of the slime mould Dictyostelium discoideum and emergence of concentrated patterns of cell density [11, 13, 42]. More recently, models of morphogenesis have been applied to understand the growth of tumors. They involve a wide range of biological phenomena such as cell-adhesion and cell traction, angiogenesis, pattern formation in cancer and macrophage dynamics [5, 37]. All these models, although based on different biological hypotheses, have many common mathematical features and are mostly based on a few views of pattern generation. One is the chemical prepattern approach involving hypothetical chemicals (morphogens) which diffuse and react in such a way that spatial heterogeneous patterns can evolve from the uniform steady states. Coupling diffusion process of signaling molecules with nonlinear dynamics of intracellular processes and cellular growth and transformation leads to receptor-based models, which differ from the usual reactiondiffusion systems. Next, the mechanochemical approach takes into account mechanical forces and properties of cells and tissues. Another class of models rely on taxis, e.g., chemotaxis or haptotaxis, and the response of cells to gradients in the concentration of signaling molecules in the environment [1, 30]. Different models are able to produce similar patterns. The question is how to distinguish between them so as to determine which may be the relevant mechanisms. Of course the first necessary condition is that the model must produce observed patterns. But then it is important to design new experiments, which could allow for model
8.1 Reaction-Diffusion Models of Pattern Formation in Developmental Biology
193
validation and therefore would also lead to the verification of model hypotheses and biological theories. Whereas different developmental processes are involved in different organisms, it is striking how conserved the processes are across different taxonomic units such as the phyla. Also, the same processes are involved in various diseases, in particular in cancer. Molecules found to be oncogenic factors also play an important role in developmental processes. This unity and conservation of basic processes implies that their mathematical models can have an impact across the spectrum of normal and pathological development. In this paper we present different approaches to model development and regeneration of the fresh-water polyp Hydra. We focus on the framework of continuous models given by partial differential equations. In particular, we employ reaction-diffusion equations and discuss their performance in the context of key experiments. One of the objectives is to understand how the structure of nonlinear feedbacks determines qualitative behavior of the system, in particular existence of stable spatially heterogenous patterns. We discuss different mechanisms of pattern formation, i.e., diffusion-driven instability and hysteresis-driven pattern formation. The first class of models uses special features of diffusion, which results in the destabilization of the spatially homogeneous steady state and emergence of spatial heterogeneity. The second mechanism of pattern formation in such systems is based on the existence of multiple steady states and hysteresis in the intracellular dynamics. Diffusion of the signaling molecules tries to average different states and is the cause of spatio-temporal patterns.
8.1.2 Mechanisms of Developmental Pattern Formation One of the crucial issues in developmental biology is to understand how coordinated systems of positional information are established during an organism’s development and how cells in the organism respond to the associated signaling cues, processes which ultimately result in the subdivided and patterned tissues of multicellular organisms [35, 36, 46]. Experiments suggest that during development cells respond to local positional cues that are dynamically regulated. The hypothesis is that cells differentiate according to positional information [46]. The question is how this information is supplied to the cells. There exists a number of models for pattern formation and regulation based on the idea that positional information is supplied to cells by a diffusing biochemical morphogen [7, 37, 46]. It links the expression of target genes with local concentrations of morphogen molecules (ligands). Different concentrations of morphogens are able to activate transcription of distinct target genes and thus cell differentiation. However, both regulatory and signaling molecules (ligands) act by binding and activating receptor molecules which are located in the cell membrane (or, with lipophilic ligands, in the cytoplasm) [20, 36]. This observation leads to a hypothesis that the positional value of the cell may be determined by the density of bound receptors which do not diffuse [35].
194
8 Mathematical Models of Pattern Formation
8.1.3 Motivating Application: Pattern Control in Hydra One of the most frequently discussed organisms in theoretical papers on biological patterns formation is the fresh-water polyp Hydra. What is peculiar about Hydra? Hydra, a fresh-water polyp, is one of the oldest and simplest multi-cellular organisms equipped with typically animal cells such as sensory cells, nerve cells and muscle cells. The animal has an almost unlimited life span and regeneration capacity. Similar to plants, in Hydra tissue there are stem cells that are constantly dividing and regenerating the adult structures of the polyps. This unlimited growth indicates that Hydra does not undergo senescence and, in this sense, it is biologically immortal [4]. Morphogenetic mechanisms active in adult polyps are responsible for the regenerative ability and the establishment of a new body axis. Research on Hydra might reveal how to selectively reactivate the genes and proteins to regenerate human tissues. The fact that no tumor formation or other malignancies have been reported for Hydra so far, indicates that growth control and tissue homeostasis in normal Hydra polyps are very efficient. The developmental processes governing formation of the Hydra body plan and its regeneration are well understood at the tissue level [35, 36]. Therefore, experiments performed on Hydra provide a good ground for testing the abilities and limitations of mathematical models. We may distinguish three main experiments:
De novo pattern formation. It was shown that normal Hydra can regenerate from random cellular aggregates [10, 36] (see Figure 8.1). Reorganization does not result from a spatial rearrangement, but it is an effect of concerted changes in the functional state of the cells. The cells do not sort with respect to the positional origin along the body axis [39]. These experiments suggest that there exist mechanisms which define new centers of head organizing activity within an initially chaotic mass of cells.
Cutting experiments. Hydra has a high capacity to regenerate any lost body part, which occurs mainly by the re-patterning of existing tissues and is an example of morphallaxis [46]. The lack of growth requirements for regeneration is shown in heavily irradiated polyps. No cell divisions occur, but the animals can still regenerate normally. Consequently, the mechanism of pattern formation in Hydra seems to be independent of growth. Overlapping cut levels show that the same cells can form either the gastric region, or the head, or the foot, according to their position along the body axis (see Figure 8.2). The experiment shows that after a transverse cut both parts of the animal can regenerate [35]. Moreover, the polarity is maintained even in small pieces of the body. A tissue piece containing 150–300 epithelial cells, i.e., about 1 percent of normal polyp, regenerates a complete Hydra [41]. However, below this size no regeneration takes place. There are also observations showing that the time required for the regeneration decreases with increasing tissue size [41].
8.1 Reaction-Diffusion Models of Pattern Formation in Developmental Biology
(a)
(d)
(b)
195
(c)
(e)
Figure 8.1. Time evolution of aggregates from randomly arranged Hydra cells. Several polyps were disintegrated into a suspension of isolated cells, which subsequently were allowed to reestablish contact. Within two weeks the aggregate reintegrated itself into intact animals [Courtesy of W. Müller].
Figure 8.2. Cutting experiment. Hydra regenerates after a transverse cut of cells of the gastric region (from both upper and lower half of the body column). In one experiment (left-hand side) the lower body column is removed, in the second experiment (right-hand side) the upper part is removed. The cut levels are not identical but somewhat different to show that one and the same group of cells (marked in grey) can form a foot (left-hand side), or a head (right-hand side) or a gastric segment (original state in the middle). The function of the cells depends on the position along the body column [Courtesy of W. Müller].
Grafting experiments. Grafting experiments show how disparities between the positional value of the transplant and the surrounding host tissue result in the head or foot formation leading to development of new organisms with multiple heads or feet [35,36] (see Figure 8.3).
196
8 Mathematical Models of Pattern Formation
Positional value
Figure 8.3. Grafting experiment. Determination of relative positional information values by transplantation. Pieces of tissue are grafted from one animal to another and one of three outcomes is observed. (1) If the tissue is transplanted from the upper position along the body column to the lower position then a new head is formed. (2) If the former and new position is the same then the piece is integrated and nothing is observed. (3) If the tissue is grafted to the upper position a new foot is formed [Courtesy of W. Müller].
10
5
0
Figure 8.4. The illustration of the idea of “positional value”, which is supplied to the cells and interpreted by them. The hypothesis is that the formation of the head is determined by the high “positional value” (which is above some threshold). The figure shows the “positional value” for a supernumerary head structure [Courtesy of W. Müller].
To conclude, experiments of this kind suggest that the cells respond to local positional cues that are dynamically regulated. It leads to the hypothesis of Wolpert on positional information [46] (compare Figure 8.4). The question is how this information is supplied to the cells and which mechanisms control the formation of spatially heterogenous structures in the positional information, and consequently patterns of cell differentiation. In the remainder of this paper we will address this question based on the results of mathematical models employing different hypotheses. Since the exact molecular mechanism of pattern formation in Hydra is unknown, the proposed models are hypothetical. They attempt to answer the following questions:
8.1 Reaction-Diffusion Models of Pattern Formation in Developmental Biology
What minimal processes are sufficient to produce de novo patterns?
Which models are able to capture the results of the above experiments?
197
The problem was first approached by Wolpert [46], who suggested a gradient model to account for head formation, in which at the head end a morphogen S is emitted. The morphogen spreads by diffusion and is distributed down the body. This diffusing chemical induces formation of the head. The proposed model corresponds to the assumption that morphogens are secreted only by a group of cells in some restricted region of a tissue and then transported in an adjacent tissue. While there is experimental evidence of such signaling in other systems, such as, for example, Drosophila wing imaginal disc or Spemann organizer, it is not the case in Hydra de novo development from the dissociated cells [36].
8.1.4 Diffusive Morphogens and Turing Patterns The question of de novo pattern formation in a homogenous tissue was addressed by Turing in his pioneering paper [45]. He proposed a hypothesis that can be stated as follows: When two chemical species with different diffusion rates react with each other, the spatially homogeneous state may become unstable, thereby leading to a nontrivial spatial structure. The idea looks counterintuitive, since diffusion is expected to lead to the uniform distribution of the particles. Mathematical analysis of reaction-diffusion equations provides an explanation for the phenomenon postulated by Turing. The proposed mechanism of pattern formation is related to a local behavior of solutions of a reactiondiffusion system in the neighborhood of the constant solution that is destabilized via diffusion. Patterns arise through a bifurcation, which we call diffusion-driven instability (DDI). They can be spatially monotone corresponding to the gradients in positional information or spatially periodic. Definition 8.1 (Turing instability). A system of reaction-diffusion equations exhibits DDI (Turing instability) if and only if there exists a constant stationary solution, which is stable to spatially homogenous perturbations, but unstable to spatially heterogenous perturbations. The original idea was presented by Turing on the example of two linear reactiondiffusion equations of the form @u D D4u C Au @t @n u.t; 0/ D 0 u.0; x/ D u0 .x/;
in ; on @; (8.1)
where u 2 R2 is a vector of two variables, D is a diagonal matrix with nonnegative coefficients du , dv on the diagonal, the symbol @n denotes the normal derivative
198
8 Mathematical Models of Pattern Formation
(no-flux condition), and is a bounded region. Here the only constant steady state is .0; 0/. Following Turing [45], we can formulate the following result on DDI: Theorem 8.2 (Allan Turing). Assume that t rA < 0, detA > 0 and dv > 0. There exists du > 0 (small enough) such that the constant steady state .0; 0/ is unstable for the reaction-diffusion equation (8.1). It can be proven using a spectral decomposition of the Laplace operator with homogenous Neumann boundary conditions and calculating the eigenvalues of obtained finite dimensional operator. Due to the local character of Turing instability, the notion has been extended in a natural way to the nonlinear equations using linearization around a constant positive steady state. However, in case of nonlinear systems we may deal with the existence of multiple constant steady states. In such cases we observe the existence of heterogenous structures far from the equilibrium and the global behavior of the solutions cannot be predicted by the properties of the linearized system, e.g., [27, 43]. In fact we can observe a variety of possible dynamics depending on the type of nonlinearities. On the other hand, Turing instability can also be exhibited in degenerated systems such as reaction-diffusion-ODE models or integro-differential equations, for example, shadow systems obtained through reduction of the reaction-diffusion model [26, 27]. Following all these observations and original character of Turing’s system we define the Turing patterns in the following way: Definition 8.3. By Turing patterns we refer to the solutions of reaction-diffusion equations that are
stable,
stationary,
continuous,
spatially heterogenous and
arise due to the Turing instability (DDI) of a constant steady state.
It can happen in a reaction-diffusion system with DDI that all nonconstant stationary solutions are unstable and then the solution converges to another constant solution or to a dynamical structure such as a spike pattern [26, 43]. In case of at least three equations, the system can also exhibit a Turing-type Hopf bifurcation, which leads to spatio-temporal oscillations [19].
8.1 Reaction-Diffusion Models of Pattern Formation in Developmental Biology
199
8.1.4.1 Activator-inhibitor Model The most famous realization of Turing’s idea in a mathematical model of biological pattern formation is the activator-inhibitor model proposed by Gierer and Meinhardt [7]. The model aims to explain head formation in Hydra due to a coupling of a local activation to a long-range inhibition process. An activator promotes head formation and increases itself autocatalytically. An inhibitor acts as a suppressant against the selfenhancing activator to prevent the system from unlimited growth. In this approach the positional value is interpreted as the density of the activator. Gradients of morphogens are formed by the DDI mechanism. Each of the various body parts is assumed to be under control of a separate activator-inhibitor system (for details, see [32]). The basic activator-inhibitor model takes the form @ @2 a2 a D Da 2 a C a C a a a; @t @x h @ @2 h D Dh 2 h C h a2 C h h h; @t @x
(8.2)
where a and h denote the concentrations of the activator and the inhibitor, respectively. The parameters a and h describe de novo production, a and h are the rates of degradation and a and h the parameters of the activator-inhibitor interactions. The model and several of its modifications were applied in the study of various topics from developmental biology (see, e.g., [33,34]). Due to its interesting mathematical features and emerging singularities, the model has also attracted a lot of attention from the side of mathematical analysis, e.g., [31, 43]. The activator-inhibitor theory operates with purely hypothetical morphogens. As we can see in Theorem 8.2, the key mechanism of Turing-type patterns is that an inhibitor diffuses faster than an activator. However, dynamics and complex tissue topologies are likely to prevent the establishment of long-range inhibitor gradients. Furthermore, diffusion rates of typical morphogens are often found to be quite small [9], i.e., do not allow significantly varying diffusion rates as required by the Turing mechanism. In the case of Hydra, while recently Wnt can be identified as an activator [10], a longrange inhibitor is missing [21, 34]. These observations support the search for a different inhibitory mechanism such as mechanical inhibition [6] or different than DDI mechanism of pattern formation [24]. In the context of Hydra experiments it is also important to note that the shape of Turing patterns depends on the size of the domain and diffusion rather than on initial conditions. Therefore, one of the difficulties of the Turing-type models is their inability to reproduce the experiments resulting in multiple head formation in Hydra [23].
200
8 Mathematical Models of Pattern Formation
8.1.5 Receptor-based Models Another type of mathematical models for pattern formation follows the hypothesis that the positional value of the cell is determined by the density of cell-surface receptors, which regulate the expression of genes responsible for cell differentiation [35], see Figure 8.5.
Increase in receptor density
Decrease in receptor density
Figure 8.5. Bound receptors density determining “positional value”: the head is formed if the density of bound receptors is high (above some threshold). Consequently, in normal development we expect a gradient-like distribution of bound receptors [Courtesy of W. Müller].
The receptor-based models are based on the idea that epithelial cells secrete ligands (a regulatory biochemical), which diffuse locally within the interstitial space and bind to free receptors on the cell surface [23, 29]. It results in a bound receptor that can be removed from the cell surface due to degradation or internalization, or dissociate back to free receptors and ligands. Both ligands and free receptors are produced within the whole tissue and undergo natural decay. The first receptor-based model for Hydra was proposed by Sherrat, Maini, Jäger and Müller in [40] in the following form (SMJM model), @ @2 a D Da 2 a C sa .x/ a a ke ae ka af C kd b; @t @x @ f D kd b ka af C ki Œ˛.x/ C ˇb f ; @t @ b D ka af .kd C ki /b; @t @2 @ e D De 2 e C se .x/ e e; @t @x
(8.3)
defined on a bounded one-dimensional domain with zero-flux boundary conditions for a and e. The variables f , b, a and e denote the density of free receptors, bound receptors, biochemical (ligands) and enzyme, respectively. In order to achieve the required
8.1 Reaction-Diffusion Models of Pattern Formation in Developmental Biology
201
local competition phenomenon, it is assumed in the SMJM model that the terms describing de novo production of free receptors, ligands and enzyme depend on the position of the tissue along the body axes x 2 Œ0; L and also on y.x/, which denotes position from which the tissue at location x originates. The functions describing the production of new receptors and production of enzyme are linearly decreasing in x. The production of ligands is assumed to be constant on 4/5 of the domain and then decrease linearly to zero. It is assumed that ˛.x/ D ˛1 Œ1 y.x/=L C ˛2 y.x/=L; se .x/ D s1 Œ1 y.x/=L C s2 y.x/=L; and decreases linearly to zero for y.x/ 2 where sa .x/ is constant for y.x/ 2 Œ0; 4L 5 4L Œ 5 ; 1. The combination of these two parallel gradients enables the model to capture some results of grafting and cutting experiments. Thus, the model functions not because of nonlinear interactions between receptors and ligands, but because of the assumption that cells produce new molecules depending on the position they had in the donor organism. In conclusion, the SMJM model is not a model for de novo pattern formation. Later, receptor-based models without imposing initial gradients were proposed by Marciniak-Czochra [23]. In general, equations of such models can be represented by the following initial boundary-value problem, u t D Dv C f .u; v/ in ; in ; v t D g.u; v/ on @; @n u D 0 v.x; 0/ D v0 .x/; u.x; 0/ D u0 .x/;
(8.4)
where u is a vector of variables describing the dynamics of diffusing extracellular molecules and enzymes, which provide cell-to-cell communication, while v is a vector of variables localized on cells, describing cell surface receptors and intracellular signaling molecules, transcription factors, mRNA, etc. D is a diagonal matrix with positive coefficients on the diagonal, the symbol @n denotes the normal derivative (no-flux condition), and is a bounded region. A rigorous derivation, using methods of asymptotic analysis (homogenization) of the macroscopic reaction-diffusion models describing the interplay between the nonhomogeneous cellular dynamics and the signaling molecules diffusing in the intercellular space has been undertaken in [25, 29]. It is shown that receptor-ligand binding processes can be modeled by reaction-diffusion equations coupled with ordinary differential equations in the case when all membrane processes are homogeneous within the membrane. As shown in [23] and also more recently highlighted in [16], receptor-based models may exhibit Turing-type instability. The simplest receptor-based model takes into account only one type of diffusive signaling molecules. The basic model of this type,
202
8 Mathematical Models of Pattern Formation
for one-dimensional epithelial sheet, takes the form @ rf D f rf C pr .rf ; rb / brf l C drb ; @t @ rb D b rb C brf l drb ; @t 1 @2 @ lD l l l brf l C pl .l; rb / C drb ; @t @x 2
(8.5)
with zero flux boundary conditions for l, @x l.t; 0/ D @x l.t; 1/ D 0. The model takes into account dynamics of free and bound receptors on cell membranes, denoted by rf .t; x/ and rb .t; x/, respectively, and diffusing signaling molecules denoted by l.t; x/. The original model operated with purely hypothetical molecules. However, in case of Hydra pattern formation we may associate the variables to Frizzled receptors and Wnt ligands [17]. The hypothesis that Wnt is a diffusing ligand is supported by experimental evidence [10, 21]. The effects of intracellular dynamics are modeled via nonlinear functions describing production of new signaling molecules and free receptors, pl and pr , respectively. Besides, the kinetics describe binding at the rate b, dissociation at the rate d and natural 2 is a scaling coefficient depending on the decay at the rates f , b and l . D L dl domain length L and the diffusion coefficient dl . As it was stated in [28, Proposition 3.1], a generic system of two ordinary differential equations coupled with a reaction-diffusion equation exhibits DDI if there exists a positive, spatially constant steady state, for which the following conditions are satisfied tr .A/
X
tr .A/ > 0;
(8.6)
det .Aij / C det .A/ > 0;
(8.7)
det .A/ > 0; det .A12 / > 0;
(8.8) (8.9)
i 0. This inequality can be interpreted as an autocatalysis at the steady state in the first equation of the system. This condition leads to the instability of those constant solutions, for which an autocatalysis occurs.
8.1 Reaction-Diffusion Models of Pattern Formation in Developmental Biology
103.656 103.654 103.652 103.65 103.648 103.646 103.644 103.642 (a)
10
쎹104
8 6 4 2 0 0
0.2
0.4
0.6
0.8
1
0 10
103.656 103.654 103.652 103.65 103.648 103.646 103.644 103.642 (b)
203
0.2
0.4
0.6
0.8
1
0.4
0.6
0.8
1
쎹104
8 6 4 2 0 0
0.2
0.4
0.6
0.8
1
0
0.2
Figure 8.6. Spatial profile of rb for different initial perturbations and a fixed . On the lefthand side we present the initial condition and on the right-hand side the final pattern originating from such an initial condition. The perturbation in initial data is of order 102 (it can be arbitrarily small) while the final peak is of height 8 104 . (a) Different initial conditions and corresponding solutions are depicted using matching line styles. The location of the peak strongly depends on the initial condition. (b) The results for the initial conditions with two maxima or two minima—the result is always one peak (depicted using matching line styles).
Following the classical Turing idea, one expects stable patterns to appear around the constant steady state in the system with DDI property. Interestingly, in numerical simulations of model (8.5), diffusion-driven instability of the constant steady state leads to the emergence of growth patterns concentrated around discrete points along the spatial coordinate, which take the mathematical form of spike-type spatially inhomogeneous solutions [23]. The structures are not robust and depend strongly on initial conditions. In some cases, blow up occurs. Definitely, the observed solutions are not Turing patterns, see Figure 8.6. Recent analytical studies of the reaction-diffusion-ODE models with only one nonzero diffusion coefficient revealed that monotone or periodic stationary solutions can be constructed for most interesting models [8, 26, 27]. However, the same mechanism that destabilizes constant solutions of these models also destabilizes non-constant solutions [26]. Consequently, there exist no stable continuous stationary solutions for the initial boundary-value problems for ordinary-PDE systems as the one in (8.5). While in some other applications such dynamical spike patterns are of biological relevance [27, 28, 38], they cannot be applied to describe pattern formation in Hydra. Considering two diffusing signaling factors leads to a four-variable receptor-based model exhibiting Turing patterns [23]. In this model it is additionally assumed that
204
8 Mathematical Models of Pattern Formation
there exists a second diffusing substance, functioning as an enzyme as in the SMJM model [40], which is secreted by cells, diffuses along the body column and degrades the ligands. The equations have the following form @ rf D f .rf / C pr .rf ; rb / b.rf ; l/ C d.rb /; @t @ rb D b .rb / C b.rf ; l/ d.rb /; @t @2 @ l D dl 2 l l .l/ b.rf ; l/ C pl .rf ; rb / C d.rb / be .l; e/; @t @x @2 @ e D de 2 e e .e/ C pe .l; rb /; @t @x
(8.10)
with zero flux boundary conditions for l and e. Here e denotes the density of enzyme, be the rate of binding of ligands and enzyme, pe the rate of production of enzyme,
e the rate of decay of the enzyme, de is the diffusion coefficient for enzyme, and the other terms are as in the three-variable model. The role of the enzyme is to remove the biological regulator, i.e., ligand, before it binds to the receptors on the cell surface. It is important to stress that this model cannot be simplified to an activator-inhibitor system of the type (8.2). The four-variable receptor-based model consists of two subsystems. The reaction-diffusion subsystem describing ligand and enzyme dynamics is not of the activator-inhibitor type and cannot produce diffusion-driven patterns itself. It is the ODEs subsystem that causes destabilization of a constant solution and emergence of a Turing pattern. In such a model, patterns can evolve due to the DDI, even if no self-enhancement of free receptors nor ligands is assumed. pr is assumed to be a function of rb , since it is known from a number of other biological contexts that there can exist a positive feedback loop between the density of bound receptors on the cell surface and the subsequent expression of new receptors [15, 44]. Also no assumption on the range of enzyme diffusion is needed. These observations show that including receptor dynamics in the model of interacting diffusing signaling molecules allows to relax the assumptions on the range of diffusion and type of nonlinear interactions necessary for a formation of stable spatially heterogenous patterns. In particular, it seems that although the Wnt antagonist Dickkopf found in Hydra tissue [2] does not satisfy the assumptions of the inhibitor from the Gierer–Meinhardt model [34], its interactions with Wnt signaling may lead to a stable gradient-like pattern formation as in the receptor-based model. In the four-variable receptor-based model, similarly to the activator-inhibitor model, the spatially homogeneous steady state bifurcates into the spatially inhomogeneous solution which has a maximum at the one end and a minimum at the other end for some range of the domain size. This model is robust and the final pattern does not depend on small perturbations of initial conditions. Pattern formation phenomenon is similar to
205
8.1 Reaction-Diffusion Models of Pattern Formation in Developmental Biology
rb
1.2
1.2
1
1
0.8
0.8
0.6
rb
0.6
0.4
0.4
0.2
0.2
0
0
–0.2
0
0.2
0.4
0.6
0.8
1
–0.2
0
0.2
0.4
x
0.6
0.8
1
x
Figure 8.7. Numerical simulation of the cutting experiment in the four-variable receptor-based model. Left-hand panel: Initial data (dashed line) corresponding to a surgical removal of the lower part (half) of the body column. We observe that a reorganization of the “gradient” on a smaller domain corresponds to the formation of a new “foot”. Right-hand panel: Initial data (dashed line) corresponding to a surgical removal of the upper part (half) of the body column. We observe that a reorganization of the “gradient” on a smaller domain corresponds to the formation of a new “head”.
that in the activator-inhibitor model of Gierer and Meinhardt (8.2) with the difference that maxima of the pattern have the shape of plateaux and not spikes as was the case in (8.2). It is related to uniform boundedness of solutions in model (8.10). Models with Turing patterns can describe self-organization of Hydra cells (see Figure 8.8, left-hand panel) and are able to simulate the cutting experiments (see Figure 8.7). Concluding, introduction of a second diffusing biochemical species improved performance of the model. The four-variable receptor-based model can explain at least as much as the
BoundRec Time
x
0.9 0.8 0.7 0.6 0.5 rb 0.4 0.3 0.2 0.1 0 –0.1
0
0.2
0.4
0.6
0.8
1
x
Figure 8.8. Left-hand panel: De novo gradient-like pattern formation in four-variable receptorbased model. Right-hand panel: Simulation of a transplantation experiment for the initial data corresponding to the head grafting. The final distribution shows the transplant disappearance.
206
8 Mathematical Models of Pattern Formation
activator-inhibitor model regarding de novo pattern formation and basic experiments. Numerical studies of both models based on Turing mechanism showed that the grafting experiments could not be explained within such an approach without changing the size of domain (or diffusion coefficient), which does not reflect experimental conditions (see Figure 8.8, right-hand panel).
8.1.6 Multistability Transplantation experiments suggest that there are a number of locally asymptotically stable patterns depending on the past history. The patterns may have multiple peaks, which depend on the local cues induced by the grafted tissue. Such experiments suggest a mechanism of pattern formation based on multistability in intracellular signaling. Coupling diffusion with a kinetics system with multiple stable steady states and hysteresis may lead to the coexistence of different patterns for the same parameters but depending on the initial conditions. Such a hypothesis was incorporated in the receptor-based model proposed in [24] by replacing the function pl , describing the rate of production of diffusing signaling molecules, by a new variable modeled using an additional ordinary differential equations. The model includes a hysteresis-based relation in the quasi-stationary state in the ODEs subsystem, i.e., g.u; v/ D 0 in the system of equations (8.4) (see Figure 8.9).
T
f(u,v) = 0 g(u,v) = 0 v= hH (u) v=hT (u)
v S1 II S0 0
0
u
Figure 8.9. A typical configuration of the kinetic functions in a receptor-based model (8.4) with hysteresis in the quasi-stationary ODEs subsystem.
The model suggests how the nonlinearities of intracellular signaling may result in spatial patterning. It allows for formation of gradient-like patterns corresponding to the normal development as well as emergence of patterns with multiple maxima describing transplantation experiments (see Figure 8.10). Numerical simulations show the existence of stationary patterns resulting from the existence of multiple steady states and switches in the production rates of diffusing molecules, see Figure 8.10, left-hand panel. The patterns observed in such models are not Turing patterns. In fact, the system does not need to exhibit DDI. Indeed, in most cases its constant steady states do not
8.1 Reaction-Diffusion Models of Pattern Formation in Developmental Biology
207
80
80 60
Foot
40 20 1000 0 0
500 50 x
100 0
Time
Bound receptors
Bound receptors
Head
60 40 20 0 0
20
1000 500 Time
40 x
60
80 100 0
Figure 8.10. Simulations of the receptor-based model with hysteresis. Formation of a gradientlike pattern corresponding to a normal development and head formation in Hydra (left-hand panel) and formation of two heads pattern for the initial conditions corresponding the transplantation experiment (right-hand panel).
change stability. In such models, spatially heterogenous stationary solutions appear far from equilibrium due to the existence of multiple quasi-steady states. Properties of a hysteresis-based mechanism of pattern formation have been recently studied in a minimal version of the model consisting of one reaction-diffusion equation coupled to one ODE [17]. In such a model, infinitely many stationary solutions can be constructed. Such solutions are discontinuous in the nondiffusing variables. The shape of spatial structures can be very irregular, since it depends strongly on the initial conditions. Therefore, the model can simulate the effects of transplantation experiments and formation of multiple heads, see Figure 8.10, right-hand panel. On the other hand, it was shown that the system with multistability but reversible quasi-steady states in the ODE subsystem, i.e., g.u; v/ globally invertible, cannot exhibit stable spatially heterogeneous patterns. Hysteresis is necessary to obtain stable patterns.
8.1.7 Discussion Transplantation and tissue manipulation experiments provided data for models of patterning in Hydra, starting with the positional information ideas of Wolpert [46], the activator-inhibitor model of Gierer and Meinhardt [7, 32–34] and, finally, receptorbased models of Marciniak-Czochra [17, 23, 24]. Each model has shed light on different but overlapping aspects of self-organization and regeneration. Now, it is possible to state which conceptual elements have to be present in a complete model, although modeling of Hydra regeneration still involves quite a few unsolved problems. In the framework of reaction-diffusion systems there are essentially two ways in which a system of identical cells can start to differentiate:
208
8 Mathematical Models of Pattern Formation
There is a critical number of cells (size of domain), above which the spatially homogeneous attractor loses stability, which leads to “spontaneous” spatial patterning. It is the case for the models with the Turing instability. Such models can explain de novo pattern formation since for some set of parameters and the domain size value, the final pattern is the same and does not depend on the initial perturbation.
There is an external inducing signal which drives the system into a new, spatially inhomogeneous state. Such a signal originates from another group of already differentiated cells and it must be strong enough to trigger differentiation. It corresponds to a sufficiently strong initial perturbation of the homogeneous steady state. This type of initialization of the pattern-forming mechanism is involved in the model with hysteresis.
The experiments showing de novo formation of Hydra from the dissociated cells could suggest a Turing-type mechanism. On the other hand, transplantation experiments suggest coexistence of different spatially inhomogeneous stationary patterns which grow up for different initial conditions. Experiments show that large perturbations (but within the range of the values of the solution itself) of the gradient-like solution should lead to another solution. The observations may be explained by models with multiple steady states exhibiting hysteresis. In such models solutions depend on the initial condition similarly to the Hydra resulting from the grafting experiment depends on the graft position and not on the size of the animal. The question is whether these two kinds of experiments can be explained using the same mechanism and whether it could be the combination of the already considered mechanisms. To clarify these issues new models including a more detailed description of cell-to-cell and intracellular signaling should be developed. Mathematical understanding of the relation between the structure of nonlinearities and the dynamics of model solutions shall be helpful both in building new models and also in designing experiments that might help to verify different hypotheses. Many uncertainties exist regarding the biological foundations of the models. Further biological discoveries are needed to gain insight into the molecular nature of cell communication and of the positional value. To understand the morphogenesis in Hydra it is necessary to bridge the gap between experimental observations at the cellular level and those at the genetic and biochemical levels. The advent of new techniques in molecular biology has recently made it possible to advance the understanding of the development of multicellular organisms. Large scale expression screening helps to identify new factors involved in embryonic development. Recently, expression analysis during regeneration and budding indicated a pivotal role of the Wnt (wingless gene) pathway in the Hydra head organizer [10]. Also the evidence of Dkk (Dickkopf) signaling in Hydra regeneration was provided [2]. Experimentally observed patterns of Wnt and Dkk gene expression give rise to many new questions. New aspects of Wnt-Dkk signaling, such as bi-stability in Wnt dynamics and switches in the Dkk functionality depending on the cellular context, were also
8.1 Reaction-Diffusion Models of Pattern Formation in Developmental Biology
209
recently found in other model organisms [14, 18, 22]. Mathematical modeling should integrate these concepts and observations into a new model of pattern formation controlled by the intracellular dynamics of Wnt-Dkk signaling. In conclusion, growth and pattern formation provide a great source of interesting and novel mathematical problems, while mathematics can be used as a tool to explore different mechanisms and processes underlying these phenomena. The use of realistic models may help to understand many complex processes.
Bibliography [1] A. R. A. Anderson, M. A. J. Chaplain, E. L. Newman, R. J. C. Steele, A. M. Thompson, Mathematical modelling of tumour invasion and metastasis, J. Theor. Med. 2 (2000), 129–154. [2] R. Augustin, A. Franke, K. Khalturin, R. Kiko, S. Siebert, G. Hemmrich, T. C. Bosch, Dickkopf related genes are components of the positional value gradient in Hydra, Dev. Biol. 296 (2006), 62–70. [3] R. E. Baker, S. Schnell, P. K. Maini, Waves and patterning in developmental biology: vertebrate segmentation and feather bud formation as case studies, Int. J. Dev. Biol. 53 (2009), 783–794. [4] T. C. G. Bosch, Hydra and the evolution of stem cells, BioEssays 31 (2009), 478–486. [5] M. A. J. Chaplain, M. Ganesh, I. G. Graham, Spatio-temporal pattern formation on spherical surfaces: numerical simulation and application to solid tumor growth, J. Math. Biol. 42 (2001), 387–423. [6] N. Desprat, W. Supatto, P. A. Pouille, E. Beaurepaire, E. Farge, Tissue deformation modulates twist expression to determine anterior midgut differentiation in Drosophila embryos, Dev. Cell 15 (2008), 470–477. [7] A. Gierer, H. Meinhardt, A theory of biological pattern formation. Kybernetik 12 (1972), 30–39. [8] Y. Golovaty, A. Marciniak-Czochra, M. Ptashnyk, Stability of nonconstant stationary solutions in a reaction-diffusion equation coupled to the system of ordinary differential equations, Comm. Pure Appl. Anal. 11 (2012), 229–241. [9] T. Gregor, E. F. Wieschaus, A. P. McGregor, W. Bialek, D. W. Tank, Stability and nuclear dynamics of the bicoid morphogen gradient, Cell 130 (2007), 141–152. [10] B. Hobmayer, F. Rentzsch, K. Kuhn, C. M. Happel, C. C. Laue, P. Snyder, U. Rothbacher, T. W. Holstein, Wnt signaling and axis formation in the diploblastic metazoan Hydra, Nature 407 (2000), 186–189. [11] D. Horstmann, From 1970 until present: The Keller-Segel model in chemotaxis and its consequences, Jahresbericht der DMV 105 (2003), 103–165. [12] S. A. Kauffman. Pattern formation in the drosophila embryo, Phil. Trans. R. Soc. Lond. 295 (1981), 567–594. [13] E. F. Keller, L. A. Segel, A model of chemotxis, J. Theor. Biol. 30 (1971), 225–234.
210
8 Mathematical Models of Pattern Formation
[14] H. A. Kestler, M. Kühl, Generating a Wnt switch: it’s all about the right dosage, J. Cell. Biol. 193 (2011), 431–433. [15] M. Kerszberg. Morphogen propagation and action towards molecular models, Semin. Cell. Dev. Biol. 10 (1999), 297–302. [16] V. Klika, R. E. Baker, D. Headon, E. A. Gaffney, The influence of receptor-mediated interactions on reaction-diffusion mechanisms of cellular self-organization, Bull. Math. Biol., DOI 10.1007/s11538-011-9699-4. [17] A. Köthe, A. Marciniak-Czochra, Multistability and hysteresis-based mechanism of pattern formation in biology, in: V. Capasso, M. Gromov, N. Morozova (eds.), Pattern Formation in Morphogenesis-problems and their Mathematical Formalization, Springer, Berlin–Heidelberg, 2012, 153–175. [18] J. Kreuger, L. Perez, A. J. Giraldez, S. M. Cohen, Opposing activities of Dally-like glypican at high and low levels of Wingless morphogen activity, Dev. Cell 7 (2004), 503–512. [19] S. Krömker, Model and Analysis of Heterogeneous Catalysis with Phase Transition, PhD thesis, University of Heidelberg, 1997. [20] D. A. Lauffenburger, J. J. Linderman, Receptors. Models for Binding, Trafficking, and Signaling, Oxford University Press, New York, 1993. [21] B. T. MacDonald, K. Tamai, X. He, Wnt/b-catenin signaling: components, mechanisms, and diseases, Developmental Cell 17 (2009), 9–26. [22] B. Mao, C. Niehrs, Kremen2 modulates Dickkopf2 activity during Wnt/LRP6 signaling, Gene 302 (2003), 179–183. [23] A. Marciniak-Czochra, Receptor-based models with diffusion-driven instability for pattern formation in Hydra, J. Biol. Sys. 11 (2003), 293–324. [24] A. Marciniak-Czochra, Receptor-based models with hysteresis for pattern formation in Hydra, Math. Biosci. 199 (2006), 97–119. [25] A. Marciniak-Czochra, Strong two-scale convergence and corrector result for the receptor-based model of the intercellular communication. IMA J. Appl. Math., (2012) Accepted. [26] A. Marciniak-Czochra, G. Karch, K. Suzuki, Unstable patterns in reaction-diffusion model of early carcinogenesis, J. Math. Pures et Appliques, (2012) Accepted. [27] A. Marciniak-Czochra, M. Kimmel, Modelling of early lung cancer progression: Influence of growth factor production and cooperation between partially transformed cells, Math. Mod. Meth. Appl. Sci. 17 (2007), 1693–1719. [28] A. Marciniak-Czochra, M. Kimmel, Reaction-diffusion model of early carcinogenesis: The effects of influx of mutated cells, Math. Mod. Natural Phenomena 7 (2008), 90– 114. [29] A. Marciniak-Czochra, M. Ptashnyk, Derivation of a macroscopic receptor-based model using homogenization techniques, SIAM J. Mat. Anal. 40 (2008), 215–237. [30] A. Marciniak-Czochra, M. Ptashnyk, Boundedness of solutions of a haptotaxis model, Math. Mod. Meth. Appl. Sci. 20 (2010), 449–476.
8.1 Reaction-Diffusion Models of Pattern Formation in Developmental Biology
211
[31] K. Masuda, K. Takahashi, Reaction-diffusion systems in the Gierer-Meinhardt theory of biological pattern formation, Japan J. Appl. Math. 4 (1987), 47–58. [32] H. Meinhardt, A model for pattern formation of hypostome, tentacles and foot in Hydra: How to form structures close to each other, how to form them at a distance, Dev. Biol. 157 (1993), 321–333. [33] H. Meinhardt, Turing’s theory of morphogenesis of 1952 and the subsequent discovery of the crucial role of local self-enhancement and long-range inhibition, Interface Focus (2012), doi:10.1098/rsfs.2011.0097. [34] H. Meinhardt, Modeling pattern formation in hydra - a route to understand essential steps in development, Int. J. Dev. Biol. (2012), doi: 10.1387/ijdb.113483hm. [35] W. A. Müller, Pattern control in hydra: basic experiments and concepts, in: H. G. Othmer, P. K. Maini, J. D. Murray (eds.), Experimental and Theoretical Advances in Biological Pattern Formation, Plenum Press, New York, 1993, 237–253. [36] W. A. Müller, Developmental Biology, Springer, New York, 1997. [37] J. D. Murray, Mathematical Biology, 2nd edn. Springer, New York, 2003. [38] K. Pham, A. Chauviere, H. Hatzikirou, X. Li, H. M. Byrne, V. Cristini, J. Lowengrub, Density-dependentquiescence in glioma invasion: instability in a simple reactiondiffusion model for the migration/proliferation dichotomy, J. Biol. Dyn. 6 (2011), 54– 71. [39] M. Sato, H. Tashiro, A. Oikawa, Y. Sawada, Patterning of hydra cells aggregates without sorting of cells from different axial origins, Dev. Biol. 151 (1992), 111–116. [40] J. A. Sherratt, P. K. Maini, W. Jäger, W. Müller, A receptor based model for pattern formation in Hydra, Forma 10 (1995), 77–95. [41] H. Shimizu, Y. Sawada, T. Sugiyama, Minimum tissue size required for hydra regeneration, Dev. Biol. 155 (1993), 287–296. [42] A. Stevens, The derivation of chemotaxis equations as limit dynamic of moderately interacting stochastic many-particle systems, SIAM J. Appl. Math. 61 (2000), 183–212. [43] K. Suzuki, I. Takagi, Collapse of patterns and effect of basic production terms in some reaction-diffusion systems, GAKUTO Internat. Ser. Math. Sci. Appl. 32 (2010), 168– 187. [44] J. R. Tata, Autoinduction of nuclear hormon receptors during metamorphosis and its significance Insect. Biochem. Mol. Biol. 30 (2000), 645–651. [45] A. M. Turing, The chemical basis of morphogenesis, Phil. Trans. Roy. Soc. B 237 (1952), 37–72. [46] L. Wolpert, Positional information and the spatial pattern of cellular differentiation, J. Theor. Biol. 25 (1969), 1–47.
212
8 Mathematical Models of Pattern Formation
Author Information Anna Marciniak-Czochra, Institute of Applied Mathematics, Interdisciplinary Center of Scientific Computing (IWR) and BIOQUANT, University of Heidelberg, Heidelberg, Germany E-mail: [email protected]
9
Modeling the Dynamics of Genetic Mechanism, Pattern Formation, and the Genetics of “Geometry”
Robert S. Anderssen, Maureen P. Edwards and Sergiy Pereverzyev Jr.
9.1 Modeling the Positioning of Trichomes on the Leaves of Plants
Abstract. A continuing and future challenge in plant science is the “genetics of geometry” [3]: the recovery of information about the dynamics of the genetic mechanisms by which plants control the development of various features of their geometry. Some representative publications dealing with such issues include: (i) the modeling of plant architecture using L-systems and rewriting [18], (ii) the genetic control of floral development [4,10], and (iii) the positioning of the trichomes (hairs) on the leaves of plants such as Arabidopsis thaliana [23, 25]. It is the positioning of trichomes which is examined in this chapter. The use of reaction-diffusion models is compared with cellular signaling and switching models. It is concluded that, in performing simulations to understand the dynamics of the mechanisms that control pattern formation in plants, it is necessary to work with a cellular model of the plant organ being studied in order to improve on current understanding about how the genetics controls the signaling and switching between cells to produce the observed patterns. Keywords. Genetics of Geometry, Hexagonal Recursion, Leaf, Plant, Reaction-diffusion, Trichome 2010 Mathematics Subject Classification. 35K57, 92C15, 92C80
9.1.1 Introduction The importance of plants (and insects) in the study of the genetics and biology of all organisms relates to the fact that information recovered about the developmental biology of plants can be utilized, through bioinformatics, to improve on current understanding about the developmental biology of non-plants. This is a direct consequence of the revolution in molecular biology that has followed the publication of the double helix interpretation of heredity [33], and the consequential technological revolution associated with full genome sequencing of key organisms (e.g., Arabidopsis), with the discovery and exploitation of gene silencing, and with the fact, being exploited as an essential aspect of bioinformatics, that genes in different organisms with similar DNA sequences are often associated with similar phenotypes and roles. In addition, experimentation with plants is less expensive and ethically more acceptable than with mammals.
216
9 The Genetics of “Geometry”
A key example is the genetics of developmental biology. As detailed in the published literature, there has been a two-way street between the genetic studies of the developmental biology of plants and fruit flies [2]. This leads naturally to the study of pattern formation. Biologically, it is important from both theoretical and practical perspectives. For the breeding of plants, there is a need to know which genes control the geometry of plants so that new varieties allow for ease of harvesting and protection against wind damage. Theoretically, any research which yields an enhanced understanding of pattern formation contributes to improving the science of genetics. An early illustration of the fact that complex biological developmental processes can be simulated using simple algorithms was given by Young [38] in his modeling of the growth of the pea plant. This is consistent with the Wolpert hypothesis [37] that “It is clear that the egg contains not a description of the adult, but a program for making it, and this program may be simpler than the description. Relatively simple cellular forces can give rise to complex changes in form; it seems simpler to specify how to make complex shapes than to describe them.” The relevance and importance of this observation, especially from a mathematical modeling perspective, is reflected in the fact that any system which is robust can be modeled by the simple model which captures the essence of the phenomenon involved [5]. In addition, the recent research on the crocheting of structures with hyperbolic surfaces [30], of which corals are real-world examples, represents independent validation for the Wolpert hypothesis. The importance of Young’s contribution is that he argued and showed that an appropriate algorithm for the growth of the pea must not only be simple but also be able to generate, with small changes in key parameters, the known mutants. In fact, it represents a generic comment about modeling developmental processes in biology and, indirectly, about modeling pattern formation. Consequently, in order to illustrate the role of mathematics in the life sciences, the motivation for this paper is an examination of how to construct simple models for the positioning of trichomes on the leaves of plants. Various models have been proposed including (a) activator-inhibitor reaction-diffusion (AIRD) PDEs models [13], (b) cellular signaling and switching models [7, 25], (c) phenomenological genetic models—logical descriptions of the gene activity in terms of diagrammatic models, and (d) systems of ordinary differential equations which model the known genetics and a biological interpretation of the roles of the genes [6].
9.1 Modeling the Positioning of Trichomes on the Leaves of Plants
217
It is known that, in the positioning of trichomes, not only are there mutants with no or sparsely irregularly spaced trichomes but also ones with random clumping of the trichomes. Consequently, a good test of a model of a developmental process is whether that model can, with a simple change in parameterization, simulate both the wild type and known mutants such as the clumping. Here, the clumping mutant test is chosen as it highlights a clear difference between the various models and gives strong support to the use of the cellular signaling and switching ones. Comment. As already mentioned, there is an extensive literature which discusses trichome positioning on Arabidopsis leaves where theoretical modeling is matched with experimental details which include the assumed activity of genes known to be involved. The emphasis ranges from the philosophical and experimental to the highly technical. The challenge is the identification of a framework within which logical conclusions can be made about how the genes orchestrate the resulting pattern. It is more than simply saying that such a system can generate patterns and is therefore the mechanism involved. In fact, the situation is sometimes confused because of the failure to draw a clear distinction between the simulation of pattern formation by a complex mathematical model, for which there is no natural biological mechanistic interpretation, and the formation of simple combinatorial algorithms (such as Young’s pea growth model) that have a biological mechanistic structure. Digiuni et al. [6] acknowledge the importance of modeling the essential biological mechanism. They propose a coupled theoretical/experimental approach based on an ordinary differential equation model of the time evolution of key genes. The chapter has been organized in the following manner. The role and limitations of AIRD PDEs in modeling the positioning of the trichomes is examined in Section 9.1.2. Here, their failure in being able, with a small change in parameterization, to generate clumping mutants is explained. These limitations represent motivation for cellular signaling and switching models. The hexagonal recursion implementation [25] is introduced and discussed in Section 9.1.3. Using simulations, it is shown how it can generate, with small changes in the control parameters, the wild type and various mutants including the clumping ones. It is stressed that, for the structure of an algorithm of a developmental process, a minimum requirement must be that mutants can be generated by changing the key parameters that control the development of the wild type. Examples include the turning ON or OFF of a parameter to simulate the ON-OFF (or OFF-ON) switching of a key gene, or, as in Young’s model of pea development, the accumulating of a signaling molecule (hormone) which controls change from one state to another. The chapter concludes in Section 9.1.4 with a discussion about cellular modeling as a basis for the identification of the dynamics of mechanisms controlling various aspects of pattern formation in plants.
218
9 The Genetics of “Geometry”
9.1.2 Activator-inhibitor Reaction-diffusion Modeling of the Trichome Positioning For the modeling of the positioning of trichomes (hairs) on the leaves of plants (in particular Arabidopsis thaliana), activator-inhibitor reaction-diffusion modeling has been proposed as a framework within which to interpret the known genetics [15, 26, 27]. The motivation was the publication by Gierer and Meinhardt [9] of their activatorinhibitor mechanism for biological pattern formation, since this mechanism directly yields a structure in which to interpret the formation of patterns, such as the positioning of trichomes, in terms of cellular and molecular processes. Their emphasis was on explaining how an activator-inhibitor mechanism could be the essential control of biological development and the associated pattern formation. Meinhardt subsequently explained how this concept could be applied to the modeling of specific situations [21] including pattern formation on shells [20] and in plants [19]. The application of the Gierer and Meinhardt activator-inhibitor interpretation to the positioning of trichomes from a genetic perspective followed [13, 15, 26, 27] once the developing molecular biology technology allowed for the easier identification of the genes controlling the differences between the wild type and mutants. Motivation for the formulation of activator-inhibitor mechanisms has been the seminal publication of Turing [32] of a mathematical theory of chemical morphogenesis. Turing’s conceptualization was profound because of its essential simplicity. His ansatz was that if, in the absence of diffusion, the reaction dynamics was such that its solution tended to a linearly stable uniform steady equilibrium, then, under appropriate conditions, the full reaction-diffusion system would generate spatially inhomogeneous patterns if the diffusion was destabilizing. The importance of Turing’s 1952 paper, published the year before Watson and Crick’s double helix paper [33], is that conditions were derived for which such destabilization would occur. The fact that such a process could generate patterns with a wide range of complexity which agreed with observed patterns in the physical and biological sciences became the stimulus for the subsequent explosion in reaction-diffusion research from both a practical [22] and theoretical perspective [12]. Remark. With respect to the above comments, it is appropriate to mention that reaction-diffusion partial differential equations first arose in the study of biological invasion and the nature of the dynamics being modeled there is quite different from that resulting from Turing’s chemical morphogenesis theory. Biological invasion has an interesting history starting with the paper by Fisher in 1937 [8] and subsequently followed with different research contributions by Kolmogorov, Weinberger and others [1, 16, 17, 29, 31, 34]. Comment. The extent to which order and localization is an essential feature of the patterns that can be generated by reaction-diffusion systems has been examined from a number of independent perspectives. Lacalli [14] compared the patterns of cell wall growth in unicellular algae with a two morphogen reaction-diffusion model (Tyson’s
9.1 Modeling the Positioning of Trichomes on the Leaves of Plants
219
Brusselator) and concluded that the simulation of the growth could be viewed as a reaction-diffusion process. Holloway and Harrison [11] used the Clark and Evans R parameter and the radial distribution function g.t / to examine and characterize localization and order in reaction-diffusion patterns. Their discussion included a comparison of reaction-diffusion models with the inhibiting field concept of Wigglesworth [35]. Important as such publications are, they do not however examine mathematically the essential framework within which reaction-diffusion systems generate patterns and the limitations biologically that are thereby imposed. As explained in Murray [22, Chapter 14], for a general two component system, which includes the Gierer and Meinhardt [9] activator-inhibitory mechanism as a special case, the essence of the associated mathematics involves the following steps: (i)
The Two-Component Reaction-Diffusion System. On some two dimensional region V , the general non-dimensional form of a twocomponent reaction-diffusion system is given by u t D f .u; v/ C r 2 u; 2
v t D g.u; v/ C d r v;
u D u.x; y; t /;
u t D du=dt;
v D v.x; y; t /;
v t D dv=dt;
(9.1)
with zero flow conditions on the boundary @V and given initial conditions on V u .n:r/ D 0; u.x; y; 0/; v.x; y; 0/ given; v where d denotes the ratio of the diffusion coefficients and can be given various interpretations including being the ratio of the relative strengths of the reaction and the diffusion. (ii) The Reaction Terms. Depending on the application, the non-dimensional reaction terms f .u; v/ and g.u; v/ will take different forms. In terms of studied simple two-component systems, the applications include Schnakenberg’s [28] two-species, chemical plausible, tri-molecular reaction, Thomas’ real empirical substrate-inhibition system (Murray [22, Chapter 5]) and the activator-inhibitor mechanism of Gierer and Meinhardt [9]. That for Gierer and Meinhardt takes the form f .u; v/ D a bu C
u2 ; v
g.u; v/ D u2 v;
a; b constants:
Its special dynamics is the result of the interplay between the linear and quadratic terms driven by the autocatalytic term u2 =v. The other mentioned applications have a generically similar structure and thereby can be given an activator-inhibitor interpretation in terms of an interplay between linear and quadratic terms driven by an activator.
220
9 The Genetics of “Geometry”
(iii) Linear Stability of Steady State. The pure reaction kinetics takes the following autonomous form (because no explicit spatial variation is involved in the definition of the reaction terms) u t D f .u; v/;
v t D g.u; v/:
(9.2)
Linearization of this equation about the steady state solution .u0 ; v0 / (i.e., the solution of the steady state equations f .u; v/ D 0 and g.u; v/ D 0) yields fu fv ; A stability matrix: (9.3) w t D Aw; A D gu gv .u0 ;v0 / A standard analysis (cf. Murray [22], Section 14.3), which seeks the conditions for which w exp .t / yields the following conditions on which guarantee linear stability (i.e., Re < 0) trace.A/ D fu C gv < 0;
determinant.A/ D fu gv gu fv > 0:
(9.4)
(iv) Destabilizing Diffusion. On returning to the full reaction-diffusion system of equations (9.1), it is now necessary to perform a linear stability analysis of these equations about the steady state solution of this system. The details, not unexpectedly, are involved. However, in essence, it is now not only necessary to satisfy the condition (9.4) but also the conditions that arise in the stability analysis of the full system. Within the resulting set of constraints, the possibility occurs that instability can occur for just one of the wave numbers k that defines the full solution of the linearized reaction-diffusion system. This leads naturally to the conclusion that a discrete pattern forms with the discreteness controlled by the relevant value of k. For the reaction-diffusion modeling of pattern formation in plants, the connection to the associated biology and genetics is orchestrated via the assumption that the molecular dynamics controlling the pattern formation is an activator-inhibitor mechanism. This then imposes the constraint that the known genes involved must be interpreted from this perspective. In this mechanism, the activator tries to find a point in the pattern domain where it can locally dominate the inhibitor. At such a point, the pattern feature (e.g., trichome) will arise. The local nature of the dominance of the activator does not allow more than one activator to be present near the pattern feature activation point, and this in turn limits clumping effects. Though such modeling is able to reproduce some of the data for trichome positioning, it has a number of drawbacks: (i)
The pattern arises as an instability in the reaction-diffusion process which is difficult to explain biologically.
9.1 Modeling the Positioning of Trichomes on the Leaves of Plants
221
(ii) The control of its parameterization to produce mutants is problematic and it is hard to interpret them biologically. (iii) It is challenged to reproduce the clumping mutant. (iv) The reaction-diffusion equations model an outward moving wave which is not consistent with the assumption that the fate of the leaf is decided at the meristem. An alternative modeling concept, which captures cell communication and allows clumping effects, is proposed in the next section.
9.1.3 Hexagonal Recursion The following discussion of hexagonal recursion and its application to modeling the positioning of trichomes on the leaves of plants is based on the earlier deliberations of [7]. The basic assumption on which that modeling was based is that the fate of the cells on the leaf of a plant is determined as it grows out of the meristem as illustrated in Figure 9.1 (a). The structure of the epidermal surface of a leaf is assumed to take the form of a hexagonal array. This is in keeping with the results in Table 1 of [15], where the average number of sides of the cells is approximately six. In general, plant cells are not hexagonal, however, the cells do tend to have a structure that is topologically similar to a hexagonal array. For example, a running rectangular brick array is topologically equivalent to a hexagonal array. It is assumed that the concentration of the signal controlling whether a cell becomes a trichome accumulates according to the additive rule of Figure 9.1 (b). In the sequel, this formula will be referred to as the “hexagonal recursion”. The numerical values generated by this recursion will be referred to as “hexagonal concentration values” (HCV). For this model, a number of important biological constraints are automatically taken into account: (i)
The model is cellular, and not some macroscopic model that has smoothed out the cellular details.
(ii) The fate of a cell is determined by its neighbors [24]. (iii) As implied in various papers by Wolpert such as [36], any cellular model must respect the known positional information behavior in the biological development of that part of an organism that is being modeled. (iv) Because of the directional nature of the hexagonal recursion, in defining how the concentration of the key signal accumulates, the model automatically involves a polar transport mechanism. (v) Trichomes do not form in boundary cells of an Arabidopsis leaf [26].
222
9 The Genetics of “Geometry”
C2 C1
C3 C*
(a)
Meristem
(b)
C* = wl C1 + wcC2 + wrC3
Figure 9.1. (a) A hexagonal cell approximation of the epidermal cells on the upper side of a leaf. The top hexagonal cell corresponds to the tip of the leaf. The red line represents the meristem out of which the leaf is growing. Further details can be found in [25]. (b) The localized additive relationship that models how the concentration of the signal in the cells above the meristem determine its concentration in the cells forming at the meristem.
In Section 9.1.3.1, the hexagonal recursion is defined. It must be robust in that small changes in the value of the defining parameters give only small changes in the patterns generated. This does not rule out the fact that large changes in the parameter will give large changes in the patterns. In Section 9.1.3.2, patterns are generated which are of wild type. In Section 9.1.3.3, by varying a particular parameter in the hexagonal recursion, an extensive range of synthetic mutants is generated. 9.1.3.1 The Hexagonal Recursion Rule The hexagonal recursion rule for determining the HCV of a cell and the consequential positioning of trichomes is given by: R1. Tiles located on the periphery of the leaf have the “boundary” HCV P0 . R2. The leaf starts growing from the three tip hexagons in the manner indicated in Figure 9.1 (b). The HCV C of the cell forming at the meristem is determined by C D wl C 1 C wc C 2 C wr C 3 ;
wl 0; wc 0; wr 0;
(9.5)
where wl C wc C wr 1, which guarantees that, until reset, the HCV increases in the cell(s) forming at the meristem. R3. This process continues progressively for each triple of hexagonal cells above the meristem as the leaf step-by-step grows out of the meristem. R4. With respect to a specified threshold T1 P0 , if a cell, forming at the meristem as indicated in Figure 9.1 (b), has HCV C T1 , then the fate of that cell is set to be a trichome. R5. With respect to a specified threshold T2 T1 , if, in a triple of hexagonal cells above the one that they are involved in forming at the meristem, at least one of them has HCV greater than or equal to T2 , then the HCV C is reset to be P0 .
223
9.1 Modeling the Positioning of Trichomes on the Leaves of Plants
Two trivial asymptotic situations can arise from this hexagonal recursion. When the first threshold value T1 D P0 , the boundary value, and the second threshold value T2 ! 1, the ON threshold is always met and the OFF threshold is never met, and so trichomes are produced in every cell, including the boundary. The other trivial case arises when the threshold value T1 is so high that it is never met and so no trichomes are ever activated. 9.1.3.2 Wild Type Patterns The recursion rule is used with the OFF threshold value T2 equal to the ON threshold value T1 . This means that as soon as a cell reaches the threshold value to produce a trichome, any adjacent cells below have their value reset to the initial value P0 , and the possibility of a trichome is “switched off”. In Figures 9.2 and 9.3, the HCVs have been determined according to the hexagonal recursion. The colour of the cell represents the corresponding HCV with white corre-
(a)
T1 = 20
(b)
T1 = 100
(c)
T1 = 500
Figure 9.2. Final distributions generated by the hexagonal recursion with T1 D T2 and (a) T1 D 20, (b) T1 D 100, (c) T1 D 500, with weights wl D 4, wc D 2, wr D 1 and boundary value P0 D 1. (Note that the shading changes as the value of T1 changes.)
(a)
T1 = 10
(b)
T1 = 50
(c)
T1 = 500
Figure 9.3. Final distributions generated by the hexagonal recursion with T1 D T2 and (a) T1 D 10, (b) T1 D 50, (c) T1 D 500, with weights wl D 4, wc D 1, wr D 0 and boundary value P0 D 1.
224
9 The Genetics of “Geometry”
sponding to zero and black to the maximal value one less than the threshold value T1 (i.e. T1 1). The cells which become trichomes where the HCV matches or exceeds the threshold value T1 are marked green. It can be shown that if symmetric weights (i.e., wl D wr ) are chosen, symmetric patterns will be produced. These types of patterns are not presented and all cases illustrated have non-symmetric weights wl ¤ wr . Even so, artifactual, symmetric-like patterns (e.g., Figure 9.3 (a)) can form. It is also possible to generate diagonal patterns (e.g., Figure 9.2 (b) and 9.3 (b)). In Figure 9.2, the value of the switching thresholds T1 .D T2 / are increased with all other parameter values fixed. This leads to a delay in the appearance of cells producing trichomes. Increasing the value of T1 (and T2 ) will lead to the first trichome production located further down the leaf. This pattern is also observed in Figure 9.3. From Figures 9.2 and 9.3, it can also been seen that as the threshold value T1 .D T2 / increases, the number of cells producing trichomes decreases and the pattern becomes more sparse. Further increases in T1 will result in less trichome production, with the limiting case of no trichome production occurring as T1 ! 1. 9.1.3.3 Mutant Patterns As is clear from the discussion in Section 9.1.3.2 there is no possibility of generating clumps of trichomes when T2 D T1 , since as soon as a cell reaches the threshold value to produce a trichome, any adjacent cells below have their value reset to the initial value P0 , and the possibility of a trichome is “switched off”. Allowing the HCV C in a cell to have neighboring cells that have produced a trichome but have not reached or exceeded the second, higher threshold will allow neighboring cells to become trichomes. This establishes that two distinct thresholds are required to produce clumping and, consequently, parameter choices with T2 > T1 are required to generate mutants. Figures 9.4 and 9.5 both have wl ¤ wr and the resulting patterns are generally not symmetric. It is obvious that clumping of trichomes is possible in the hexagonal re-
(a)
T2 = 20
(b)
T2 = 100
(c)
T2 = 300
Figure 9.4. Final distributions generated by the hexagonal recursion with T1 D 10 and (a) T2 D 20, (b) T2 D 100, (c) T2 D 300, with weights wl D 4, wc D 2, wr D 1 and boundary value P0 D 1.
225
9.1 Modeling the Positioning of Trichomes on the Leaves of Plants
(a)
T2 = 100
(b)
T2 = 400
(c)
T2 = 800
Figure 9.5. Final distributions generated by the hexagonal recursion with T1 D 60 and (a) T2 D 100, (b) T2 D 400, (c) T2 D 800, with weights wl D 4, wc D 1, wr D 0 and boundary value P0 D 1.
cursion model when T2 > T1 . Holding the threshold value T1 fixed and increasing the value of T2 leads to a thickening of the groups of cells producing trichomes. Increasing the threshold value T1 will result in the initial production of trichomes occurring further down the leaf. Setting the right-hand weight wr D 0 leads to a trichome pattern which is pushed toward the right-hand side of the leaf.
9.1.4 Conclusions As conjectured in Pereverzyev and Anderssen [25], having a notional model of the type formulated above, yields a new framework for performing the biocombinatorial sorting of the known genes to be involved into biomechanistic categories. The traditional approach, as exemplified in Digiuni et al. [6], is basically biological as the modus operandi is driven by comparative genetics based on differences in phenotype between mutants and wild type. Appealing to a complex mathematics differential equation model is of little assistance unless that model has or can be related explicitly to the genetic/biological processes occurring. In addition, the formulation of a differential equation model that simulates the observed biological dynamics, though interesting and informative, is not a proof of or framework for the analysis of the biomechanistic dynamics occurring. An alternative strategy, which is being proposed here, is the formulation of simple models, defined in terms of variables which can be given specific biological meaning (e.g., the concentrations of key hormones), which yield a mechanistic framework within which the comparative genetics is performed. As already mentioned, the earlier pea leaf modeling of Young [38] represents an excellent model system of the process being proposed. A notional protocol for doing this in the context of the positioning of trichomes on the leaves of plants can be found in Section 5 of Pereverzyev and Anderssen [25].
226
9 The Genetics of “Geometry”
Acknowledgments. The authors are grateful to the referees whose comments have assisted in the improvement of this chapter. Publications by the authors related to earlier versions of this research have been acknowledged in the text. This chapter represents an extension of and further development of their ideas in their MODSIM 2011 Conference paper entitled “Modelling pattern formation in plants”.
Bibliography [1] D. G. Aronson, H. F. Weinberger, Multidimensional non-linear diffusion arising in population-genetics, Advances in Math. 30 (1978), 33–76. [2] E. Coen, The Art of Genes, Oxford University Press, Oxford, 1999. [3] E. Coen, A. G. Rolland-Lagan, M. Matthews, J. A. Bangham, P. Prusinkiewicz, The genetics of geometry, PNAS, 101 (2004), 4728–4735. [4] M.-L. Cui, L. Copsey, A. A. Green, J. A. Bangham, E. Coen, Quantitative control of organ shape by combinatorial gene activity, PLOS Bio. 8 (2010), e1000538. [5] F. R. de Hoog, Why are simple models often appropriate in industrial mathematics? in: 18th World IMACS/MODSIM Congress, Cairns, July 13-17, Proceedings (2009), 23–36. [6] S. Digiuni, S. Schellmann, F. Geier, B. Greese, M. Pesch, K. Wester, B. Dartan, V. Mach, B. P. Srinivas, J. Timmer, C. Fleck, M. Hulskamp, A competitive complex formation mechanism underlies trichome patterning on Arabidopsis leaves, Mol. Systems Bio. 4, Article Number 217 (2008). [7] M. P. Edwards, S. Pereverzyev Jr., R. S. Anderssen, Modelling pattern formation in plants, in: MODSIM 2011 Congress, Perth, December 12-16, Proceedings (2011), 378– 384. [8] R. A. Fisher, The wave of advance of advantageous genes, Annals Eugenics 7 (1937), 355–369. [9] A. Gierer, H. Meinhardt, Theory of biological pattern formation, Kybernetik 12 (1972), 30–39. [10] A. A. Green, R. Kennaway, A. I. Hanna, J. A. Bangham, E. Coen, Genetic control of organ shape and tissue polarity, PLOS Biology 8, Article Number el000537 (2010). [11] D. M. Holloway, L. G. Harrison, Order and localization in reaction-diffusion pattern, Physica A - Stat. Mech. and Appl. 222 (1995), 210–233. [12] R. B. Hoyle, Pattern Formation: An Introduction to Methods, Cambridge University Press, Cambridge, 2006. [13] M. Hulskamp, Plant trichomes: A model for cell differentiation, Nature Rev. Molecular Cell Biol. 5 (2004), 471–480. [14] T. C. Lacalli, Dissipative structures and morphogenetic pattern in unicellular algae, Phil. Trans. Roy. Soc. London – B. Bio. Sciences 294 (1981), 547–588. [15] J. C. Larkin, N. Young, M. Prigge, M. D. Marks, The control of trichome spacing and number in Arabidopsis, Development 122 (1996), 997–1005.
9.1 Modeling the Positioning of Trichomes on the Leaves of Plants
227
[16] B. T. Li, M. A. Lewis, H. F. Weinberger, Existence of traveling waves for integral recursions with nonmonotone growth functions, J. Math. Biology 58 (2009), 323–338. [17] B. T. Li, H. F. Weinberger, M. A. Lewis, Spreading speeds as slowest wave speeds for cooperative systems, Math. Biosci. 196 (2005), 82–98. [18] A. Lindenmayer, Developmental algorithms for multicellular organisms—Survey of L-systems, J. Theor. Biology 54 (1975), 3–22. [19] H. Meinhardt, Models of pattern formation and their application to plant development, in: P. W. Barlow and D. J. Carr (eds.), Positional Control in Plant Development Chapter 1, Cambridge University Press, Cambridge, (1984), 1–32. [20] H. Meinhardt, M. Klingler, A model for pattern-formation on the shells of mollusks, J. Theor. Bio. 126 (1987), 63–89. [21] H. Meinhardt, Models of biological pattern formation: From elementary steps to the organization of embryonic axes, in: S. Schnell, P. K. Maini, S. A. Newman, T. J. Newman, (eds.), Multiscale Modelling of Developmental Systems, Current Topics in Developmental Biology 81, pp. 1–63, 2008, 9th Biocomplexity Workshop, Bloomington, IN, MAY, 2006. [22] J. D. Murray, Mathematical Biology, Springer, Berlin, 1989. [23] C. M. O’Keefe, S. Pereverzyev Jr., R. S. Anderssen, The algebra of hexagonal numbers, The Mathematical Scientist 36 (2011), 1–9. [24] R. I. Pennell, Q. C. B. Cronk, S. Forsberg, C. Stohr, L. Snogerup, P. Kjellbom, P. F. McCrae, Cell-Ccntex signalling, Phil. Trans Roy. Soc. London, B-Biological Sci. 350 (1995), 87–93. [25] S. Pereverzyev Jr., R. S. Anderssen, Recursive algebraic modelling of gene signalling, communication and switching, RICAM Report 24 (2008), 17. [26] S. Schellmann, A. Schnittger, V. Kirik, T. Wada, K. Okada, A. Beermann, J. Thumfahrt, G. Jurgens, M. Hulskamp, TRIPTYCHON and CAPRICE mediate lateral inhibition during trichome and root hair patterning in Arabidopsis, EMBO J. 21 (2002), 5036–5046. [27] B. Scheres, Plant patterning: TRY to inhibit your neighbors, Current Bio. 12 (2002), R804–R806. [28] J. Schnakenberg, Simple chemical-reaction systems with limit-cycle behavior, J. Theor. Biol. 81 (1979), 389–400. [29] N. Shigesada, K. Kawasaki, Biological Invasion: Theory and Practice, Oxford University Press, Oxford, 1997. [30] D. Taimina, Crocheting Adventures with Hyperbolic Planes, A. K. Peters Ltd., Wellesley, MA, 2009. [31] H. R. Thieme, Density-dependent regulation of spatially distributed populations and their asymptotic speed of spread, J. Math. Biology 8 (1979), 173–187. [32] A. M. Turing, The chemical basis of morphogenesis, Phil. Trans. R. Soc. London Ser. B-Biol. Sci. 237 (1952), 37–72.
228
9 The Genetics of “Geometry”
[33] J. D. Watson, F. H. C. Crick, Molecular structure of nucleic acids – A structure for deoxyribose nucleic acid, NATURE 171 (1953), 737–738. [34] H. F. Weinberger, Long-time behavior of a class of biological models, SIAM J. Math. Anal. 13 (1982), 353–396. [35] V. B. Wigglesworth, Local and general factors in the development of “pattern” in Rhodnius prolixus (hemiptera), J. Exp. Bio. 17 (1940), 180–200. [36] L. Wolpert, Positional information and spatial pattern of cell differentiation, J. Theoretical Biol., 25 (1969), 1–47. [37] L. Wolpert, The Development of Pattern and Form in Animals, Carolina Biology Readers, No. 51, Carolina Biological Supplies, Burlington, NC, (1977), 1–16. [38] J. P. W. Young, Pea leaf morphogenesis – a simple-model, Annals Botany 52 (1983), 311–316.
Author Information Robert S. Anderssen, CSIRO Mathematics, Informatics and Statistics, Canberra, Australia E-mail: [email protected] Maureen P. Edwards, School of Mathematics and Applied Statistics, University of Wollongong, Wollongong, Australia E-mail: [email protected] Sergiy Pereverzyev Jr., Industrial Mathematics Institute, Johannes Kepler University Linz, Linz, Austria E-mail: [email protected]
10
Statistical Modeling in Life Sciences and Direct Measurements
Illya Likhtarov, Sergii Masiuk, Mykola Chepurny, Alexander Kukush, Sergiy Shklyar, Andre Bouville and Lina Kovgan
10.1 Error Estimation for Direct Measurements in May–June 1986 of 131 I Radioactivity in Thyroid Gland of Children and Adolescents and Their Registration in Risk Analysis
Abstract. A statistical model of thyroid gland radioactivity measurements is proposed. The measurement error is of classical type and heteroscedastic. Its variance can be reliably estimated. A model of thyroid exposure dose is constructed that involves both classical and Berkson errors. Two methods are proposed to deal with dose uncertainty in risk analysis: (a) parametric calibration, where the true doses are assumed log-normally distributed, and (b) nonparametric calibration, where the form of dose distribution is not specified. Keywords. Berkson Measurement Error, Classical Measurement Error, Exposure Dose Uncertainty, Radioactivity Measurement, Regression Calibration, Thyroid Cancer 2010 Mathematics Subject Classification. 62P10
10.1.1 Introduction In May and June 1986 millions of inhabitants of Ukraine, Belarus and Russia suffered from exposure to radiation fallout caused by the Chornobyl accident. The most substantial was the thyroid exposure as a result of iodine radioisotopes fallout, first of all 131 I (cf. Likhtarev et al. [9, 13, 15]). Already within 5–6 years after the accident, sharp increasing in thyroid cancer cases were observed for children and adolescents who lived in the territories where the estimated exposure doses for this organ were quite large [1, 5, 12, 25]. In fact, increasing thyroid cancer prevalence for children and adolescents was caused by inner exposure to Chornobyl fallouts. This was the main statistically significant remote effect
This work was supported by the Ukrainian Radiation Protection Institute of the Academy of Technological Science and by the Scientific Center for Radiation Medicine of the Academy of Medical Science of Ukraine, and by the intramural program of the U.S. National Cancer Institute, NIH, DHHS.
232
10 Statistical Modeling in Life Sciences and Direct Measurements
of the Chornobyl accident. It is not surprising that this phenomenon was of great interest for radioepidemiologists all over the world that resulted in a series of research in Ukraine, Belarus and Russia [6, 14, 24, 26]. Utmost interest to this problem is also due to the fact that there was quite extensive and reliable information about risk of radio-indicated thyroid cancer as a result of external exposure acting on this organ (cf. Ron et al. [22]). Concerning internal exposure, there is not enough data about the risk quantity [6, 14, 16, 24, 26]. Interpretation of results of most radioepidemiological investigations presented in the papers cited above was founded on a series of general approaches, first of all to estimate the acting factor, i.e., the exposure dose. Those approaches are the following:
It was accepted that estimates of exposure dose contain uncertainty that is considerable as a rule;
Even if it was possible to determine the size of dose, estimates errors, the applied analytic tool of risk analysis, ignored this circumstance;
Practically in all radioepidemiological studies there were no instrumental measurements that were used in the process of their dosimetric support.
As a result of general properties of dosimetric support listed above, in analytic procedures of risk analysis only the stochastic nature of thyroid cancer cases was taken into account, whereas the exposure doses were supposed to be estimated without error. Moreover, one of the most popular instruments of such risk analysis, the computer package EPICURE (Preston et al. [20]) ignores the fact that the exposure doses contain significant uncertainty. Investigations performed by the authors in [8] showed that dose uncertainties can be quite correctly taken into account within risk analysis. Some difficulty is that the main sources of dose uncertainties are related to errors in estimation of the mass of exposed organ, level of the measured activity, and ecological dose component. In Kukush et al. [8] it is shown that the thyroid mass measurements contain Berkson error and the instrumental measurements contain the classical error. While it is easy to estimate Berkson error using Monte Carlo methods, cf. Likhtarev et al. [15], for the classical error estimation a special analysis is needed. The present investigation is devoted exactly to this problem. The paper is organized as follows. Section 10.1.2 contains the main results and includes the procedure of direct radioactivity measurements, the measurement error structure, the dose model, and the proposed methods of dealing with dose uncertainty in risk analysis. Section 10.1.3 concludes. A quadratic approximation to the conditional expectation of the latent variable given the observed variable is presented in Appendix 10.1.4. The approximation can be used within nonparametric calibration methods in risk analysis. In the paper, E and var denote expectation and variance, respectively.
10.1 Error Estimation for Direct Radioactivity Measurements
233
10.1.2 Materials and Methods 10.1.2.1 Direct Measurements of 131I Radioactivity in the Thyroid In May–June 1986 within the territory of Ukraine there were made more than 150 000 measurements of 131 I content in the thyroid for inhabitants of three northern Oblasts of Ukraine1 who suffered from the most intensive radioactive nuclide fallouts, including 115 000 measurements among children and adolescents from the age of 0 to 18 years, as in Likhtarev et al. [10, 11]. Measurements were made by special accident brigades under general guidance of the Health Minister of Ukraine. Consulting help was made by a group of specialists from the Leningrad Institute of Sea Transport Hygiene. The group elaborated a general method to make measurements and provided the brigades with etalon 131 I sources for calibration of instruments. At the beginning of mass measurements of 131 I content in the thyroid, significant numbers of adolescents from affected regions were taken to the places of traditional summer rest in southern, less affected Oblasts of the country, and the population of 30 km zone neighboring the Chornobyl Nuclear Power Plant was thoroughly evacuated. Therefore, 47 000 measurements were made within the territory of 10 Oblasts that were quite far from the Chornobyl Nuclear Power Plant, whereas 103 000 measurements were made within three northern Oblasts of Ukraine where there was considerable radionuclide pollution. In the monitoring, about 100 instruments with scintillation detectors NaI(TI) were used. About one third of them (27 instruments) consisted of single-channel impulse radiometers of 5 different types that worked in the regime of impulse accumulation (typical accumulation time for different instruments varied from 15 to 200 seconds). The most popular instruments (about 65 ones) were integral (not energy-selective) radiometers SRP 68-01 that worked in the count rate regime. The electronics of those instruments made it possible to show permanently on the arrow indicator the number of impulses accumulated during nearly 5 second intervals. As usual, the measurement scheme was the following. The measurements were made in well-ventilated rooms with damp cleaning every hour. In order to decrease background exposure, the instrumental detectors were defended with lead collimators, factory-made (for energy-selective instruments) or hand-made (for SRP 68-01 instruments). An instrumental detector was brought to the person’s neck, and a single registration was made that was written in the list. Once an hour or once a day the background exposure was measured at the same point. Those measurements were written in the list as well. In order to calibrate an instrument, every hour or every day a registration was made from a bottle-phantom containing etalon 131 I solution. Alas, some of SRP 68-01 instruments were not calibrated at all. 1
The territory of Ukraine is administratively divided into 26 Oblasts with an approximate area 20–30 thousand square km each.
234
10 Statistical Modeling in Life Sciences and Direct Measurements
10.1.2.2 Error Estimation of Direct Measurements It is known (cf. Gol’danskiy et al. [3] as well as Ruark and Devol [23]), that for fixed intensity of radioactive source n, the probability to register k readings on measuring instrument, e.g., on Geiger–Mueller counter, during time period t is determined by Poisson law Pois.nt / with parameter nt , pn .k/ D
.nt /k nt e ; kŠ
k D 0; 1; 2; : : :
(10.1)
Based on (10.1) and methods of activity2 measurements of 131 I in the thyroid that were described above, we get kbg kth ; (10.2) fsh Qmes D K mes tth tbg where Qmes is a measured value of 131 I activity in the thyroid, K mes is a measured value of device calibration coefficient, kth is a number of impulses registered by the device in the process of 131 I activity measurement in the thyroid at the time interval tth, kbg is the number of impulses registered by the device in the process of background activity measurement at the time interval tbg, and fsh is a factor of background shielding (degree of weakening of background during measurements of a subject). Because for large enough n, Poisson distribution (10.1) is close to normal (cf. Molina [19]), then Pois.nt / N .nt; nt /; (10.3) and one can write tr 2 nmes th N .nth ; th /;
tr 2 nmes bg N .nbg ; bg /;
where nmes th D
kth ; tth
nmes bg D
kbg ; tbg
th2 D
ntrth ; tth
2 bg D
(10.4) ntrbg tbg
;
and index tr denotes the true value of a quantity. mes Besides statistical registration error, variables nmes th and nbg contain additional in2 strumental error as well, and we denote its variance by dev . Using (10.3) one can estimate total variances of 131I intensities in the thyroid gland and background intensities, nmes nmes bg 2 2 2 ; O bg D C dev : (10.5) O th2 D th C dev tth tbg Due to the way the measuring instrument was calibrated, an approximate relation holds (10.6) K mes K tr .1 C K 1 /; 1 N .0; 1/; 2
Hereafter for brevity we write activity instead of radioactivity and source instead of radiosource.
10.1 Error Estimation for Direct Radioactivity Measurements
235
where K is estimated based on the error of standard 131I source and instrumental error. Using (10.6) together with (10.4) and (10.5), formula (10.2) is rewritten as Qmes K tr .1 C K 1 /.ntrth fsh ntrbg C n 2 /;
(10.7)
q 2 with n D O th2 C fsh2 O bg . From (10.7) we get Qmes K tr .ntrth fsh ntrbg C .ntrth fsh ntrbg/K 1 C n 2 C K n 1 2 /:
(10.8)
Because Qtr D K tr .ntrth fsh ntrbg /;
(10.9)
we obtain after substitution (10.9) in (10.8) that Qmes Qtr C K tr .n 2 C .ntrth fsh ntrbg /K 1 C K n 1 2 / Qtr C Q ; q 2 C .ntr f ntr /2 2 , N .0; 1/. with Q D K tr n2 C n2 K sh bg K th Because ntrth and ntrbg are unknown, we estimate Q by q mes 2 mes 2 2 D K mes n2 C n2 K C .nmes Q th fsh nbg / K : Finally, we get the observation model of activity with additive error mes Qmes D Qtr C Q :
(10.10)
10.1.2.3 Dose Model Likhtarev et al. [13, 15] show that the true individual thyroid dose of i th person from a cohort consisting of N persons is equal to Ditr D fitr Qitr =Mitr ;
(10.11)
where Mitr is the true thyroid mass, fitr is a multiplier obtained using an ecological model of radiation transition along trophic chains, and Qitr is the true 131I activity in thyroid. Denote fitr =Mitr D Fitr . Then relation (10.11) takes the form Ditr D Fitr Qitr :
(10.12)
But the true dose Ditr is unknown, because parameters Fitr and Qitr are unknown. Only the measured dose is given, Dimes D Fimes Qimes:
(10.13)
236
10 Statistical Modeling in Life Sciences and Direct Measurements
Here, the relation between Fitr and Fimes is described by additive3 Berkson error Fitr D Fimes Cıi , Eıi D 0, Fimes and ıi are stochastically independent; Qimes is the measured thyroid activity that can be written in the form (see formula (10.10)) mes i ; Qimes D Qitr C Q;i
iD1; : : : ; N;
(10.14)
mes is given and nonwhere 1 ; : : : ; n are standard normal variables, the value Q;i tr random, and variables i , Qi , iD1; : : : ; N , are jointly independent. The empirical distribution of a multiplier Fitr and its characteristics (expectation, variance, etc.) can be obtained by Monte Carlo procedures described in Likhtarev et al. [15]. Consider several methods of measurement error registration for radiation risk estimation.
10.1.2.4 Methods of Dose Uncertainty Registration in Risk Analysis There exist several methods for how to include errors in independent variables in the procedure of regression parameters estimation, see [2, 8, 17, 18, 21]. But the simplest and the most popular method is regression calibration. Its popularity is related to the fact that after calibration (that is, substitution of an independent variable with its conditional expectation) it is possible to utilize standard procedures of regression parameters estimation and to use for that purpose the computer package EPICURE. There exist two kinds of calibration, parametric and nonparametric. Both are considered in the present paper. The main idea of regression calibration (see Carroll et al. [2] and Kukush et al. [8]) is as follows. In the radiation risk model (see Health Risks from Exposure to Low Levels of Ionizing Radiation [4]) instead of true doses Ditr , their conditional expectations are used, (10.15) E.Ditr j Dimes /: Substitute (10.12) and (10.13) in (10.15) and obtain E.Ditr j Dimes / D E.Fitr Qitr j Dimes / D E.Fimes Qitr j Dimes /: Denote D tri D Fimes Qitr and get E.Ditr j Dimes / D E.D tri j Dimes /: Using (10.14) we obtain mes mes i / D Dtri C Fimes Q;i i : Dimes D Fimes Qimes D Fimes .Qitr C Q;i 3
(10.16)
The error can be either additive or multiplicative. In this case it is not so important, and the main requirement is that the equality holds E.Fimes / D E.Fitr /.
237
10.1 Error Estimation for Direct Radioactivity Measurements
Random variables ¹ıi ; i1º, ¹i ; i1º and vectors ¹.Fimes ; Qitr /; i1º are jointly mes independent, at that Fimes and Qitr could be correlated. Denote i D Fimes Q;i , then (10.16) takes the form Dimes D D tri C i i : (10.17) In fact (10.17) is a dose observation model with classical error. Parametric calibration method assumes that the sample Dtri , iD1; : : : ; N , comes from a known distribution (or the latter can be estimated reliably). Because the radiation doses of thyroids are essentially positive, and their distribution is left-asymmetric, see Likhtarev et al. [15], a log-normal distribution provides a good approximation. Therefore, we suppose that 2 log D tr N . D tr ; D tr /:
(10.18)
If the measured doses Dimes , iD1; : : : ; N , are given, then the parameters of the distribution (10.18) can be easily estimated. First, we estimate mDtr D ED tr and vDtr D var.D tr /, m b Dtr
N 1 X mes D Di ; N i D1
vO D tr D
1 N 1
N X
.Dimes m b Dtr /2
i D1
N 1 X 2 i : N i D1
Using relations for the moments of a log-normal distribution, see Korolyuk et al. [7], it is easy to construct the estimators of Dtr and 2 tr :
O D tr D log q
.b mDtr /2 vO Dtr C .b mDtr /2
! ;
D
2 O D tr
! vO Dtr D log C1 : .b mDtr /2
P 2 If the sequence ¹ N1 N i D1 i ; i1º is bounded, then those estimators are strongly consistent, that is, converge to Dtr and 2 tr a.s., as n ! 1. In what follows the D
parameters Dtr and 2 tr will be assumed known. N D tr Let tr be a pdf of D tri , mes be a pdf of Dimes, tr;mes be a joint pdf of D i and Dimes, and be a pdf of i i . Then tr;mes .D tri ; Dimes / D tr .D tri / .Dimes Dtri /; and the conditional pdf is tr .D tri / .Dimes D tri / tr;mes .Dtri ; Dimes / D : trjmes .D tri / D R 1 mes mes .Dimes / 0 tr;mes .t; Di / dt
238
10 Statistical Modeling in Life Sciences and Direct Measurements
This implies that Z1 E.D tri
j
Dimes/
D 0
1 ttrjmes .t / dt D mes.Dimes /
with
Z1 ttr .t / .Dimes t / dt; 0
Z1 D
mes.Dimes /
tr .t / .Dimes t / dt: 0
Because of (10.18),
1
tr .t / D p exp t 2Dtr
.log t Dtr /2 : 2 2 tr D
Then mes .Dimes / D Z1 .log t Dtr /2 .Dimes t /2 1 1 D dt: exp p exp p 2 2 tr 2i2 t 2D tr 2i D 0
(10.19) Similarly, Z1 t tr .t / .Dimes t / dt D 0
Z1 p
D 0
1 2D tr
exp
.log t Dtr /2 2 2 tr
p
D
1 2i
exp
.Dimes t /2 2i2
dt: (10.20)
We change variables in integrals (10.19) and (10.20): .log t Dtr /2 1 dt; exp dz D p 2 2 tr t 2Dtr D
Zt zD 0
t DG
t
p
1
1 2Dtr
.z/:
exp
.log t Dtr /2 2 2 tr
dt D G.t /;
D
(10.21)
239
10.1 Error Estimation for Direct Radioactivity Measurements
Because z D G.t / is a cdf of a log-normal law, then z.0/ D 0;
z.1/ D 1:
(10.22)
We plug-in (10.21) and (10.22) into (10.19) and (10.20) and get Z1 mes.Dimes /
D 0
1 p exp 2i
Z1
Z1 ttr .t /
0
.Dimes
t / dt D 0
.Dimes G 1 .z//2
G 1 .z/ p exp 2i
2i2
dz;
.Dimes G 1 .z//2 2i2
(10.23) dz: (10.24)
In spite of G 1 .z/ ! C1 as z ! 1, the integrals (10.23) and (10.24) are proper, because the integrand can be defined 0 by continuity at point z D 1. To compute the integrals we use an identity G 1 .z/ D exp . D tr C Dtr ˆ1 .z//; where ˆ.z/ stands for the cdf of standard normal law. The function ˆ1 .z/ is tabulated, and therefore one can use the trapezoid formula to compute the integrals (10.23) and (10.24). Within nonparametric calibration we do not parametrize the distribution of doses Dtri , iD1; : : : ; N . Therefore, in order to compute the conditional expectation E.D tri j Dimes /, one can use a polynomial approximation in powers of Dimes as described in Appendix, or estimation methods of the distribution of D tr based on the maximum likelihood function presented in Kukush et al. [8].
10.1.3 Conclusion and Discussion In present paper we constructed a statistical model of radioactivity measurements. It was shown that the measurements always contain random additive error of the classical type. It was shown as well that the measurement error can be regarded as normally distributed with sufficient accuracy. Estimation methods of the measurement error variance were elaborated. In the paper, a model of thyroid dose was constructed that takes into account the presence of both classical error, that is present in thyroid radioactivity measurements, and Berkson error, that is inevitably present in the ecological dosimetry model. Summarizing, we get the observation model of the dose with classical additive normal error. This model differs from the one proposed by the authors in [8]. In that model methods were elaborated to estimate radiation risks under uncertainty in doses. In this relation, the error was assumed multiplicative without particular investigation. In the present paper, the dose model is more realistic.
240
10 Statistical Modeling in Life Sciences and Direct Measurements
Also the authors elaborated two methods to deal with dose uncertainty in risk analysis. One of them, parametric calibration of dose, is based on the log-normality assumption for the sample of true doses. This assumption is based on the paper by Likhtarev et al. [15]. The second method, nonparametric calibration of dose, uses polynomial approximation for conditional expectation of the true dose in powers of the measured dose. A simulation study is not presented here and will be reported in a forthcoming paper. Information about the efficiency of parametric and nonparametric calibration and other methods dealing with uncertainty in risk analysis can be found in [8, 17, 18, 22].
10.1.4 Appendix. Approximation of Conditional Expectations Let wn D xn C un , nD1; 2; : : : ; N be realizations of a random variable w, un N .0; n2 /, n > 0, and variables ¹xn; un ; n1º are jointly independent. Here, xn is the unknown true value of the observable variable (dose), and wn is a measurement of xn . Denote Ex D x , varx D x2 . Suppose also that Ex 2 < 1 and the sequence P 2 ¹ N1 N nD1 n ; N 1º is bounded. 10.1.4.1 Linear Approximation of E.xn j wn / We search for the conditional expectation in a form E.xn j wn / D a C bwn : From (10.25) we have Ex D x D E.a C bwn /; Exn wn D Ex 2 D 2 C 2x D E.wn E.xn j wn // 2 D E.awn C bwn2 / D a x C b.w C 2x /: n
Solve the system of two algebraic equations in a and b and obtain A D .1 K/ x ; where K D
x2 2 w n
Parameters
b D K;
x2 is the reliability ratio. x2 Cn2 2
x and x can be easily estimated from
D
observations,
N 1 X wn ;
O x D w D N nD1
.O x /2 D with 2 D
1 N
PN
2 nD1 n .
1 N 1
N X nD1
.wn w/2 2 ;
(10.25)
10.1 Error Estimation for Direct Radioactivity Measurements
241
Thus, E.xn j wn / .1 K/ x C Kwn : In case x
N . x ; x2 /,
(10.26)
then in (10.26) the equality is attained, that is, E.xn j wn / D .1 K/ x C Kwn :
10.1.4.2 Quadratic Approximation of E.xn j wn / P 4 Now, suppose that Ex 4 < 1 and the sequence ¹ N1 N nD1 n ; N 1º is bounded. We search for the conditional expectation in a form E.xn j wn / D
2 X
ai wni :
(10.27)
i D0
From (10.27) we have for j D 0; 1; 2: E.wnj E.xn j wn // D E E.wnj xn j wn / D E.wnj xn / D E..xn C un /j xn/ DW bj ; X X 2 2 ai wni D ai Ewni Cj : (10.28) bj D E wnj i D0
i D0
Equalities (10.28) form a system of three linear equations with three unknowns ai , iD0; 1; 2. The matrix of the system G D .Gij / D .Ewni Cj /;
i; j D 0; 1; 2;
is a Gram matrix. It is positive definite, because random variables 1, wn , wn2 are linearly independent in the space L2 .; F ; P / of random variables with finite second moment (the linear independence follows from the fact that wn has continuous cdf and, therefore, is not concentrated at two points). Let a D .a0 ; a1 ; a2 /T 2 R31 , b D .b0 ; b1 ; b2 /T 2 R31 . Then (10.28) can be written in a vector form Ga D b. Hence, a D G 1 b: To estimate entries Gij of the matrix G we construct estimators of Ewnk , 0n4. We have Ewn0 D 1; Ewn1 D Ex; Ewn2 D Ex 2 C n2 ; Ewn3 D Ex 3 C 3x2 Ex; Ewn4 D Ex 4 C 6n2 Ex 2 C 3n4 :
242
10 Statistical Modeling in Life Sciences and Direct Measurements
Next, b0 D Ex; b1 D E.xn C un /xn D Ex 2 ; b2 D E.xn C un /2 xn D Ex 3 C n2 Ex: Estimators of Ex, Ex 2 , Ex 3 , Ex 4 can be constructed based on observations wn : b Ex D w; b Ex 2 D w 2 2 ; b Ex 3 D w 3 3w 2 ;
2 b Ex 2 3 4 D w 4 6 2 w 2 C 6 2 3 4 : Ex 4 D w 4 6 2 b
Here hat denotes the estimator, and bar stands for the average in n D 1; 2; : : : ; N . O and the final approximation is b 1 b, Thus, aO D .G/ E.xn j wn /
2 X
aO i wni :
i D1
Bibliography [1] E. E. Buglova, J. E. Kenigsberg, N. V. Sergeeva, Cancer risk estimation in Belarussian children due to thyroid irradiation as a consequence of the Chernobyl nuclear accident, Health Phys 71 (1996), 45–49. [2] R. J. Carroll, D. Ruppert, L. A. Stefanski, C. A. Crainiceanu, Measurement Error in Nonlinear Models. A Modern Perspective, Chapman and Hall/CRC, Boca Raton, 2006. [3] V. I. Gol’danskiy, A. V. Kutsenko, M. I. Podgoretskiy, Statistics of Counting-out During Registration of Nuclear Particles, Fizmatgiz, Moscow, 1959 (in Russian). [4] Health Risks from Exposure to Low Levels of Ionizing Radiation, BEIR VII Phase 2, National Academy Press, Washington D.C., 2006. [5] P. Jacob, T. I. Bogdanova, E. Buglova et al., Thyroid cancer risk in areas of Ukraine and Belarus affected by the Chernobyl accident, Radiation Research 165 (2006), 1–8. [6] K. J. Kopecky, V. Stepanenko, N. Rivkind et al., Childhood thyroid cancer, radiation dose from Chernobyl and dose uncertainties in Bryansk Oblast, Russia: A populationbased case-control study, Radiation Research 166 (2006), 367–374. [7] V. S. Korolyuk, N. I. Portenko, A. V. Skorokhod, A. F. Turbin, Reference Book on Probability Theory and Mathematical Statistics, Nauka, Moscow, 1985 (In Russian). [8] A. Kukush, S. Shklyar, S. Masiuk, I. Likhtarov, L. Kovgan, R. Carroll, A. Bouville, Methods for estimation of radiation risk in epidemiological studies accounting for classical and Berkson errors in doses, The International Journal of Biostatistics 7(1), article 15, (2011), DOI: 10.2202/1557-4679.1281.
10.1 Error Estimation for Direct Radioactivity Measurements
243
[9] I. A. Likhtarev, N. K. Shandala, G. M. Goulko, I. A. Kairo, Exposure doses to thyroid of the Ukrainian population after the Chernobyl accident Health Phys. 64 (1993), 594–599. [10] I. A. Likhtarev, G. Prohl, K. Henrichs, Reliability and accuracy of the 131I thyroid activity measurements performed in the Ukraine after the Chernobyl accident in 1986, Munich, GSF-Bericht 19/93, Institut für Strahlenschutz (1993). [11] I. A. Likhtarev, G. M. Goulko, B. G. Sobolev, I. A. Kairo, G. Prohl, P. Rath, K. Henrichs, Evaluation of the 131I thyroid-monitoring measurements performed in Ukraine during May and June of 1986. Health Phys. 69 (1995), 6–15. [12] I. A. Likhtarev, B. G. Sobolev, I. A. Kairo et al., Thyroid cancer in Ukraine. Nature 375 (1995), 365–378. [13] I. Likhtarev, L. Kovgan, S. Vavilov, M. Chepurny, A. Bouville, N. Luckyanov, P. Jacob, P. Voilleque, G. Voigt, Post-Chornobyl thyroid cancers in Ukraine. Report 1: Estimation of thyroid doses. Radiation Research 163 (2005), 125–136. [14] I. Likhtarev, L. Kovgan, S. Vavilov, M. Chepurny, A. Bouville, N. Luckyanov, P. Jacob, P. Voilleque, G. Voigt, Post-Chornobyl thyroid cancers in Ukraine. Report 2: Risk analysis. Radiation Research 166 (2006), 375–386. [15] I. Likhtarev, A. Bouville, L. Kovgan, N. Luckyanov, P. Voilleque, M. Chepurny, Questionnaire- and measurement-based individual thyroid doses in Ukraine resulting from the Chornobyl Nuclear Reactor accident. Radiation Research 166 (2006), 271–286. [16] J. L. Lyon, S. C. Alder, M. B. Stone, A. Scholl, J. C. Reading, R. Holubkov, X. Sheng, G. L. White, K. T. Hegmann, L. Anspaugh, F. O. Hoffman, S. L. Simon, B. Thomas, R. J. Carroll, A. W. Meikle, Thyroid disease associated with exposure to the Nevada Test Site radiation: a reevaluation based on corrected dosimetry and examination data, Epidemiology 17 (2006), 604–614. [17] B. Mallick, F. O. Hoffman, R. J. Carroll, Semiparametric regression modeling with mixtures of Berkson and classical error, with application to fallout from the Nevada Test Site. Biometrics 58 (2002), 13–20. [18] S. V. Masiuk, S. V. Shklyar, A. G. Kukush, S. E. Vavilov, Impact of doses uncertainties on the radiation risks estimation. Radiation and Risk – Bulletin of the National Radiation and Epidemiological Registry 17(3) (2008), 64–75 (in Russian). http://mechmat.univ.kiev.ua/eng/ppages/kukush/Public/files/ Unsert_risk_influence.doc. [19] E. C. Molina, Poisson’s Exponential Binomial Limit, D. Van Nostrand Co., New York, 1945. [20] D. L. Preston, J. H. Lubin, D. A. Pierce, M. E. McConney, EPICURE User’s Guide, Hirosoft Corporation, Seattle (WA), 1993. [21] E. Ron, F. O. Hoffman, Uncertainties in radiation dosimetry and their impact on doseresponse analysis. Proceedings of a workshop held September 3–5, 1997 in Bethesda, Maryland, NIH Publication No. 99-4541, 1999.
244
10 Statistical Modeling in Life Sciences and Direct Measurements
[22] E. Ron, J. Lubin, R. Shore et al., Thyroid cancer after exposure to external radiation: a pooled analysis of seven studies, Radiation Research 141 (1995), 259–277. [23] A. Ruark, L. Devol, The General Theory of Fluctuations in Radioactive Disintegration, Phys. Rev. 49 (1936), 355–367. [24] M. D. Tronko, G. R. Howe, T. I. Bogdanova et al., A cohort study of thyroid cancer and other thyroid diseases after the Chornobyl accident: thyroid cancer in Ukraine detected during first screening. Journal of the National Cancer Institute 98 (2006), 897–903. [25] A. F. Tsyb, E. M. Parshkov, V. V. Shakhtarin et al., Thyroid cancer in children and adolescents of Bryansk and Kaluga regions. Proceedings of the first international conference, in: The Radiological Consequences of the Chernobyl Accident, 691–698, European Commission, 1996. [26] L. Zablotska, E. Ron, A. Rozhko, Thyroid cancer risk in Belarus among children and adolescents exposed to radioiodine after the Chornobyl accident, British Journal of Cancer 104 (2011), 181–187.
Author Information Illya Likhtarov, Scientific Center for Radiation Medicine of the Academy of Medical Sciences of Ukraine, Ukraine E-mail: [email protected] Sergii Masiuk, Scientific Center for Radiation Medicine of the Academy of Medical Sciences of Ukraine, Ukraine E-mail: [email protected] Mykola Chepurny, Scientific Center for Radiation Medicine of the Academy of Medical Sciences of Ukraine, Ukraine E-mail: [email protected] Alexander Kukush, National Taras Shevchenko University of Kyiv, Ukraine E-mail: [email protected] Sergiy Shklyar, National Taras Shevchenko University of Kyiv, Ukraine E-mail: [email protected] Andre Bouville, National Cancer Institute, NIH, DHHS, Bethesda, Maryland, USA E-mail: [email protected] Lina Kovgan, Scientific Center for Radiation Medicine of the Academy of Medical Sciences of Ukraine, Ukraine E-mail: [email protected]
11
Design and Development of Experiments for Life Science Applications
János F. László
11.1 Physiological Effects of Static Magnetic Field Exposure in an in vivo Acute Visceral Pain Model in Mice
Abstract. Static magnetic field (SMF) exposure was shown to induce a wide range of biological responses. In this presentation we will give account on recent progress in a special area of experimental research. We shall look for evidence supporting the following hypotheses. Hypothesis 1. There is an SMF configuration, exposure to which induces a statistically significant analgesia in the writhing test in mice [28]. Hypothesis 2. In order to show an effect, the applied SMF must be strong and/or strongly inhomogeneous [28]. Hypothesis 3. The background mechanism of action lies in the excitation of the endogenous opioid system [11]. On the basis of these findings, we may raise the possibility of using devices with such SMF for therapy [29]. Keywords. Acute Pain, Analgesia, Antinociception, Static Magnetic Field (SMF), Writhing Test 2010 Mathematics Subject Classification. 92C05, 92C30
11.1.1 Introduction Pain is a complex state involving both central and peripheral mechanisms. Even nowadays morphine and aspirin (and related compounds) serve as the most prevailing painkillers although both groups of analgesics have a number of considerable adverse effects. Efforts have been made to develop a safer painkiller. Development of selective cyclooxygenase-2 inhibitors (e.g., coxibs) promised a safer alternative to traditional non-steroidal anti-inflammatory drugs. However, because of the unforeseen increase in cardiovascular morbidity and mortality, some of these, for example, rofecoxib have been withdrawn from the market. Besides the pharmacological management of pain, non-pharmacological attempts have also been made. An increasing amount of evidence suggests that static magnetic field (SMF) can induce analgesic action in humans. Due to volume restrictions we ignore to review human studies, we rather refer the reader to the most recent review [57]. In the case of in vivo experiments with mammals, Xu et al. [71] made efforts to explore the role of SMF on modulating the muscle capillary microcirculation in pentobarbital-anesthetized mice under full-body exposition. They found that the peak blood velocity significantly increased at a field
248
11 Design and Development of Experiments for Life Science Applications
induction above 1 mT. Veliks et al. [65] were interested in how SMF influenced the brain function of rats. They found that the predominant effects were bradycardia and the disappearance of respiratory sinus arrhythmia. Ichioka et al. [19] tested whether the skin temperature of rats under high induction (8 T) SMF decreased, but found no convincing evidence. In a series of investigations, Okano and Ohkubo [41] showed that there was a relationship between the spatial gradients of SMF and the suppression of blood pressure elevation in rats. They observed that under hypertensive conditions, the full-body exposure to non-uniform SMF with peak magnetic gradient in the carotid sinus baroreceptor significantly attenuated the vasoconstriction, and suppressed the elevation of blood pressure. McLean et al. [38] carried out experiments to find a potential supplement or alternative to the pharmacological treatment of epilepsy. They studied how SMF modulated the severity of audiogenic seizures and anticonvulsant effects of phenytoin in mice, and found that SMF pretreatment potentiated the effect of phenytoin. Prato et al. [49] investigated the analgesic effect in mice due to magnetic field shielding and concluded that there was a robust effect of the shielding on analgesia. Rogachefsky et al. [51] studied the effect of a mattress with permanent magnets on osteoarthritis in dogs. They conclude that osteoarthritis developed in the medial femoral condyle might have been inhibited by SMF. Some authors have reported that acute exposure of mice to SMF suppressed stress- and morphine-induced analgesia (Choleris et al. [5], Kavaliers et al. [23]). The effects of SMF on several behavioral patterns and neural functions, such as induction of locomotor activity (Houpt et al. [17]), conditioned taste aversion (Nolte et al. [40]), and vestibular activation (Snyder et al. [60]) have been studied. The proliferation of magnetic resonance (MR) tomography as a diagnostic tool has encouraged research related to the physiological effects strong SMF can potentially induce. Probably Hansen pioneered the investigations on small animals [13]. Good reviews of the state of the art can be read in the paper of Pirko et al. [45], in Section 7.2.2 of the report of the World Health Organization [68], and in the above mentioned SCENIHR report [57]. However, with the exception of behavioral studies, in almost all experiments the subjects were immobilized by anesthesia (Lukasik and Gillies [36]). Beyond mammalian reproduction and development studies, most of what we know about the complex effects of MR apparatuses on rodents is as follows. Innis et al. executed behavioral tests in rats, e.g., spatial memory test [20]. They did not find an effect on spatial memory. They also studied the open field behavior and the passive avoidance test in mice [42], and found no effect. Teskey et al. [63] examined survival and stress reactions. They neither found any effect on hormone levels, nor on the weight of animals in a period of 13–22 months following the exposure. They could not identify any change in survival. The blood brain barrier was investigated by Shivers et al. [58]. They observed a temporary opening of the blood brain barrier, recovered 15–30 min after exposure. Prato et al. [47,48] discovered a significant increase in the permeability of the blood brain barrier. Kwong-Hing et al. [26] examined the acute exposure effects
11.1 Physiological Effects of Static Magnetic Field Exposure
249
of a 0.05 T MRI on dentin and bone formation in mice. They found that the exposure caused a significant increase in the synthesis of the collagenous matrix of dentin in the incisors. Levine, Bluni and coworkers [32, 33] made efforts to reveal the effect of 0.3 or 2 T homogeneous SMF of an MR on the left and right discrimination learning ability and the serum melatonin levels of mice with exposure durations of 30 to 100 min. They found a significant interference of SMF with spatial discrimination learning, but no influence of SMF on serum melatonin levels. Prasad et al. [46] examined the chromosomes in the bone marrow of mice influenced by 1 h exposition to homogeneous 0.75 T SMF in an MR and experienced no effects, no chromosomal damage. They also executed the taste aversion paradigm in rats under the exposure of 1.89 T for 30 min (Messmer et al. [39]). They found no effect. Rofsky et al. [50] examined the stability of chromosomal damage in regenerating liver cells. They used an MR with 1.5 T homogeneous static plus gradient magnetic fields for 5–10 min. No effect induced by the MR alone or in combination with gadopentetate dimeglumine was observed. High et al. [15] published a substantial contribution to the study of the effects of a strong homogeneous SMF (9.4 T) on a wide range of biological endpoints from spatial memory to gross pathologic findings in male and female rats exposed to the SMF 3 h a day, twice a week for 5 weeks. Their basic conclusion was that no adverse biological effects in either the parents or in the progeny could be attributed to the exposure. Weiss et al. [66], later Nolte et al. [40], investigated the behavior of laboratory rodents in SMF of 4 or 9.4 T. They found that SMF stronger than 4 T may be unpleasant, may induce aversive responses and conditioned avoidance. Houpt et al. [17, 18] confirmed these findings when they published a result on rats that could freely move through the 4 T homogeneous SMF of an MR, but avoided to enter the field. The authors related this experience to motion-induced currents in the vestibular system. Although the mechanisms by which SMF exerts its great variety of actions are unknown (Lockwood et al. [35, 56]), modification of the endogenous opioidergic system [24] and of the ion channel conduction properties [52, 54] have been suggested. Several findings indicate the ability of relatively weak SMF to diminish neural excitability by the inhibition of Ca2C and NaC currents [52, 53, 69]. The aim of the present series of experiments was to elucidate the effects of SMF exposure on acute visceral chemonociception in mice using (i) an optimized SMF generating device designed, developed and validated by us, and (ii) the SMF component of a clinical MR. During the MR experiments we had to control the potential effect of noise, vibration and illumination stimuli. We also had to estimate the potential effect of motion-induced current densities and time derivatives of the magnetic induction.
11.1.2 Methods 11.1.2.1 Magnetic Exposure Conditions A static magnetic device in which magnet-holding matrices could be easily changed was introduced to carry out the examinations. A special Plexiglas animal cage of size
250
11 Design and Development of Experiments for Life Science Applications
Table 11.1. Summarized technical data of the individual magnets and their arrangements used in the different experiments. Generator numbers are allocated to the configurations.
length (mm)
5 n/a 5 5 5 5 5 5 2.5 2.5 5 5 5 5 n/a n/a
10 n/a 10 10 10 10 10 10 10 10 10 10 20 20 25 25
n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a 70 70
n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a 50 50
matrix constant (mm)
height (mm)
1.20 n/a 1.20 0.40 1.20 1.47 0.40 1.47 1.20 1.20 1.47 1.47 1.47 1.47 0.40 0.40
magnetic coupling
radius (mm)
N35 n/a N35 n/a N35 N50 n/a N50 N35 N35 N50 N50 N50 N50 n/a n/a
number of matrices
remanent induction (T)
NdFeB n/a NdFeB ferrite NdFeB NdFeB ferrite NdFeB NdFeB NdFeB NdFeB NdFeB NdFeB NdFeB ferrite ferrite
polarity
grade
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
width (mm)
material
magnet arrangement
generator number
individual magnet
bidir. n/a bidir. unidir. unidir. unidir. bidir. bidir. bidir. bidir. bidir. bidir. bidir. bidir. bidir. unidir.
2 n/a 1 2 2 2 2 2 2 2 2 2 2 2 2 2
no n/a no no no no no no no no yes yes yes yes yes yes
10 n/a 10 10 10 10 10 10 5 10 10 20 10 20 n/a n/a
LW H D 14014046 mm could be inserted between the magnetic matrices. The individual magnets used in the matrices were: ferrite block magnets, ferrite cylindrical magnets, or NdFeB cylindrical magnets. Table 11.1 contains data of the individual magnets and their arrangements. The generator numbers will be referred to from here onwards. SMF in each experimental setup was analyzed by a calibrated 5 V Hall probe with 12.3 mV/T sensitivity. These measurements were executed separately from the animal experiments. Lateral scans were taken in parallel planes at 3, 10, and 15 mm from the surface of the magnets’ top. When NdFeB N50 cylindrical magnets sit one next to another with alternating polarity, the configuration will be referred to as “bidirectional” (“bidir.” in Table 11.1) as opposed to when the magnets point to the same direction with their identical poles, which will be referred to as “unidirectional” (“unidir.” in Table 11.1). The representative scanned area was 41 41 mm in the center of the
251
11.1 Physiological Effects of Static Magnetic Field Exposure l
0.03
0.03 0.02
0.02 P-P
0.01 0.00 – 0.01 – 0.02
Arbitrary signal
Arbitrary signal
(a)
0.04
0.04
0.00 – 0.01 – 0.02 – 0.03
– 0.04
– 0.04 0
20 40 60 80 100 120 140 160 180 200 Distance (mm)
(b)
l
0.12
P-P
0.01
– 0.03 – 0.05 –60 –40 –20
l
0.05
0.05
– 0.05 –60 –40 –20
0
20 40 60 80 100 120 140 160 180 200 Distance (mm)
0.12
0.10
0.10
0.08
0.08
Arbitrary signal
Arbitrary signal
P-P
0.06 0.04
0.00 –60 –40 –20
0.06
P-P
0.04 0.02
0.02
(c)
l
0
20 40 60 80 100 120 140 160 180 200 Distance (mm)
(d)
0.00 –60 –40 –20
0
20 40 60 80 100 120 140 160 180 200 Distance (mm)
Figure 11.1. Schematic figure of a resultant arbitrary signal (e.g., magnetic flux density) distribution of a possible magnet arrangement. The resultant signal can be drastically different from the individual signals depending on the arrangement of the individual magnets. Dashed lines denote the individual signals, solid line is the resultant. (a) Cylindrical magnets of 5 mm radius are arranged with alternating polarity along a line 10 mm away from each other. (b) Same as in Panel (a), but the magnets are 20 mm away from each other. (c) Same as in Panel (a), but the magnets are arranged with identical polarity along a line 10 mm away from each other. (d) Same as in Panel (c), but the magnets are 20 mm away from each other (from [28]).
matrices. Within this area the contribution of the asymmetrical magnetic induction of the matrix edges could be neglected. Approximating the scanned values of the SMF distribution along the x axis, we fitted normal distribution functions to the individual peaks, where the contribution 2 of the k th peak to the distribution was mod .k;2/C1 p1 exp ¹ Œx.k1/ º, with 2 2 2 k D 1; 2; : : : ; n ( was considered equal for the peaks), the distance between the means (the magnetic matrix constant). In this simple approach, widely used for finding the spatial resolution in periodic systems (see e.g., the review of Hofmann [16]), the magnets were bidirectionally ( D 1) arranged, the R radius of each magnet was 5 mm (equal to =2), see Figure 11.1 (a). Dashed lines represent arbitrary signals (e.g., magnetic induction) belonging to each contributing magnet, while the solid line shows
average P-P value (mT)
389.46 0 192.28 25.30 10.59 6.39 110.65 513.69 62.83 120.38 783.22 337.43 259.36 321.82 258.39 16.18
generator number
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
z D 3 mm average B value (mT)
average surface roughness (T/m)
average P-P value (mT)
standard deviation of B (mT)
average B value (mT)
average surface roughness (T/m) 202.28 0 197.67 160.32 82.60 83.72 83.72 201.67 199.51 202.96 200.62 197.18 203.61 201.97 197.57 81.57
4.17 0 2.12 1.89 1.24 1.80 1.80 13.33 0.33 1.01 25.01 15.67 3.29 17.90 67.07 1.03
standard deviation of B (mT)
1.78 0 1.01 0.80 0.43 0.69 0.69 5.72 0.31 0.46 10.69 3.22 1.47 3.64 4.34 0.9
2.97 0 3.19 4.83 6.90 4.88 2.21 5.74 0.74 0.77 1.46 30.62 4.52 30.77 169.90 1.55
average P-P value (mT)
17.94 0 10.16 7.21 4.38 4.38 7.64 57.20 1.09 2.26 108.68 65.54 17.28 72.85 220.80 4.98
0.22 0 0.31 0.58 0.78 0.50 0.50 0.69 0.22 0.13 0.15 1.58 0.44 1.67 3.40 0.03
average surface roughness (T/m)
108.79 0 47.16 5.08 2.60 1.16 28.54 137.00 21.80 19.78 191.62 76.67 72.87 76.67 101.78 6.99
202.47 0 198.23 160.84 81.56 81.36 82.61 201.40 199.31 203.03 201.08 200.96 202.86 202.07 199.26 81.86
average B value (mT)
200.88 0 198.11 159.33 88.94 87.72 196.08 208.82 201.56 197.88 212.90 196.60 206.01 196.60 200.01 83.75
z D 15 mm
0.66 0 0.64 1.51 0.87 1.46 1.46 1.44 0.20 0.27 0.29 7.73 0.92 8.39 46.02 0.37
standard deviation of B (mT)
39.25 0 18.89 2.82 1.15 0.62 11.14 51.37 19.59 8.63 74.16 16.55 25.82 16.55 5.16 0.78
z D 10 mm
252 11 Design and Development of Experiments for Life Science Applications
Table 11.2. Summarized field scan data of SMF as obtained with a calibrated Hall probe at a distance of 3, 10, or 15 mm from the magnets’ surface in the different experiments.
11.1 Physiological Effects of Static Magnetic Field Exposure
253
the resultant signal due to the interaction between the magnets. If is small and we do not consider the special non-symmetrical side peaks, the P-P (peak-to-peak) value of the resultant signal can be significantly smaller than the individual peaks would suggest. For example, if =2 D R D 5 mm and D 10 mm, the signal is reduced to 15.1 % of the P-P value (Figure 11.1 (a)). For =2 D R D 5 mm and D 20 mm, the signal is reduced to only 73.0 % (Figure 11.1 (b)). For a similar situation when =2 D R D 2:5 mm, D 5 and 10 mm, the corresponding values are 31.9 and 73.0 %. The spatial resolution gets better with decreasing . Similar to Panels (a) and (b) in Figure 11.1, Panels (c) and (d) show the situation when the magnets are aligned unidirectionally ( D 1). If =2 D R D 5 mm and D 10 mm, the magnitude of the resultant magnetic field becomes 2.5 times greater than the individual signal, but in the center the distribution can be considered homogeneous, since the P-P value is almost zero. However, the P-P value becomes more expressed if =2 D R D 5 mm and D 20 mm (see Panel (d) in Figure 11.1). To take this effect into account we defined the surface roughness as ˛ D B= in T/m. Here the flux density difference .B/ was taken as equal to the P-P value over the distance between adjacent peaks in both directions (because of symmetry reasons, was equal in both x and y directions). In this sense, a homogeneous field is described by ˛ D 0 (see Panel C in Figure 11.1). The average values for all peaks, P-P and ˛ are collected in Table 11.2. In some cases, the SMF between the matrices was adjusted by magnetically connecting the opposite poles of the lower and upper magnets in a horseshoe-like coupling (referred to as “mc” further on). For a quadratic arrangement of 25 pieces of NdFeB cylindrical magnets of grade N50 with R D 5 mm magnet radius and L D 10 mm magnet height, this situation had the following effects. In the case of a unidirectional arrangement, mc increases the homogeneity of the field, the function break-off starts rather at z D 20 mm than at 10 mm. In the case of a bidirectional arrangement, mc also enhances the field (by 58.8 % on the average between z D 10 and 20 mm), but it does not change the shape of the function. For a quadratic arrangement of 9 pieces of magnets the effects were similar, but less expressed (e.g., 21.99 vs. 58.8 %). A commercial Philips Achieva 3 T MR with magnetic shielding was used in the MR experiments. It has a 3 T homogeneous, horizontal SMF parallel to the axis inside the bore of the MR. The gradient system and the RF radiation were not used in our measurements.
11.1.2.2 Model The professional expectations for in vivo experimental models are the following: the model should (i) reflect a pathological condition including symptoms similar to that occurring in humans, (ii) be specific, sensitive, valid, reliable, and reproducible [31]. Suppose all these requirements are fulfilled, data coming from in vivo experiments are irreplaceably important regarding the certain therapeutic method under investigation,
254
11 Design and Development of Experiments for Life Science Applications
since among experimental conditions the psychosomatic/placebo effects can be excluded. The writhing test as an acute visceral pain assay is widely spread and it counts as a good predictor of human studies [31]. In these experiments pain is elicited by the intraperitoneal (i.p.) injection of a slightly irritating agent. The pain sensation manifests itself in a characteristic stretching, writhing movement of the animal, the number of writhings per time period reflects response to visceral pain. Both opioid and nonopioid analgesics are able to exert antinociceptive activity in this pain model. Numerous agents have been described to cause abdominal pain (Collier et al. [6]) and depending on the irritant, inflammatory or non-inflammatory pain can develop (Gyires and Torma [12]). In our experiments we used primarily acetic acid (CH3 COOH) in a volume of 0.2 ml/mouse (0.6 %) as an irritant. (We also tested magnesium sulfate, MgSO4 (0.2 ml/mouse, 2 %) as an irritant, but—as results were not significantly different—we shall stick to acetic acid in this review.) The solution injected i.p. induces a well reproducible, relatively long-lasting, mild pain reaction. We used the method as described by Wende and Margoli [67], modified by Witkin et al. [70]. The number of writhings was monitored in several time periods within the 0–40 min time of the experiment. That is, the nociceptive reaction was measured during the exposure of mice to SMF for 10, 20, or 30 min as well as after the cessation of the SMF exposure for 30, 20, or 10 min in order to determine the duration of the effect of SMF on nociception. Mice were sacrificed 48 h after the experiments; meanwhile, their behavior and appetite were checked. No difference between treated and untreated mice could be identified. Neither morbidity, nor mortality was observed. Animals, handling. In the tests male CFLP mice (24–26 g) were used. The animals were housed in groups of 5, the room had been held under a 12:12 h light/dark cycle at 20 ˙ 2 ı C, feed and water ad libitum. In order to reduce experimental stress, an adaptation period was introduced: mice were placed in the Plexiglas cage (where they were kept during the experiments) for 30 min for 4 consecutive days preceding the experiment. Environment, stress sources. No efforts were made to magnetically shield the experimental setup from the geomagnetic field in any experimental setup. Creatures throughout the hundred millions of years of phylogenesis on Earth have adapted to this field (vertical component of magnetic induction was 43.448 ˙ 0.157 T on average between 2001 and 2011 at the location of the experiments). Therefore, shielding might have introduced new effects [49]. For SMF exposure, 2–3 animals were put into the Plexiglas cage at a time keeping in mind that mice are socially sensitive [27], then the cage with the animals was inserted into the exposure chamber of a generator for a maximum of 40 min in a single session. The treatment was whole-body exposure, while animals were free to move in the cage. Blinding was provided by keeping 2– 3 control animals (not exposed to SMF) at a time in identical Plexiglas boxes for the same time as their mates were exposed to SMF. The evaluator was blind, he did not know which results come from exposed, which from unexposed animal groups.
11.1 Physiological Effects of Static Magnetic Field Exposure
255
The effect of vibration stimulus known to produce anesthesia [59] and analgesia (for references see Chapter 19 in [7]) should have been investigated in the MR trial, while the cooling of the device may have generated vibration above the perception threshold of mice. Fortunately, the vibration of the table at rest (without imaging sequence) was found negligible: magnets’ mass was 6 ton, the moving table supporting the cage weighted less than 20 kg. We optically checked vibration by simply positioning a glass of water at the spot of measurement on the table and watched for surface waves, disturbances. We failed to find any sign of vibration. This control is reliable in the 10–100 Hz frequency range. Another potential confound was the effect of illumination. It has long been known that light can be a source of aversive stimulus for rodents [22, 44, 61]. Garcia et al. [9] published that 3 lx seemed to be the threshold of aversion in the exploratory behavior of rats in an elevated plus maze, but not in their locomotor activity. (Some animals use the polarization of sunlight for their orientation, even the million-time dimmer polarization of the moonlight proved to be a sufficient source to orient the African dung beetle [8]). This is why the side walls of the animal cage were transparent, the top was covered with an opaque, air permeable material, and the support under the cage was always solid and plane. One parameter of the circadian cycle is the change of lighting. Rodents are subject to the circadian cycle in almost all areas of their life [4]. All experiments were therefore carried out in the same period of time of the day, between 8 and 12 a.m. The background noise was limited to 50 dBA in all experiments but the MR. In the MR examination the noise stimulus originating from the Cold Head of the helium pump of the MR exceeded 50 dBA, it could be a stress source for mice. We therefore introduced an additional control group, see Point 3.3. Ethics. All experimental procedures were carried out according to the 1998/XXVIII Act of the Hungarian Parliament on Animal Protection and Consideration Decree of Scientific Procedures of Animal Experiments (243/1988), and complied with the recommendations of the International Association for the Study of Pain [73] and the Helsinki Declaration. The studies were approved by the Animal Care Committee of Semmelweis University of Budapest under license no. 1810/003/2004. Statistical analysis. We defined the effect of a treatment in a given time interval as 100.1 x= N y/ N %, where xN is the average of the number of writhings measured in the treated and yN is that in the control group. As some treatments were applied in several experiments on different days with different numbers of animals, they had more than one observed value. For such treatments we calculated a pooled average based on all experiments. For pooling we applied weighted averages with weights inversely proportional to the variances of the particular estimate. This is optimal in the sense that it results in a minimum variance pooled estimate ([34], pp. 401–402). From a general result about the variance of a ratio estimate ([34], pp. 63–64), the asymptotic 2 2 2 variance of the average is 2 D xyNN 2 xN x2 C yNy2 ; where x2 is the variance of the number
256
11 Design and Development of Experiments for Life Science Applications
of writhings for the treated and y2 is that for the control animals. Thus, the pooled average for a treatment that occurred in m different experiments with nj number of treated and uj number of control animals on the j th date (1 j m), can be obtained as m m X xNj ı X wj 1 wj %; 100 yNj j D1
where xNj D
1 nj
j D1
Pnj
i D1 xij is the average value of 1 Puj i D1 yij that for the control uj
the number of writhings for the
treated and yNj D animals on the j th day of measurement. The weights wj are inversely proportional to the variance estimates from the particular experiments, i.e., wj D j2 D
h xN 2 2 j
yNj
x xNj2
C
y2 i1 yNj2
:
This definition of the effect has the advantages that (i) it is based on comparisons between treated and control animals examined on the same day, and (ii) it takes into account different sample sizes correctly. To achieve the best possible transparency of the data, we summarized the results of the tests in tables including standard error of the mean (SEM), e.g., for treated animals on the j th date: sP nj Nj /2 i D1 .xij x : SEMj D .nj 1/nj The p values were either computed from the 30 min interval using two-sample homo- or heteroscedastic Student’s t -tests depending on the result of the f -test. Unequal sized data sets were compared using the Bonferroni correction. The difference between data sets was considered significant at the 95 % confidence level, if p < 0:05. For experiments carried out on several different dates, the highest p value (pmax ) was reported. We also carried out joint analysis of all experiments using a mixed model one-way ANOVA with the date of experiment as a random factor.
11.1.3 Results 11.1.3.1 Testing the SMF Parameters with the Writhing Test [28] From now on in this chapter we present the results sorted around problems and their solutions regarding characteristics of the specific magnetic arrangements. The results presented in Table 11.3 time-resolve the experiment, i.e., they split the measurement into 3 sections: from 0–5, from 6–20, and from 21–30 min. The last columns (0– 30 min, in bold) denote the weighted averages of all 3 previous columns. Note that the headline of Table 11.3 displays pooled numbers of writhings.
89.47 44.11
21.38
14 0.22˙0.22 15 13.64˙2.96
16
8.56˙1.70
92.50 91.89
0.30˙0.15 0.33˙0.24
12 13
generator number
2.00˙4.17 2.33˙0.51 1.00˙0.21
treated: xNj ˙ SEMj
9 10 11
yNj D 8:52 ˙ 0:37
3.15˙0.37 2.70˙0.47
analgesic effect (%)
7 8
72.72 76.61 89.07
67.32 76.86
xNj ˙ SEMj
–102.78 42.61 66.93
yNj D 34:91 ˙ 0:63 treated:
8.11˙1.67 6.08˙0.73 2.50˙0.40
control:
4 5 6
34.56˙3.91
6.56˙2.06 34.00˙4.06
9.00˙1.20 8.44˙0.82
24.00˙0.67 15.75˙1.56 8.60˙1.96
19.81˙0.89 15.80˙0.74
39.33˙1.44 27.50˙2.48 18.70˙0.88
analgesic effect (%)
38.64˙2.43 28.18˙4.49
16.09
83.82 35.71
77.31 78.20
41.78 68.02 79.97
48.74 59.94
0.84 24.66 52.16
-2.92 26.91
xNj ˙ SEMj
–0.55 40.25
yNj D 27:54 ˙ 0:58 treated:
10.18˙0.82 4.73˙1.00
control:
2 3
21.56˙1.93
7.22˙0.98 24.91˙1.00
6.50˙0.76 6.11˙0.72
18.89˙1.75 9.75˙1.16 5.30˙0.93
18.73˙1.03 14.00˙1.90
33.44˙0.75 18.42˙0.95 18.00˙1.10
26.55˙1.59 23.73˙2.92
analgesic effect (%) 25.62
79.18 18.76
80.00 80.94
46.13 70.45 89.48
39.92 55.07
-2.91 36.05 45.71
7.71 17.81
64.67˙5.97
14.00˙2.40 72.55˙7.04
15.80˙1.76 14.89˙1.55
44.89˙1.93 27.83˙2.46 14.90˙2.28
41.69˙1.89 32.50˙2.89
80.89˙2.94 52.00˙3.13 39.20˙1.70
75.36˙4.17 56.64˙7.59
36.30˙1.79
xNj ˙ SEMj
46.70
yNj D 70:97 ˙ 1:08 treated:
16.58˙0.50
control:
55.37
p > 0:883, n D 11 p > 0:104, n D 11
19.98
81.81 27.57
p > 0:084, n D 9
p < 0:003, n D 9 p > 0:077, n D 11
79.26 p < 0:001, n D 10 80.75 p < 0:003, n D 9
46.05 p < 0:001, n D 9 68.07 p < 0:002, n D 12 83.40 p < 0:001, n D 10
47.54 p < 0:001, n D 26 59.81 p < 0:001, n D 10
–6.20 p > 0:216, n D 9 31.49 p < 0:001, n D 12 50.21 p < 0:013, n D 10
1.40 24.39
54.07 p < 0:003, n D 76
analgesic effect (%)
19.22˙0.67
21 – 30 min j D1
2.34˙0.26
21 – 30 min control: m P uj D 109
1
6 – 20 min t -test, number of animals.
89.04
0 – 5 min
11.1 Physiological Effects of Static Magnetic Field Exposure
257
Table 11.3. Summarized data of the writhing tests for the first 5 min, from 6 to 20 min, and from 21 to 30 min. Bold values stand for the whole 30 min time interval. means significant difference, p < 0:05.
258
11 Design and Development of Experiments for Life Science Applications 50 Control (n = 109) SMF treatment (case 1, n = 76)
45 Number of writhings
40 35 30 25 20 15 10 5 0
0
5
10
15 20 Time (min)
25
30
35
Figure 11.2. The raw data of the number of writhings due to visceral pain elicited by i.p. injection of acetic acid in mice. Data corresponding to the 0–5, 6–20, and 21–30 min time intervals are deliberately scattered in the time coordinate in a random manner in 5 min intervals around 5, 20, and 30 min to ease comprehension. Arrows show average values (from [28]).
On the analgesic effect caused by SMF. Generator 1 was used first. The raw data of the number of writhings for this situation are presented in Figure 11.2. The data corresponding to the 0–5, 6–20, and 21–30 min intervals are deliberately scattered in the time coordinate in a random manner in 5 min intervals around 5, 20, and 30 min, respectively for better visibility. Arrows show the averages. The overall analgesic effect was found to be over 54 %. There is a significant analgesic effect associated with the SMF exposure. On the reproducibility of the experiments. In these examinations still SMF of Generator 1 was under investigation. The average number of writhings is illustrated in Figure 11.3, Panels (a) and (b) as a function of the time of the experiments in weeks. Solid lines guide the eye only. In Figure 11.3, Panel (a) the control results (headline in Table 11.1) are shown, in Panel (b) the results of SMF. The average value of writhings for the 0–30 min interval was 70.97. The linear trend line fitted to the 0–30 min curve had a slope of 0.29. In Figure 11.3 these were 36.30 and 0.07, respectively. Since the slopes of the trend lines were small, the experiments could be regarded as stable; a sufficiently good reproducibility was achieved and maintained. The residual standard deviation was 11.24, the standard deviation due to the random factor (the date) was 6.76. Accordingly, the variance within a date was 126.34, while the total (including all dates) was 147.30. This is why we accepted pooling the data from experiments executed on different days (see Point 11.1.2.2). The experiments proved to be reproducible. On the effect of the instrument. The next set of experiments was executed with “placebo” matrices, this Generator was number 2 (sham with “dummy” magnets). The
259
11.1 Physiological Effects of Static Magnetic Field Exposure 90
0–30 min mean 0–30 min mean + st. dev. 0–30 min mean – st. dev.
70
Number of writhings
i.p. acetic acid, p 0:216), while for NdFeB N35 and N50 the results were significant (p < 0:001 and p < 0:013, resp.). The negative analgesic effect value in the 0–5 min period for the weakest unidirectional SMF was due to an extremely low daily control value. We should note here that unidirectionality strongly decreases ˛, the average surface roughness, see Point 11.1.2.1 On the average surface roughness, ˛ (the inhomogeneity of the magnetic field) and on the average peak to peak field intensity values, P-P. The performance of Generators 1, 7, and 8 were compared. The maximal activity of 59.81 % was achieved at 51 T/m surface roughness (at z D 3 mm). In all 3 cases the analgesic effect was statistically significant (pmax 0:003). The bidirectional arrangements caused higher analgesic effect than the unidirectional arrangements. Increasing the average surface
260
11 Design and Development of Experiments for Life Science Applications
roughness as well as increasing the average P-P values cause increasing analgesic effect. On the magnetic matrix constant, . The situation presented was obtained in Generator 9 and 10. Introducing a gap between the magnets increased the analgesic effect. This is in harmony with the observations in the case of the normal distributions (see Panels (a)–(d) in Figure 11.1). The next situation was Generator 11 vs. 12. At first sight it might be surprising that the difference in the analgesic effect vanished. Our explanation for this is that two concurrent and opposite effects added up: different individual magnet radii simultaneously appeared with the modification of the average P-P values (e.g., at z D 3 mm). The P-P values changed in a different manner from Generator 9 to 10 than from Generator 11 to 12. The t -test showed a significant analgesic effect in all 4 cases (pmax 0:002). The change of the magnet constant ./ from 2R to 4R did not reduce the analgesic effect. On the height of the individual cylindrical magnets. Generator 11 and 13 were compared. There was hardly any difference in the analgesic effect, which suggested that the analgesic effect did not benefit from the slight increase in the magnetic induction due to longer magnets. This effect may be due to a threshold in the induction that manifests itself in a saturation of the analgesic effect at a value somewhere over 80 %. The probabilities showed a significant analgesic effect in both cases (pmax < 0:001). Doubling the height of the individual magnets did not significantly contribute to the enhancement of the analgesic effect. Remark. If we let both height and matrix constant double, we get Generator 13 and 14 for L D 20 mm magnets (to be compared to Generator 11 and 12 for L D 10 mm). The high analgesic effect did not change. On the diameter of the individual cylindrical magnets. The following generators were compared: 1 vs. 9. The slimmer magnets had about the same surface induction values as the larger ones (200.88 vs. 201.56 mT as measured at z D 3 mm from the magnets’ surface). There was no difference between the analgesic effects for the magnets with different radii; in both cases there was a significant analgesic effect (pmax 0:001). Halving the radius of the individual magnets did not significantly reduce the analgesic effect. On the magnetic coupling (mc). Generator 8 and 11 were tested here. The analgesic effect for the mc situation was found more expressed. The t -test showed a significant analgesic effect in both cases (pmax 0:001). We found the record-holding highest average analgesic effect of 83.4 % in Generator 11 in the course of our investigations. The effect mc caused (see Point 11.1.2.1) could explain this ratio of increase. In the case of bidirectional arrangements, mc increases the analgesic effect. On the form of the magnets (do block magnets fulfill the tasks?). The arrangements in these cases were such that ferrite block magnets were used on both sides either bidirectionally or unidirectionally (refer to Cases 15 and 16 in Table 11.3). The
261
11.1 Physiological Effects of Static Magnetic Field Exposure
results in Table 11.3 show that both bidirectional and unidirectional cases resulted in analgesic effects below 30 %. Block magnets did not achieve a significant analgesic effect. On the basis of these studies (see Point 11.1.4), we used Generator 11 in our further experiments. 11.1.3.2 The Role of Endogenous Opioid Receptors [11] Comparison of the antinociceptive effect of SMF (Generator 11) applied either before or after the i.p. injection of acetic acid is shown in Figure 11.4. Inhibition of pain reaction induced by SMF did not differ in mice exposed either before or after acetic acid challenge in the 0–5, or 6–20 min observation periods. However, no antinociceptive effect could be observed in the 21–30 min period if mice were exposed to SMF before the injection of acetic acid. The abdominal pain reaction during the whole 30 min observation period was inhibited by 46 % (p < 0:05) when SMF was applied before, and 64 % (p < 0:005, cf. 54 % in Point 3.1) when SMF was applied following the i.p. injection of acetic acid. The effect of naloxone, ˇ-funaltrexamin, naltrindol and norbinaltorphimin given peripherally on SMF-induced analgesic action. Naloxone in the dose of 0.2 or 1 mg/kg s.c. resulted in a significant inhibition of the antinociceptive action induced by SMF applied for 30 min after the acetic acid challenge. The -opioid receptor selec-
SMF
SMF
SMF
SMF
Number of writhing
100 80 60
n =6 /group *
40
** * *
20
* * *
0 0–5
6–20
21–30
0–30
Time (min)
Figure 11.4. The effect of SMF on writhing syndrome applied for 30 min either after or before the injection of 0.6 % acetic acid measured in 0–5, 6–20, and 21–30 min after the acetic acid challenge. Open columns: number of writhings in control animals that were not exposed to SMF. Light shaded columns: number of writhings of mice exposed to SMF after the administration of acetic acid for the 30 min. Dark shaded columns: number of writhings in mice exposed to SMF before the administration of acetic acid for the 30 min. Each column represents the mean ˙ S.E.M. of 6 mice. p < 0:05, p < 0:005; ANOVA-test (from [11]).
262
11 Design and Development of Experiments for Life Science Applications
(a)
†
n =12/group
60
†
40
**
** 20
1
1
100
0.2
50 ** 25
b-funaltrexamin saline s.c. SMF
†
n = 6/group
** 40 20
0 0.5 mg/kg
(c)
Naltrindol s.c. Saline s.c.
Number of writhing in 30 min
Number of writhing in 30 min
SMF 80 60
†
n = 6/group 75
0 20 mg/kg s.c. 0.2 naloxone s.c. (b) saline s.c.
0 mg/kg
SMF
SMF Number of writhing in 30 min
Number of writhing in 30 min
SMF 80
100
n = 6/group
75 50
**
†
25 0 20 mg/kg s.c.
(d)
Norbinaltorphimin Saline s.c.
Figure 11.5. The effect of (a) naloxone (1.0 or 0.2 mg/kg s.c.), (b) ˇ-funaltrexamin (20 mg/kg s.c.), (c) naltrindol (0.5 mg/kg s.c.), and (d) norbinaltorphimin (20 mg/kg s.c.) on SMF induced antinociceptive effect on writhing syndrome in mice. Each column represents the mean ˙ S.E.M. of 6–12 mice. Open columns: number of writhings in control animals that were not exposed to SMF. Shaded columns: number of writhings of mice exposed to SMF. p < 0:005 compared to control groups, p < 0:01 compared to SMF-saline group; ANOVA-test (from [11]).
tive irreversible antagonist ˇ-funaltrexamin also inhibited the SMF-induced analgesic effect. Naltrindol decreased the SMF-induced antinociception in a significant manner as well, but the inhibition of the action was not reversed. In contrast, norbinaltorphimin failed to affect the analgesic effect of SMF (Figure 11.5). The effect of naloxone on SMF-induced antinociceptive action given intracerebroventricularly. Exposure of mice to SMF resulted in more than 50 % inhibition of acetic acid-induced pain reaction. Naloxone injected intracerebroventricularly (i.c.v.) failed to affect the SMF-induced antinociceptive action (Figure 11.6). 11.1.3.3 MR “Therapy” [29] As mentioned in Point 11.1.2.2, illumination can be a stress source for rodents. Care must have taken to produce similar illumination at all experimental locations: inside the bore of the MR and outside. A large enough dimmed (< 20.6 lx) area was produced inside the cage by screening light emitted from halogen lamps above the cage.
Number of writhing in 30 min
11.1 Physiological Effects of Static Magnetic Field Exposure
80
SMF
60 40
263
n =12/group ** †
20
0 10 mg/mouse
Naloxone i.c.v. Saline s.c.
Figure 11.6. The effect of naloxone given intracerebroventricularly (10 mg/mouse i.c.v.) on the SMF-induced antinociceptive action on writhing syndrome in mice. Open columns: number of writhings in control animals. Shaded columns: number of writhings of mice exposed to SMF. Each column represents the mean ˙ S.E.M. of 12 mice. p < 0:005 compared to saline-control group. p < 0:005 compared to naloxone-control group; ANOVA-test (from [11]).
The highest illumination was provided at the location of the control experiments. The lighting conditions inside the cage were basically independent of the cage location. The cage was illuminated from above during the experiments. The shaded area was always bigger than 94 140.46/ mm during the experiments. The lamps generated a scattered light in the shaded area of the cage between 3.9–20.6 lx. The measured illumination values are shown in Figure 11.7. Accordingly, three additional locations (tested with groups of 6 animals each) were applied. The cage was placed in the service area outside the MR lab, where the MR’s stray field had either 0.1 mT or 0.5 mT horizontal induction component denoted as “out 0.1 mT” or “out 0.5 mT”, respectively, see Figure 11.7. The vertical field component could be neglected since the cage was in the median plane of the bore (denoted by M in Figure 11.7). Neither gradient, nor radiofrequency fields were on. The only horizontal field components were the axial ones that certainly could be regarded as homogeneous within the cage. In the third group (denoted by “in 3 mT”), animals were exposed to SMF inside the MR lab in the median plane .M /, where the horizontal magnetic induction originating from the MR was 3 mT, but noise and vibration were similar to those in the bore. The 0.1, 0.5, and 3 mT induction values were taken from the magnetic field map of the MRI as provided by Phillips showing the main horizontal components of the stray field in the median plane .M /. We usually assume that the motion of the animals is similar in the SMF environment as it is without SMF. If so, we can suppose that the induced currents have the same effect in both exposed groups, if any. In case of the MR model, we tried to minimize the generation of the extra motioninduced currents in the animals on their way into the bore. Therefore, groups of mice at locations “3 T” and “NX 3 T” were moved into the experimental position in the bore along its axis (cf., double arrow in Figure 11.7) with a constant speed not exceeding 0.5 m/s. The path of the cage out of the bore did not influence the measurement.
11 Design and Development of Experiments for Life Science Applications
“3 T” and “NX 3 T” 8.2 lx
MR
M Motion of cage Table
Sound insulation
264
“in 3 mT” 4.7 lx
“out 0.5 mT” 20.6 lx
“out 0.1 mT” 3.9 lx
Figure 11.7. Scheme of the experimental arrangement. We denoted by crossed circles locations where the magnetic field has been probed: (i) 0 T SMF (control, n D 18), not seen in the figure, (ii) 3 mT inside the MR lab (“in 3 mT”, n D 6), (iii) 0.1 mT outside the MR lab (“out 0.1 mT”, n D 6), (iv) 0.5 mT outside the MR lab (“out 0.5 mT”, n D 6), (v) 3 T inside the bore of the MR (“3 T”, n D 18), and (vi) 3 T inside the bore of the MR following naloxone pretreatment (“NX 3 T”, n D 6). n denotes animal numbers. Double arrow shows the motion of the cage into and out of the bore. M is the median plane of the bore. Dosimetric illumination values in units of lx are also shown (from [30]).
The average number of writhings in different time intervals following the acetic acid challenge in the writhing test in mice can be seen in Figure 11.8. The experiments were performed according to the following protocol: (i)
no SMF (control, n D 18),
(ii) 3 mT inside the MR lab (“in 3 mT”, n D 6), (iii) 0.1 mT outside the MR lab (“out 0.1 mT”, n D 6), (iv) 0.5 mT outside the MR lab (“out 0.5 mT”, n D 6), (v) 3 T inside the bore of the MR (“3 T”, n D 18), and (vi) 3 T inside the bore of the MR following 0.2 mg/kg naloxone pretreatment (“NX 3 T”, n D 6). The “NX 3 T” group achieved an average antinociceptive activity of 1 % and 0 % in the 6–20 and 21–30 min time intervals, respectively and thus these values remain hidden in Figure 11.8. One-way ANOVA resulted in pmax < 0:001 by Fmin > 18:7, Fcrit D 2:4 for the four time intervals 0–5, 6–20, 21–30, and 0–30 min. The p and 2 values of the pair-wise comparisons with the control can be seen in Table 11.4.
265
11.1 Physiological Effects of Static Magnetic Field Exposure
Number of writhing in 30 min
100 control (n =18) “in 3 mT” (n =6) “out 0.1 mT” (n =6) “out 0.5 mT” (n =6) “3 T” (n =18) “NX 3 T” (n =6)
90 80 70 60 50 40
*
30 20
*
10
*
*
0
6 – 20
0– 5
21– 30 Time (min)
0– 30
Figure 11.8. The average number of writhings (mean ˙ S.E.M.) in mice in different time intervals following the intraperitoneal (i.p.) injection of 0.6 % acetic acid. The locations where the magnetic field has been probed are the same as summarized in Figure 11.7. Statistical analysis of the data was evaluated by means of one-way ANOVA. Equal numbers of animals were compared. A probability of p < 0:05 was considered statistically significant. ANOVA resulted in pmax < 0:001 by Fmin > 18:7, Fcrit D 2:4 for the four time intervals 0–5, 6–20, 21–30, and 0–30 min. In the case of the comparison between “3 T” group and control, we used pooled data; in every other case we compared the treated group to the daily control. The p and 2 values of the pair-wise comparisons are listed in Table 11.4 (from [30]).
Table 11.4. p and 2 values as calculated with Student t -test for pair-wise comparisons between different treated groups and control. Equal numbers of animals were compared. In case of the comparison between group “3 T” and control, we used pooled data (n D 3 6), in every other case we compared the group to the daily control (n D 6). denotes significant difference at the 5 % risk level (p < 0:05). time
“3 T” n D 18
“in 3 mT” nD6
“out 0.1 mT” nD6
“out 0.5 mT” nD6
“NX 3 T” nD6
period (min)
p
2
p
2
p
2
p
2
p
2
0–5 6 – 21 21 – 30 0 – 30
< 0:001 < 0:001 < 0:001 < 0:001
0.2 0.3 0.4 0.3
> 0:1 > 0:5 < 0:01 > 0:05
0.1 0.1 0.1 0.1
> 0:1 > 0:1 < 0:05 < 0:05
0.1 0.0 0.1 0.1
< 0:05 > 0:5 < 0:05 > 0:05
0.0 0.1 0.1 0.1
> 0:1 >0.5 > 0:5 > 0:5
0.2 0.2 0.1 0.2
266
11 Design and Development of Experiments for Life Science Applications
11.1.4 Discussion The present data demonstrate that an SMF configuration exists that can achieve a high and statistically significant analgesia in the writhing test in mice. In an effort to optimize the SMF generator for analgesia, we should first recognize the possible errors introduced in our evaluation. These were (i) the reading error of the magnetic induction scans (for greater elevations and lower inductions the scans were distorted), (ii) lateral components of the magnetic field have not been measured, (iii) there was a difference between Br remanent induction values (presumably equal to the absolute value of the homogeneous magnetization vector M in the magnetic material) given in the technical specification of the magnets and the ones calculated from the surface induction value, (iv) the magnetization was neither perfectly homogeneous, nor even perfectly cylindrically symmetrical in the cylindrical magnets, (v) the experiments were not factorially designed, therefore, a complete survey may lack some experiments. However, the following 10 trends and relationships could still be identified. (1)
The analgesic effect has a correlation of 85.61 % (p < 5 105 ) with the grade of the magnet so that it emphasizes the role of the grade of the magnet. Due to collinearities this also involves a strong correlation with Br (84.06 %), with the induction measured on the magnet surface (82.96 %), and with the material of the magnet (76.07 %). The grade should be optimized, since constructional and budgetary viewpoints of a potential medical device contradict.
(2)
The analgesic effect correlates with the polarity of the magnet arrangement to an extent of 69.27 % (p < 5 105 ). Accordingly, the bidirectional arrangement should be preferred. A bidirectional field generates a local P-P value at 3 mm from the magnet surface of a minimum of 4.37 (ferrite magnets, Generator 4 and 7 in Table 11.2) to a maximum of 80.39 (NdFeB N50 magnets, Generator 6 and 8 in Table 11.2) times that of a unidirectional field, and contributes to the enhancement of the average surface roughness, which consequently generates a higher analgesic effect (–6.20 vs. 47.54 % and 50.21 vs. 59.81 %, Generator 4 vs. 7 and 6 vs. 8 in Table 11.3, respectively).
(3)
The cylindrical shape is better than the block shape as suggested by the 63.95 % (p < 6 105 ) correlation between the analgesic effect and the shape of the magnet.
(4)
The analgesic effect has a correlation of 57.06 % to (mc) magnetic coupling (p < 5 105 ); mc should be included in the successful arrangement for analgesia. It enhances the analgesic effect by 39.44 %, see for example, Generator 8 vs. 11 in Table 11.3.
(5)
The analgesic effect shows a correlation to the average B value of the induction scan at 3 mm from the magnet surface by 52.19 % (p < 6 105 ). See column 5
11.1 Physiological Effects of Static Magnetic Field Exposure
267
in Table 11.2. This may imply that the analgesia happens in the mice close to the magnet surface, in other words the effect may be more peripheral than central. This answers the question of what the possible relation is between the field generated at the magnet source and the field at the target (animal) site. (6)
There is a correlation of 55.99 % (p < 6 105 ) with the number of magnetic matrices; 2 are better than 1. A single-sided arrangement causes far less than 50 % as much analgesic effect as a double-sided one.
(7)
We found a moderate correlation between 46.24 and 47.93 % (pmax < 7 105 ) concerning the analgesic effect and the surface roughness, but this was consistent for all three surface scans (at 3, 10, and 15 mm from the surface).
(8)
The analgesic effect is most expressed in the first 5 min of the experiments as suggested by the number of writhings per 5 min period (5 min writhing rate) in the treated groups. The 5 min writhing numbers increase by even 16 times from the first 5 to the last 5 min of the 30 min time period.
(9)
A greater matrix constant does not reduce the analgesic effect (by decreasing the resultant induction, but increasing the surface roughness) and therefore, can be preferably applied from budgetary considerations.
(10) Block magnets with lateral sizes exceeding or comparable to those of a mouse do not perform well in analgesia, probably due to the laterally slowly changing SMF and therefore, due to small surface roughness. A stepwise and a best subset regression were also executed to explore deeper relations. In this estimate, the matrix in question consisted of data for the number of writhings corresponding to all 17 types of experiments (carried out on 71 dates) and to 31 variables, which were not necessarily independent, i.e., collinearity appeared. The variables were distributed into three groups: magnet arrangement, magnetic field map, and animal experiment. The magnet arrangement group contained: material, grade, remanent induction, shape, height, radius, side length, matrix constant, number of bonds, induction value on the magnet surface, magnetic coupling, and polarity. The field map group contained: P-P value, surface roughness, mean, and standard deviation of the induction at 3, 10, and 15 mm z values. The animal experiment group contained: animal identification number, period of examination, mean, and standard deviation of writhings. We looked for variables to potentially be the best predictors of analgesia. On the basis of our experiences so far, we could not identify a single parameter that would exclusively be responsible for the results. Remark. There might be a chance that magnetic induction, surface roughness, matrix constant, or other quantities work in synergy only in a “window”. See for example, the paper of Kirson et al. [25] concerning the window effect in a situation induced by an electric field. Further studies must be devoted to this question.
268
11 Design and Development of Experiments for Life Science Applications
100
Category 1
90
#11 #12 #13 #14
Analgesic effect (%)
80 Category 2
70
#10 #8
60
#6 #9 #7
50 40
Category 3
30 #16
#3
#15
#1
#5
20 10
#2
0 –10
#4
Case number = #
Figure 11.9. The analgesic effect due to visceral pain elicited by i.p. injection of acetic acid in mice in the 0–30 min time interval. SMF was applied full-body in different magnetic configurations; see Tables 11.1–11.3 for the generator numbers (from [28]).
Summarizing the results for the 0–30 min interval in Figure 11.9, the following relationships can be established. The first prize in the analgesic effect of SMF (category 1) goes to bidirectional magnet arrangements with mc and a gap ( D 4R). In the second category, there are arrangements which consist of bidirectional or very strong unidirectional magnetic fields with no gap. Arrangements in the last class (category 3) cannot be proposed for purposes of analgesia. These include arrangements with unidirectional or block magnets, or magnets on only one side of the cage. We suspected that opioid receptors mediate the analgesic effect of SMF exposure, since they are involved in most forms of analgesia in general. The question was raised which opioid receptor may play a role in the antinociceptive effect of SMF. Since naloxone (0.2 mg/kg) as well as the -opioid receptor selective irreversible antagonist ˇ-funaltrexamin inhibited the antinociceptive effect of SMF, the involvement of opioid receptors in the antinociceptive action may be suggested. The ı-opioid receptor antagonist naltrindol also highly reduced the effect of SMF. On the contrary, the opioid receptor antagonist norbinaltorphimin failed to affect the antinociceptive effect. These results indicate that activation of -, as well as ı-, but not -opioid receptors may play a role in the SMF-induced antinociceptive action in the mouse. Our results are consistent with the findings of Thomas et al. [64], who suggested that -opioid receptors may not be involved in the analgesic action of magnetic fields in land snail against thermal nociception. Based on considerations on endogenous ligands, their affinity, and specificity for opioid receptors, it might be speculated that the release of either the -opioid receptor selective agonist endomorphins and/or the ı- and opioid receptor agonist ˇ-endorphin may be responsible for the antinociceptive action
11.1 Physiological Effects of Static Magnetic Field Exposure
269
of SMF in the mouse. The potential role of opioid peptides in SMF-induced biological responses is supported by the finding that increased concentration of ˇ-endorphin (and substance P) was observed in the hypothalamus following repeated exposure of rats to an extremely low frequency magnetic field [2]. The potential site of action of SMF-induced antinociceptive effect has also been studied. Intracerebroventricular administration of naloxone failed to affect the analgesic effect of SMF, indicating that the site of action is not likely to be at the supraspinal level. However, spinal mechanism in the SMF-induced analgesic action can be raised. Specifically, acetic acid-induced pain reaction was suggested to involve inflammatory components [12], which can result in a prolonged afferent input and the concomitant physiological changes. For example, repeated small afferent input leads to release of active factors in the dorsal horn neurons initiating a spontaneous afferent traffic [72]. Endomorphin-2 is present in the primary afferent fibers and in the spinal cord [3, 37, 43], and endomorphin-2 released from fibers may function to modify pain sensations from the visceral organs [37]. Furthermore, ˇ-endorphin was also identified in the spinal cord [10]. Consequently, it may be hypothesized that the analgesic effect of SMF may be due to the release of either ˇ-endorphin or endomorphin-2 in the spinal cord. It was also shown in the present study that the SMF of a clinical MR is also able to produce analgesia in vivo in the writhing test in mice, and naloxone also reverses the effect. Although we cannot exclude the possibility that pain perception of mice was influenced by motion-induced currents in their body, we think that this effect was negligible in the present study. The question to be primarily answered regards rather the motion of the cage (and the mice inside) into the bore of the MR, along which the induction changes five orders of magnitude. In order to control the effect of the induced currents due to the gradient of the magnetic field, we moved the cage along the axis of the bore with a constant speed not exceeding 0.5 m/s. We refer the reader here to a study focused on the potential effect of induced electric currents during the writhing test [30]. Our examination of the background mechanisms of action of analgesia proved to be transferable to other models than the writhing test in mice. There is evidence that SMF exposure also influences the rodent responses to acute peripheral pain [55], and also to chronic pain [1].
11.1.5 Conclusions Verification of Hypothesis 1 and Hypothesis 2: Having tested a number of different SMF arrangements for their analgesic effect in mice, we concluded that the optimal arrangement for this purpose is an arrangement with two magnetic matrices, one below and one above the animal cage. The separation of the two matrices should be kept as small as possible. The lower poles of the lower matrix and the opposite upper poles
270
11 Design and Development of Experiments for Life Science Applications
of the upper matrix should be coupled magnetically, the individual magnets must sit next to one another or at a double diameter distance from each other with alternating poles, and their material should be NdFeB grade N50, cylindrical shape with a radius of 5 mm and height 10 mm or 20 mm. Verification of Hypothesis 3: The present data demonstrated that SMF inhibits pain reaction due to chemical irritation of the peritoneum in the mouse. The opioid system is likely to be involved in the SMF induced analgesic effect, either at the periphery or at the spinal level. The analgesic action may be mediated by - and (to a lesser extent) ı-opioid receptors. The present study shows that the 3 T homogeneous SMF of a clinical MR induces a significant pain-inhibitory effect in the writhing test in mice. Since the antinociceptive activity for the whole 30 min observation period is 68 % (p < 0:01, n D 18), MR’s SMF should be regarded as a potential therapeutical tool. Noise, vibration, lighting stimuli as well as motion-induced effects were estimated not to contribute to pain sensation in a significant manner in the present experiments. Acknowledgments. The author is grateful to his coauthors enlisted in the references.
Bibliography [1] M. Antal, J. László, Exposure to inhomogeneous static magnetic field ceases mechanical allodynia in neuropathic pain, Bioelectromagnetics 30(6) (2009), 438–445. [2] X. Bao, Y. Shi, X. Huo, A possible involvement of ˇ-endorphin, substance P, and serotonin in rat analgesia induced by extremely low frequency magnetic field, Bioelectromagnetics 27(6) (2006), 467–472. [3] G. A. Barr, J. E. Zadina, Maturation of endomorphin-2 in the dorsal horn of the medulla and spinal cord of the rat, Neuroreport 10 (1999), 3857–3860. [4] S. W. Cain, T. Chou, M. R. Ralph, Circadian modulation of performance on an aversion-based place learning task in hamsters, Behavioral Brain Research 150 (2004), 201–205. [5] E. Choleris, C. Del Seppia, A. W. Thomas, P. Luschi, S. Ghione, G. R. Moran, F. S. Prato, Shielding, but not zeroing of the ambient magnetic field reduces stress-induced analgesia in mice, Proceedings of the Royal Society of London Series B-Biological Sciences 269 (2002), 193–201. [6] H. O. J. Collier, L. C. Dinnen, L. A. Johnson, C. Schneider, The abdominal constriction response and its suppression by analgesic drugs in the mouse, British Journal of Pharmacology and Chemotherapy 32 (1968), 295–310. [7] J. E. Charlton (ed.) Core curriculum for professional education in pain, 3rd edition, International Association for the Study of Pain, IASP Press, Seattle, 2005. [8] M. Dacke, D. E. Nilsson, C. H. Scholtz, M. Byrne, E. J. Warrant, Animal behavior: insect orientation to polarized moonlight, Nature 424(6944) (2003), 33.
11.1 Physiological Effects of Static Magnetic Field Exposure
271
[9] A. M. B. Garcia, F. P. Cardenas, S. Morato, Effects of different illumination levels on rat behavior in the elevated plus maze, Physiology & Behavior 85 (2005), 265–270. [10] H. B. Gutstein, D. M. Bronstein, H. Akil, ˇ-endorphin processing and cellular origins in rat spinal cord, Pain 51(2) (1992), 241–247. [11] K. Gyires, B. Rácz, Z. S. Zádori, J. László, Pharmacological Analysis of static magnetic field-induced antinociceptive action in the mouse, Bioelectromagnetics 29(6) (2008), 456–462. [12] K. Gyires, Z. Torma, The use of writhing test in mice for screening different types of analgesics, Archives Internationales de Pharmacodynamie et de Thérapie 267 (1984), 131–140. [13] G. Hansen, L. E. Crooks, P. Davis, J. De Groot, R. Herfkens, A. R. Margulis, C. Gooding, L. Kaufman, J. Hoenninger, M. Arakawa, R. McRee, J. Watts, In vivo imaging of the rat anatomy with nuclear magnetic resonance, Radiology 136 (1980), 695–700. [14] Zs. Helyes, Á. Szabó, J. Németh, B. Jakab, E. Pintér, Z. Szilvássy, J. Szolcsányi, Antiinflammatory and analgesic effects of somatostatin released from capsaicin-sensitive sensory nerve terminals in Freund’s adjuvant-induced chronic arthritis model of the rat, Arthritis & Rheumatism 50 (2004), 1677–1685. [15] W. B. High, J. Sikora, K. Ugurbil, M. Garwood, Subchronic in vivo effects of a high static magnetic field (9.4 T) in rats, Journal of Magnetic Resonance Imaging 12 (2000), 122–139. [16] S. Hofmann, Quantitative depth profiling in surface analysis – A Review, Surf. Interf. Anal. 2(3) (1980), 148–160. [17] T. A. Houpt, D. W. Pittman, J. M. Barranco, E. H. Brooks, J. C. Smith, Behavioral effects of high-strength static magnetic fields on rats, J. Neurosci. 23(4) (2003), 1498– 1505. [18] T. A. Houpt, J. A. Cassell, C. Riccardi, M. D. DenBleyker, A. Hood, J. C. Smith, Rats avoid high magnetic fields: dependence on an intact vestibular system, Physiol. Behav. 92(4) (2007), 741–747. [19] S. Ichioka, M. Minegishi, M. Iwasaka, M. Shibata, T. Nakatsuka, J. Ando, S. Ueno, Skin temperature changes induced by strong magnetic field exposure, Bioelectromagnetics 24 (2003), 380–386. [20] N. K. Innis, K. P. Ossenkopp, F. S. Prato, E. Sestini, Behavioral effects of exposure to nuclear magnetic resonance imaging, II. Spatial memory tests, Magnetic Resonance Imaging 4(4) (1986), 281–284. [21] B. J. Jones, D. J. Roberts, A rotarod suitable for quantitative measurements of motor incoordination in naïve mice, Naunyn-Schmiedebergs Archiv für Experimentelle Pathologie und Pharmakologie 259(2) (1968), 211. [22] M. Kaplan, B. Jackson, R. Sparer, Escape behavior under continuous reinforcement as a function of aversive light intensity, Journal of the Experimental Analysis of Behavior 8(5) (1965), 321–323.
272
11 Design and Development of Experiments for Life Science Applications
[23] M. Kavaliers, K. P. Ossenkopp, M. Hirst, Magnetic fields abolish the enhanced nocturnal analgesic response to morphine in mice, Physiology and Behavior 32 (1984), 261–264. [24] M. Kavaliers, K. P. Ossenkopp, Magnetic field inhibition of morphine-induced analgesia and behavioral activity in mice – Evidence for involvement of calcium ions, Brain Res. 379(1) (1986), 30–38. [25] E. D. Kirson, Z. Gurvich, R. Schneiderman, E. Dekel, A. Itzhaki, Y. Wasserman, R. Schatzberger, Y. Palti, Disruption of cancer cell replication by alternating electric fields, Cancer Research 64 (2004), 3288–3295. [26] A. Kwong-Hing, H. S. Sandhu, F. S. Prato, J. R. Frappier, M. Kavaliers, Effects of magnetic resonance imaging (MRI) on the formation of mouse dentin and bone, Journal of Experimental Zoology 252(1) (1989), 53–59. [27] D. J. Langford, S. E. Crager, Z. Shehzad, S. B. Smith, S. G. Sotocinal, J. S. Levenstadt, M. L. Chanda, D. J. Levitin, J. S. Mogil, Social modulation of pain as evidence for empathy in mice, Science 312 (2006), 1967–1970. [28] J. László, J. Reiczigel, L. Székely, A. Gasparics, I. Bogár, L. Bors, B. Rácz, K. Gyires, Optimization of static Magnetic field parameters improves analgesic effect in mice, Bioelectromagnetics 28(8) (2007), 615–627. [29] J. László, K. Gyires, 3 T homogeneous static magnetic field of a clinical MR significantly inhibits pain in mice, Life Sciences 84(1-2) (2008), 12–17. [30] J. László, K. Gyires, Analysis of inhomogeneous static magnetic field-induced antinociceptive activity in mice, PIERS Online 6(4) (2010), 307–313. [31] D. Le Bars, M. Gozarium, S. W. Cadden, Animal models of nociception, Pharmacol. Rev. 53 (2001), 597–652. [32] R. L. Levine, T. D. Bluni, Magnetic field effects on spatial discrimination learning in mice, Physiology & Behavior 55(3) (1994), 465–467. [33] R. L. Levine, J. K. Dooley, T. D. Bluni, Magnetic field effects on spatial discrimination and melatonin levels in mice, Physiology & Behavior 58(3) (1995), 535–537. [34] E. Lloyd (ed.), Handbook of Applicable Mathematics, Volume VI. Statistics, John Wiley & Sons, Chichester, 1984. [35] D. R. Lockwood, B. Kwon, J. C. Smith, T. A. Houpt, Behavioral effects of static high magnetic fields on unrestrained and restrained mice, Physiology & Behavior 78 (2003), 635–640. [36] V. M. Lukasik, R. J. Gillies, Animal anaesthesia for in vivo magnetic resonance, NMR in Biomedicine 16 (2003), 459–467. [37] S. Martin-Schild, J. E. Zadina, A. A. Gerall, S. Vigh, A. J. Kastin, Localization of endomorphin-2-like immunoreactivity in the rat medulla and spinal cord, Peptides 18 (1997), 1645–1649. [38] M. J. McLean, S. Engström, R. R. Holcomb, D. Sanchez, A static magnetic field modulates severity of audiogenic seizures and anticonvulsant effects of phenytoin in DBA/2 mice, Epilepsy Res. 55(1-2) (2003), 105–116.
11.1 Physiological Effects of Static Magnetic Field Exposure
273
[39] J. M. Messmer, J. H. Porter, P. Fatouros, U. Prasad, M. Weisberg, Exposure to magnetic resonance imaging does not produce taste aversion in rats, Physiology & Behavior 40(2) (1987), 259–261. [40] C. M. Nolte, D. W. Pittman, B. Kalevitch, R. Henderson, J. C. Smith, Magnetic field conditioned taste aversion in rats, Physiology & Behaviour 63 (1998), 683–688. [41] H. Okano, C. Ohkubo, Anti-pressor effects of whole body exposure to static magnetic field on pharmacologically induced hypertension in conscious rabbits, Bioelectromagnetics 24 (2003), 139–147. [42] K. P. Ossenkopp, N. K. Innis, F. S. Prato, E. Sestini, Behavioral effects of exposure to nuclear magnetic resonance imaging, I. Open-field behavior and passive avoidance learning in rats, Magnetic Resonance Imaging 4(4) (1986), 275–280. [43] T. L. Pierce, M. D. Grahek, M. V. Wessendorf, Immunoreactivity for endomorphin-2 occurs in primary afferents in rats and monkey, Neuroreport 9 (1998), 385–389. [44] J. P. Pinel, D. G. Mumbz, F. N. Dastur, J. G. Pinel, Rat (Rattus norvegicus) defensive behavior in total darkness, risk-assessment function of defensive burying, Journal of Comparative Psychology 108(2) (1994), 140–147. [45] I. Pirko, S. T. Fricke, A. J. Johnson, M. Rodriguez, S. I. Macura, Magnetic resonance imaging, microscopy, and spectroscopy of the central nervous system in experimental animals, NeuroRx 2 (2005), 250–264. [46] N. Prasad, S. C. Bushong, J. I. Thornby, R. N. Bryan, C. F. Hazlewood, J. E. Harrell, Effect of nuclear magnetic resonance on chromosomes of mouse bone marrow cells, Journal of Magnetic Resonance Imaging 2(1) (1984), 37–39. [47] F. S. Prato, J. R. Frappier, R. R. Shivers, M. Kavaliers, P. Zabel, D. Drost, T. Y. Lee, Magnetic resonance imaging increases the blood-brain barrier permeability to 153gadolinium diethylenetriaminepentaacetic acid in rats, Brain Research 523(2) (1990), 301–304. [48] F. S. Prato, J. M. Wills, J. Roger, H. Frappier, D. J. Drost, T. Y. Lee, R. R. Shivers, P. Zabel, Blood-brain barrier permeability in rats is altered by exposure to magnetic fields associated with magnetic resonance imaging at 1.5 T, Microscopy Research and Technique 27(6) (1994), 528–534. [49] F. S. Prato, J. A. Robertson, D. Desjardins, J. Hensel, A. W. Thomas, Daily repeated magnetic field shielding induces analgesia in CD-1 mice, Bioelectromagnetics 26(2) (2005), 109–117. [50] N. M. Rofsky, D. J. Pizzarello, M. O. Duhaney, A. K. Falick, N. Prendergast, J. C. Weinreb, Effect of magnetic resonance exposure combined with gadopentetate dimeglumine on chromosomes in animal specimens, Academic Radiology 2(6) (1995), 492– 496. [51] R. A. Rogachefsky, R. D. Altman, M. S. Markov, H. S. Cheung, Use of a permanent magnetic field to inhibit the development of canine osteoarthritis, Bioelectromagnetics 25 (2004), 260–270. [52] A. D. Rosen, Inhibition of calcium channel activation in GH3 cells by static magnetic fields, Biochimica et Biophysica Acta 1148 (1996), 149–155.
274
11 Design and Development of Experiments for Life Science Applications
[53] A. D. Rosen, Effect of a 125 mT static magnetic field on the kinetics of voltage activated Na+ channels in GH3 cells, Bioelectromagnetics 24 (2003), 517–523. [54] A. D. Rosen, Mechanism of action of moderate-intensity static magnetic fields on biological systems, Cell Biochem. Biophys. 39 (2003), 163–173. [55] K. Sándor, Zs. Helyes, K. Gyires, J. Szolcsányi, J. László, Static magnetic field-induced antinoniceptive effect and the involvement of capsaicin-sensitive sensory nerves in this mechanism, Life Sciences 81 (2007), 97–102. [56] R. Saunders, Static magnetic fields – Animal studies, Progress in Biophysics and Molecular Biology 87 (2005), 225–239. [57] SCENIHR – Scientific Committee on Emerging and Newly Identified Health Risks, Health Effects of Exposure to EMF, http://ec.europa.eu/health/ph_risk/committees/04_scenihr/docs/scenihr_o_022.pdf. [58] R. R. Shivers, M. Kavaliers, G. C. Teskey, F. S. Prato, R. M. Pelletier, Magnetic resonance imaging temporarily alters blood-brain barrier permeability in the rat, Neuroscience Letters 76(1) (1987), 25–31. [59] K. C. Smith, S. L. Comite, S. Balasubramanian, A. Carver, J. F. Liu, Vibration anesthesia: a noninvasive method of reducing discomfort prior to dermatologic procedures, Dermatology Online Journal 10(2) (2004), 1. [60] D. J. Snyder, J. W. Jahng, J. C. Smith, T. A. Houpt, c-Fos induction in visceral and vestibular nuclei of the rat brain stem by a 9.4 T magnetic field, Neuroreport 11(12) (2000), 2681–2685. [61] S. Stern, V. G. Laties, 60 Hz electric fields and incandescent light as aversive stimuli controlling the behavior of rats responding under concurrent schedules of reinforcement, Bioelectromagnetics 19 (1998), 210–221. [62] Á. Szállási, F. Joó, P. M. Blumberg, Duration of desensitization and ultrastructural changes in dorsal root ganglia in rats treated with resiniferatoxin, an ultrapotent capsaicin analog, Brain Research 503 (1989), 68–72. [63] G. C. Teskey, K. P. Ossenkopp, F. S. Prato, E. Sestini, Survivability and long-term stress reactivity levels following repeated exposure to nuclear magnetic resonance imaging procedures in rats, Physiological Chemistry & Physics & Medical NMR 19(1) (1987), 43–49. [64] A. W. Thomas, M. Kavaliers, F. S. Prato, K. P. Ossenkopp, Pulsed magnetic field induced “analgesia” in the land snail, Cepaea nemoralis, and the effects of -, ı- and -opioid receptor agonists/antagonists, Peptides 18(5) (1997), 703–709. [65] V. Veliks, E. Ceihnere, I. Svikis, J. Aivars, Static magnetic field influence on rat brain function detected by heart rate monitoring, Bioelectromagnetics 25 (2004), 211–215. [66] J. Weiss, R. C. Herrick, K. H. Taber, C. Contant, G. A. Plishker, Bio-effects of high magnetic fields: a study using a simple animal model, Journal of Magnetic Resonance Imaging 10(4) (1992), 689–694. [67] V. C. Wende, S. Margoli, Analgesic test based upon experimentally induced acute abdominal pain in rats, Fed Proc 15 (1956), 494.
11.1 Physiological Effects of Static Magnetic Field Exposure
275
[68] WHO – World Health Organization, Environmental Health Criteria 232, Static Fields http://who.int/peh-emf/publications/EHC_232_Static_Fields_full_document.pdf, 2006. [69] A. Wieraszko, Dantrolene modulates the activity of steady magnetic fields on hippocampal evoked potentials in vitro, Bioelectromagnetics 21 (2000), 175–182. [70] L. B. Witkin, C. F. Heuter, F. Galdi, E. O’Keefe, P. Spitaletta, A. J. Plummer, Pharmacology of 2-amino-indane hydrochloride (Su-8629), a potent non-narcotic analgesic, J. Pharmacol. Exp. Ther. 133 (1961), 400–408. [71] S. Xu, H. Okano, C. Ohkubo, Acute effects of whole-body exposure to static magnetic fields and 50-Hz electromagnetic fields on muscle microcirculation in anesthetized mice, Bioelectrochemistry 53(1) (2001), 127–135. [72] T. L. Yaksh, Spinal systems and pain processing: Development of novel analgesic drugs with mechanistically defined models, TiPS 20 (1999), 329–337. [73] M. Zimmermann, Ethical guidelines for investigations of experimental pain in conscious animals, Pain 16 (1983), 109–110.
Author Information János F. László, Faculty of Informatics, University of Debrecen, Debrecen, Hungary E-mail: [email protected]
12
Mathematical Biomedicine and Modeling Avascular Tumor Growth
Helen M. Byrne
12.1 Continuum Models of Avascular Tumor Growth
Abstract. In this chapter we review existing continuum models of avascular tumor growth, explaining how they are inter-related and the biophysical insight that they provide. The models range in complexity and include one-dimensional studies of radiallysymmetric growth, and two-dimensional models of tumor invasion in which the tumor is assumed to comprise a single population of cells. We also present more detailed, multiphase models that allow for tumor heterogeneity. The chapter concludes with a summary of the different continuum approaches and a discussion of the theoretical challenges that lie ahead. Keywords. Avascular Tumor Growth, Cancer, Hypoxia, Mathematical Modeling, Moving Boundary Problem, Multicellular Tumor Spheroid, Multiphase Model, Partial Differential Equation 2010 Mathematics Subject Classification. 35Q92, 92B05
12.1.1 Introduction Cancer is a complex, multi-factorial disease which continues to devastate lives and cause widespread morbidity and mortality throughout the world. The vast amounts of money that have been invested in cancer research have undoubtedly advanced understanding of how the disease progresses and contributed to the significant increases in five-year survival rates for certain types of cancer, including breast and colon. Unfortunately this trend does not apply to all cancers, with survival rates for brain tumors and cervical cancer showing little change since 1985. Further, it has been argued that the increases in survival rates should be attributed to better screening programs and earlier diagnosis rather than scientific advances. While debates about the societal benefit of cancer research will continue to excite interest for years to come, it is less controversial to view cancer as a multistage disease, characterized by the progressive loss of function of a range of regulatory genes, including repair genes that correct mutations and DNA damage before cell division and tumor suppressor genes that signal for cell-cycle arrest or induce programmed
This publication was based on work supported in part by Award No. KUK-C1-013-03, made by King Abdullah University of Science and Technology (KAUST).
280
12 Mathematical Biomedicine and Modeling Avascular Tumor Growth
cell death (apoptosis) if substantial genetic damage is detected [51]. The uncontrolled division of a mutated cell leads to the growth of a (small) avascular tumor, which will then remain dormant unless it is eliminated by immune surveillance or it acquires its own blood system by the process of angiogenesis. The resulting vascular tumor is well supplied with nutrients and may increase rapidly in size. Finally, malignant tumors are able to invade the surrounding tissue, leading to metastatic spread, with secondary tumors arising elsewhere in the host. Further genetic mutations within the tumor mass may endow it with drug resistance and facilitate its later stages of development. The complex pattern of vascular tumor growth, combined with the paucity of reliable experimental assays for collecting dynamic data from the same tumor, have deterred many theoreticians from attempting to model this phase of tumor growth [9, 77, 97]. Fortunately, the situation is starting to change, stimulated in large part by technological advances which make it possible now simultaneously to visualize changes in the size and spatial composition of vascular tumors using certain experimental assays. Thus, as we look to the future, we should anticipate the development of realistic models of vascular tumor growth that have been validated against experimental data. However, it will be some years before this vision becomes a reality and there are assays for vascular tumor growth which generate data which are as accurate and reproducible as those obtained when clusters of tumor cells are cultured in vitro as multicellular spheroids [65]. For these reasons, in this chapter we will focus on reviewing theoretical efforts to describe the early phase of avascular tumor growth. During avascular growth, externally-supplied nutrients are consumed by live, proliferating cells as they diffuse toward the tumor center. As the tumor grows, the amount of nutrient reaching the center declines until there is insufficient to sustain viable cells. There ensues the formation of a central core of dead (necrotic) cellular material whose size increases as the tumor continues to grow. Thus, a well-developed avascular tumor comprises an outer rim of nutrient-rich, proliferating cells and a central core of nutrient-starved, necrotic debris. These regions may be separated by a layer of oxygenpoor (hypoxic) cells which are quiescent (viable but nonproliferating). Since diffusion controls the delivery of nutrients (e.g., oxygen and glucose) to, and the removal of waste products from, avascular tumors [36, 93], the diameter to which they may grow is typically limited to several millimetres, growth halting when the rate of volume increase of the tumor balances its rate of volume loss, increases being due to cell growth and proliferation and decreases to cell death. There is a large theoretical literature devoted to mathematical models of the growth of avascular tumors and multicellular tumor spheroids. While in this chapter attention will focus on deterministic models that can be formulated as mixed systems of partial differential equations (PDEs), we pause here to mention some of the other approaches that are being used. These include discrete, cell-based models which view the tumor as a collection of interacting cells, each assigned their own set of parameter values and behavioral rules [6, 90]. Such models are gaining in popularity and have been used to study not only the growth of multicellular tumor spheroids [59, 69] but
12.1 Continuum Models of Avascular Tumor Growth
281
also tumor invasion [32, 101] and the fixation of clonal sub-populations within the intestinal crypt [102]. Additionally, Anderson and coworkers have used cellular automata models to investigate how the microenvironment (specifically, the local oxygen concentration and extracellular matrix density) influences (and is influenced by) the growth dynamics and phenotypic diversity of a tumor [7, 84]. Their simulations predict that when oxygen levels are low the tumor will rapidly diverge from its initial phenotype and exhibit high levels of population diversity, with aggressive phenotypes quickly becoming dominant. In this chapter we present a series of increasingly complex, spatially-structured models of avascular growth, starting in Section 12.1.2 with one-dimensional models. The stability of these models to symmetry-breaking perturbations is considered in Section 12.1.3 and the results used to determine conditions under which a radiallysymmetric tumor remains compact and localized (i.e., stability) and conditions under which it is predicted to become irregularly shaped and invasive (i.e., instability). In Section 12.1.4 we use a multiphase modeling framework to extend the models from Sections 12.1.2 and 12.1.3 to allow for tumor heterogeneity. In so doing, we not only provide justification for the use of Darcy’s law to describe cell motion (this law states that cells move down pressure gradients) but also enable other descriptions of the tumor’s material properties to be incorporated. The chapter concludes in Section 12.1.5 with a summary of the models presented and a discussion of future theoretical challenges.
12.1.2 Diffusion-limited Models of Avascular Tumor Growth 12.1.2.1 Introduction In this section we consider the earliest continuum models for multicellular tumor spheroids growing in free suspension [11, 47, 88]. We view the tumor as a radiallysymmetric and spatially-uniform mass of cells whose net growth rate is determined by local levels of a single, diffusible species (such as oxygen or glucose) which is present in the culture medium that surrounds the spheroids. As we explain below, these models couple a reaction-diffusion equation for the growth-rate limiting nutrient c.r; t / to an integro-differential equation for the outer tumor radius R.t /, with additional equations defining implicitly internal boundaries RH .t / and RN .t / that mark the transitions between nutrient-rich regions of cell proliferation, and nutrient-poor regions of hypoxia and necrosis. The governing equations are introduced in Subsection 12.1.2.2 and analyzed in Subsection 12.1.2.3. We identify conditions under which the models reduce to a nonlinear ordinary differential equation for R.t / and algebraic equations for the internal free boundaries, RH .t / and RN .t /. The section concludes in Subsection 12.1.2.4 with a summary of the biological insights that these models, and their extensions, can provide and a discussion of their shortcomings.
282
12 Mathematical Biomedicine and Modeling Avascular Tumor Growth
12.1.2.2 Model Development In order to describe the radially-symmetric growth of an avascular cluster of tumor cells growing in free suspension, we introduce dependent variables R.t / and c.r; t / to represent respectively the size of the tumor radius at time t > 0 and the concentration of a single, growth-rate limiting, diffusible chemical (henceforth oxygen) at time t and distance r from the cluster center (0 r R.t /). We denote by r D RH .t / and r D RN .t / the internal boundaries that mark the transitions between regions of cell proliferation, quiescence and necrosis. The principle of mass balance is used to derive equations for c.r; t / and R.t /, whereas RH .t / and RN .t / are defined implicitly, occurring when the oxygen concentration c.r; t / passes through known threshold values. The Oxygen Concentration, c.r; t/. We assume that the dominant processes regulating the distribution of oxygen within the spheroid are diffusive transport and its consumption by the tumor cells. Combining these processes we deduce that c.r; t / satisfies the following reaction-diffusion equation: @c @c D @ (12.1) D 2 r2 .c; R; RH ; RN /; @t r @r @r ƒ‚ … ƒ‚ … „ „ rate of oxygen diffusive consumption transport where D denotes the assumed constant diffusion coefficient of the oxygen and D .c/ its rate of consumption. In practice, .c/ will be a non-linear function which depends on the tumor cell line under investigation. Here, for simplicity and to demonstrate the qualitative behavior of the model, we suppose that oxygen is consumed at the constant rate by both proliferating and quiescent cells and so we fix .c/ D H.c cN /; where H.:/ denotes the Heaviside step function (H.x/ D 1 if x > 0 and H.x/ D 0 otherwise) and cN denotes the threshold oxygen concentration at which cells become necrotic (i.e., live cells are restricted to regions where c > cN ). The Outer Tumor Radius, R.t/. We assume that the oxygen distribution determines S.c/ and N.c/, the local rates of cell proliferation and cell death at all points within the tumor, and use the principle of mass balance to derive the following equation for the evolution of the outer tumor radius, R.t /: • • d 4R3 D S.c/ r 2 sin d d' dr N.c/ r 2 sin d d' dr ; dt 3 „ ƒ‚ … „ „ ƒ‚ … ƒ‚ … rate of change total rate total rate of tumor volume of cell proliferation of cell death
283
12.1 Continuum Models of Avascular Tumor Growth
where, for radially-symmetric growth, c D c.r; t / and the above equation reduces to dR R D dt 2
R.t Z /
ŒS.c/ N.c/ r 2 dr:
(12.2)
0
As simple, representative examples, we assume that cell proliferation is localized in nutrient-rich regions (where c > cH ) where it occurs at a rate which is proportional to the local nutrient concentration, c. We suppose further that both apoptosis (or natural cell death) and necrosis contribute to cell death, with apoptosis occurring at a constant rate throughout the tumor and necrosis being localized to nutrient-starved regions (where c cN < cH ). Thus, we fix S.c/ D scH.c cH / and N.c/ D sA C sN H.cN c/; where s; A and N are positive constants. The Hypoxic and Necrotic Boundaries, RH .t/ and RN .t/. The internal free boundaries r D RH .t / and r D RN .t / are defined implicitly in terms of threshold oxygen concentrations cH and cN < cH . These constants denote respectively the minimum oxygen concentration at which cell proliferation can occur and the minimum concentration at which quiescent cells can remain alive. Three different cases arise according to whether the minimum oxygen concentration within the tumor falls below none, one or both of these threshold values:
Case 1: uniformly proliferating tumor (RN .t / D RH .t / D 0). In this case, c.r; t / > cH 8 r 2 .0; R.t //:
Case 2: central quiescent core surrounded by a proliferating annulus (RN .t / D 0 < RH .t / < R.t /). In this case, c.r; t / > cN 8 r 2 .0; R/ so that RN .t / D 0 and there exists 0 < RH .t / < R.t / such that c.r; t / cH for r 2 .0; RH /, cH < c.r; t / for r 2 .RH ; R/ and c.RH ; t / D cH :
Case 3: fully-developed spheroid (0 < RN .t / < RH .t / < R.t /). In this case, c.RN ; t / D cN and c.RH ; t / D cH : with c.r; t / cN for r 2 .0; RN /, cN < c.r; t / < cH for r 2 .RN ; RH / and c.r; t / > cH for r 2 .RH ; R/.
We consider each of these cases in turn in Subsection 12.1.2.3. In Figure 12.1 we provide a schematic diagram to illustrate the structure of a well-developed avascular tumor which possesses a central necrotic core and a hypoxic region. Our model comprises equations (12.1)–(12.2) for c.r; t / and R.t / and supplementary equations for RH .t / and RN .t /. It remains to specify appropriate boundary and
284
12 Mathematical Biomedicine and Modeling Avascular Tumor Growth
Figure 12.1. Schematic diagram of a fully-developed avascular tumor. An outer proliferating rim (the red region, where c > cH and RH r R) surrounds a hypoxic or quiescent annulus (the blue region, where cN < c < cH and RN r RH ) and a central necrotic core (the yellow region, where c.r; t / cN and o r RN ).
initial conditions to close the model. We assume symmetry of the oxygen profile about r D 0, that on the outer tumor boundary it is maintained at a constant value, c D c1 say. Finally, we prescribe the initial tumor radius (R.0/ D R0 ) and the initial oxygen distribution (c.r; 0/ D c0 .r /). Thus we have 9 > at r D 0; > > > > = on r D R.t /; (12.3) @c > continuous across r D RH .t / and r D RN .t /; > c; > > > @r ; c.r; 0/ D c0 .r /; R.t D 0/ D R0 : Nondimensionalization. Before analyzing the model equations, we recast them in dimensionless variables. Denoting typical length and time scales by X and T and a typical oxygen concentration by C , we introduce the following dimensionless variables @c D0 @r c D c1
r D
r ; X
t D
t ; T
c D
c ; C
R D
R ; X
RH D
RH ; X
RN D
RN : X
When written in terms of dimensionless variables, our model equations become @c D @t
R
2
dR D dt
ZR ® 0
DT X2
@ 2 @c r TH.c cN /; @r r 2 @r 1
¯ 2 sT C c H.c cN / A N H.cN c / r dr :
12.1 Continuum Models of Avascular Tumor Growth
285
where A D sT A and N D sT N . Guided by our interest in timescales on which the tumor’s size and spatial structure change, we focus on the tumor doubling timescale and choose 1 T D : sC Following [47], we exploit the fact that the oxygen diffusion timescale ( minutes) is much shorter than a typical tumor doubling time ( weeks or months) and make a quasi-steady approximation in the nutrient equation so that 0D
1 r
2
@ 2 @c r H.c cN /; @r @r
2
O.1/. (We remark that, as a result, prescription of the initial where D X D oxygen profile in (12.3) is redundant.) For completeness, we now state our dimensionless model equations in full, omitting the s for clarity:
R
1 @ 2 @c r H.c cN /; 0D 2 r @r @r
(12.4)
¹cH.c cN / A N H.cN c/º r 2 dr;
(12.5)
RH D 0 if c > cH 8 r and otherwise c.RH ; t / D cH ;
(12.6)
RN D 0 if c > cN 8 r and otherwise c.RN ; t / D cN ; @c D 0 at r D 0; @r c D c1 on r D R;
(12.7)
(12.9)
R.0/ D R0 ; prescribed:
(12.10)
2 dR
dt
ZR D 0
(12.8)
We remark that c1 could be eliminated from the model equations by fixing C D c1 . We want to investigate the effect of varying c1 and, hence, retain it as an explicit model parameter. For similar reasons, we choose not to scale lengths with R0 . 12.1.2.3 Model Analysis In this subsection we derive analytical expressions for c.r; t / and show how the model may be reduced to a nonlinear ordinary differential equation (ODE) for R.t / and algebraic equations for RH .t / and RN .t /. We show further how the form of these relations changes as R.t / increases, and we progress from Case 1 to Cases 2 and 3.
286
12 Mathematical Biomedicine and Modeling Avascular Tumor Growth
Case 1: Uniformly Proliferating Tumor. In this case, RH D RN D 0 and equations (12.4)–(12.10) reduce to give c.r; t / D c1 .R2 r 2 /; 6 dR R R2 D c1 A ; (12.11) dt 3 15 where, for valid solutions, c.r; t / > cH for 0 < r < R.t /. This growth phase persists D 0) or the model ceases to until either the tumor attains a steady state (at which dR dt be valid. Recalling that a spherically-symmetric tumor of radius R.t / has volume V .t / D 4R3 .t /=3, we remark that equation (12.11) is equivalent to 1 dV 3V 2=3 D c1 A : V dt 15 4 Thus, our spatially-structured model for the growth of a uniformly proliferating avascular tumor exhibits the same dynamics as a time-dependent, ODE. However, without performing the above analysis, it would not be obvious what powers of V to include or how to relate the model parameters to physically defined quantities. D 0 when R D 0 (the trivial solution) or, assuming From (12.11) we deduce that dR dt c1 > A , when R D Œ15.c1 A /=/1=2 . Linear stability analysis reveals that the trivial solution is unstable if c1 > A (i.e., where a nontrivial steady state exists) and stable otherwise. As stated above, the nontrivial steady state is physically realistic if c.r; t / > cH 8 r 2 .0; R/. Since c attains a minimum at r D 0 we deduce that R D Œ15.c1 A /=/1=2 is a physically realistic steady state if c1 < 5A 2cH : In practice, A and cH will be fixed for a given tumor cell line. If A < 2cH =5 then the above inequality never holds, irrespective of c1 , and no non-trivial steady state solution exists. By contrast, if A > 2cH =5 then we predict that for A < c1 < 5A 2cH , there is a nontrivial steady state with 0 D RH D RN < R and which linear stability analysis shows to be stable. For c1 > 5A 2cH , the model breaks down before the steady state is attained, a region of quiescence forming when R D Œ6.c1 cH /=1=2 . Case 2: Intermediate-sized Tumor. In this case RN D 0 and equations (12.4)– (12.10) supply c.r; t / D c1 .R2 r 2 /; 6 3 5 RH RH R R2 R2 dR D c1 1 3 C 1 5 A ; dt 3 6 R 10 R
(12.12)
12.1 Continuum Models of Avascular Tumor Growth
287
where 2 D R2 RH
6 .c1 cH /;
(12.13)
and for valid solutions c.r; t / > cN for 0 < r < R.t /. By differentiating (12.13) with respect to time, we can recast our model as a pair of ODEs for R and RH , with dRH R dR D : dt RH dt H Since 0 < RH < R, we deduce that j dR j > j dR j. Thus, if the tumor contains dt dt a quiescent region, then RH .t / changes more rapidly than R.t /. In particular, it the > 0) then the quiescent region is growing more rapidly. tumor is growing (i.e., dR dt We remark that, instead of differentiating (12.13) with respect to time we could use it to eliminate RH from (12.12) and, in so doing, reduce the model to a single, nonlinear ODE for R.t /. In contrast to equation (12.11), the resulting ODE does not lend itself to physical interpretation. In this case, the absence of a clear link between the spatially-structured model and the corresponding ODE model underlines the difficulty associated with relating the parameters that appear in spatially-homogeneous models to physically relevant quantities. As for Case 1, by setting d=dt D 0 in equation (12.12), it is possible to identify conditions under which the system evolves to a nontrivial steady state. This equilibrium solution is physically realistic provided that cN < cmi n D c1 R2 =6 > cH or, equivalently, 6.c1 cH /= < R2 < 6.c1 cN /=. Otherwise, assuming dR=dt > 0, so that the tumor is increasing in size, a central necrotic core will form and we must consider Case 3. Case 3: Fully-developed Tumor. In this case 0 < RN .t / < RH .t / < R.t / and the model equations reduce to give ² 0 < r < RN ; cN (12.14) c.r; t / D cN C .r RN /2 .r C 2RN /=6r RN < r < R; 3 3 RH RN R dR D cN 1 3 A C N 3 dt 3 R R 5 2 R R R3 R3 R2 R3 1 1 H5 N2 1 H3 C N3 1 H2 ; (12.15) C 6 5 R R R R R
with
and
RN 2 6 2RN 1 D 1C .c1 cN /; R R R2
(12.16)
2RN 6 RN 2 1C D .cH cN /: 1 2 RH RH RH
(12.17)
288
12 Mathematical Biomedicine and Modeling Avascular Tumor Growth
Once again, the model reduces to a nonlinear ODE for R.t /. However, in this case the ODE is coupled to two algebraic equations for RH and RN . While it is, in principle, possible to use equations (12.16) and (12.17) to eliminate both RH and RN from (12.15), the complexity of the resulting ODE means that it is more informative to consider equations (12.15)–(12.17) together.
12.1.2.4 Discussion In this section we have studied simple, radially-symmetric models of avascular tumor growth and shown how the governing equations can be used to determine how the size and structure of the tumor change over time and how the long-time or equilibrium composition of the tumor depend on physical parameters such as c1 , the concentration of a diffusible nutrient such as oxygen which is supplied externally, and A , the basal level of cell death due to apoptosis. We remark that the kinetic terms used in our analyses are highly idealized and should be replaced by functions which have been fitted to experimental data [85]. In general, these functions will be non-linear and the resulting models must be solved numerically. Whilst numerical simulations are of great value, they may obscure the manner in which the various mechanisms interact. In such cases, complementary insight into the system’s dynamics can be gained by using asymptotic techniques to study cases for which the model equations simplify greatly. For example, in [19], when the tumor is small (so that 0 < R 1 and RH D RN D 0), equation (12.11) predicts exponential growth of R.t /. Alternatively, if c1 cN 1 and A C N 1 then, since cN < cH < c1 , we deduce that c1 cH 1 while equations (12.15)–(12.17) imply that the tumor will evolve to a non-trivial equilibrium, with a thin proliferating rim (0 < R RH 1) and a thin quiescent rim (0 < R RN 1). This resembles the structure of MCS cultured in vitro where, typically, the outer viable rim is only three to seven cell layers thick [36]. There are many ways in which the model presented above has been extended. In addition to using experimentally-determined functions for the rates of cell proliferation, apoptosis and necrosis, additional reaction-diffusion equations can be incorporated to describe the action of other diffusible growth factors which may be present in the tumor environment. These chemicals may be supplied externally or expressed by the tumor cells themselves and may promote or inhibit the tumor’s growth. For example, bi-products of the degradation process that accompany cellular necrosis are believed to inhibit cell proliferation (models of this type are studied in [18, 47]). If we view the nutrient as a (growth-) activator and the bi-product of necrosis as an inhibitor, it may be possible to apply classical Turing theory to this reaction-diffusion system in order to determine whether the system will generate spatial patterns, with hot-spots of high cell density [24]. Similarly, it is possible to include a more realistic description of cell metabolism by introducing separate variables to describe oxygen, glucose, pH and lactate. In models that distinguish between normal and tumor cells, this enables
12.1 Continuum Models of Avascular Tumor Growth
289
us to investigate the advantage over normal cells that tumor cells enjoy under acidosis [45, 89, 106]. Alternatively, the models can be used to investigate the response of tumors to treatment with radiotherapy and/or chemotherapy [35, 56, 58]. A further model extension which provides a simple description of vascular tumor growth involves including a distributed source of nutrient (or a distributed sink of waste products) in the outer portion of the tumor so that, in place of equation (12.4), we write 1 @ 2 @c r C h.cvess c/H.c cH / H.c cN /; 0D 2 r @r @r where h and cvess denote respectively the rate at which oxygen exchanges with the vasculature and the oxygen concentration within the vessels. As a result, nutrients may be supplied to the tumor either by exchange with its vasculature or by diffusion across the tumor’s outer boundary (for details, see [16, 17]). The models presented in this section can be termed moving boundary problems since the domain on which they are formulated (i.e., the tumor size) must be determined as part of the solution procedure. As such, they have excited the interest of Friedman and coworkers who have constructed proofs of existence and uniqueness of the model solutions [27, 30, 43], that place the earlier analysis on a more rigorous footing. Given that avascular tumors are usually radially-symmetric while vascular tumors possess highly irregular boundaries, it is natural to ask whether this change in morphology is due to the non-uniform distribution of blood vessels vascularization or whether the radially-symmetric avascular tumors are intrinsically unstable to asymmetric perturbations. Although the spatially-structured models presented in this section are not amenable to such analysis, in the next section we show how they have been extended to address this important question.
12.1.3 Tumor Invasion 12.1.3.1 Introduction In this section we show how the one-dimensional models of Section 12.1.2 have been extended to study avascular tumor growth in two and three dimensions, following an approach originally proposed by Greenspan [48] in which new dependent variables for the local cell velocity and pressure within the tumor were introduced. The physical principles that underpin models of this type can be summarized as follows: if the tumor is incompressible and contains no voids or holes, then cell proliferation and death generate spatial variations in the pressure within the tumor which drive cell motion, with cells moving down pressure gradients, away from regions of net cell proliferation and toward regions of net cell death. Surface tension is also incorporated into the model as a mechanism for maintaining the tumor’s compactness and counteracting the expansive force caused by cell proliferation.
290
12 Mathematical Biomedicine and Modeling Avascular Tumor Growth
The remainder of this section is organized as follows. In Subsections 12.1.3.2 and 12.1.3.3 we develop the model equations and show how the models of Section 12.1.2 are recovered when growth is one-dimensional. The stability of steady, radiallysymmetric solutions to symmetry-breaking perturbations is investigated in Subsection 12.1.3.4 before the section concludes, in Subsection 12.1.3.5, with a discussion of the models and suggestions for further research. 12.1.3.2 The Model Equations The model that we study is presented below in dimensionless form (for details, see [18, 48]). For simplicity, we assume that the tumor is small enough that it is nutrientrich, with all cells proliferating and, hence, that there is no quiescence or necrosis. The key variables are the nutrient concentration c.r; t /, the cell velocity v.r; t / and the pressure p.r; t /. The variables c; v and p satisfy the following system of dimensionless partial differential equations 0 D r 2 c ;
(12.18)
r v D S.c/ N.c/ c A ;
(12.19)
v D rp:
(12.20)
Equation (12.18) is the natural extension in higher spatial dimensions of the quasisteady reaction-diffusion equation that was used in Section 12.1.2. Equation (12.19) expresses mass conservation within the (assumed incompressible) tumor. We highlight similarities with the models from Section 12.1.2 by employing the same proliferation and death rates here. Following [48], in equation (12.19) Darcy’s law relates the cell velocity to the pressure, with cells moving down pressure gradients and the constant of proportionality denoting the sensitivity of the tumor cells to the pressure gradients. Equations (12.19) and (12.20) can be combined to eliminate v from the model equations, giving (12.21) 0 D r 2 p C .c A /: Equations (12.18) and (12.21) are closed by imposing the following boundary conditions: @p @c D D 0 at r D 0: @r @r c D c1 ; p D on .r; t / D 0;
(12.22) (12.23)
Equations (12.22) guarantee that the nutrient and pressure profiles are bounded at the origin. In (12.23), c1 and p1 D 0 are the assumed constant nutrient concentration and pressure outside the tumor, 0 is the surface tension coefficient and the mean curvature of the tumor boundary, on which .r; t / D 0. Thus, equations (12.23) state that the nutrient concentration is continuous across the tumor boundary and that there is a jump discontinuity in the pressure, this jump being proportional to the curvature
291
12.1 Continuum Models of Avascular Tumor Growth
of the boundary (and playing the role of a surface tension force which maintains the tumor’s compactness). In order to complete our model, it remains to determine how the tumor boundary .r; t / D 0 evolves. Following [48], we write r D .r; / and parameterize the tumor boundary as follows .r ; t / D 0 D r R.; t /: We assume further that cells located on the boundary move with the local velocity there so that @R D v njr DR.;t / D rp njr DR.;t / ; (12.24) @t with R.; t D 0/ D R0 ./. In Equation (12.24), n is the unit outward normal to the tumor boundary and R0 ./ D r denotes the position of the tumor boundary at t D 0. In summary, our model of solid tumor growth comprises equations (12.18) and (12.21)–(12.24). 12.1.3.3 Radially-Symmetric Model Solutions Under the assumption of radial symmetry, c D c.r; t /; p D p.r; t /, r D R.t / on the tumor boundary and the model equations reduce to give 1 @ @c 0D 2 r2 ; (12.25) r @r @r @p
@ r2 C .c A /; (12.26) 0D 2 r @r @r ˇ dR @p ˇˇ D ˇ : (12.27) dt @r r DR.t / Integrating equation (12.26) once with respect to r and imposing (12.22), we deduce that @p 1
D 2 @r r
Zr 2
.c A / d ) R 0
2 dR
dt
ZR D .c A /r 2 dr;
(12.28)
0
which shows how the current model reduces to the simpler models presented in Section 12.1.2 under radial symmetry. By integrating equations (12.25)–(12.26) subject to the boundary conditions, we obtain the following expressions for c and p: 2 .R r 2 /; 6 1 R2 2 2 2 .R r / C c1 A .R2 r 2 /: p.r; t / D R 120
6
15 c.r; t / D c1
292
12 Mathematical Biomedicine and Modeling Avascular Tumor Growth
Substitution with c.r; t / in (12.28) then yields dR R2 R D c1 A ; dt 3 15 which is identical to equation (12.11), showing how the current model reduces to Case 1 from Section 12.1.2 under the assumption of radial symmetry. 12.1.3.4 Linear Stability Analysis In this subsection, we investigate what happens when steady radially-symmetric solutions are subjected to small, symmetry-breaking perturbations. Our aim is to identify conditions under which the perturbations grow over time and conditions under which they decay: in the former case the radially-symmetric solution is said to be unstable (to symmetry-breaking perturbations) and in the latter it is stable. For simplicity, we consider the stability of steady radially-symmetric solutions by seeking solutions to the governing equations of the form 9 c c0 .r / C c1 .r; ; t / = where 0 < 1: (12.29) p p0 .r / C p1 .r; ; t / ; R R0 C R1 .; t / and, from Section 12.1.3.3, c0 .r / D c1
2 .R r 2 /; 6 0
p0 .r / D
s .R2 r 2 /2 ; R0 120 0
and
15 .c1 A /: (12.30) Substituting with (12.29) in equations (12.18), (12.21), (12.24), (12.23) and (12.22) and equating to zero coefficients of O. / we deduce that .c1 ; p1 ; R1 / solve R02 D
0 D r 2 c1 D r 2 p1 C c1 ; @R1 d 2 p0 @p1 ; D
C R1 @t @r dr 2 r DR0
(12.31) (12.32)
with @p1 @c1 D D 0 at r D 0; @r @r ˇ dc0 ˇˇ ; c1 D R1 dr ˇr DR0
(12.33) (12.34)
ˇ dp0 ˇˇ p1 D R1 2 .2R1 C L.R1 //r DR0 ; ˇ dr r DR0 R0
(12.35)
R1 .; 0/ D R10 ./; prescribed:
(12.36)
12.1 Continuum Models of Avascular Tumor Growth
In (12.35),
293
@f 1 @ sin L.f / D sin @ @
so that r 2 f .r; / D
1 @ r 2 @r
@f L.f / : r2 C @r r2
We remark that the derivation of equation (12.35) involves determining the O. / contributions to the curvature and the normal derivative of the pressure on the tumor boundary (details of these calculations are contained in references [18, 47]). Equations (12.31)–(12.36) are linear and, hence, admit separable solutions of the form 9 c1 .r; ; t / D k .t /r k Pk .cos / > > > > = 2 k r k (12.37) r Pk .cos / > p1 .r; ; t / D k .t / > 2 .2k C 3/ > > ; R1 .; t / D k .t /Pk .cos / where the Legendre polynomials Pk .cos / satisfy L.Pk / D k.k C 1/Pk so that r 2 .r k Pk / D 0. It is straightforward to show that equations (12.31) and (12.33) are automatically satisfied by the above choice of c1 and p1 . The coefficients k and k are determined by imposing conditions (12.35) and exploiting the orthogonality of the Legendre polynomials. This gives k R0k D
R0 k 3
and k R0k D
k R0kC2 k : .k 1/.k C 2/ C 2 .2k C 3/ 2R02
Using these results in (12.32) we deduce that the amplitude k of a perturbation to the tumor radius, R.; t /, involving Pk .cos / satisfies 2R02
1 dk D .k 1/ k.k C 2/ : (12.38) k dt 15.2k C 3/ 2R03 From (12.38) we note that all modes evolve independently and that the system is insensitive to perturbations involving the first Legendre polynomial (the latter result is unsurprising since PkD1 .cos / corresponds simply to a translation of the co-ordinate axes). We note also that if surface tension effects are neglected ( D 0) then the radially-symmetric steady state is unstable to all modes for which k 2. More generally, if > 0 then the steady state is unstable to the finite number of modes for which 4R05 : k.k C 2/.2k C 3/ < 15 Since the steady state radius R0 is defined in terms of the system parameters (see equation (12.30)), this result shows how the choice of parameter values influences
294
12 Mathematical Biomedicine and Modeling Avascular Tumor Growth
the modes to which the steady state is unstable. We note also that as increases, the number of unstable modes falls. In particular, if 2R05 =15 < 15 then there are no integers which satisfy the above inequality and we deduce that in this case the radiallysymmetric steady state is stable to perturbations involving Legendre polynomials of arbitrary order. Appealing to (12.30), we deduce that this will be the case if the external nutrient concentration c1 satisfies 225 2=5 : c1 < C 15 4 By differentiating (12.38) with respect to k we may determine the fastest growing mode for a given set of parameter values (and, hence, for a given value of R0 ). After some manipulation, we deduce that the fastest growing mode satisfies .2k C 3/2 .3k 2 C 2k 2/ D
4R05 : 3
12.1.3.5 Discussion The analysis presented above provides a mechanism which may explain how the irregular morphology that characterizes invasive tumors may be initiated. To understand this, consider a uniform cluster of tumor cells for which the surface tension coefficient is sufficiently large that the underlying radially-symmetric solution is (linearly) stable to symmetry-breaking perturbations involving Legendre polynomials. Our analysis predicts that such a cluster will remain radially symmetric throughout its development. Suppose, now, that the cells undergo a transformation which weakens the surface tension forces holding the tumor cells together. If the reduction in is large enough, then the tumor will become unstable to a finite range of asymmetric perturbations and develop an irregular morphology. We note that similar qualitative behavior is obtained if, instead of invoking surface tension (and the associated jump in the pressure across the tumor boundary), we assume that the nutrient concentration is discontinuous across the tumor boundary, with a jump that is proportional to the local curvature [12]. The physical motivation for this boundary condition is that nutrient (or its energy-equivalent) is utilized by cells on the tumor boundary to maintain its compactness. Many model modifications to the basic model of asymmetric tumor growth presented in this section have been considered. These include an investigation of more general perturbations involving spherical harmonics Ykm .; '/, Cartesian rather than spherical geometries and a study of the stability of avascular tumors that contain quiescent and necrotic regions [13,18]. We note also that linear stability analysis predicts only local behavior. Where instability is predicted it is natural to ask how the asymmetry develops at longer times. Two complementary approaches have been used to investigate this issue. The first involves using weakly nonlinear analysis to extend the linear theory [13, 22]. Alternatively, numerical methods can be used to solve the nonlinear governing equations [29].
12.1 Continuum Models of Avascular Tumor Growth
295
As mentioned in Section 12.1.2, another natural extension involves incorporating additional diffusible species into the model and studying their combined effect on the tumor’s development [24]. These growth factors may promote cell proliferation (e.g., oxygen and glucose) or inhibit it (e.g., tumor necrosis factor, anti-cancer drugs); they may be supplied externally (e.g., oxygen, drugs) or be produced as a bi-product of the cells’ normal functions (e.g., tumor necrosis factor is a bi-product of cell degradation). Further, the introduction of cell pressure into the model makes it possible to investigate the potential influence of pressure on cell proliferation, increasing amounts of experimental data suggesting that mechanical effects have a significant influence on cell proliferation [54]. In [20], analysis of a modified version of Greenspan’s basic model of asymmetric growth [48] reveals that contact-inhibition of cell proliferation (where cell proliferation halts when the cell pressure exceeds a threshold value) produces tumors whose growth dynamics are identical to those undergoing nutrient-limited growth. Two aspects of the tumor growth model studied in this section that warrant further consideration are the assumption that the tumor cell population is spatially uniform and the use of Darcy’s law to describe cell motion. In the next section we explain how a multiphase modeling approach can be used to relax these assumptions.
12.1.4 Multiphase Models of Avascular Tumor Growth 12.1.4.1 Introduction The spatially-structured models studied thus far treat the tumor as a homogeneous mass of cells, whose proliferation and death rates are regulated by a single, diffusible species. Additionally, where it is considered, cell movement is assumed to be governed by Darcy’s law, an empirical constitutive law more usually associated with fluid flow through a porous medium! Several authors have developed spatially-structured models of avascular tumor growth that account for cellular heterogeneity. For example, in [103–105] Ward and King distinguish between live and dead cells but assume that all species move with a common velocity and show how the two cell types separate to reproduce the layered structure that characterizes multicellular spheroids cultured in vitro. An alternative approach is employed by Thompson and Byrne [98], and Pettet and coworkers [79]: they allow for differential movement of distinct cell species but prescribe the velocities in a phenomenological manner. In this section we explain how a multiphase framework can be used to develop new models that account for cellular heterogeneity and permit the incorporation of alternative descriptions of cell movement. The models also allow for more detailed investigation of the impact that mechanical stimuli may have on cell proliferation and cell movement. The earliest multiphase models of solid tumor growth were developed by Please and coworkers [64,80] and based on an analogy between the layered structure of multicellular spheroids (MCS) and a compacting porous medium such as soil, the idea being that cells in the periphery are in direct physical contact (i.e., compacted) and therefore transmit any external mechanical forces that are acting whereas those in the
296
12 Mathematical Biomedicine and Modeling Avascular Tumor Growth
necrotic core are no longer in contact (they are fluidized) and so any load is distributed between the cells and the fluid. In this section we follow an alternative approach developed by Byrne and coworkers [10, 21, 23] which is based on continuum models originally developed to study the response of cartilage to mechanical loading [70]. A key difference between the multiphase models presented here and those used to study cartilage deformation is that in the latter case cell proliferation is neglected whereas in our tumor models cell proliferation (and death) must be included, leading to mass conversion between the constituent phases. The remainder of the section is organized as follows. In Section 12.1.4.2 we develop a two-phase model of avascular tumor growth in vitro, the two phases representing the tumor cells and the extracellular fluid in which they are bathed. In Section 12.1.4.3 we explain how the model equations can be reduced to a simpler system of PDEs and demonstrate how the constitutive assumptions that are used to close the model influence the structure of the reduced model. For example, if the cell and fluid phases are assumed to be incompressible, with fixed volume fractions, then we recover moving boundary problems of the type presented in Section 12.1.2. Alternatively, if the two phases are treated as isotropic fluids whose pressures are related by a prescribed function of the cell volume fraction, then we recover a reaction-diffusion model of tumor invasion similar to those proposed in [44,87]. Section 12.1.4.3 also contains numerical and analytical results. We conclude in Section 12.1.4.4 by explaining how the multiphase modeling framework has been adapted to describe other aspects of solid tumor growth and outlining ideas for future work. 12.1.4.2 Model Development Following [10, 21, 23], we view the tumor as a two-phase mixture of cells and extracellular fluid or water. For simplicity, the model is formulated in one-dimensional Cartesian geometry. We denote by n.x; t / and w.x; t / the volume fractions occupied by the tumor cells and extracellular fluid and by .vn ; n ; pn / and .vw ; w ; pw / their respective velocities, stress tensors and pressures. We derive the governing equations by applying the principles of mass and momentum balances to each phase and close the model by making constitutive assumptions about their material properties, interactions between the two phases and the factors that regulate cell proliferation and death. The Mass and Momentum Balance Equations. Applying the principle of mass balance to n and w yields @ @n C .vn n/ D Sn ; (12.39) @t @x @ @w C .vw w/ D Sw Sn : (12.40) @t @x In (12.39) and (12.40), Sn and Sw are the net rates at which cells and water are produced and we have assumed that there are no internal sources or sinks of mass so that mass is converted from one phase to the other (i.e Sn D Sw ).
12.1 Continuum Models of Avascular Tumor Growth
297
Applying the principle of momentum balance to n and w, and neglecting inertial effects, we have 0D
@n @ .nn / C Fnw C p ; @x @x
(12.41)
0D
@ @w .ww / Fnw C p : @x @x
(12.42)
In (12.41) the first term represents internal forces in each phase while Fnw denotes the force exerted by the water on the cells, with an equal and opposite force acting on the water. The third term models interfacial effects and introduces the interfacial pressure, p. This term arises when we average over discrete cells to obtain a continuum limit (for details, see [33]). We determine p by assuming that there are no holes or voids within the tumor so that n C w D 1: (12.43) Constitutive Assumptions. To close our model we must impose boundary and initial conditions, specify functional forms for Sn ; n ; w and Fnw . We start by considering the cell proliferation rate Sn that appears in (12.39). Following [23], we assume that Sn D Sn .n; c/ where c.x; t / denotes an externally-supplied nutrient and 8 S0 .c cN /n ˆ ˆ ın; if c cN ; ˆ < 1 C S1 c Sn .n; c/ D : (12.44) ˆ c Q 1 C S 2 ˆ ˆ n if c cN ; : 1 C S2 c where S0 ; S1 ; S2 ; cN ; cQ and are positive constants. The first term models cell proliferation as an increasing, saturating function of nutrient concentration (in practice, there is a physical limit to how rapidly cells can divide), the second models apoptosis and the third necrosis. As in earlier sections, we assume that c.x; t / satisfies a quasisteady reaction-diffusion equation and introduce positive constants Q0 and Q1 such that @2 c Q0 cn : (12.45) 0D 2 @x 1 C Q1 c We remark that other choices for Sn could be used to investigate, for example, the influence on the rates of cell proliferation and death of mechanical stimuli such as pressure and shear. Theoretical studies showing how such mechanical stimuli may alter the composition of a biological tissue are presented in [73–75]. We assume that the interaction force Fnw is due to relative motion of the two phases and accordingly specify (12.46) Fnw D k nw.vw vn /; assuming, for simplicity, that the drag coefficient is proportional to the volume fractions of each phase, with constant of proportionality k (for other choices, see [37]).
298
12 Mathematical Biomedicine and Modeling Avascular Tumor Growth
When prescribing the stress tensors, n and w , we view the water as an inviscid fluid and the cell phase as a viscous fluid, with viscosity n so that w D pw ;
and n D pn C 2 n
@vn : @x
(12.47)
We interpret the viscosity n heuristically as a measure of the cells’ affinity for each other: as the cells become more pathological, and less well differentiated, they will be less likely to maintain contact with each other and/or to shear and so their viscosity will decrease. With n ; w and Fnw specified via equations (12.46)–(12.47), equations (12.41)– (12.42) define vn and vw . However, two additional equations are needed to determine the phase pressures pw and pn . For simplicity (and consistency with existing multiphase models), we fix pw D p; (12.48) and focus on two alternative closures for pn . In the first case, we assume that the cell phase is incompressible so that (12.49) n D n ; for some positive constant n 2 .0; 1/. We remark that in this case, the no-voids assumption, equation (12.43), guarantees incompressibility of the fluid phase. In the second case, we follow [23], viewing the cells as bags of water, with an additional pressure, †.n/ to account for the way in which cells differ from water. In more detail, pn D p C †n .n/; where †.n/ D and ˇD
8 < 0
.n n0 /.n n2 / : ˛ .1 n/ˇ
.1 n1 /.3n1 2n2 n0 / ; .n1 n0 /.n2 n1 /
(12.50) 0 n < n0 ; n0 n < 1;
˛D
;
(12.51)
†1 .1 n1 /ˇ : .n1 n0 /2 .n2 n1 /
Thus, if n < n0 the cells exert no influence on each other. By contrast, when n0 < n < n2 the cells are attracted to each other. Finally, when n2 < n < 1 the cells repel each other, the repulsive force becoming infinite as n ! 1. We illustrate these effects in Figure 12.2 where we sketch †.n/. It remains to specify the tumor’s growth rate. As in Section 12.1.3, we assume that the tumor boundary x D R.t / moves with the local cell velocity there so that dR D vn .R; t /: dt
(12.52)
299
12.1 Continuum Models of Avascular Tumor Growth
S
S1
f0
f1
f2
1
fT
Figure 12.2. Schematic diagram showing how the function †.n/ varies with n. The black arrows represent the forces experienced by the black cells due to the presence of the grey cells (and conversely).
Boundary and Initial Conditions. Our two-phase model of avascular tumor growth comprises equations (12.39)–(12.52) which we close by specifying the following boundary and initial conditions: @c at x D 0; (12.53) vn D 0 D v w D @x @vn D 0; c D c1 at x D R.t /; (12.54) w D p D 0; n D pn C 2 n @x n.x; 0/ D n0 .x/; w.x; 0/ D 1 n.x; 0/ D 1 n0 .x/; R.0/ D R0 : (12.55) Equations (12.53) ensure symmetry about x D 0 while the first two of equations (12.54) guarantee continuity of (the normal component of) the cell and water stress tensors across x D R.t /, with the pressure outside the tumor normalized so that p D 0 there. Additionally, we assume that c is continuous across x D R and denote by c1 the nutrient concentration in the surrounding medium. Finally, equations (12.55) specify the initial cell and water distributions within the tumor and its initial size. 12.1.4.3 Model Reduction Using the constitutive assumptions specified above, it is straightforward to show that the governing equations can be written @n @ @ C .nvn / D Sn ; .nvn C .1 n/vw / D 0; (12.56) @t @x @x @p @ @vn npn wp C 2 n n D 0; D k n.vw vn /; (12.57) @x @x @x Q0 cn @2 c ; (12.58) 0D 2 @x 1 C Q1 c dR D vn .R; t /; (12.59) dt wherein pn is determined by imposing either (12.49) or (12.50).
300
12 Mathematical Biomedicine and Modeling Avascular Tumor Growth
Integrating the second of equations (12.56) subject to (12.53) and substituting in equations (12.57), we deduce that vw D and pn D
1 @p w w @p ; vn D vw D k @x n k n @x
@vn @ w @p w w p C 2 n D p C 2 n : n @x n @x k n @x
Under closure 1, n D n and w D 1 n and our model simplifies further to give Sn @vn D @x x dR 1 ) D vn jxDR.t / D dt n
ZR Sn dx:
(12.60)
0
With Sn D Sn .n ; c/, the model comprises a reaction-diffusion equation for c (see equation (12.58)) and an integro-differential equation for R (see equation (12.60)), along with the appropriate boundary and initial conditions. This reduced model is equivalent to the PDE models discussed in Section 12.1.2. The analysis presented here shows how these early models of diffusion-limited growth can be viewed as special cases of the two-phase models under the assumption of incompressibility. We remark further that it is possible to relax the assumption that cell proliferation and death are dominated by the availability of diffusible nutrients and to consider, instead, situations in which these processes are regulated by mechanical phenomena. Indeed, with Sn D Sn .pn ; pw ; vn /, we obtain new models whose investigation will be the subject of future work. Under closure 2, pn D p C †.n/ and our model reduces to the first of equations (12.56), together with equations (12.58), (12.59) and kn @vn @
n n n† D vn : (12.61) @x @x 1n These equations for n; vn ; c and R are solved subject to conditions (12.53)–(12.55). A typical numerical simulation taken from [23] is presented in Figure 12.3. It shows how the cell volume fraction n.x; t / '.x; t / and tumor radius R.t / D L.t / evolve over time toward a steady state. For further details of numerical simulations and model analysis, see [10, 23]. Under the additional assumption that viscous effects are negligible, so that n ! 0, equation (12.61) may be used to eliminate vn from the first of equations (12.56), yielding @ 1n @ @n D .n†/ C Sn : @t @x k @x
301
12.1 Continuum Models of Avascular Tumor Growth ϕT
(a)
0.85 0.8 0.75 0.7 –2
ϕ1
–1
(b)
1
2
x
2.5 2
L
1.5 1 0.5 0
0
0 50 0
40 0
0
30 0
20 0
10 0
0
0 ~
t
Figure 12.3. (a) Diagram showing the evolution of the cell volume fraction n.x; t / D 'T .x; t / toward a steady state at times t D 500; 1 000; 2 000; 4 000; (b) Diagram showing how the evolution of the tumor radius, R.t / D L.t /, to its equilibrium value varies as the maximum tumor cell proliferation rate is increased. As S0 increases, the tumor’s equilibrium size increases while the time taken to reach the steady state decreases. From the lower to the upper curve S0 increases from S0 D 0:005 to S0 D 0:0125.
Thus, when n ! 0 our model comprises a quasi-steady diffusion equation for c.x; t /, a nonlinear diffusion equation for n and associated boundary and initial conditions. This model is similar to existing reaction-diffusion models of tumor growth [44, 87]. The key differences here are our formulation of the problem on a moving domain and the physical mechanisms driving cell diffusion: here they are drag and cell-cell interactions rather than random cellular motion. It is also possible to show that when viscous effects are neglected the pressures and velocities in each phase are such that 1 @pw .1 n/ @ n vw D and vn D pn : k @x k n @x 1 n Thus for one-dimensional growth, a Darcy-type law governs the velocities of each phase, providing justification for the models of tumor growth that were originally developed by Greenspan [48] and discussed in Section 12.1.3. 12.1.4.4 Discussion In this section we have used a multiphase approach to develop models of avascular tumor growth that account for cellular heterogeneity and permit a more general description of cell movement than was possible using the models of Sections 12.1.2 and
302
12 Mathematical Biomedicine and Modeling Avascular Tumor Growth
12.1.3. We have also shown how reaction-diffusion models can be recovered when appropriate asymptotic limits are taken and provided stronger justification for the use of Darcy’s law to describe cell movement in existing phenomenological models [48]. There are many ways in which this two-phase model could be extended. Motivated by experimental work which suggests that mechanical effects can influence cell proliferation, in [23] Byrne and Preziosi assumed Sn D Sn .c; p/ in equation (12.44) and showed that a proliferation rate that was a decreasing function of the pressure p could limit cell growth as an alternative to nutrient limitation. A different approach was adopted in [26]. There the growth of a two-phase tumor embedded in a poroelastic gel was studied, the gel compressing in response to the expansive force exerted on it by the growing tumor mass, and exerting an equal and opposite restraining force on the tumor. The model reproduced experimental results which showed that the equilibrium size attained by a tumor spheroid cultured in such a gel decreases as the stiffness of the gel increases [54]. Roose et al. obtained similar results using a similar approach in which the tumor was treated as a two-phase poroelastic solid [86]. Other authors have developed multiphase models involving three or more phases to describe vascular tumor growth in one- and two-dimensions [9, 55], the response of tumors to treatment involving macrophages that have been genetically engineered to release anti-tumor drugs when they localize in low oxygen regions of the tumor [76], and tumor encapsulation [57]. Tumor encapsulation refers to the process by which a dense collagenous rim forms around a tumor, keeping it localized and preventing it from invading the surrounding tissue. Using a three-phase model, Jackson and Byrne showed that the capsule was more likely to form as a result of compression of the existing extracellular matrix (caused by expansion of the tumor mass) than as a result of the immune response stimulating the deposition of new collagenous material. In a related manner, Preziosi and Tosin have, more recently, used a multiphase approach to investigate how interactions between the tumor and the extracellular matrix influence the tumor’s growth and remodeling of the extracellular matrix [83, 99]. Several other authors have used continuum mechanical approaches to develop models of solid tumor growth. In [25], Chaplain and Sleeman viewed the proliferating rim of a radially-symmetric expanding multicellular spheroid (MCS) as an elastic shell, enclosing a fluid-like necrotic core. A weakness of this model is that the volume of the proliferating rim does not change. Consequently, as the tumor grows, the rim gets progressively thinner: in practice, however, when MCS are cultured in vitro the width of the proliferating rim evolves to a constant value and, so, the volume of the proliferating rim increases as the tumor radius increases. Jones et al. formulated a single phase mechanical model in which the usual constitutive law for linear elasticity was reformulated in terms of rates of strain and stress in order to account for the continuous (and nonuniform) cell growth and death within the tumor mass [60]. A weakness of this model is that the stress becomes unbounded (and, hence, physically unrealistic) when the tumor reaches an equilibrium configuration. Several authors have since suggested modifications that resolve this deficiency. Araujo and McElwain introduce anisotropy,
12.1 Continuum Models of Avascular Tumor Growth
303
assuming that cell proliferation increases the stress in the tangential direction whereas cell death relieves stress in the radial direction [8]. By contrast, MacArthur and Please show that the introduction of viscosity into the material constitutive law relieves the excessive stresses that develop in Jones et al.’s purely elastic tumor [66], while Roose et al. decompose the tumor into a two-phase mixture [86]. Other areas that merit further investigation include extending the multiphase models to two and three spatial dimensions and investigating the impact on the tumor’s growth dynamics of using different constitutive laws to describe its mechanical properties. For example, we might view the cell phase as an elastic [86] or viscoelastic material. Finally, when developing mechanical models of the type discussed in this section, it is important to verify that the constitutive laws used to close the models do not violate the laws of thermodynamics (for details, see [4, 5] and references therein).
12.1.5 Conclusions In this chapter we have reviewed a series of increasingly complex, continuum models of avascular tumor growth. These range from one-dimensional models of radiallysymmetric growth (see Section 12.1.2) to two-dimensional models that can be used to determine the stability of the radially-symmetric solutions to symmetry-breaking perturbations and, hence, to establish conditions under which it remains localized (i.e., stable) and conditions under which it becomes invasive (i.e., unstable; see Section 12.1.3). In Section 12.1.4.4 we introduced a multiphase modeling approach which extended the models of Sections 12.1.2 and 12.1.3 to allow for tumor heterogeneity. We showed how the constitutive assumptions that are used to close the model influence its structure. For example, if the tumor cell and fluid phases are assumed to be incompressible, with fixed volume fractions, then moving boundary problems of the type presented in Section 12.1.2 are recovered. Alternatively, if the two phases are treated as isotropic fluids whose pressure are related by a prescribed function of the cell volume fraction then we recover reaction-diffusion models of tumor invasion similar to those proposed in [44, 87]. As mentioned in the introduction, we have focussed our review on spatiallystructured continuum models and neglected the rapidly increasing number of stochastic and cell-based models of avascular tumor growth. Indeed, several probabilistic models of avascular tumor growth that reproduce the same behavior as the PDE models have been developed and shown to exhibit good qualitative and quantitative agreement with experimental data. Models of this type that focus on individual cells and their interactions with neighboring cells use concepts ranging from Markov chain processes [53], through cellular automata [6, 34, 69] to Potts models which are based on stochastic energy minimization techniques [59, 92, 101]. When comparing discrete, cell-based and continuum models of tumor growth, an obvious advantage of cell-based models is the relative ease with which parameters to model their behavior can be estimated from measurable biological and biophysical
304
12 Mathematical Biomedicine and Modeling Avascular Tumor Growth
quantities, such as cell growth rates during the cell cycle and cell membrane deformation in response to mechanical loading. Given that tumors growing in vitro and in vivo typically contain between 106 and 1011 cells, it might be more practical to use a continuum rather than a cell-based model to simulate their development. In [20], Byrne and Drasdo developed complementary cell-based and continuum models of the growth of tumor spheroids that exhibited similar growth kinetics. By fitting the profiles for the tumor radius and pressure distribution generated by each model they estimated parameters for the continuum model from parameters in the cell-based model. In this way, they have shown how cell-based models can be used as an intermediate step to relate measurable biophysical properties of individual cells to parameters that appear in continuum models of tumor spheroids. Other authors have used theoretical approaches more formally to relate discrete and continuum models. For example, in [38] a continuum model, comprising a mixed system of partial differential equations, is derived, in the limit of large numbers of cells, to describe the dynamics of a system of tightly adherent (visco-elastic) cells which are subject to drag due to cell-substrate adhesion. In [71], Murray and coworkers show formally that is possible to model the movement of a population of individual cells connected by overdamped elastic springs by a nonlinear diffusion equation for the cell density. In [72] the approach is extended to develop a continuum model of cell movement and proliferation in which cell proliferation depends on the subcellular dynamics of particular proteins. An exciting, alternative approach is being developed by Kim and coworkers [63]: they propose a new type of hybrid model in which a continuum model is used in regions where the tissue density does not change markedly and use a discrete, cell-based model in regions characterized by high rates of cell proliferation. One of the key challenges associated with such models is determining how to couple the continuum and discrete regions of the tissue domain. The wide range of mathematical approaches being used to study tumor growth, and avascular tumor growth in particular, can make it difficult to know what type of model is best suited to a particular problem and what level of detail should be included. The situation can be further exacerbated when we realize that different mathematical approaches can reproduce the same experimental results! In such cases, it may be appropriate to appeal to Occam’s razor and develop a model that includes sufficient detail to address the question of interest but not so much that it becomes obscured in detail. In practice, close collaboration between theoreticians and biomedical researchers is vital to getting this balance right, because the models are only ever as good as the assumptions used to construct them and the data with which they are validated. Indeed, in many respects the form of the initial model is less important than starting the dialogue between experimentalists and modelers because the model will almost certainly be wrong. In the same way that a new experimental protocol requires testing and optimization before data collection can begin, mathematical models require refinement before they can be used to address real problems.
12.1 Continuum Models of Avascular Tumor Growth
305
To date, most of the mathematical modeling that has been carried out has been retrospective, being developed in response to a set of experimental data or to test a biological hypothesis. As theoreticians become more involved in experimental design the models that they develop should better represent the experimental data, and vice versa. Additionally, the quality and practical use of the mathematical models should also increase, contributing ultimately to improved treatment and a better prognosis for cancer patients worldwide.
Bibliography [1] J. A. Adam, A simplified mathematical model of tumour growth, Math. Biosci. 81 (1986), 229–242. [2] J. A. Adam, A mathematical model of tumour growth. II Effects of geometry and spatial uniformity on stability, Math. Biosci. 86 (1987), 183–211. [3] J. A. Adam, N. Bellomo, A survey of models for tumour immune system dynamics, Birkhäuser Press, Boston, Cambridge (MA), 1997. [4] D. Ambrosi, F. Mollica, On the mechanics of a growing tumour, Intl. J. Eng. Sci. 40 (2002), 1297–1316. [5] D. Ambrosi, L. Preziosi, On the closure of mass balance models for tumor growth, Math. Models Methods Appl. Sci. 12, (2002), 737–754. [6] A. R. Anderson, A hybrid mathematical model of solid tumour invasion: the importance of cell adhesion, Math. Med. Biol. 22 (2005), 163–186. [7] A. R. A. Anderson, A. M. Weaver, P. T. Cummings, V. Quaranta, Tumour morphology and phenotypic evolution driven by selective pressure from the microenvironment, Cell 127(5) (2006), 905–915. [8] R. P. Araujo, D. L. S. McElwain, A history of the study of solid tumor growth: the contribution of mathematical modelling, Bull. Math. Biol. 66 (2004), 1039–1091. [9] C. J. W. Breward, H. M. Byrne, C. E. Lewis, A multiphase model describing vascular tumour growth, J. Math. Biol. 65 (2003), 609–640. [10] C. J. W. Breward, H. M. Byrne, C. E. Lewis, The role of cell-cell interactions in a twophase of solid tumor growth, J. Math. Biol. 45 (2002), 125–152. [11] A. C. Burton, Rate of growth of solid tumours as a problem of diffusion, Growth 30 (1966), 157–176. [12] H. M. Byrne, The importance of intercellular adhesion in the development of carcinomas, IMA J. Math. Appl. Med. Biol. 14 (1997), 305–323. [13] H. M. Byrne, A weakly nonlinear analysis of a model of avascular solid tumour growth, J. Math. Biol. 33 (1999) 59–89. [14] H. M. Byrne, Mathematical modelling of solid tumour growth: from avascular to vascular, via angiogenesis, in: IAS/Park City Mathematics Series, Volume 14 (Editors: M. A. Lewis, M. A. J. Chaplain, J. P. Keener and P. K. Maini). American Mathematical Society, 2009.
306
12 Mathematical Biomedicine and Modeling Avascular Tumor Growth
[15] H. M. Byrne, Dissecting cancer through mathematics: from the cell to the animal model, Nature Reviews Cancer. 10(3) (2010), 221–230. [16] H. M. Byrne, M. A. J. Chaplain, Growth of non-necrotic tumours in the presence and absence of inhibitors, Math. Biosci. 130 (1995), 151–181. [17] H. M. Byrne, M. A. J. Chaplain, Growth of necrotic tumours in the presence and absence of inhibitors, Math. Biosci. 131 (1995), 187–216. [18] H. M. Byrne, M. A. J. Chaplain, Free boundary value problem associated with the growth and development of multicellular spheroids, Eur. J. Appl. Math. 8 (1997), 639– 658. [19] H. M. Byrne, M. A. J. Chaplain, Necrosis and apoptosis: Distinct cell loss mechanisms in a mathematical model of solid tumour growth, J. Theor. Med. 1 (1998), 223–236. [20] H. Byrne, D. Drasdo, Individual-based and continuum models of growing cell populations: a comparison, J. Math. Biol. 58 (2009), 657–687. [21] H. M. Byrne, J. R. King, D. L. S. McElwain, L. Preziosi, A two-phase model of solid tumor growth, Appl. Math. Lett. 16 (2003), 567–573. [22] H. M. Byrne, P. C. Matthews, Asymmetric growth of avascular tumours: exploiting symmetries, IMA J. Math. Appl. Med. Biol. 19 (2002), 1–29. [23] H. M. Byrne, L. Preziosi, Modelling solid tumor growth using the theory of mixtures, IMA J. Math. Appl. Med. Biol. 20 (2003), 341–366. [24] M. A. J. Chaplain, M. Ganesh, I. Graham, Spatio-temporal pattern formation on spherical surfaces: numerical simulation and application to solid tumour growth, J. Math. Biol. 42 (2001), 387–423. [25] M. A. J. Chaplain, B. D. Sleeman, Modelling the growth of solid tumours and incorporating a method for their classification using nonlinear elasticity theory, J. Math. Biol. 31 (1993), 431–473. [26] C. Y. Chen, H. M. Byrne, J. R. King, The influence of growth-induced stress from the surrounding medium on the development of multicell spheroids, J. Math. Biol. 43 (2001), 191–220. [27] X. Chen, A. Friedman, A free boundary problem for elliptic-hyperbolic system: an application to tumour growth, SIAM J. Math. Anal. 35 (2003), 974–986. [28] J. Crank, Free and Moving Boundary Problems, Clarendon Press, Oxford, 1984. [29] V. Cristini, J. Lowengrub, Q. Nie, Nonlinear simulation of tumour growth, J. Math. Biol. 46 (2003), 191–224. [30] S. B. Cui, A. Friedman, Analysis of a mathematical model of the effect of inhibitors on the growth of tumours, Math. Biosci. 164 (2000), 103–137. [31] E. De Angelis, L. Preziosi, Advection-diffusion models for solid tumour evolution in vivo and related free boundary problems, Math. Models Methods App. Sci. 10 (2000), 379–407. [32] D. Drasdo, S. Hoehme, M. Block, On the role of physics in the growth and pattern formation of multi-cellular systems: what can we learn from individual-cell based models?, J. Stat. Phys. 128 (2007), 287–345.
12.1 Continuum Models of Avascular Tumor Growth
307
[33] D. A. Drew, L. A. Segel, Averaged equations for two-phase flows, Stud. Appl. Math. 50 (1971), 205–231. [34] W. Duchting, Spatial structure of tumour growth: a simulation study, IEEE Transactions on Systems, Man and Cybernetics 10(6) (1980), 292–296. [35] H. Enderling, M. A. J. Chaplain, A. R. A. Anderson, J. S. Vaidya, A mathematical model of breast cancer development, local treatment and recurrence, J. Theor. Biol. 246(2) (2007), 245–259. [36] J. Folkman, M. Hochberg, Self-regulation of growth in three-dimensions, J. Exp. Med. 138 (1973), 745–753. [37] A. C. Fowler, Mathematical models in the applied sciences, Cambridge University Press, Cambridge, (1997) [38] J. A. Fozard, H. M. Byrne, O. E. Jensen, J. R. King, Continuum approximations of individual-based models for multicellular systems, Math. Med. Biol. 27(1) (2010), 39– 74. [39] S. J. Franks, H. M. Byrne, J. R. King, C. E. Lewis, Modelling the growth of ductal carcinoma in situ, J. Math. Biol. 47 (2003), 424–452. [40] S. J. Franks, H. M. Byrne, H. S. Mudhar, J. C. E. Underwood, C.E. Lewis, Mathematical modelling of comedo ductal carcinoma in situ of the breast, Math. Med. Biol. 20 (2003), 277–308. [41] S. J. Franks, H. M. Byrne, J. C. E. Underwood, C. E. Lewis, Biological inferences from a mathematical model of comedo duct carcinoma in situ of the breast, J. theor. Biol. 232 (2005), 523–543. [42] A. Friedman, Mathematical analysis and challenges arising from models of tumour growth, Math. Mod. Meth. Appl. Sci. 17 (2007), 1751–1772. [43] A. Friedman, F. Reitich, Analysis of a mathematical model for the growth of tumours, J. Math. Biol. 38 (1999), 262–284. [44] R. A. Gatenby, E. T. Gawlinski, A reaction-diffusion model of cancer invasion, Cancer Res. 56 (1996), 5745–5753. [45] R. A. Gatenby, E. T. Gawlinski, The glycolytic phenotype in carcinogenesis and tumor invasion: insights through mathematical models, Cancer Res. 63 (2003), 3847–3854. [46] R. A. Gatenby, P. K. Maini, Mathematical oncology: Cancer summed up, Nature 421 (2003), 321. [47] H. P. Greenspan, Models for the growth of a solid tumour by diffusion, Stud. Appl. Math. 52 (1972), 317–340. [48] H. P Greenspan, On the growth and stability of cell cultures and solid tumours, J. Theor. Biol. 56 (1976), 229–242. [49] K. Groebe, W. Mueller-Klieser, Distributions of oxygen, nutrient and metabolic waste concentration in multicellular spheroids and their dependence on spheroid parameters, Eur. Biophys. J. 19 (1991), 169–181.
308
12 Mathematical Biomedicine and Modeling Avascular Tumor Growth
[50] K. Groebe, W. Mueller-Klieser, On the relation between size of necrosis and diameter of tumour spheroids, Int. J. Radiat. Oncol. 34 (1996), 395–401. [51] D. Hanahan, R. A. Weinberg, The hallmarks of cancer, Cell 100 (2000), 57–70. [52] D. Hanahan, R. A. Weinberg, Hallmarks of cancer: the next generation, Cell 144 (2011), 646–674. [53] L. G. Hanin, A stochastic model of tumour response to fractionated radiation: limit theorems and rate of convergence, Math. Biosci. 191(1) (2004), 1–17. [54] G. Helmlinger, P. A. Netti, H. C. Lichtenbeld, R. J. Melder, R. K. Jain, Solid stress inhibits the growth of multicellular tumour spheroids, Nature Biotech. 15 (1997), 778– 783. [55] M. E. Hubbard, H. M. Byrne, Multiphase modelling of vascular tumour growth in two spatial dimensions, J. Math. Biol. (2012), (under revision). [56] T. L. Jackson, Intracellular accumulation and mechanism of action of doxorubicin in a spatio-temporal tumour model, J. Theor. Biol. 220(2) (2003), 201–213. [57] T. L. Jackson, H. M. Byrne, A mechanical model of tumor encapsulation and transcapsular spread, Math. Biosci. 180 (2002), 307–328. [58] T. L. Jackson, S. R. Lubkin, J. D. Murray, Theoretical analysis of conjugate localization in two-step cancer chemotherapy, J. Math. Biol. 39 (1999), 353–376. [59] Y. Jiang, J. Pjesivan-Grbovic, C. Cantrell, J. P. Freyer, A multiscale model for avascular tumour growth, Biophys. J. 89 (2005), 3884–3894. [60] A. S. Jones, H. M. Byrne, J. W. Dold, J. Gibson, A mathematical model of the stress induced during solid tumour growth, J. Math. Biol. 40 (2000), 473–499. [61] C. E. Kelly, R. D. Leek, H. M. Byrne, S. M. Cox, A. L. Harris, C. E. Lewis, Modelling macrophage infiltration into avascular tumours, J. Theor. Med. 4 (2002), 21–38. [62] J. F. R. Kerr, Shrinkage necrosis: a distinct mode of cell death, J. Path. 105 (1971), 13–20. [63] Y. Kim, M. A. Stolarska, H. G. Othmer, A hybrid model for tumour spheroid growth in vitro I: theoretical development and early results, Math. Model. Meth. Appl. Sci. 17 (2007), S1773–S1798. [64] K. A. Landman, C. P. Please, Tumour dynamics and necrosis: Surface tension and stability, IMA. J. Math. Appl. Med. 18 (2001), 131–158. [65] C.-Y. Li, S. Shan, Q. Huang, R. D. Braun, J. Lanzen, K. Hu, L. Lin, M. W. Dewhirst, Initial stages of tumour cell-induced angiogenesis: evaluation via skin window-chambers in rodent models, J Natl Cancer Inst 92(2) (2000), 143–137. [66] B. D. MacArthur, C. P. Please, Residual stress generation and necrosis formation in multicell tumour spheroids, J. Math. Biol. 49 (2004), 537–552. [67] N. Mantzaris, S. Webb, H. G. Othmer, Mathematical modelling of tumour-induced angiogenesis, J. Math. Biol. 95 (2004), 111–187. [68] D. L. S. McElwain, L. E. Morris, Apoptosis as a volume loss mechanism in mathematical models of solid tumour growth, Math. Biosci. 39 (1978), 147–157.
12.1 Continuum Models of Avascular Tumor Growth
309
[69] J. Moreira, A. Deutsch, Cellular automaton models of tumour development: a critical review, Adv. Complex Systems 5 (2002), 247–267. [70] V. C. Mow, S. C. Kuei, W. M. Lai, C. G. Armstrong, Biphasic creep and stress relaxation of articular cartilage in compression: theory and experiments, J. Biomech. Eng. 102(1) (1980), 73–84. [71] P. J. Murray, C. M. Edwards, M. J, Tindall, P. K. Maini, From a discrete to a continuum model of cell dynamics in one dimension, Phys. Rev. E 80 (2009), 031912. [72] P. J. Murray, J.-W. Kang, G. R. Mirams, S.-Y. Shin, H. M. Byrne, P. K. Maini, K.H. Cho, Modelling spatially regulated ˇ-catenin dynamics and invasion in intestinal crypts, Biophys. J. 99(3) (2010), 716–725. [73] R. D. O’Dea, J. M. Osborne, A. J. El-Haj, H. M. Byrne, S. L. Waters, The interplay between scaffold degradation, tissue growth and cell behaviour in engineered tissue constructs, J. Math. Biol (2012), (submitted). [74] R. D. O’Dea, S. L. Waters, H. M. Byrne, A multiphase model for tissue construct growth in a perfusion bioreactor, Math. Med. Biol. 27(2) (2010), 95–127. [75] J. M. Osborne, R. D. O’Dea, J. P. Whiteley, H. M. Byrne, S. L. Waters, The influence of bioreactor geometry and the mechanical environment on engineered tissues, J. Biomech. Eng. 132 (2010), 051006. [76] M. R. Owen, H. M. Byrne, C. E. Lewis, Mathematical modelling of the use of macrophages as vehicles for drug delivery to hypoxic tumour sites, J. Theor. Biol. 226 (2004), 377–391. [77] H. Perfahl, M. R. Owen, T. Alarcon, A. Lapin, P. K. Maini, M. Reuss, H. M. Byrne, 3D hybrid multiscale modelling of vascular tumour growth, PLoS One 6(4) (2011), e14790. [78] A. J. Perumpanani, H. M. Byrne, Extracellular matrix concentration exerts selective pressure on invasive cells, Eur. J. Cancer 35 (1999), 1274–1280. [79] G. J. Pettet, C. P. Please, M. J. Tindall, D. L. S. McElwain, The migration of cells in multicell tumour spheroids, Bull. Math. Biol. 63 (2001), 231–257. [80] C. P. Please, G. J. Pettet, D. L. S. McElwain, A new approach to modelling the formation of necrotic regions in tumours, Appl. Math. Letters 11 (1998), 89–94. [81] J. B. Plotkin, M. A. Nowak, The different effects of apoptosis and DNA repair on tumorigenesis, J. theor. Biol. 214 (2002), 453–467. [82] L. Preziosi, Cancer modelling and simulation, Chapman and Hall/CRC, Boca Raton (FL), 2003. [83] L. Preziosi, A. Tosin, Multiphase modelling of tumour growth and extracellular matric interaction: mathematical tools and applications, J. Math. Biol. 58 (2009), 625–656. [84] V. Quaranta, K. A. Rejniak, P. Gerlee, A. R. Anderson, Invasion emerges from cancer cell adaptation to competitive microenvironments: quantitative predictions from multiscale mathematical models, Semin. Cancer Biol. 18 (2008), 338–348. [85] T. Roose, S. J. Chapman, P. K. Maini, Mathematical models of avascular tumour growth: a review, SIAM Review 49 (2006), 179–208.
310
12 Mathematical Biomedicine and Modeling Avascular Tumor Growth
[86] T. Roose, P. A. Netti, L. L. Munn, Y. Boucher, R. K. Jain, Solid stress generated by spheroid growth estimated using a linear poroelasticity model, Microvasc. Res. 66 (2003), 204–212. [87] J. A. Sherratt, Cellular growth and travelling waves of cancer, SIAM Appl. Math. 53 (1993), 1713–1730. [88] R. M. Shymko, L. Glass, Cellular and geometric control of tissue growth and mitotic instability, J. Theor. Biol. 63 (1976), 355–374. [89] K. Smallbone, D. J. Gavaghan, R. A. Gatenby, P. K. Maini, The role of acidity in solid tumour growth and invasion, J. Theor. Biol. 235 (2005), 476–484. [90] S. L. Spencer, R. A. Gerety, K. J. Pienta and S. Forrest, Modelling somatic evolution in tumorigenesis, PLoS Comp. Biol. 2 (2006), e108. [91] W. G. Stetler-Stevenson, S. Aznavoorian, L. A. Liotta, Tumour cell interactions with the extracellular matrix during invasion and metastasis, Ann. Rev. Cell. Biol. 9 (1993), 541–573. [92] E. L. Stott, N. F. Britton, J. A. Glazier, M. Zajac, Stochastic simulation of benign avascular tumour growth using the Potts model, Math. Comp. Mod. 30 (1999), 183–198. [93] R. M. Sutherland, R. E. Durand, Growth and cellular characteristics of multicell spheroids, Recent Results in Cancer Research 95 (1984), 24–49. [94] K. R. Swanson, E. C. Alvord, J. D. Murray, A quantitative model for differential motility of gliomas in grey and white matter, Cell. Prolif. 33 (2000), 317–329. [95] K. R. Swanson, E. C. Alvord, J. D. Murray, Quantifying efficacy of chemotherapy of brain tumors with homogeneous and heterogeneous drug delivery, Acta Biotheoretica 50 (2002), 223–237. [96] K. R. Swanson, C. Bridge, J. D. Murray, E. C. Alvord, Virtual and real brain tumors: using mathematical modeling to quantify glioma growth and invasion, J. Neur. Sci. 216 (2003), 1–10. [97] K. R. Swanson, R. C. Rockne, J. Claridge, M. A. Chaplain, E. C. Alvord Jr., A. R. Anderson, Quantifying the role of angiogenesis in malignant progression of gliomas: In silico modeling integrates imaging and histology, Cancer Res. 71 (2011), 7366. [98] K. E. Thompson, H. M. Byrne, Modelling the internalisation of labelled cells in tumour spheroids, Bull. Math. Biol. 61 (1999), 601–623. [99] A. Tosin, L. Preziosi, Multiphase modelling of tumour growth with matrix remodelling and fibrosis, Math. Comp. Modelling 52 (2010), 969–976. [100] P. Tracqui, From passive diffusion to active cellular migration in mathematical models of tumour invasion, Acta Biotheoretica 43 (1995), 443–464. [101] S. Turner, J. A. Sherratt, Intercellular adhesion and cancer invasion: a discrete simulation using the extended Potts model, J. Theor. Biol. 216 (2002), 85–100. [102] I. M. M. van Leeuwen, G. R. Mirams, A. Walter, P. Murray, J. Osborne, S. Varma, S. J. Young, J. Cooper, B. Doyle, J. Pitt-Francis, L. Momtahan, P. Pathmanathan, J. P. Whiteley, S. J. Chapman, D. J. Gavaghan, O. E. Jensen, J. R. King, P. K. Maini, S. L.
12.1 Continuum Models of Avascular Tumor Growth
311
Waters, H. M. Byrne, An integrative computational model for intestinal tissue renewal, Cell Prolif. 42 (2009), 617–636. [103] J. P. Ward, J. R. King, Mathematical modelling of avascular-tumour growth, IMA. J. Math. Appl. Med. 14 (1997), 39–69. [104] J. P. Ward, J. R. King, Mathematical modelling of avascular-tumour growth II: Modelling growth saturation, IMA. J. Math. Appl. Med. 15 (1998), 1–42. [105] J. P. Ward, J. R. King, Mathematical modelling of the effects of mitotic inhibitors on avascular tumor growth, J. Theor. Med., 1 (1999), 171–211. [106] S. D. Webb, J. A. Sherratt, G. Fish, Alterations in proteolytic activity at low pH and its association with invasion: a theoretical model, Clin. Exptl. Metastasis 17(5) (1999), 397–407. [107] L. M. Wein, J. T. Wu, D. H. Kirn, Validation and analysis of a mathematical model of a replication-competent oncolytic virus for cancer treatment: implications for virus design and therapy, Cancer Res. 15 (2003), 1317–1324.
Author Information Helen M. Byrne, Oxford Centre for Collaborative and Applied Mathematics, Mathematical Institute, Oxford, UK E-mail: [email protected]
Index
ˇ-endorphin 268, 269 ˇ-funaltrexamin 261, 262, 268 ı-opioid 268, 270 H -solution 181, 183 definition 181 H 182 HC 183 ı
1;q
H 2;p ./ 179
b 2;p ./ 182 H -opioid 268
-opioid 261, 268 p-Laplace operator 161, 162, 185, 186 T -product 55 T -solution 174, 175, 177 definition 174 existence 177 1;q
ı
T 1;p ./ 164 W -solution 181–184 definition 181 1;q W2;p ./ 179 ı
W 1;q 2;p ./ 179 acetic acid 254, 258, 261, 262, 264, 265, 268, 269 activator-inhibator 218 acute peripheral pain 269 acute visceral pain 247, 254 adaptation period 254 analgesia 248, 255, 266–270, 272–274 analgesic effect 248, 257–262, 266–272 anesthesia 248, 255, 274 anti-viral treatment 33 antinociceptive action 261–263, 268, 271 apoptosis 280, 283, 288, 297 approximate Bayesian computation 29 Arabidopsis 215, 217, 218, 221 array 221
avascular tumor 280, 283, 284, 286, 288, 289, 294–296, 299, 301, 303, 304 aversive stimulus 255 axiomatic modeling 113–115, 124, 138 background noise 255 backward propagator 46 Bayesian parameter inference 24 best subset regression 267 bidirectional 250, 251, 253, 259–261, 266, 268 biharmonic operator 186 bio-actuators 8 bio-imaging 7 bio-macromolecules 7 bio-sensors 8 bio-tissues 6, 8 biological games 7 biological networks 7, 9 biomedicine 7, 13, 14 bionanotechnology 6, 7 birth and death MCMC 28 blood flow 7, 12 Boltzmann distribution 27 cancer 14 thyroid cancer 231 cancer modeling 114 cardiac and skeletal-muscular functions 7 Cauchy problem 76 cell biology 5–7, 9–11 chain rule 46 Chornobyl accident 231 circadian cycle 255 clinical trials 7 clumping 217, 224 coercivity 161, 162, 171, 172, 177, 178 degenerate 161, 171, 172, 185 complex systems 7, 9, 14
314
Index
concentration values (HCV) 221, 222 constitutive assumptions 296, 297, 299, 303 continuum models 281, 296, 303, 304 control of genetic expressions 7 control parameters 217 correlation 266, 267 cost function 22 coupled dynamic systems 3, 8 coupled effects in biological systems 8 CTL 123
generalized solution 162, 163, 171 genetics 215 genomics 7 geomagnetic field 254 Gierer and Meinhardt 218 Greenspan, H. P. 289, 295, 301
Darcy’s law 281, 290, 295, 302 destabilizing diffusion 220 developmental biology 215 diffusion-driven instability 193, 197, 202, 203 diffusion-limited growth 300 Dirichlet problem 161–163, 166, 171– 174, 177–179, 184, 185 DNA 3, 6, 7 dose uncertainty 232 drug development 7 duality 49 dummy magnets 258
identifiability 22 illumination 249, 255, 262–264, 271 immune response 115–118, 120, 122, 123 in vivo 247, 253, 269, 271, 272 infection 115–118, 122–124, 126, 127, 136 infectious diseases 7, 10 intracerebroventricular 262, 263, 269 intraperitoneal 254, 265
ecology 6, 8 effect of a treatment 255 endomorphin 268–270, 272, 273 entropy solution 161–163, 165–178, 180– 185 a priori properties 172, 173, 185 definition 165, 173, 174, 180 existence 163, 166, 172, 173, 177, 181, 185 summability properties 176, 177 uniqueness 163, 166, 181 epidemic equation 73 evolution 4, 7 excitable systems 7 experimental stress 254 Fibonacci sequence 3 Fisher, R. A. 218 formal solution 80 full-body 247, 248, 268
Hessian H 23 hexagonal 217, 221–225 HIV 115, 123 hypoxia 281 hysteresis 193, 206–208
kinetic equation 46 leaves 217 leaves of plants 215 linear stability analysis 286, 292, 294 magnet arrangement 250, 251, 266–268 magnetic coupling 250, 260, 266, 267 magnetic field map 263, 267 magnetic induction 249, 251, 254, 259, 263, 266, 267 magnetic resonance tomography (MR) 248 Markov Chain Monte Carlo 27 mass balance 282, 296 mathematical modeling 3–14, 305 mathematical models in developmental biology 199 matrix constant 250, 251, 260, 267 maximum likelihood 22, 29 measurements of 131I in thyroid 233 membrane biology 6, 7 Metropolis–Hastings algorithm 27 model of activity with additive error 235 molecular machines 7
315
Index momentum balance 296, 297 Monte Carlo methods 3, 13 moving boundary problems 289, 296, 303 MR 248, 249, 253, 255, 262–264, 269, 270, 272 multicellular spheroid 280, 295, 302 multiphase modeling 281, 295, 296, 303 multiple scales 7, 8 multiple steady states 193, 206, 208 mutants 216, 217, 222, 224 naloxone 261–264, 268, 269 naltrindol 261, 262, 268 necrosis 281–283, 288, 290, 295, 297 neuroscience 6, 7 nondimensionalization 284 nonlinear elliptic second-order equations with L1 -data 161, 162, 185 nonlinear fourth-order equations with L1 -data 161, 178 nonlinear Markov process 47 nonlinear waves 7 norbinaltorphimin 261, 262, 268 normal parabolic equation (NPE) 150 objective function 22 oncolytic virus dynamics 123, 136 oncolytic virus therapy 131, 137 oncolytic virus treatment 136 opioid 254, 261, 268–270 P-P value 252, 253, 260, 266, 267 pain 247, 254, 255, 258, 261, 262, 268– 270 parallel tempering 28 partial differential equations 4, 6, 8, 280, 290, 304 pattern formation 161, 162, 191–194, 196, 197, 199–209, 216, 217 pattern formation in Hydra 194, 196, 202, 203 pea development model 217, 225 phylogenesis 254 physiology 6, 7, 13 placebo 254, 258, 259 plant biology 8 plant leaves 215 polarity 250
polynomial approximation 239, 240 positioning 216, 217, 221, 225 posterior density 24 predictive capabilities 9 prior density 24 probability generating function 74 propagator 46 generator family 49 Markov 46 sub-Markov 46 propagator equation 46 proper entropy solution 161, 183, 184 a priori estimates 184 definition 183 proteins 6, 7 reaction-diffusion 218 reaction-diffusion equations 193, 197, 198, 201, 202, 207 receptor 261, 268, 270, 274 receptor-based models 192, 200, 201, 203–207 receptor-ligand interactions 201 recursion 217, 221–223, 225 recursion model 225 recursion rule 222, 223 registration error 234 regression calibration 236 remanent induction 250, 266, 267 renormalized solution 161–163, 171, 172 definition 171 existence 163 uniqueness 163 reproducibility 258 reversible jump MCMC 28 RNA 6, 7 self-assembly 7 sensitivity 23 signaling models 7, 12 simulated annealing 29 simulation 217 site of action 269 SMF 247–254, 256, 258, 259, 261–264, 266–270 stabilization problem 159 start control 158 static magnetic field 247
316 steady state solution 286 stepwise regression 267 stiff parameter combinations 23 strengthened coercivity 161, 178, 185 strongly continuous propagator 49 structure of NPE dynamics 153 surface roughnes 259 surface roughness 252, 253, 259, 260, 266, 267 sustainability 8 switching thresholds 224 synergy 267 systems-science-based approaches 4, 14 thaliana 218 therapy 115–118, 120, 122, 123, 131, 136, 137 thyroid cancer 231 thyroid dose 235 tomography scans 7 treatment 115, 118, 120, 122–124, 127, 130, 132, 133, 136, 137 trichome 216–218, 220–225 tumor 123, 125, 126, 136, 137
Index tumor invasion 279, 281, 296, 303 Turing patterns 198, 199, 203–206 Turing, A. M. 218 uncertainty 232 unidirectional 250, 253, 259–261, 266, 268 vibration stimulus 255 viral infection 30 virus 115–118, 120, 122, 123, 125, 126, 128, 129, 131, 133, 136, 137 weak solution 161–163, 166–171, 174, 175, 177 definition 163, 174 existence 162, 167, 170 limit summability 185 wild type 217, 222, 223 Wolpert hypothesis 216 writhing numbers 267 writhing test 254, 256, 257, 264, 266, 269–271 Young, J. P. W. 216