Biomat 2012 - International Symposium On Mathematical And Computational Biology 9789814520829, 9789814520812

This is a book of a series on interdisciplinary topics of the Biological and Mathematical Sciences. The chapters corresp

180 78 7MB

English Pages 406 Year 2013

Recommend Papers

Recent Studies on Computational Intelligence: Doctoral Symposium on Computational Intelligence (DoSCI 2020) [1st ed.] 9789811584688, 9789811584695

This book gathers the latest quality research work of Ph.D. students working on the current areas presented in the Docto

429 67 4MB Read more

Computational and Mathematical Models in Biology (Nonlinear Systems and Complexity, 38) [1st ed. 2023] 3031426886, 9783031426889

107 64 12MB Read more

XIIIth International Symposium on Spermatology 3030662918, 9783030662912

These proceedings of the 2018 XIII International Symposium on Spermatology focus on comparative biology, and encourages

120 72 Read more

Computational Biology (MIT 6.047)

391 77 17MB Read more

Optimization Stories: 21st International Symposium on Mathematical Programmng, Berlin, August 19-24, 2012 (Documenta Mathematica: Journal der Deutschen Mathematiker-Vereinigung Gegrundet 1996, Extra Volume) 3936609586, 9783936609585

116 108 8MB Read more

Systems Biology and Computational Proteomics: Joint RECOMB 2006 Satellite Workshops on Systems Biology, and on Computational Proteomics, San Diego, ... (Lecture Notes in Computer Science, 4532) 9783540730590, 3540730591

This book constitutes the thoroughly refereed post-proceedings of two joint RECOMB 2006 satellite events: the Second Ann

107 64 4MB Read more

Transactions on computational systems biology VIII 9783540766384, 3540766383, 9783540766391, 3540766391

The LNCS journal Transactions on Computational Systems Biology is devoted to inter- and multidisciplinary research in th

290 21 7MB Read more

Encyclopedia of bioinformatics and computational biology 9780128114148

876 123 78MB Read more

Mathematical Morphology: 40 Years On: Proceedings of the 7th International Symposium on Mathematical Morphology, April 18–20, 2005 [1 ed.] 9781402034428, 9781402034435, 1402034423, 1402034431

Mathematical Morphology is a speciality in Image Processing and Analysis, which considers images as geometrical objects,

405 97 11MB Read more

Practical Applications of Computational Biology and Bioinformatics, 17th International Conference (PACBB 2023) 3031380789, 9783031380785

This book aims to promote the interaction among the scientific community to discuss applications of CS/AI with an interd

168 91 9MB Read more

Biomat 2012 - International Symposium On Mathematical And Computational Biology
9789814520829, 9789814520812

Author / Uploaded
Rubem P Mondaini

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

BIOMAT 2012 International Symposium on Mathematical and Computational Biology

8846_9789814520812_tp.indd 1

20/5/13 9:28 AM

May 20, 2013

10:27

BC: 8846 - BIOMAT 2012

This page intentionally left blank

index

BIOMAT 2012 International Symposium on Mathematical and Computational Biology Tempe, Arizona, USA

6 – 10 November 2012

edited by

Rubem P Mondaini Federal University of Rio de Janeiro, Brazil

World Scientific NEW JERSEY

•

LONDON

8846_9789814520812_tp.indd 2

•

SINGAPORE

•

BEIJING

•

SHANGHAI

•

HONG KONG

•

TA I P E I

•

CHENNAI

20/5/13 9:28 AM

Published by World Scientific Publishing Co. Pte. Ltd. 5 Toh Tuck Link, Singapore 596224 USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.

BIOMAT 2012 International Symposium on Mathematical and Computational Biology Copyright © 2013 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.

ISBN 978-981-4520-81-2

Printed in Singapore

May 6, 2013

11:20

BC: 8846 - BIOMAT 2012

00a˙preface

v

Preface The BIOMAT 2012 International Symposium was scheduled to be held in Mexico. However, due to some misunderstanding on the logistics of the submitted proposal of our mexican colleagues, the BIOMAT Consortium (http://www.biomat.org) has decided to accept an alternative offer of organizing the conference in a university of the Arizona state, USA. The continuation of this history does not give any contribution to scientific development except by the report of some sad but useful lessons on political controversy. Everything which seems relevant to say is that the organization of the BIOMAT 2012 at the Four Points by Sheraton Hotel in Tempe, Arizona, was a tour de force and has been characterized by an intransigent defense of the multidisciplinary tradition of the BIOMAT Consortium activities as well as its fundamental mission of enhancing the scientific collaboration of dedicated practitioners of developing countries worldwide. This defense has been conducted in a “no retreat, no surrender basis” and we are indebted to all colleagues, authors of accepted papers, Keynote Speakers, Senior professionals, Post Docs and Research Students which have attended the conference and gave a lucid example of scientific professionalism. We have no authorities and/or representatives of universities and sponsoring institutions to acknowledge this time. After twelve years of organization of conferences of the BIOMAT series, the fund-raising for the BIOMAT 2012 was made essentially from registration fees and some savings of the two previous conferences. This was strictly necessary to keep the minimum quality of the administrative work as well as to offer fellowships to twenty research students and young Post Docs, in terms of accommodation with breakfast included on a double occupancy level. Unfortunately, this also corresponds to the present “tabula rasa” financial situation of the Consortium. On behalf of the BIOMAT Consortium, we are pleased to acknowledge the excellent professional work of some collaborators: Alicia Johnson, the Sales Manager of the conference venue hotel and the hotel Staff by their help on the catering services of the Reception Cocktail, the Coffee-Breaks and Conference Dinner. We also thank the staff of the BIOMAT 2012 Symposium - Larissa Costa, Marcelo Domingues, Reinaldo Viana, for their competent work on the conference registration office, the expert technical assistance on a talk given in teleconference format and the photographic

May 6, 2013

11:20

BC: 8846 - BIOMAT 2012

00a˙preface

vi

record of the conference, respectively. The Editor of this BIOMAT book series would like to acknowledge his wife Carmem Lucia for her exceptional dedication to the editorial work of the present book and Dr. Leonardo Mondaini for correction of some typos on the chapter of page 208. Last but not least, he thanks Jose Martinez Guerrero from Chile and Edgar G.G. do Amaral for their help to solve some problems with LaTeX files. Rubem P. Mondaini President of the BIOMAT Consortium Chairman of the BIOMAT 2012 International Symposium Tempe, Arizona, USA, November 2012

May 6, 2013

11:20

BC: 8846 - BIOMAT 2012

00a˙preface

vii

Editorial Board of the BIOMAT Consortium Rubem Mondaini (Chair) Federal University of Rio de Janeiro, Brazil Alain Goriely University of Arizona, USA Alan Perelson Los Alamos National Laboratory, New Mexico, USA Alexander Grosberg New York University, USA Alexei Finkelstein Institute of Protein Research, Russian Federation Ana Georgina Flesia National University of Cordoba, Argentina Anna Tramontano University of Rome La Sapienza, Italy Avner Friedman Ohio State University, USA Carlos Condat National University of Cordoba, Argentina Charles Pearce Adelaide University, Australia Christian Gautier Universit´e Claude Bernard, Lyon, France Christodoulos Floudas Princeton University, USA Denise Kirschner University of Michigan, USA David Landau University of Georgia, USA De Witt Sumners Florida State University, USA Ding Zhu Du University of Texas, Dallas, USA Dorothy Wallace Dartmouth College, USA Eduardo Gonz´ alez-Olivares Catholic University of Valpara´ıso, Chile Eduardo Massad Faculty of Medicine, University of S. Paulo, Brazil Frederick Cummings University of California, Riverside, USA Fernando Cordova-Lepe Catholic University del Maule, Chile Fernando R. Momo National University of Gen. Sarmiento, Argentina Gonzalo Robledo Universidad de Chile, Santiago, Chile Guy Perri´ere Universit´e Claude Bernard, Lyon, France Gustavo Sibona National University of Cordoba, Argentina Helen Byrne University of Nottingham, UK Jaime Mena-Lorca Pontifical Catholic University of Valpara´ıso, Chile Jean Marc Victor Universit´e Pierre et Marie Curie, Paris, France John Harte University of California, Berkeley, USA John Jungck Beloit College, Wisconsin, USA Jorge Velasco-Hern´ andez Instituto Mexicano del Petr´oleo, M´exico Jos´e Flores University of South Dakota, USA Jos´e Fontanari University of S˜ ao Paulo, Brazil Juan Pablo Apar´ıcio National University of Salta, Argentina Kristin Swanson University of Washington, USA Kerson Huang Massachussets Institute of Technology, MIT, USA Lisa Sattenspiel University of Missouri-Columbia, USA

May 6, 2013

11:20

BC: 8846 - BIOMAT 2012

00a˙preface

viii

Louis Gross University of Tennessee, USA Ludek Berec Biology Centre, ASCR, Czech Republic Mariano Ricard Havana University, Cuba Michael Meyer-Hermann Frankfurt Inst. for Adv. Studies, Germany Nicholas Britton University of Bath, UK Panos Pardalos University of Florida, Gainesville, USA Peter Stadler University of Leipzig, Germany Pedro Gajardo Federico Santa Maria University, Valpara´ıso, Chile Philip Maini University of Oxford, UK Pierre Baldi University of California, Irvine, USA Ramit Mehr Bar-Ilan University, Ramat-Gan, Israel Raymond Mej´ıa National Institutes of Health, USA Reidun Twarock University of York, UK Richard Kerner Universit´e Pierre et Marie Curie, Paris, France Robijn Bruinsma University of California, Los Angeles, USA Rui Dil˜ ao Instituto Superior T´ecnico, Lisbon, Portugal Ruy Ribeiro Los Alamos National Laboratory, New Mexico, USA Timoteo Carletti Facult´es Universitaires Notre Dame de la Paix, Belgium Vitaly Volpert Universit´e de Lyon 1, France William Taylor National Institute for Medical Research, UK Zhijun Wu Iowa State University, USA

May 6, 2013

11:22

BC: 8846 - BIOMAT 2012

00b˙pearce

ix

Professor C.E.M. Pearce – In Memoriam The work done during the organization of the BIOMAT 2012 Conference is dedicated to the memory of our dear colleague and friend Charles Edward Miller Pearce (Born March 29, 1940 – Died June 09, 2012), Professor of Mathematics, University of Adelaide Australia, Elder Chair of Mathematics. He gave Keynote Speaker talks on six BIOMAT conferences (2003, 2004, 2007, 2009, 2010, 2011). A Honorary Member of the BIOMAT Consortium, he was a superb human being, a great scientist and scholar, and a very honourable gentleman. We missed him deeply.

May 9, 2013

10:12

BC: 8846 - BIOMAT 2012

00c˙contents

x

Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Editorial Board of the BIOMAT Consortium . . . . . . . . . . . . . . . . . . . . . . . . . . vii Professor C.E.M. Pearce - In Memoriam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Mathematical Epidemiology Compartmental Age of Infection Epidemic Models Fred Brauer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Mathematical Modelling of Infectious Diseases Lyme Pathogen Transmission in Tick Populations with Multiple Host Species Yijun Lou, Jianhong Wu, Xiaotian Wu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Quantifying the Risk of Mosquito-Borne Infections Basing on the Equilibrium Prevalence in Humans Marcos Amaku, Francisco A.B. Coutinho, Eduardo Massad . . . . . . . . . . . . 44 Seasonal Fluctuation in Tsetse Fly Populations and Human African Trypanosomiasis: A Mathematical Model T. Madsen, D.I. Wallace, N. Zupan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 Modelling Physiological Disorders A Mathematical Model for the Immunotherapy of Advanced Prostate Cancer Travis Portz, Yang Kuang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 Seizure Manifold of the Epileptic Brain: A State Space Reconstruction Approach Mujahid N. Syed, Pando G. Georgiev, Panos M. Pardalos . . . . . . . . . . . . . 86 Synchronous Calcium Induced Calcium Release (CICR) in a Multiple Site Model of the Cardiac Myocyte D.I. Wallace, J.E. Tanembaum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 Theoretical Immunology Modelling Natural Killer Cell Repertoire Development and Activation Dynamics Michal Sternberg-Simon, Ramit Mehr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

May 9, 2013

10:12

BC: 8846 - BIOMAT 2012

00c˙contents

xi

Saturation Effects on T-Cell Activation in a Model of a Multi-Stage Pathogen Michael Shapiro, Edgar Delgado-Eckert . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .148 Dynamic and Geometric Modelling of Biomolecular Structure Advances in DE NOVO Protein Design for Monomeric, Multimeric, and Conformational Switch Proteins James Smadbeck, George A. Khoury, Meghan B. Peterson, Christodoulos A. Floudas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 Mathematical Models and Techniques of Biomolecular Geometric Analysis K.L. Xia, F. Xin, Y. Tong, G.W. Wei . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 Towards a New Bio-Quantum Model for Signaling and Repair of DNA Damage A. Martinez Aragon, J. D. de Toledo Arruda-Neto, Y. Medina Guevara . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 Population Dynamics Viral Evolution and Adaptation as a Multivariate Branching Process F. Antoneli, F. Bosco, D. Castro, L. M. Janini . . . . . . . . . . . . . . . . . . . . . . 217 Associative Learning of a Lexicon in a Noisy Cross-Situational Scenario P.F.C. Tilles, J.F. Fontanari . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 Relationship between Rainfall and Control Effectiveness of the Aedes aegypti Population through a Non-linear Dynamical Model: Case of Lavras City, Brazil L.B. Barsante, R.T.N. Cardoso, J.L. Acebal, M.M. Morais, A.E. Eiras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 Spatiotemporal Dynamics of Telegraph Reaction-Diffusion Predator-Prey Models Eliseo Hernandez-Martinez, Hector Puebla, Teresa Perez Munoz, Margarita Gonzalez Brambila, Jorge X. Velasco-Hernandez . . . . . . . . . . . 268 Population Dynamics of Spider Monkey (Ateles hybridus) in a Fragmented Landscape of Colombia J.M. Cordovez, J.R. Arteaga B., M. Marino, A.G. de Luna, A. Link . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282

May 9, 2013

10:12

BC: 8846 - BIOMAT 2012

00c˙contents

xii

Computational Biology The Contribution of Stop Codon Frequency and Purine Bias to the Classification of Coding Sequences N. Carels, D. Frias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301 Multiclass Classification of Tree Structured Objects: The K-NN Case Ana Georgina Flesia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323 Optimal Control Techniques in Mathematical Modelling of Biological Phenomena Regularity of Optimal Cost Functional Applied to Study of Environmental Pollution Santina Arantes, Jaime Rivera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344 Pattern Recognition on Biological Phenomena A Sensitivity Analysis of Gene Expression Model N.A. Barbosa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364 A Wavelet-based Time-varying Irregular Vector Autoregressive Model G.E. Salcedo, O.E. Molina, R.F. Porto . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .390

May 6, 2013

11:43

BC: 8846 - BIOMAT 2012

01˙brauer

COMPARTMENTAL AGE OF INFECTION EPIDEMIC MODELS∗

FRED BRAUER Department of Mathematics University of British Columbia Vancouver, BC V6T 1Z2, Canada E-mail: [email protected]

The age of infection epidemic model, originally introduced by W.O. Kermack and A.G. McKendrick in 1927, includes as special cases all of the standard compartmental epidemic models with homogeneous mixing. We give a self-contained description of the basic properties of such models, including the relation between the initial exponential growth rate and the reproduction number, and the final size relation. We also extend the results to age of infection models with heterogeneity of mixing.

1. Epidemic models with homogeneous mixing We will describe models for epidemics from which individuals who recover have immunity against reinfection, that is, models of SIR type. We assume that the demographic time scale is much slower than the epidemiological time scale, or that we are concerned only with a single outbreak of a disease, so that demographic effects such as birthe and natural deaths may be ignored. Throughout recorded history, epidemics have invaded populations causing many infections and often many deaths before disappearing. The “Spanish flu” epidemic of 1918-19 may have caused over 50,000,000 deaths worldwide. In the twenty first century we have already experienced the SARS epidemic of 2002-3, a threat of an avian influenza (H5N1) outbreak in 2005, and the H1N1 influenza pandemic of 2009. We will concentrate on relatively simple models, but one should be aware that the models needed to help plan public health measures for attempting ∗ This

work is supported by the Natural Sciences and Engineering Research Council and M-prime 1

May 6, 2013

11:43

BC: 8846 - BIOMAT 2012

01˙brauer

2

to control a disease outbreak must include a great deal of detail. Usually, such models can be analyzed only by numerical simulations; fortunately, the increased availability of high-speed computing has made this possible. Relatively simple strategic models are useful for gaining qualitative understanding of the possible behavior of a model.

1.1. The simple Kermack-McKendrick model One of the early triumphs of mathematical epidemiology was the formulation of a simple model by Kermack and McKendrick in 1927 24 whose predictions are very similar to the behavior, observed in countless epidemics, of diseases that invade a population suddenly, grow in intensity, and then disappear leaving part of the population untouched. The KermackMcKendrick model is a compartmental model based on relatively simple assumptions on the rates of flow between different classes of members of the population. The SARS epidemic of 2002-3 revived interest in epidemic models, which had been largely ignored since the time of Kermack and McKendrick, in favor of models for endemic diseases. More recently, the threat of spread of avian flu raised in 2005 and the H1N1 influenza A pandemic of 2009 have provided a continuing source of important modeling questions about epidemics. In order to model such an epidemic we divide the population being studied into three classes labeled S, I, and R. We let S(t) denote the number of individuals who are susceptible to the disease, that is, who are not (yet) infected at time t. I(t) denotes the number of infected individuals, assumed infectious and able to spread the disease by contact with susceptibles. R(t) denotes the number of individuals who have been infected and then removed from the possibility of being infected again or of spreading infection. Removal is carried out either through isolation from the rest of the population, or through immunization against infection, or through recovery from the disease with full immunity against reinfection, or through death caused by the disease. We will use the terminology SIR to describe a disease which confers immunity against re-infection, to indicate that the passage of individuals is from the susceptible class S to the infective class I to the removed class R. Epidemics are usually diseases of this type, and in these notes we will consider only models in which infectives recover with full immunity against reinfection. Compartmental models have serious shortcomings as a description of

May 6, 2013

11:43

BC: 8846 - BIOMAT 2012

01˙brauer

3

the beginning of a disease outbreak, and a very different kind of model is required. Compartmental epidemic models assume that the sizes of the compartments are large enough that the mixing of members is homogeneous, or at least that there is homogeneous mixing in each subgroup if the population is stratified by activity levels. At the beginning of a disease outbreak, there is a very small number of infective individuals and the transmission of infection is a stochastic event depending on the pattern of contacts between members of the population. One possible approach to a realistic description of an epidemic would be to use a branching process model initially and then make a transition to a compartmental model when the epidemic has become established and there are enough infectives that mass action mixing in the population is a reasonable approximation, and this is the approach that we will follow. The special case of the model proposed by Kermack and McKendrick in 1927 which is the starting point for our study of epidemic models is S ′ = −βSI

I ′ = βSI − αI

(1)

′

R = αI .

It is based on the following assumptions: (i) An average member of the population makes contact sufficient to transmit infection with βN others per unit time, where N represents total population size (mass action incidence). (ii) Infectives leave the infective class at rate αI per unit time. (iii) There is no entry into or departure from the population, except possibly through death from the disease. (iv) There are no disease deaths, and the total population size is a constant N . According to (i), since the probability that a random contact by an infective is with a susceptible, who can then transmit infection, is S/N , the number of new infections in unit time per infective is (βN )(S/N ), giving a rate of new infections (βN )(S/N )I = βSI. Alternately, we may argue that for a contact by a susceptible the probability that this contact is with an infective is I/N and thus the rate of new infections per susceptible is (βN )(I/N ), also giving a rate of new infections (βN )(I/N )S = βSI. The assumption (ii) is really an assumption that the infectious period is exponentially distributed. One of the features of the age of the infection

May 6, 2013

11:43

BC: 8846 - BIOMAT 2012

01˙brauer

4

epidemic model is that it includes the possibility of arbitrary distributions of stay in a compartment. In our model R is determined once S and I are known, and we can drop the R equation from our model, leaving the system of two equations S ′ = −βSI

(2)

′

I = (βS − α)I ,

together with initial conditions S(0) = S0 ,

I(0) = I0 ,

S0 + I0 = N.

The analysis of the model begins with the observation that the sum of the two equations of (2) is (S + I)′ = −αI. Thus S+I is a non-negative smooth decreasing function and therefore tends to a limit as t → ∞. Also, it is not difficult to prove that the derivative of a smooth decreasing function that tends to a limit must tend to zero, and this shows that I∞ = lim I(t) = 0. t→∞

Thus S + I has limit S∞ . Integration of the sum of the two equations of (2) from 0 to ∞ gives Z ∞ α (S(t) + I(t))dt = S0 + I0 − S∞ = N − S∞ . 0

Division of the first equation of (2) by S and integration from 0 to ∞ gives Z ∞ S0 β S∞ ln =β I(t)dt = [N − S∞ ] = R0 1 − . S∞ α N 0 The equation (3) is called the final size relation. It gives a relation between the basic reproduction number and the size of the epidemic. The final size of the epidemic, the number of members of the population who are infected over the course of the epidemic, is N − S∞ . This is often described in terms of the attack rate (1−S∞ /N ). [Technically, the attack rate should be called an attack ratio, since it is dimensionless and is not a rate]. The quantity βN α introduced here is called the basic reproduction number. It is defined as the number of secondary infections caused by introducing a single infective R0 =

May 6, 2013

11:43

BC: 8846 - BIOMAT 2012

01˙brauer

5

individual into an entirely susceptible population. An infective individual makes βN contacts in unit time, and since all of these are with susceptibles, they produce βN secondary infections in unit time, foe a mean infetiv period of 1/α. In the model (2), if S0 ≈ N , it is easy to see that if R0 > 1, the number of infectives increases initially and we have an epidemic, while if R0 < 1, the number of infectives decreases from the start, and there is no epidemic. The basic reproduction number is a threshold quantity, distinguishing between an epidemic and a disease outbreak that does not spread. In addition, since the right side of (3) is finite, the left side is also finite, and this shows that S∞ > 0. The final size relation (3) is valid for a large variety of epidemic models, as we shall see in later sections. It is not difficult to prove that there is a unique solution of the final size relation (3), and that this solution satisfies the bound S∞
0. In the remainder of these notes, we will assume that there are no disease deaths, so that the final size relation is an equality and may be used to

May 6, 2013

11:43

BC: 8846 - BIOMAT 2012

01˙brauer

9

determine the extent of an epidemic. If there are disease deaths, it is necessary to integrate the model equations numerically to determine the extent, but if the disease death rate is small it is possible to show that the final size relation is an approximate equality. Also, there will occasions in which it is more convenient to express models in terms of the number of contacts a per individual in unit time rather than the fraction β of the total population size contacted by an average member of the population in unit time. 1.3. More complicated epidemic models We have established that the simple Kermack-McKendrick epidemic model (2) has the basic properties • There is a basic reproduction number R0 such that if R0 < 1, the disease dies out while if R0 > 1 there is an epidemic. • The number of infectives always approaches zero and the number of susceptibles always approaches a positive limit as t → ∞. • There is a relation between the reproduction number and the final size of the epidemic which is an equality if there are no disease deaths. In fact, these properties hold for epidemic models with more complicated compartmental structure. We will describe some common epidemic models as examples. These models may be considered as general templates which can be modified to fit the properties of specific diseases. In many infectious diseases there is an exposed period after the transmission of infection from susceptibles to potentially infective members but before these potential infectives develop symptoms and can transmit infection. To incorporate an exposed period with mean exposed period 1/κ we add an exposed class E and use compartments S, E, I, R and total population size N = S + E + I + R to give a generalization of the epidemic model (2) S ′ = −βSI

E ′ = βSI − κE

(5)

′

I = κE − αI.

The analysis of this model is the same as the analysis of (2), but with I replaced by E + I. That is, instead of using the number of infectives as

May 6, 2013

11:43

BC: 8846 - BIOMAT 2012

01˙brauer

10

one of the variables we use the total number of infected members, whether or not they are capable of transmitting infection. For the model (5) it is no longer possible to distinguish whether there is an epidemic or not by determining whether the number of infectives grows or decreases initially. It will be necessary to give a more defintion of an epidemic, and we will do this when we iintroduce the general age of infection epidemic model. Another extension of the simple Kermack-McKendrick model (2) includes treatment, decreasing the infectivity and perhaps decreasing the length of the infective period. This may be modeled by supposing that a fraction γ per unit time of infectives is selected for treatment, and that treatment reduces infectivity by a fraction δ. Suppose that the rate of removal from the treated class is η. This leads to the SIT R model, where T is the treatment class, given by S ′ = −βS[I + δT ]

I ′ = βS[I + δT ] − (α + γ)I

(6)

′

T = γI − ηT.

It is not difficult to prove, much as was done for the model (2) that S∞ = lim S(t) > 0, t→∞

lim I(t) = lim T (t) = 0.

t→∞

t→∞

In order to calculate the basic reproduction number, we may argue that an infective in a totally susceptible population causes a new infections in unit time, and the mean time spent in the infective compartment is 1/(α + γ). In addition, a fraction γ/(α + γ) of infectives are treated. While in the treatment stage the number of new infections caused in unit time is δβN , and the mean time in the treatment class is 1/η. Thus R0 is R0 =

βN γ δβN + α+γ α+γ η

(7)

It is also possible to establish the final size relation (3) by means very similar to those used for the simple model (2). In the various compartmental models that we have studied, there are significant common features, and these common features also hold for models with more complicated compartmental structure18,23 . This suggests that compartmental models can be put into a more general framework. In fact, this general framework is the age of infection epidemic model originally introduced by Kermack and McKendrick in 1927.

May 6, 2013

11:43

BC: 8846 - BIOMAT 2012

01˙brauer

11

2. The age of infection epidemic model The general epidemic model described by Kermack and McKendrick (1927) included a dependence of infectivity on the time since becoming infected (age of infection). We may view it as an extension and reinterpretation of the model (2). In the model (2) it is assumed that the infective period is distributed exponentially with a distribution function e−ατ . By this we mean that a fraction e−ατ of infective individuals are still infective τ time units after having been infected. Now suppose that we assume that the infective period is distributed according to a general function P (τ ), meaning that a fraction P (τ ) of infective individuals are still infective τ time units after having been infected. It is assumed that the function P (τ ) has the properties Z ∞ ′ P (0) = 1 P (τ ) ≤ 0 P (τ )dτ < ∞. 0

Also, in order to prepare for a study of models with heterogeneous mixing, we formulate the age of infection model in terms of the number of contacts a = βN of contacts per individual instead of β. If no other assumptions are changed, the model (2) is replaced by a model S ′ = −a

S I N Z

t

I(t) = I0 P (t) +

t

= I0 P (t) +

Z

a

0

0

S(t − τ ) I(t − τ )P (τ )dτ N (t − τ )

[−S ′ (t − τ )]P (τ )dτ.

Here, it is assumed that all infectives at time t = 0 have infection age zero. More generally, we assume that the number of infectives at time t who were already infective at time 0 is I0 (t). Thus we obtain the model S ′ = −a

S I N

I(t) = I0 (t) +

(8) Z

0

t

[−S ′ (t − τ )]P (τ )dτ.

The general age of infection model is obtained from (8) by letting ϕ(t) be the total infectivity at time t, defined as the sum of products of the number of infected members with each infection age and the mean infectivity for that infection age. We let π(τ ) with 0 ≤ π(τ ) ≤ 1 be the mean infectivity

May 6, 2013

11:43

BC: 8846 - BIOMAT 2012

01˙brauer

12

at infection age τ , and then we let A(τ ) = π(τ )P (τ ), the mean infectivity of members of the population with infection age τ . We assume that there are no disease deaths, so that the total population size is a constant N . The age of infection epidemic model is S ′ = −a

S ϕ N t

ϕ(t) = ϕ0 (t) +

Z

t

= ϕ0 (t) +

Z

a

0

0

S(t − τ )ϕ(t − τ )A(τ )dτ N

(9)

[−S ′ (t − τ )]A(τ )dτ.

We note that the analysis of (9) is exactly the same as the analysis of (8) with I replaced by ϕ and P (τ ) replaced by A(τ ). We will carry out the analysis of (9). It is easy to see that the basic reproduction number of (9) is Z ∞ R0 = a A(τ )dτ. 0

We write −

S ′ (t) a a = ϕ0 (t) + S(t) N N

Z

0

t

[−S ′ (t − τ )]A(τ )dτ.

Integration with respect to t from 0 to ∞ gives Z Z Z S0 a ∞ a ∞ t ln = ϕ0 (t)dt + [−S ′ (t − τ )]A(τ )dτ dt S∞ N 0 N 0 0 Z Z Z ∞ a ∞ a ∞ ϕ0 (t)dt + A(τ ) [−S ′ (t − τ )]dtdτ = N 0 N 0 τ Z Z ∞ a ∞ = ϕ0 (t)dt + [S0 − S∞ ] A(τ )dτ (10) N 0 0 Z ∞ Z ∞ S∞ a = a 1− A(τ )dτ + [ϕ0 (t) − (N − S0 )A(τ )dτ N N 0 0 Z ∞ S∞ a = R0 1 − − [(N − S0 )A(t) − ϕ0 (t)]dt. N N 0

May 6, 2013

11:43

BC: 8846 - BIOMAT 2012

01˙brauer

13

Here, ϕ0 (t) is the total infectivity of the initial infectives at time t. If all initial infectives have infection age zero at t = 0, ϕ0 (t) = [N − S0 ]A(t), and Z ∞ [ϕ0 (t) − (N − S0 )A(t)]dt = 0. 0

Then (10) takes the form S0 S∞ ln , = R0 1 − S∞ N

(11)

and this is the general final size relation. If there are initial infectives with infection age greater than R ∞ zero, let u(τ ) be the fraction of these individuals with infection age τ, 0 u(τ )dτ = 1. At time t these individuals have infection age t + τ and mean infectivity A(t + τ ). Thus Z ∞ u(τ )A(t + τ )dτ, ϕ0 (t) = (N − S∞ ) 0

and Z ∞ 0

ϕ0 (t)dt = (N − S∞ )

Z

∞

Z

∞

u(τ )A(t + τ )dτ dt Z ∞ Z ∞ = (N − S∞ ) u(τ ) A(v)dv dτ 0 Zτ v Z ∞ Z = (N − S∞ ) A(v) u(τ )dτ dv ≤ (N − S∞ ) 0

0

0

0

∞

A(v)dv,

0

Rv since 0 u(τ )dτ ≤ 1. Thus, the initial term satisfies Z ∞ [(N − S0 )A(t) − ϕ0 (t)]dt ≥ 0. 0

The examples studied in Section 1.3 are all included in the age of infection model (9) as special cases. However, although the age of infection formulation gives a general structure, calculations involving integrals depending on A(τ ) may be quite complicated. Example 1. Consider an SEIR model in which the exposed period has a distribution given by a function Q and the infective period has a distribution given by a function P . It is necessary to do some preliminary analysis before we can formulate the model. We begin with the equations

May 6, 2013

11:43

BC: 8846 - BIOMAT 2012

01˙brauer

14

for S and E, S′ = −

a SI N

E(t) = E0 Q(t) +

Z

t

[−S ′ (s)]Q(t − s)ds.

0

In order to obtain an equation for I, we differentiate the equation for E, obtaining Z t ′ ′ ′ E (t) = E0 Q (t) − S (t) + [−S ′ (s)]Q′ (t − s)ds. 0

Thus the input to I at time t is Z t E0 Q′ (t) + [−S ′ (s)]Q′ (t − s)ds, 0

and I(t) = E0

Z

t

0

Q′ (u)P (t − u)du +

Z

0

t

[−S ′ (s)]Q′ (u − s)dsP (t − u)du.

The first term in this expression may be written as I0 (t), and the second term may be simplified, using interchange of the order of integration in the iterated integral, to yield Z tZ u Z tZ t [−S ′ (s)]Q′ (u−s)dsP (t−u)du = Q′ (u−s)duP (t−u)[−S ′ (s)]ds. 0

0

0

s

If we define M (t − s) =

Z

t

s

Q′ (u − s)P (t − u)du =

Z

0

t−s

Q′ (t − s − v)P (v)dv,

or M (τ ) =

Z

τ

Q′ (τ − v)P (v)dv,

0

(12)

we obtain I(t) = I0 (t) +

Z

t 0

[−S ′ (s)]M (t − s)ds.

Then the model is S′ = −

a SI N

(13)

Z t E(t) = E0 Q(t) + [−S ′ (s)]Q(t − s)ds 0 Z t I(t) = I0 (t) + [−S ′ (s)]M (t − s)ds, 0

May 6, 2013

11:43

BC: 8846 - BIOMAT 2012

01˙brauer

15

which is in age of infection form with ϕ = I and A(τ ) = M (τ ), and we have an explicit expression (12) for M (τ ). As suggested by this example, there are general methods for calculation of integrals involving A(τ ) without the necessity of calculating the function A explicitly 8,40 . 2.1. The initial exponential growth rate Earlier we calculated the initial exponential growth rate for the simple Kermack-McKendrick epidemic model (2). The initial exponential growth rate is related to the basic reproduction number, but the relation depends also on the specific model. In fact, our derivation was not completely correct, because it involved linearization of a model at a non-isolated equilibrium, and the use of linearization at an equilibrium to analyze asymptotic behaviour requires additional justification.. In this section we give a new definition of an epidemic for a general age of infection model and establish rigorously an equation relating the initial exponential growth rate and the basic reproduction number. Definition 2.1. In a disease transmission model with no demographic effects, there is no epidemic if the equilibrium with all members of the population susceptible is (locally) asymptotically stable, and there is an epidemic if this equilibrium is unstable, in each case considering only perturbations of the equilibrium with positive infected initial states. In order to validate this definition it is necessary to develop the analysis of equilibria of (5). The first step is to replace (9) by the limit equation S S ′ = −a ϕ N Z ∞ S(t − τ ) ϕ(t) = a ϕ(t − τ )A(τ )dτ. N 0

(14)

This limit equation is just the model (9) with a particular choice of initial function ϕ0 . According to the asymptotic theory of integral equations 25 , the asymptotic behavior of (9) is the same as that of the limit equation (14) for every initial function with limt→∞ ϕ0 (t) = 0. To analyze (14), we would ordinarily linearize about an equilibrium, but this approach is not applicable since there is a line of equilibria ϕ = 0, and the standard

May 6, 2013

11:43

BC: 8846 - BIOMAT 2012

01˙brauer

16

linearization theory is not applicable. In order to avoid this problem, we replace the model (14) by the model S (15) S ′ = µN − a ϕ − µS N Z ∞ S(t − τ ) ϕ(t) = a ϕ(t − τ )e−µτ A(τ )dτ. N 0 The model (15) is obtained from the model (14) by including a birth rate µN of susceptibles and a proportional death rate µ in each class. In fact, the age of infection epidemic model neglects demographic processes, arguing that these operate on a much slower time scale than the epidemiological process. In the model (15) we are restoring the demographic process; we will assume that µ is small and will ultimately return to (14) and (5) by letting µ → 0. The model (15) has an isolated disease-free equilibrium S = N, ϕ = 0, and the linearization at this equilibrium is u′ (t) = −µu + av(t) Z ∞ v(t) = a v(t − τ )e−µτ A(τ dτ. 0

Note that if v = 0 for t ≤ 0, the linearization has solution u(t) = u(0)e−µt ,

v(t) ≡ 0.

In our definition of an epidemic we have ruled out such initial states. The characteristic equation is the condition on λ that the linearization have a solution u = u0 eλt , v = v0 eλt , and this is −(µ + λ) R a det = 0. ∞ 0 a 0 e−(λ+µ)τ A(τ )dτ − 1

There are two roots of the characteristic equation, namely λ = −µ and the solution of Z ∞ a e−(λ+µ)τ A(τ )dτ = 1. (16) 0

If the solution of (16) is positive there is an epidemic. The integral in (16) is a decreasing function Fµ (λ) of λ, and if Fµ (0) > 1 the solution of (16) is positive. If R0 > 1, then Fµ (0) > 1 for all sufficiently small positive µ. We may take the limit as µ → 0, to see that the characteristic equation of the linearization of (14) at the disease - free equilibrium S = N, ϕ = 0 is the solution of Z ∞ a e−λτ A(τ )dτ = 1. (17) 0

May 6, 2013

11:43

BC: 8846 - BIOMAT 2012

01˙brauer

17

The solution λ of (17) is positive if R0 > 1 and negative if R0 < 1. We have thus established the following result. Theorem 2.1. There is an epidemic for the model (5) if and only if R0 > 1. If there is an epidemic, there are solutions with exponential growth rate given by the solution of (17). The result that the intial exponential growth rate in an epidemic is given by the solution of (17) was stated in 39 , but with an incomplete proof. The results that we have obtained are valid for the age of infection epidemic model, and thus are applicable to compartmental models that can be interpreted in an age of infection context. By interpretation as an age of infection epidemic model we mean the ability to calculate the infectivity function A(τ ).

3. Heterogeneous mixing age of infection models The classical simple epidemic models 1,2,12,30 assume homogeneous mixing of members of the population being studied, and this is certainly unrealistically simple. Contact rates may be age-dependent, and this would suggest the use of age-structured models. In this section we consider heterogeneity in behavior, specifically contact rates. In sexually transmitted diseases there is often a “core” group of very active members who are responsible for most of the disease cases, and control measures aimed at this core group have been very effective in control 22 . In epidemics there are often “super-spreaders”, who make many contacts and are instrumental in spreading disease and in general some members of the population make more contacts than others. Recently there has been a move to complicated network models for simulating epidemics4,32,33,34,35,37 . These assume knowledge of the mixing patterns of groups of members of the population and make predictions based on simulations of a stochastic model. The theoretical analysis of network models is a very active and rapidly developing field. It is possible to consider models more realistic than simple compartmental models but simpler to analyze than detailed network models. To model heterogeneity in mixing we may assume that the population is divided into subgroups with different activity levels. In this way, we may hope to give

May 6, 2013

11:43

BC: 8846 - BIOMAT 2012

01˙brauer

18

models intermediate between the too simple compartmental models with homogeneous mixing and the too complicated full network models. Just as with homogeneous mixing epidemic models, there is a general setting that includes many different compartmental structures. In this section, we describe an age of infection model dividing the population into two subgroups with different contact rates. Our description generalizes easily to any finite number of subgroups and even to a continuous distribution of subgroups 9 , but we restrict ourselves to two subgroups for simplicity and clarity. Consider two subpopulations of constant sizes N1 , N2 respectively, each divided into susceptibles, infected, and removed members with subscripts to identify the subpopulation. Suppose that group i members make ai contacts in unit time and that the fraction of contacts made by a member of group i that is with a member of group j is pij , (i, j = 1, 2). Then p11 + p12 = p21 + p22 = 1. A two-group age of infection model with general mixing would be ϕ2 (t) ϕ1 (t) ′ + p12 S1 (t) = −a1 S1 p11 N1 N2 Z t (0) ϕ1 (t) = ϕ1 (t) + [−S1′ (t − τ )]A1 (τ )dτ (18) 0 ϕ1 (t) ϕ2 (t) S2′ (t) = −a2 S2 p21 + p22 N1 N2 Z t (0) ϕ2 (t) = ϕ2 (t) + [−S2′ (t − τ )]A2 (τ )dτ, 0

(0)

where ϕi (t) is the infectivity in group i at time t, ϕi (t) is the infectivity at time t of members of group i who were infected at time 0, and Ai (τ ) is the average infectivity of members of group i with infection age τ . The infectivity of an infected member of group 2 with infection age τ towards a susceptible member of group 1 is a1 p12 A2 (τ ). The next generation matrix, in the sense of 38 , is R∞ R∞ a1 p11 R0 A1 (τ )dτ a1 p12 R0 A2 (τ )dτ . ∞ ∞ a2 p21 0 A1 (τ )dτ a2 p22 0 A2 (τ )dτ

Thus R0 is the largest root of R∞ R∞ p11 a1 0 A1 (τ )dτ − λ p12 a1 0 A2 (τ )dτ R R det = 0. ∞ ∞ p21 a2 0 A1 (τ )dτ p22 a2 0 A2 (τ )dτ − λ

(19)

May 6, 2013

11:43

BC: 8846 - BIOMAT 2012

01˙brauer

19

In order to obtain a more useful expression for R0 , it is necessary to make some assumptions about the nature of the mixing between the two groups. There has been much study of mixing patterns, see for example 5,6,10 . One possibility is proportionate mixing, that is, that the number of contacts between groups is proportional to the relative activity levels. In other words, mixing is random but constrained by the activity levels 36 . Under the assumption of proportionate mixing, pij =

aj N j , a1 N 1 + a2 N 2

and we may write p11 = p21 = p1 ,

p12 = p22 = p2 ,

with p1 + p2 = 1. In particular, p11 p22 − p12 p21 = 0, and thus R0 = p1 a1

Z

∞

A1 (τ )dτ + p2 a2

0

Z

∞

A2 (τ )dτ.

0

Another possibility is preferred mixing 36 , in which a fraction πi of each group mixes randomly with its own group and the remaining members mix proportionately. Thus, preferred mixing is given by p11 = π1 + (1 − π1 )p1 , p21 = (1 − π2 )p1 ,

p12 = (1 − π1 )p2

(20)

p22 = π2 + (1 − π2 )p2 ,

with pi =

(1 − πi )ai Ni . (1 − π1 )a1 N1 + (1 − π2 )a2 N2

Proportionate mixing is the special case of preferred mixing with π1 = π2 = 0. It is also possible to have like-with-like mixing, in which members of each group mixes only with members of the same group. This is the special case of preferred mixing with π1 = π2 = 1. For like-with-like mixing, p11 = p22 = 1, p12 = p21 = 0. R∞ R∞ Then the roots of (19) are a1 0 A1 (τ )dτ and a2 0 A2 (τ )dτ , and the reproduction number is Z ∞ Z ∞ R0 = max a1 A1 (τ )dτ, a2 A1 (τ )dτ ] . 0

0

May 6, 2013

11:43

BC: 8846 - BIOMAT 2012

01˙brauer

20

3.1. The final size relations For a one-group model there is a final size relation that makes it possible to calculate the size of an epidemic from the reproduction number 7,29 . There is a corresponding final size relation for the two-group model (18), established in much the same way. This relation does not involve the reproduction number explicitly but it still makes it possible to calculate the size of the epidemic from the model parameters. Integration of the equation for Si′ (t)/Si (t) in (18) gives Z ∞X 2 ϕj (t Si (0) = ai pij dt ln Si (∞) Nj 0 j=1 Rt Z ∞X (0) 2 ϕj (t) + 0 [−Sj′ (t − τ )]Aj (τ )dτ = ai pij dt Nj 0 j=1 ∞

2 X

ϕj (t) dt Nj 0 j=1 Rt Z ∞X 2 [−Sj′ (t − τ )]Aj (τ )dτ +ai pij 0 dt Nj 0 j=1

= ai

Z

pij

If all initial infectives have infection age zero at time t = 0, (0)

ϕj (t) = Aj (t)[Nj − Sj (0)], and the first term on the right side of (21) is Z ∞X 2 Sj (0) ai pij Aj (t) 1 − dt. Nj 0 j=1 The second term on the right side of (21) is Z ∞X Z 2 pij t ai [−Sj′ (t − τ )]Aj (τ )dτ dt N j 0 0 j=1 = ai

Z

∞

Z

∞

0

= ai

Z 2 X pij j=1

0

2 X

2 X j=1

Nj

pij

τ

∞

[−Sj′ (t − τ )]dtAj (τ )dτ

Sj (0) − Sj (∞) Aj (τ )dτ Nj

2 X Sj (∞) ˆ Sj (0) = ai pij 1 − Aj − ai pij 1 − Nj Nj j=1 j=1

(21)

May 6, 2013

11:43

BC: 8846 - BIOMAT 2012

01˙brauer

21

Thus we may rewrite (21) as the final size system Z ∞ 2 X Sj (∞) Si (0) = ai pij 1 − Aj (τ )dτ, ln Si (∞) Nj 0 j=1

i = 1, 2, · · · , n.

(22)

If there are initial infectives with positive infection age, the final size system contains an initial term and takes the form ln

Z ∞ 2 X Si (0) Sj (∞) = ai pij 1 − Aj (τ )dτ − Γi , Si (∞) Nj 0 j=1

(23)

with Γi = a i

Z 2 X pij j=1

Nj

0

∞

h i (0) Aj (t) (Nj − Sj (0)) − ϕj (t) dt ≥ 0.

The system of equations (23) has a unique solution (S1 (∞), S2 (∞)). The final size relation takes a simpler form if the mixing is proportionate. With proportionate mixing, since pij is independent of i, 1 Si (0) 1 Sj (0) ln = ln . ai Si (∞) aj Sj (∞) This enables us to write S2 (∞) on the right side of the final size relation in terms of S1 (∞), and gives the final size system as an equation for S1 (∞). Then we obtain the epidemic final size by solving a single equation for S1 (∞) and using the expression for S2 (∞) in terms of S1 (∞). Numerical simulations indicate that models with heterogeneous mixing may give very different epidemic sizes than models with the same basic reproduction number and homogeneous mixing. The reproduction number of an epidemic model is not sufficient to determine the size of the epidemic if there is heterogeneity in the model. It is possible to show that if the mixing is proportionate, then for a given value of the basic reproduction number the maximum epidemic size is obtained with homogeneous mixing. We conjecture that this result is also valid if we allow arbitrary mixing, that is, we conjecture that for a given value of the basic reproduction number the maximum epidemic size for any mixing is obtained with homogeneous mixing.

May 6, 2013

11:43

BC: 8846 - BIOMAT 2012

01˙brauer

22

3.2. The initial exponential growth rate The analysis of the model (18) is analogous to the analysis carried out for the homogeneous mixing model (9). In order to obtain an expression for the initial exponential growth rate if there is an epidemic, we first replace the model (18) by the limit system ϕ1 ϕ2 S1′ = −a1 S1 p11 + p12 N1 N2 Z ∞ ϕ1 (t) = [−S1′ (t − τ )A1 (τ )dτ (24) 0 ϕ1 ϕ2 S2′ = −a2 S2 p21 + p22 N1 N2 Z ∞ ϕ1 (t) = [−S2′ (t − τ )A2 (τ )dτ. 0

According to the asymptotic theory of 25 , the asymptotic behaviour of (18) is the same as that of the limit system (24) for all initial functions ϕ01 (t), ϕ02 (t) that tend to zero as t → ∞. In order to avoid the difficulties posed by the fact that there is a twodimensional subspace of equilibria ϕ1 = ϕ2 = 0, we include small birth rates in the equations for S1 , S2 and corresponding proportional natural death rates in each compartment, to give the system ϕ2 ϕ1 + p12 S1′ = µN1 − µS − a1 S1 p11 N1 N2 Z ∞ ϕ1 (t) = [−S1′ (t − τ )e−µτ A1 (τ )dτ (25) 0 ϕ1 ϕ2 S2′ = µN2 − µS2 − a2 S2 p21 + p22 N1 N2 Z ∞ ϕ1 (t) = [−S2′ (t − τ )e−µτ A2 (τ )dτ. 0

Our procedure now follows the procedure used in the analysis of the intiial exponential growth rate in the homogeneous mixing age of infection model (9) in Section 2. We linearize about the unique disease-free equilibrium S1 = N 1 ,

ϕ1 = 0,

S2 = N 2 ,

ϕ2 = 0,

and form the characteristic equation (the condition on r that this linearization have a non-zero solution for u1 (0, v1 (0), u2 (0), v2 (0)). There is a double

May 6, 2013

11:43

BC: 8846 - BIOMAT 2012

01˙brauer

23

root r = −µ < 0, and the remaining roots of the characteristic equation are the roots of R∞ R∞   p11 a1 0 e−(r+µ)τ A1 (τ )dτ − 1 p12 a1 0 e−(r+µ)τ A2 (τ )dτ  = 0. det  R ∞ −(r+µ)τ R ∞ −(r+µ)τ p21 a2 0 e A1 (τ )dτ a2 p22 0 e A2 (τ )dτ − 1 Since this is valid for every sufficiently small µ > 0, we may let µ → 0 and conclude that if there is an epidemic, corresponding to an unstable equilbrium of the model, there is a positive root of the characteristic equation R∞ R∞   a1 p11 0 e−rτ A1 (τ )dτ − 1 a1 p12 0 e−rτ A2 (τ )dτ  = 0, (26) det  R ∞ −rτ R ∞ −rτ a2 p21 0 e A1 (τ )dτ a2 p22 0 e A2 (τ )dτ − 1 and the initial exponential growth rate is equal to this root. In the special case of proportionate mixing, in which p11 = p21 , p12 = p22 , so that p12 p21 = p11 p22 , the basic reproduction number is given by Z ∞ Z ∞ R0 = p11 a1 A1 (τ )dτ + p22 a2 A2 (τ )dτ, 0

0

and the characteristic equation (26) reduces to Z ∞ Z ∞ −rτ p11 a1 e Ai (τ )dτ + p22 a2 e−rτ Ai (τ )dτ = 1. 0

(27)

0

There is an epidemic if and only if R0 > 1. We have seen that in the case of homogeneous mixing, knowledge of the initial exponential growth rate and the infectious period distribution is sufficient to determine the basic reproduction number and thence the final size of an epidemic. In the case of heterogeneous mixing, knowledge of the initial exponential growth rate and the infectious period distribution is sufficient to determine the basic reproduction number, but not to determine the final size of the epidemic. This raises the question of what additional information that may be measured at the start of a disease outbreak would suffice to determine the epidemic final size if the mixing is heterogeneous. We assume that A1 (τ ), A2 (τ ) and the mixing matrix p p M = 11 12 p21 p22 are known. The next generation matrix is R∞ R∞ a1 p11 0 A1 (τ )dτ a1 p12 0 A2 (τ )dτ R R , K= ∞ ∞ a2 p21 0 A1 (τ )dτ a2 p22 0 A2 (τ )dτ

May 6, 2013

11:43

BC: 8846 - BIOMAT 2012

01˙brauer

24

and R0 is the largest (positive) eigenvalue of this matrix. There is a corresponding eigenvector with positive components u1 u= . u2 Since the components of this eigenvector give the proportions of infectious cases in the two groups initially, it is reasonable to hope to be able to determine this eigenvector from early outbreak data. The final size relations (22) may be solved for S1 (∞), S2 (∞) if the contact rates a1 , a2 can be determined from the available information. The condition that the vector u with components (u1 , u2 ) is an eigenvector of the next generation matrix corresponding to the eigenvector R0 is Z ∞ a1 A1 (τ )dτ [p11 u1 + p12 u2 ] = R0 u1 0 Z ∞ a2 A2 (τ )dτ [p21 u1 + p22 u2 ] = R0 u2 . 0

and since it is assumed that the function A(τ ), the vector u, and the mixing matrix (pij ) are known these equations determine a1 and a2 . When these values are substituted into the final size system, S1 (∞) and S2 (∞) may be determined. This argument extends easily to models with an arbitrary number of activity groups. In real-life applications, there are usually many groups, and the final size of an epidemic is obtained most efficiently by numerical simulations. The results obtained here are more likely to be useful in theoretical applications, such as comparisons of different control strategies. 4. Different models for the same epidemic Suppose we have an age of infection model for which we know the function A(τ ) representing the total infectivity of individuals with age of infection τ . That is, we assume A(τ ) = A1 (τ ) = A2 (τ ). If we assume that mixing is homogeneous, we are led to a model (9). If we measure (or estimate) the initial exponential growth rate r, then, as we

May 6, 2013

11:43

BC: 8846 - BIOMAT 2012

01˙brauer

25

have seen earlier, R∞

A(τ )dτ R0 = R ∞0 −rτ . e A(τ )dτ 0

(28)

Now suppose, however, that there is in fact some heterogeneity in the model, specifically that there are two subgroups of size N1 , N2 with N = N1 +N2 and with contact rates a1 , a2 respectively, which mix in an arbitrary way. Then R0 is the largest root of (19) with A1 (τ ) = A2 (τ ), or R∞ R∞   p11 a1 0 A(τ )dτ − λ p12 a1 0 A(τ )dτ  = 0. det  (29) R∞ R∞ p21 a2 0 A(τ )dτ p22 a2 0 A(τ )dτ − λ The equation for the initial exponential growth rate is (26) with A1 (τ ) = A2 (τ ), or R∞ R∞   p11 a1 0 e−λτ A(τ )dτ − 1 p12 a1 0 e−λτ A(τ )dτ  = 0. (30) det  R ∞ −λτ R ∞ −λτ p21 a2 0 e A(τ )dτ p22 a2 0 e A(τ )dτ − 1 R∞ (29) and (30), we see that each of R0 / 0 A(τ )dτ and RComparing ∞ 1/ 0 e−rτ A(τ )dτ is the largest root of the equation x2 − (a1 p11 + a2 p22 )x + a1 a2 (p11 p22 − p12 p21 ) = 0.

Thus R∞ 0

1 R0 = R ∞ −rτ , A(τ )dτ e A(τ )dτ 0

which implies the same relation (28) as for the homogeneous mixing model (5). Thus, if we assume heterogeneous mixing, we obtain the same estimate of the reproduction number from observation of the initial exponential growth rate, and this conclusion remains valid for an arbitrary number of groups with different contact rates. This result does not generalize to the the case A1 (τ ) 6= A2 (τ ). One may think of the case a1 6= a2 , A1 (τ ) = A2 (τ ) as a model for a disease with heterogeneous mixing but not treatment and the case a1 = a2 , A1 (τ ) 6= A2 (τ ) as a model for a disease in which the mixing is homogeneous but treatment that changes the infectious period distribution has been applied to a part of the population. Of course, if the treatment also includes quarantine that also changes the contact rate, the case a1 6= a2 , A1 (τ ) 6= A2 (τ ) would be appropriate.

May 6, 2013

11:43

BC: 8846 - BIOMAT 2012

01˙brauer

26

References 1. Anderson, R.M. & May, R.M. (1979) Population biology of infectious diseases I, Nature 280: 361–367. 2. Anderson, R.M. & R.M. May (1991) Infectious Diseases of Humans. Oxford University Press (1991) 3. Arino, J., F. Brauer, P. van den Driessche, J. Watmough & J. Wu (2007) A final size relation for epidemic models, Math. Biosc. & Eng. 4: 159–176. 4. Bansal, S., J. Read, B. Pourbohloul, and L.A. Meyers (2010) The dynamic nature of contact networks in infectious disease epidemiology, J. Biol. Dyn., 4: 478–489. 5. Blythe, S.P., S. Busenberg & C. Castillo-Chavez (1995) Affinity and pairedevent probability, Math. Biosc. 128: 265–284. 6. Blythe, S.P., C. Castillo-Chavez, J. Palmer & M. Cheng (1991) Towards a unified theory of mixing and pair formation, Math. Biosc. 107: 379–405. 7. Brauer, F. (2005) The Kermack-McKendrick epidemic model revisited, Math. Biosc. 198: 119–131. 8. Brauer, F., C. Castillo-Chavez, & Z. Feng (2010) Discrete epidemic models, Math. Biosc. & Eng. 7: 1–15. 9. Brauer, F. & J. Watmough (2009), Age of infection models with heterogeneous mixing, J. Biol. Dyn.3: 324–330. 10. Busenberg, S. & C. Castillo-Chavez (1989) Interaction, pair formation and force of infection terms in sexually transmitted diseases, In Mathematical and Statistical Approaches to AIDS Epidemiology, Lect. Notes Biomath. 83, C. Castillo-Chavez (ed.), Springer-Verlag, Berlin-Heidelberg-New York: 289–300. 11. Castillo-Chavez, C., K. Cooke, W. Huang, and S.A. Levin (1989a) The role of long incubation periods in the dynamics of HIV/AIDS. Part 1: Single Populations Models, J. Math. Biol., 27: 373–98. 12. Diekmann, O. & J.A.P. Heesterbeek (2000) Mathematical epidemiology of infectious diseases: Model building, analysis and interpretation, John Wiley & Sons, New York. 13. Diekmann, O., J.A.P. Heesterbeek, and J.A.J. Metz (1990) On the definition and the computation of the basic reproductive ratio mathcalR0 in models for infectious diseases in heterogeneous populations, J. Math. Biol., 28:365–382. 14. Dietz, K. (1982) Overall patterns in the transmission cycle of infectious disease agents, In: R.M. Anderson, R.M. May (eds), Population Biology of Infectious Diseases, Life Sciences Research Report 25, Springer-Verlag, Berlin-Heidelberg-New York: 87–102. Am. J. Epidem. 103: 152–165. 15. Ferguson, N.M., D.A.T. Cummings, S. Cauchemez, C. Fraser, S. Riley, A. Meeyai, S. Iamsirithaworn, & D.S. BUrke (2005) Strategies for containing an emerging influenza pandemic in Southeast Asia, Nature, 437: 209–214. 16. Ferguson, N.M., D.A.T. Cummings, C. Fraser, J.C. Cajka, P.C. Cooley & D.S. Burke, Strategies for mitigating an influenza pandemic (2006) Nature, 442: 448–452.

May 6, 2013

11:43

BC: 8846 - BIOMAT 2012

01˙brauer

27

17. Gani, R., H. Hughes, T. Griffin, J. Medlock, & S. Leach (2005) Potential impact of antiviral use on hospitalizations during influenza pandemic, Emerg. Infect. Dis. 11: 1355–1362. 18. Gumel, A., S. Ruan, T. Day, J. Watmough, P. van den Driessche, F. Brauer, D. Gabrielson, C. Bowman, M.E. Alexander, S. Ardal, J. Wu, and B.M. Sahai (2004) Modeling strategies for controlling SARS outbreaks based on Toronto, Hong Kong, Singapore and Beijing experience, Proc. Roy. Soc. London, 271: 2223–2232. 19. Heesterbeek, J.A.P. (1992) R0 , Thesis, CWI, Amsterdam. 20. Heesterbeek, J.A.P. and J.A.J Metz (1993) The saturating contact rate in marriage and epidemic models, J. Math. Biol., 31: 529–539. 21. Heffernan, J.M., R.J. Smith? and L.M. Wahl (2005) Perspectives on the basic reproductive ratio, J. Roy. Soc. Interface, 2: 281–293. 22. Hethcote, H.W. & J.A. Yorke (1984) Gonorrhea Transmission Dynamics and Control, Lect. Notes in Biomath. 56, Springer-Verlag, BerlinHeidelberg-New York. 23. Hyman, J.M., J. Li & E. A. Stanley (1999) The differential infectivity and staged progression models for the transmission of HIV, Math. Biosci., 155: 77–109. 24. Kermack, W.O. and A.G. McKendrick (1927) A contribution to the mathematical theory of epidemics, Proc. Royal Soc. London, 115:700–721. 25. Levin, J.J. & D.F, Shea (1972) On the asymptotic behavior of the bounded solutions of some integral equations, I, II, III, J. Math. Anal. & Appl. 37: 42-82, 288-326, 537-575. 26. Longini, I.M., M.E. Halloran, A. Nizam, & Y. Yang (2004) Containing pandemic influenza with antiviral agents, Am. J. Epidem. 159: 623–633. 27. Longini, I.M., A. Nizam, S. Xu, K. Ungchusak, W. Hanshaoworakul, D.A.T. Cummings, & M.E. Halloran (2005) Containing pandemic influenza at the source, Science 309: 1083–1087. 28. Longini, I.M. & M. E. Halloran (2005) Strategy for distribution of influenza vaccine to high - risk groups and children, Am. J. Epidem. 161: 303–306. 29. Ma, J. & D.J.D. Earn (2006), Generality of the final size formula for an epidemic of a newly invading infectious disease, Bull. Math. Biol. 68: 679– 702. 30. May, R.M. & Anderson, R.M. (1979) Population biology of infectious diseases II, Nature 280: 455–461. 31. Mena-Lorca, J. and H.W. Hethcote (1992) Dynamic models of infectious diseases as regulators of population size, J. Math. Biol., 30: 693–716. 32. Meyers, L.A. (2007) Contact network epidemiology: Bond percolation applied to infectious disease prediction and control, Bull. Am. Math. Soc. 44: 63–86. 33. Meyers, L.A., M.E.J. Newman & B. Pourbohloul (2006) Predicting epidemics on directed contact networks, J. Theor. Biol. 240: 400–418. 34. Newman, M.E.J. (2002) The spread of epidemic disease on networks, Phys. Rev. E, 66, 016128. 35. Newman, M.E.J. (2003) The structure and function of complex networks,

May 6, 2013

11:43

BC: 8846 - BIOMAT 2012

01˙brauer

28

SIAM Review, 45: 167–256. 36. Nold, A. (1980) Heterogeneity in disease transmission modeling, Math. Biosc. 52: 227–240. 37. Pourbohloul, B. & J. Miller (2008) Network theory and the spread of communicable diseases, Center for Disease Modeling Preprint 2008-03, www.cdm.yorku.ca/cdmprint03.pdf 38. van den Driessche, P. & J. Watmough (2002) Reproduction numbers and subthreshold endemic equilibria for compartmental models of disease transmission, Math. Biosc. 180:29–48. 39. Wallinga, J. & M. Lipsitch (2007) How generation intervals shape the relationship between growth rates and reproductive numbers, Proc. Royal Soc. B 274: 599–604. 40. Yang, C.K. & F. Brauer (2008) Calculation of R0 for age-of-infection models, Math. Biosc. & Eng. 5: 585–599.

May 6, 2013

12:5

BC: 8846 - BIOMAT 2012

02˙lou-wu

LYME PATHOGEN TRANSMISSION IN TICK POPULATIONS WITH MULTIPLE HOST SPECIES

YIJUN LOU Department of Applied Mathematics, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong Mprime Centre for Disease Modelling, York Institute of Health Research, Toronto, Ontario, M3J 1P3, Canada E-mail: [email protected] JIANHONG WU∗ AND XIAOTIAN WU Laboratory for Industrial and Applied Mathematics, York University, Toronto, Ontario, M3J 1P3, Canada Mprime Centre for Disease Modelling, York Institute of Health Research, Toronto, Ontario, M3J 1P3, Canada E-mail: [email protected], [email protected]

The vectors of Lyme disease can feed on a wide range of host species with variable reservoir competence, and the transmission dynamics of Lyme pathogen is also influenced by seasonal temperature. Here we summarize some recent results based on a stage-structured time-periodic deterministic model integrating seasonal tick development and activity, multiple host species and complex transmission routes between ticks and hosts. Of particular focus is on qualitative conditions for successful tick invasion and disease persistence, numerical simulations on the impact of climate warming on the pathogen transmission dynamics, and analysis how host diversity will dilute or amplify the Lyme disease risk to public health.

1. Introduction Lyme disease can be transmitted to humans during B.burgdorferi-infected blacklegged ticks (Ixodes scapularis) feeding. The pathogen transmission involves three ecological and epidemiological processes: nymphal ticks infected in the previous year appear first; these ticks transmit the pathogen to their susceptible vertebrate hosts during feeding period; the next generation larvae acquire infection by sucking recently infected hosts’ blood later ∗ Corresponding

author 29

May 6, 2013

12:5

BC: 8846 - BIOMAT 2012

02˙lou-wu

30

in the same year and these larvae develop into nymphs in the next year, which completes the transmission cycle. Among many factors for the Lyme risk 17 are host diversity 21,13,14,20 , stage structure of ticks 6,19,18,15,14,12,2,26 and climate effects 18,15,14,7,1 . Modelling disease transmission incorporating multiple life stages, tick seasonality and host community composition is crucial to understanding the pathogen transmission. There have been some modelling efforts, but many of these modelling studies incorporating some of the aforementioned facts do not permit analytic investigation, rather than simulations. In 25 , we developed a modelling framework to investigate the impact of multiple tick life stages, tick seasonality and host diversity on the infection cycle of the Lyme disease agent, and this model seems to be mathematically tractable. In this study, we followed the framework proposed by Randolph and Rogers 19 , and we divided the vector population into 7 stages with 12 subclasses. This scheme can account for the following key features in tick development and the pathogen transmission: (1) temperature-dependent/temperature-independent development rate; (2) temperature-dependent host seeking rate; (3) density-dependent mortality, caused by the hosts’ responses during the feeding period; (4) density-independent/constant mortality induced by the influence of abiotic factors acting on the off-host development stages. The modelling study aimed to answering the following questions on Lyme transmission from a theoretical point of view: Could climate change redistribute the disease pathogen? Could the change of host diversity by adding alternative host species into the community dilute/amplificate the Lyme pathogen? This paper gives a brief summary of the study 25 .

2. The Model and Analysis The model developed in 25 used periodic differential equations to account for the effects of temperature variations on tick development and pathogen transmission. The ticks, I.scapularis, pass through four stages which labeled as E, L, N , A for the eggs, larvae, nymphs and adults, respectively. Each postegg stage is subdivided into questing and feeding phases according to the activity. We use iF S /iF I to denote the number of susceptible/infected ticks of feeding phase in the i stage (i = L, N, A), and iQS /iQI the number of susceptible/infected ticks of questing phase in the i stage (i = L, N, A), respectively. The host community contains three species: hosts for immature ticks (which include the white-footed mouse H1 and an alternative

May 6, 2013

12:5

BC: 8846 - BIOMAT 2012

02˙lou-wu

31

host H2 ) and host for adult ticks (deer D). The host death rates for immature ticks are µH1 and µH2 respectively. We assume that the total number of each host species (susceptible plus infected) in the isolated habitat is constant. To consider Lyme disease transmission between I.scapularis and rodent, we denote by H1I and H2I the number of infected mice and the number of infected alternative host. Eggs hatch to larvae at a development rate dE (t), while feeding larvae and nymphs will go through to the next stage (nymphs and adults, respectively) at the development rates dL (t) and dN (t). Host-attaching rates FL (t), FN (t) and FA (t) between questing ticks and hosts of each class are reported in 15 , which are host densitydependent and temperature-dependent. Due to the host resistance to the feeding ticks, we also consider density-dependent mortality of each feeding stage as a quadratic function with coefficient Di (t) (i = L, N, A) respectively. We use µQi and µF i (i = L, N, A) to denote the natural death rates of larvae, nymphs and adults at respective questing and feeding phases. We also suppose eggs are produced by feeding adult ticks at a rate b(t) and die at a rate µE (t). For the host-pathogen-tick transmission cycle, susceptible larvae can be infected by sucking blood from the infected hosts, after a duration of developmental delay, the infected nymphs developed from infected larvae then transmit the pathogen to their new host in the nymphal feeding period. In order to identify the different biting rates on two host species for immature ticks, we use the biting bias coefficients 8,9 to describe the competence of different host species. We assume p1 (p2 ) represents larval (nymphal) ticks biting bias for the alternative host. The biting bias coefficient p1 > 1 (p2 > 1) indicates larvae (nymphs) bias for the alternative host; on the other hand, 0 < p1 < 1 (0 < p2 < 1) means larvae (nymphs) bias for the main immature host (Peromyscus leucopus). Using a derivaH1I (t) H1 tive method as in 5 , FL (t) H1 +p is the average rate at which a H1 1 H2 susceptible questing larva finds and attaches successfully onto the infected H1I (t) H1 mice, and βH1L FL (t) H1 +p is the average infection rate at which H1 1 H2 a susceptible larva gets infected from mice, where βH1L is the transmission probability per bite from infectious mice (H1 ) to susceptible larvae. Using the same idea to account for the infection rate of larvae from the infected alternative host (H2 ), the larval infection rate is given by H1I (t) H2I (t) p1 H2 H1 βH1L FL (t) H1 +p H1 LQ (t) + βH2L FL (t) H1 +p1 H2 H2 LQ (t) 1 H2 H1I (t) p1 H2I (t) = (βH1L H1 +p1 H2 + βH2L H1 +p1 H2 )FL (t)LQ (t).

Similarly the newly infected feeding nymphs which come from the contact

May 6, 2013

12:5

BC: 8846 - BIOMAT 2012

02˙lou-wu

32

of questing susceptible nymphs and infectious hosts are given by H1I (t) H2I (t) p2 H2 H1 βH1N FN (t) H1 +p H1 NQS (t) + βH2N FN (t) H1 +p2 H2 H2 NQS (t) 2 H2 1I (t) 2I (t) = (βH1N HH + βH2N Hp21H +p2 H2 )FN (t)NQS (t). 1 +p2 H2

The susceptible hosts can get infected when they are bitten by infected questing nymphs. According to the conservation of bites (that is the numbers of bites made by ticks and received by hosts should be conserved), the disease incidence rate for mice Peromyscus leucopus is (t)

H1 −H1I (t) QI H1 FN (t)βN H1 (NQI (t) + NQS (t)) NQI (t)+N H1 QS (t) H1 +p2 H2 N

1I (t) = FN (t)βN H1 NQI (t) HH11−H +p2 H2 .

Similarly, the alternative host is infected by the infectious nymphal biting at a rate FN (t)βN H2 NQI (t)

p2 (H2 − H2I (t)) . H1 + p 2 H2

Therefore, the disease transmission process between ticks and their hosts can be described by the following system dE dt dLQ dt dLF S dt dLF I dt dNQS dt dNQI dt dNF S dt dNF I dt dAQS dt dAQI dt dAF S dt dAF I dt dH1I dt dH2I dt

= b(t)(AF S (t) + AF I (t)) − µE (t)E(t) − dE (t)E(t), = dE (t)E(t) − µQL (t)LQ (t) − FL (t)LQ (t), 2I (t) 1I (t) = (1 − (βH1L HH + βH2L Hp11H +p1 H2 ))FL (t)LQ (t) 1 +p1 H2 −µF L (t)LF S (t) − DL (t)(LF S (t) + LF I (t))LF S (t) − dL (t)LF S (t), 2I (t) 1I (t) = (βH1L HH + βH2L Hp11H +p1 H2 )FL (t)LQ (t) 1 +p1 H2 −µF L (t)LF I (t) − DL (t)(LF S (t) + LF I (t))LF I (t) − dL (t)LF I (t), = dL (t)LF S (t) − µQN (t)NQS (t) − FN (t)NQS (t), = dL (t)LF I (t) − µQN (t)NQI (t) − FN (t)NQI (t), 1I (t) 2I (t) = (1 − (βH1N HH + βH2N Hp21H +p2 H2 ))FN (t)NQS (t) 1 +p2 H2 −µF N (t)NF S (t) − DN (t)(NF S (t) + NF I (t))NF S (t) − dN (t)NF S (t), 1I (t) 2I (t) = FN (t)NQI (t) + (βH1N HH + βH2N Hp21H +p2 H2 )FN (t)NQS (t) 1 +p2 H2 −µF N (t)NF I (t) − DN (t)(NF S (t) + NF I (t))NF I (t) − dN (t)NF I (t), = dN (t)NF S (t) − µQA (t)AQS (t) − FA (t)AQS (t), = dN (t)NF I (t) − µQA (t)AQI (t) − FA (t)AQI (t), = FA (t)AQS (t) − µF A (t)AF S (t) − DA (t)(AF S (t) + AF I (t))AF S (t), = FA (t)AQI (t) − µF A (t)AF I (t) − DA (t)(AF S (t) + AF I (t))AF I (t), 1I (t) = FN (t)βN H1 NQI (t) HH11−H +p2 H2 − µH1 H1I (t), p2 (H2 −H2I (t)) = FN (t)βN H2 NQI (t) H1 +p2 H2 − µH2 H2I (t). (1)

May 6, 2013

12:5

BC: 8846 - BIOMAT 2012

02˙lou-wu

33

All parameter definitions are given in Table 1. We assume all coefficients in the system are nonnegative and T -periodic with period T = 365 days. Using change of variables LF = LF S + LF I , NQ = NQS + NQI , NF = NF S + NF I , AQ = AQS + AQI and AF = AF S + AF I , system (1) can be reduced into dE dt dLQ dt dLF dt dNQ dt dNF dt dAQ dt dAF dt dLF I dt dNQI dt dH1I dt dH2I dt

= b(t)AF (t) − (µE (t) + dE (t))E(t), = dE (t)E(t) − (µQL (t) + FL (t))LQ (t), = FL (t)LQ (t) − DL (t)L2F (t) − (µF L (t) + dL (t))LF (t), = dL (t)LF (t) − (µQN (t) + FN (t))NQ (t), = FN (t)NQ (t) − DN (t)NF2 (t) − (µF N (t) + dN (t))NF (t), = dN (t)NF (t) − (µQA (t) + FA (t))AQ (t), = FA (t)AQ (t) − µF A (t)AF (t) − DA (t)A2F (t), 2I (t) 1I (t) = (βH1L HH + βH2L Hp11H +p1 H2 )FL (t)LQ (t) 1 +p1 H2 −DL (t)LF (t)LF I (t) − (µF L (t) + dL (t))LF I (t), = dL (t)LF I (t) − (µQN (t) + FN (t))NQI (t), 1I (t) = FN (t)βN H1 NQI (t) HH11−H +p2 H2 − µH1 H1I (t), 2 −H2I (t)) = FN (t)βN H2 NQI (t) p2 (H − µH2 H2I (t). H1 +p2 H2

(2)

Note also that we have other three equations for infected feeding nymphs (NF I ), questing adults (AQI ) and feeding adults (AF I ), which can be decoupled from the above system. 2.1. The Tick Population Dynamics We first considered the following stage-structured system: dE dt dLQ dt dLF dt dNQ dt dNF dt dAQ dt dAF dt

= = = = = = =

b(t)AF (t) − (µE (t) + dE (t))E(t), dE (t)E(t) − (µQL (t) + FL (t))LQ (t), FL (t)LQ (t) − DL (t)L2F (t) − (µF L (t) + dL (t))LF (t), dL (t)LF (t) − (µQN (t) + FN (t))NQ (t), FN (t)NQ (t) − DN (t)NF2 (t) − (µF N (t) + dN (t))NF (t), dN (t)NF (t) − (µQA (t) + FA (t))AQ (t), FA (t)AQ (t) − µF A (t)AF (t) − DA (t)A2F (t).

(3)

Linearization of (3) at zero leads to the matrix F (t) = (fij (t))7×7 ,

(4)

where f1,7 (t) = b(t) and fi,j (t) = 0 if (i, j) 6= (1, 7), and introduction of

May 6, 2013

12:5

BC: 8846 - BIOMAT 2012

02˙lou-wu

34

V (t) = 

 HE (t) 0 0 0 0 0 0  −dE (t) JL (t)  0 0 0 0 0     0 −FL (t) HL (t) 0 0 0 0    , 0 0 −d (t) J (t) 0 0 0 L N     0 0 0 −F (t) H (t) 0 0 N N     0 0 0 0 −dN (t) JA (t) 0 0 0 0 0 0 −FA (t) µF A (t)

where HE (t) = µE (t) + dE (t) ; HL (t) = µF L (t) + dL (t) ; HN (t) = µF N (t) + dN (t) JL (t) = µQL (t) + FL (t) ; JN (t) = µQN (t) + FN (t) ; JA (t) = µQA (t) + FA (t) as well as the evolutionary process d Y (t, s) = −V (t)Y (t, s) ∀t ≥ s, Y (s, s) = I, s ∈ R, dt where I is the 7 × 7 identity matrix. Let CT be the Banach space of all T -periodic functions from R to R7 , equipped with the maximum norm. Suppose φ ∈ CT is the initial distribution of tick individuals in this periodic environment. We then defined the linear operator G : CT → CT by Z ∞ (Gφ)(t) = Y (t, t − a)F (t − a)φ(t − a)da ∀t ∈ R, φ ∈ CT . 0

23

We then follow to define the threshold value Rv as Rv := ρ(G), the spectral radius of G. Using 23 , Theorem 2.2; 22 , Theorem 4.1.1; 28 , Theorem 2.3.4], we obtained that Theorem 2.1. The following statements are valid: (1) If Rv ≤ 1, then zero is globally asymptotically stable for system (3) in R7+ ; (2) If Rv > 1, then system (3) admits a unique T -positive periodic solution ∗ (t), NF∗ (t), A∗Q (t), A∗F (t)), (E ∗ (t), L∗Q (t), L∗F (t), NQ

and it is globally asymptotically stable for system (3) with initial values in R7+ \ {0}. We want to emphasize that this threshold value is related to basic reproduction ratio introduced in 3 and a few numerical algorithms for calculating this spectral radius have been proposed in the recent literature of mathematical biology (see, for example,4,23,27 ).

May 6, 2013

12:5

BC: 8846 - BIOMAT 2012

02˙lou-wu

35

2.2. The Global Dynamics If the threshold value for ticks Rv > 1, then there exists a positive periodic ∗ solution, (E ∗ (t), L∗Q (t), L∗F (t), NQ (t), NF∗ (t), A∗Q (t), A∗F (t)), for system (3) such that (E(t), LQ (t), LF (t), NQ (t), NF (t), AQ (t), AF (t)) ∗ → (E ∗ (t), L∗Q (t), L∗F (t), NQ (t), NF∗ (t), A∗Q (t), A∗F (t)),

as, t → ∞.

In this case, equations for the infected populations in system (2) give rise to the following limiting system: dLF I dt dNQI dt dH1I dt dH2I dt

2I (t) 1I (t) ∗ = (βH1L HH + βH2L Hp11H +p1 H2 )FL (t)LQ (t) 1 +p1 H2 ∗ −DL (t)LF (t)LF I (t) − (dL (t) + µF L (t))LF I (t), = dL (t)LF I (t) − (µQN (t) + FN (t))NQI (t), 1I (t) = FN (t)βN H1 NQI (t) HH11−H +p2 H2 − µH1 H1I (t), 2 −H2I (t)) = FN (t)βN H2 NQI (t) p2 (H − µH2 H2I (t). H1 +p2 H2

(5)

We can then define F˜ (t), V˜ (t) and use evolution process Y˜ (t, s) in a similar way as before to define the threshold value Rd for the pathogen. We also obtained the following result: Theorem 2.2. The following statements are valid: (1) If Rd ≤ 1, then zero is globally asymptotically stable for system (5) in R4+ ; (2) If Rd > 1, then system (5) admits a unique positive periodic solution ∗ ∗ ∗ (L∗F I (t), NQI (t), H1I (t), H2I (t))

and it is globally asymptotically stable for system (5). Therefore, based on two threshold values, the threshold for ticks (Rv ) and the threshold for the pathogen (Rd ), we can completely determine the global dynamics of the model system (2). Theorem 2.3. Let x(t, x0 ) be the solution of system (2) through x0 . Then the following statements are valid: (1) If Rv ≤ 1, then zero is globally attractive for system (2); (2) If Rv > 1 and Rd ≤ 1, then (x1 (t), x2 (t), x3 (t), x4 (t), x5 (t), x6 (t), x7 (t)) ∗ → (E ∗ (t), L∗Q (t), L∗F (t), NQ (t), NF∗ (t), A∗Q (t), A∗F (t)),

as

t → ∞,

May 6, 2013

12:5

BC: 8846 - BIOMAT 2012

02˙lou-wu

36

and lim xi (t) = 0

t→∞

f or

i ∈ [8, 11];

(3) If Rv > 1 and Rd > 1, then there exists a positive periodic solution x∗ (t), and it is globally attractive for system (2) with respect to all positive solutions. 3. Numerical Simulations We then simulated the influences of climate warming and host diversity on tick population abundance and disease invasion. The parameter values are estimated from the literatures and experiment reports. We simulated the model until the tick population and pathogen level stabilize at an annual cycle. The simulation results show that every solution attains the same stable annual cycle with different initial conditions, which is consistent to the theoretical results in Section 2. We compared four indexes to measure the disease risk to humans. The first indicator is the total number of questing nymphs (for short, TQN) at equilibrium. This value gives the precise description of questing nymphal abundance and seasonality. The second index is the abundance and seasonality of all actually active infected questing nymphs (for short, AIQN) at equilibrium. Here the number of AIQN is the multiplication of the immature tick activity proportion and the number of all infected questing nymphs. This number gives the real risk of Lyme disease of public health concern. Then, we measured the infection prevalence in questing nymphs (INP), the quantity of infected questing nymphs divided by the total number of questing nymphs at stable state. The last index is the threshold values for the tick population and Lyme pathogen. 3.1. Climate Warming Effects The model considered tick seasonality due to both temperature-dependent and temperature independent tick development and activities. This enabled us to investigate the effect of climate change on the seasonal tick abundance and disease risk. Here, we altered the mean monthly temperature data from the 1961 − 1990 period to 2000 − 2009 period which were collected individually from a meteorological station near a tick endemic in Canada. For both two periods, the threshold values can be numerically computed. The threshold value for ticks Rv is increased from 1.6996 for the 1961 − 1990 period to 2.1915 for the 2000−2009 period, while the threshold value for the

The parameter definitions.

12:5 BC: 8846 - BIOMAT 2012 02˙lou-wu

Meaning daily basal mortality rate of eggs daily basal mortality rate of questing larvae daily basal mortality rate of questing nymphs daily basal mortality rate of questing adults daily basal mortality rate of feeding larvae daily basal mortality rate of feeding nymphs daily basal mortality rate of feeding adults daily death rate of white-footed mice daily death rate of the alternative host H2 daily time-dependent birth rate of eggs produced by per feeding adult female daily time-dependent development rate of eggs daily time-dependent development rate of larvae daily time-dependent development rate of nymphs number of white-footed mice number of alternative host number of deer larval biting bias for the alternative host H2 nymphal biting bias for the alternative host H2 daily time and density dependent host-attaching rate of larvae daily time and density dependent host-attaching rate of nymphs daily time and density dependent host-attaching rate of adults daily density-dependent mortality rate of feeding larvae on hosts daily density-dependent mortality rate of feeding nymphs on hosts daily density-dependent mortality rate of feeding adults on hosts transmission probability from infected host species H1 to susceptible larvae transmission probability from infected nymphs to susceptible host species H1 transmission probability from infected host species H2 to susceptible larvae transmission probability from infected nymphs to susceptible host species H2 transmission probability from infected host species H1 to susceptible nymphs transmission probability from infected host species H2 to susceptible nymphs

37

Parameter µE µQL µQN µQA µF L µF N µF A µH1 µH2 b(t) dE (t) dL (t) dN (t) H1 H2 D p1 p2 FL (t) FN (t) FA (t) DL (t) DN (t) DA (t) β H1 L βNH1 β H2 L βNH2 β H1 N β H2 N

May 6, 2013

Table 1.

May 6, 2013

12:5

BC: 8846 - BIOMAT 2012

02˙lou-wu

38

pathogen Rd is also increased, from 0.7585 to 1.0867. Our mathematical result (Theorem 2.3) predicts that the tick population can successfully invade into the habitat under these two temperatures since Rv > 1. However, the transmission cycle fails to establish under the 1961 − 1990 temperature data while the cycle establishes for the 2000 − 2009 temperature data since Rd is increased from below one to be greater than one. 3.2. Host Diversity Effects The host community combination can affect the tick population abundance and pathogen spread through serving as the blood source for the vector and reservoir for the pathogen. As a first step, we assumed that only one species of alternative hosts can be added to the model while the abundance of the competent host P. leucopus population is set as a constant. To simulate the result, we fixed the climate condition (the temperature data in the 2000 − 2009 period) and changed the size and species of the alternative hosts. Once the temperature data is given, the birth rate b(t), the development rates dE (t), dL (t), dN (t) and host attaching rates FL (t), FN (t) and FA (t) can be estimated using methodology introduced in 24 . All the possible alternative species listed in 10 were tested in our simulations with the exception of deer, which are incompetent exclusive host for adult ticks in our model. 3.2.1. Effects of Adding Alternative Hosts without Interspecific Host Competition Adding one species of alternative hosts (H2 ) to the system, we tested how the alternative host species affects the abundance of total/infected questing nymphal tick population. Figure 1 shows the simulation result that both the number of TQN (Figure 1(a)) and that of IQN (Figure 1(b)) become larger when the density of the alternative host densities (Eastern chipmunk) changes from nothing to higher levels. The numbers of active total/infected questing nymphs (Figure 1(c) and 1(d)) increase as well. Figure 1(d) shows that Eastern chipmunk amplifies the number of active infected questing nymphal ticks, and the peak time of infected questing nymphs happens in summer times (July and August) which is consistent with the peak human outdoor activity, and thus contributes to the high risk of getting Lyme disease. However, we also had simulations that show that the size of (active) total questing nymphs increases while that of the (active) infected questing nymphs contrarily decreases when we increase the density of the alternative

May 6, 2013

12:5

BC: 8846 - BIOMAT 2012

02˙lou-wu

39

host Virginia opossum. In this case, Virginia opossum, serving as a blood source for the ticks, can dilute the pathogen of Lyme disease and thus decrease the risk to Lyme disease. Using the similar idea, we tested the effects of adding a specific alternative host species on disease risk. We noticed that adding an alternative host always increases the threshold values for ticks Rv , which is due to the alternative uses as a food supply and promotes tick development. However, adding an alternative host may increase or decrease the threshold values for the pathogen Rd since this value is affected by the tick development and the reservoir competence of the adding host. Moreover, two indexes, the active infected questing nymphs (AIQN) and infected nymphal proportion (INP) may generate conflicting predictions in determining the amplification and dilution effects. For example, adding 20 raccoons into the existing host community may increase the AIQN while decrease the INP. Therefore, different indexes, instead of a single index, should be used to measure the disease risk for different specific purposes. 3.2.2. Effects of Adding the Alternative Host with Interspecific Host Competition In some situations, the addition of alternative hosts to the model will reduce the abundance of P. leucopus through interspecific competition on a onefor-one basis, and the total number of hosts (including mice and alternative hosts) remain constant. This scenario is based on the assumption that the environment can only support a saturated number of rodents. When alternative hosts replace the P. leucopus in the model, dilution/amplification effect was observed for all the simulations, which is slightly different from the results induced by ignoring the interspecific competition between different host species. 3.3. Sensitivity Analysis Due to the complexity of the vector life cycle, the broad host species and variable reservoir competence of different host species, estimating the parameter values of the model has a high degree of uncertainty. The Latin Hypercube Sampling Method (LHS) 11 was used and partial rank correlation coefficients (PRCC) were computed to identify the impact of the uncertainties in the parameter estimations on the model prediction: the threshold ratio Rd , in the presence of alternative species Eastern chipmunk without interspecific host competition. We have changed each parameter

12:5

BC: 8846 - BIOMAT 2012

02˙lou-wu

(b) 250 H1(200) H1(200)+H2(10) H1(200)+H2(20)

1500

1000

500

0

J

F

M

A

M

J

J

A

S

O

N

D

(c) 1200

H1(200) H1(200)+H2(10) H1(200)+H2(20)

200 150 100 50 0

J

F

M

A

M

J

J

A

S

O

N

D

M J J A Time (month)

S

O

N

D

(d) 150 H1(200) H1(200)+H2(10) H1(200)+H2(20)

1000 800 600 400 200 0

Number of active infected questing nymphs

Number of questing nymphs

(a) 2000

Numberof infected questing nymphs

40

Number of active questing nymphs

May 6, 2013

J

F

M

A

M J J A Time (month)

S

O

N

D

H1(200) H1(200)+H2(10) H1(200)+H2(20) 100

50

0

J

F

M

A

Figure 1. Variations of the abundance of total questing nymphal ticks and infected questing nymphal ticks without/with an alternative host Eastern chipmunk, where p1 = 0.4, p2 = 3.5. (a): Variations in the numbers of total questing nymphal ticks (TQN) with/without alternative host Eastern chipmunk; (b): Variations in the numbers of infected questing nymphal ticks (IQN) with/without alternative host Eastern chipmunk; (c): Variations in the numbers of total active questing nymphal ticks (AQN) with/without alternative host Eastern chipmunk; (d): Variations in the numbers of active infected questing nymphal ticks (AIQN) with/without alternative host Eastern chipmunk. Solid lines represent the scenario without any alternative hosts; Dash lines represent the scenario when alternative host Eastern chipmunk is 10; Dot-dash lines represent the scenario when alternative host Eastern chipmunk is 20. When H1 = 200 and H2 = 0, the threshold value for ticks Rv = 2.1915 and the threshold value for the pathogen Rd = 1.0867; when H1 = 200 and H2 = 10, Rv = 2.3511 and Rd = 1.1777; when H1 = 200 and H2 = 20, Rv = 2.4978 and Rd = 1.2579.

within ±20% from the start values, and taken 2000−2009 temperature data for sensitivity analysis. We noticed that Rd is particularly sensitive to the variation of temperature, in particular, that in July, August and June. All the parameter values related to the main host (white-footed mice), such as the abundance, mortality and transmission probabilities between the hosts

May 6, 2013

12:5

BC: 8846 - BIOMAT 2012

02˙lou-wu

41

and ticks, are important sources for Rd variation. Parameters pertaining to alternative hosts also play a role in determining Rd . The result of the analysis also suggests that Rd is more sensitive to the nymphal biting bias coefficient p2 than the larval tick biting bias coefficient p1 .

4. Discussion We summarized some findings, based on a temperature-driven Lyme disease model, on how climate and host community composition jointly affect the tick distribution and pathogen invasion. There are two threshold values, the ecological Rv for the ticks which predicts the tick persistence and the epidemiological Rd for the pathogen transmission. These thresholds can predict successful pathogen invasion, and both thresholds are affected by the abiotic factors (such as temperatures) and biotic factors (such as the host community composition). The model was calibrated by two temperature datasets corresponding to 1961 − 1990 and 2000 − 2009 periods from some tick endemic area in Canada. We calculated the threshold values for two different periods, 1961−1990 and 2000 − 2009. The model predicted that the tick threshold Rv = 1.6996 and pathogen threshold is Rd = 0.7585 for the 1961 − 1990 period, while Rv = 2.1915 and Rd = 1.0867 for the 2000 − 2009 period. In both periods, Rv is larger than 1, and therefore, the tick can successfully survive in the habit. However, since Rd is smaller than 1 for the 1961 − 1990 period, the pathogen transmission cycle can not successfully establish. However, Rd is brought to be larger than one for the 2000 − 2009 period, which ensures that the pathogen will remain endemic in the habitat. Therefore, climate warming may facilitate ticks invasion and pathogen transmission. We also carried out simulations to demonstrate that host community diversity may trigger the dilution effect and amplification effect 16 . The dilution effect or amplification effect phenomenon was observed in our model simulations with adding competent or incompetent hosts. Interestingly, both the dilution and amplification effects were observed an incompetent reservoir. We noticed that the predicting results largely depend on the chosen of the biting bias coefficients. Our prediction results of adding an alternative host into the whitefooted mouse community illustrated the importance of considering the details of the underlying ecological mechanisms of the host diversity.

May 6, 2013

12:5

BC: 8846 - BIOMAT 2012

02˙lou-wu

42

Acknowledgements This work was partially funded by the Canada Research Chairs program (CRC), the Natural Sciences and Engineering Research Council of Canada (NSERC), the Mathematics for Information Technology and Complex Systems (MITACS), the GEOmatics for Informed Decision (GEOIDE), and the Public Health Agency of Canada (PHAC). References 1. T. Awerbuch-Friedlander, R. Levins and M. Predescu, The role of seasonality in the dynamics of deer tick populations, Bull. Math. Biol. 67, 467-486 (2005). 2. T. E. Awerbuch and S. Sandberg, Trends and oscillations in tick population dynamics, J. Theor. Biol. 175, 511-516 (1995). 3. N. Baca¨er and S. Guernaoui, The epidemic threshold of vector-borne diseases with seasonality. The case of cutaneous leishmaniasis in Chichaoua, J. Math. Biol. 53, 421-436 (2006). 4. N. Baca¨ er, Approximation of the basic reproduction number R0 for vectorborne diseases with a periodic vector population, Bull. Math. Biol. 69, 10671091 (2007). 5. C. Bowman, A. B. Gumel, P. van den Driessche, J. Wu and H. Zhu, A mathematical model for assessing control strategies against West Nile virus, Bull. Math. Biol. 67, 1107-1133 (2005). 6. T. Caraco, S. Glavanakov, G. Chen, J. E. Flaherty, T. K. Ohsumi and B. K. Szymanski, Stage-structured infection transmission and a spatial epidemic: a model for Lyme disease, Am. Nat. 160, 348-359 (2002). 7. M. Ghosh and A. Pugliese, Seasonal population dynamics of ticks, and its influence on infection transmission: a semi-discrete approach, Bull. Math. Biol. 66, 1659-1684 (2004). 8. G. R. Hosack, P. A. Rossignol and P. van den Driessche, The control of vector-borne disease epidemics, J. Theor. Biol. 255, 16-25 (2008). 9. J. G. Kingsolver, Mosquito host choice and the epidemiology of malaria, Am. Nat. 130, 811-827 (1987). 10. K. LoGiudice, R. S. Ostfeld, K. A. Schmidt and F. Keesing, The ecology of infectious disease: Effects of host diversity and community composition on Lyme disease risk, Proc. Natl. Acad. Sci. 100, 567-571 (2003). 11. S. Marino, I. B. Hogue, C. J. Ray, D. E. Kirschner, A methodology for performing global uncertainty and sensitivity analysis in systems biology, J. Theor. Biol. 254, 178-196 (2008). 12. H. G. Mwambi, J. Baumg¨ artner and K. P. Hadeler, Ticks and tick-borne diseases: a vector-host interaction model for the brown ear tick (Rhipicephalus appendiculatus), Stat. Methods. Med. Res. 9, 279-301 (2000). 13. R. Norman, R. G. Bowers, M. Begon and P. J. Hudson, Persistence of tickborne virus in the presence of multiple host species: tick reservoirs and parasite mediated competition, J. Theor. Biol. 200, 111-118 (1999).

May 6, 2013

12:5

BC: 8846 - BIOMAT 2012

02˙lou-wu

43

14. N. H. Ogden, M. Bigras-Poulin, C. J. O’callaghan, I. K. Barker, K. Kurtenbach, L. R. Lindsay and D. F. Charron, Vector seasonality, host infection dynamics and fitness of pathogens transmitted by the tick Ixodes scapularis, Parasitology 134, 209-227 (2007). 15. N. H. Ogden, M. Bigras-Poulin, C. J. O’Callaghan, I. K. Barker, L. R. Lindsay, A. Maarouf, K. E. Smoyer-Tomic, D. Waltner-Toews and D. Charron, A dynamic population model to investigate effects of climate on geographic range and seasonality of the tick Ixodes scapularis, Int. J. Parasitol. 35, 375-389 (2005). 16. N. H. Ogden and J. I. Tsao, Biodiversity and Lyme disease: Dilution or amplification? Epidemics 1, 196-206 (2009). 17. R. S. Ostfeld, Lyme Disease: The Ecology of a Complex System, New York, Oxford University Press, 2011. 18. S. E. Randolph, Epidemiological uses of a population model for the tick Rhipicephalus appendiculatus, Trop. Med. Int. Health. 4, A34-A42 (1999). 19. S. E. Randolph and D. J. Rogers, A generic population model for the African tick Rhipicephalus appendiculatus, Parasitology, 115, 265-279 (1997). 20. R. Ros` a and A. Pugliese, Effects of tick population dynamics and host densities on the persistence of tick-borne infections, Math. Biosci. 208, 216240 (2007). 21. R. Ros` a, A. Pugliese, R. Norman and P. J. Hudson, Thresholds for disease persistence in models for tick-borne infections including non-viraemic transmission, extended feeding and tick aggregation, J. Theor. Biol. 224, 359-376 (2003). 22. H. L. Smith, Monotone Dynamical Systems: An Introduction to the Theory of Competitive and Cooperative Systems, Math. Surveys Monogr. 41, AMS, Providence, RI, 1995. 23. W. Wang and X.-Q. Zhao, Threshold dynamics for compartmental epidemic models in periodic environments, J. Dyn. Differ. Equ., 20, 699-717 (2008). 24. X. Wu, V. R. S. K. Duvvuri, Y. Lou, N. H. Ogden, Y. Pelcat and J. Wu, Developing a temperature-driven map of the basic reproductive numbers of the emerging tick vector of Lyme disease Ixodes scapularis in Canada (published online in J. Theor. Biol.). 25. Y. Lou, J. Wu and X. Wu, submitted. 26. X. Wu, V. R. S. K Duvvuri and J. Wu, Modeling dynamical temperature influence on the Ixodes scapularis population, 2010 International Congress on Environmental Modelling and Software, 2010. 27. J. Zhang, Z. Jin, G. Sun, X. Sun and S. Ruan, Erratum to: Modelling seasonal rabies epidemics in China, Bull. Math. Biol., 74, 1226-1251 (2012). 28. X.-Q. Zhao, Dynamical Systems in Population Biology, Springer-Verlag, New York, 2003.

May 6, 2013

12:21

BC: 8846 - BIOMAT 2012

03˙massad

QUANTIFYING THE RISK OF MOSQUITO-BORNE INFECTIONS BASING ON THE EQUILIBRIUM PREVALENCE IN HUMANS

MARCOS AMAKU School of Veterinary Medicine, University of S˜ ao Paulo, Brazil FRANCISCO ANTONIO BEZERRA COUTINHO School of Medicine, University of S˜ ao Paulo, Brazil EDUARDO MASSAD School of Medicine, University of S˜ ao Paulo, Brazil and London School of Hygiene and Tropical Medicine, UK

This paper proposes a general model for vector-borne infections that is flexible enough to comprise the dynamics of some known diseases transmitted by arthropods. From equilibrium analysis, we determined the number of infected vectors as an explicit function of the model’s parameters and the prevalence of infection in the hosts. From the analysis, it is also possible to derive the Basic Reproduction Number and the equilibrium force of infection as a function of those parameters and variables. From the force of infection, we were able to conclude that, depending on the disease’s structure and the model’s parameters, it is possible to estimate a risk quantifier for those diseases. The analysis is exemplified by the case of malaria.

1. Introduction Vector-borne diseases such as malaria, dengue, yellow fever, plague, trypanosomiasis and leishmaniasis have been major causes of morbidity and mortality through human history.1 Currently, half of the world’s population is infected with at least one type of vector-borne pathogen.2,3 Only one mosquito-borne infection, dengue fever, affects the lives of 3.6 billion people worldwide.4 In the 17th through early 20th centuries, human morbidity and mortality due to vector-borne diseases outstripped all other causes combined.5 – 7 By the 1960s the majority of vector-borne infections had been effectively con44

May 6, 2013

12:21

BC: 8846 - BIOMAT 2012

03˙massad

45

trolled or targeted for intensive programmes. However, such programmes were discontinued in the 1970s because vector-borne infections were no longer considered major public health problems.7 – 11 As a consequence, in the 1980s the world observed a resurgence of old vector-borne disease and the emergence of new ones.5 The historical paradigm of mosquito-borne infections, malaria, accounts for the most deaths by far of any human vector-borne diseases, with approximately 300 million people infected and up to one million deaths every year.12,13 Explosive epidemics have also marked the resurgence of dengue and yellow fever,1 and nearly all of the most important vector-borne human diseases have exhibited dramatic changes in incidence and geographic range in recent decades.5 Vectors of human diseases are typically species of mosquitoes that are able to transmit viruses, bacteria, or parasites to humans and other warmblooded hosts.14 Among the mosquito-borne infections, arthropod-borne viruses (arboviruses) comprise the largest class of vector-borne human pathogens with more than 500 arboviruses being described up to now, 20 percent of which causing human diseases.5,15,16 Examples of arboviruses include dengue and dengue haemorrhagic fever, yellow fever, Rift Valley fever, West Nile virus, Japanese encephalitis, among others.5,16,17 Approximately 80 percent of vector-borne disease transmission typically occurs among 20 percent of the host populations.18,19 Thus, the overwhelming impact of the distribution of vector-borne infections is disproportionately on tropical and subtropical countries.2 Unfortunately, this considerable economic, ecological, and public health impact of vector-borne is expected to continue, given limited capabilities for detecting, identifying and addressing likely epidemics.1 Therefore, much remains to be known about the dynamics of vectorborne infections and the need to better understand the disease transmission potential should motivate the development of quantitative tools to describing those rich dynamics. In addition, mathematical model can provide public health authorities with predictive tools that can guide the designing control strategies. The concept of risk of acquiring vector-borne infections has been estimated as direct and linear functions of endemicity levels, defined as the prevalence of infected individuals. For instance, the map of malaria endemicity is defined as the proportion of people with circulating parasite (the so-and erroneously-called “parasite rate”). This is illustrated by the work of Hay et al.,20 who estimated the parasite rate by a sophisticated statistical model. The results of this exercise stratified the world into

May 6, 2013

12:21

BC: 8846 - BIOMAT 2012

03˙massad

46

three classes: the spatial representation of no risk, unstable risk (P. falciparum annual parasite incidence [PfAPI] < 0.1 per 1,000 people per annum [pa]), and stable risk (PfAPI ≥ 0.1 per 1,000 people pa) of P. falciparum transmission for 2007.20 This paper proposes a general model for vector-borne infections that is flexible enough to comprise the dynamics of some known diseases transmitted by arthropods. From equilibrium analysis, we determined the number of infected vectors as an explicit function of the model’s parameters and the prevalence of infection in the hosts. From the analysis, it is also possible to derive the Basic Reproduction Number and the equilibrium force of infection as a function of those parameters and variables. From the force of infection, we were able to conclude that, depending on the disease’s structure and the model’s parameters, it is possible to estimate a risk quantifier for those diseases. The analysis is exemplified by the case of malaria. 2. The Model The basic model that is used to calculate the efficiency of control strategies can be found in Refs. 11, 21 and 22. The populations involved in the transmission are human hosts, mosquitoes and their eggs. For the purposes of this paper, the term “eggs” also includes the intermediate stages, such as larvae and pupae. Therefore, the population densities are divided into the compartments described in Table 1. Table 1. Model variables and their biological meanings. Variable

Biological meaning

SH

Susceptible humans

LH

Latent humans

IH

Infectious humans

RH

Recovered humans

SM

Uninfected mosquitoes

LM

Latent mosquitoes

IM

Infectious mosquitoes

SE

Uninfected eggs (imm. stages)

IE

Infected aquatic forms

The model is defined by the following equations:

May 6, 2013

12:21

BC: 8846 - BIOMAT 2012

03˙massad

47

dSH SH NH = −abIM + σH RH + θH IH − µH SH + rH NH 1 − dt NH κH dLH dt dIH dt dRH dt dSM dt dLM dt dIM dt dSE dt

= abIM

SH − (µH + δH )LH NH

= δH LH − (µH + αH + γH + θH )IH = γH IH − µH RH − σH RH = p cS (t)SE − µM SM − acSM = acSM

IH NH

(1)

IH − γ M L M − µM L M NH

= γM LM − µH IM + p cS (t)IE

(SE + IE ) = [rM SM + (1 − g)rM (IM + LM )] 1 − − µE SE − p cS (t)SE κE dIE (SE + IE ) = [g rM (IM + LM )] 1 − − µE IE − p cS (t)IE , dt κE

where NH = SH + LH + IH + RH NM = SM + LM + IM NE = SE + IE . and cS (t) = d1 − d2 sin(2πf t + φ) is a factor mimicking seasonal influences in the mosquito population.23,24 The model’s parameters are described in Table 2. In addition, we separate from the human general population (individuals that reside in the area) a cohort,25,26 denoted by primes and called “probe”, followed through their entire exposure to calculate the risk of infection. The evolution equation for the probe cohort is: ′ ′ SH dSH ′ ′ = −a b IM θ(t − t0 ) (2) dt NH for a′ = P OISSON (0.3) b′ = GAM M A(0.088, 0.017) ,

May 6, 2013

12:21

BC: 8846 - BIOMAT 2012

03˙massad

48

and θ(t − t0 ) is the Heaviside function. Note that Eq. (2) considers nocompetitive risks. Table 2. Parameter

Model’s parameters and their biological significance. Biological meaning

a

Average daily rate of biting

b

Fraction of bites actually infective to humans

σH

Loss of immunity rate

δH

Latency rate in humans

θH

Loss of infectiousness in humans

µH

Human natural mortality rate

rH

Birth rate of humans

κH

Carrying capacity of humans

αH

Dengue mortality in humans

γH

Human recovery rate

p

Hatching rate of susceptible eggs

γM

Latency rate in mosquitoes

µM

Natural mortality rate of mosquitoes

rM

Oviposition rate

g

Proportion of infected eggs

κE

Carrying capacity of eggs

µE

Natural mortality rate of eggs

c

Fraction of bites actually infective to mosquitoes

cS

Climatic factor

Model (1) is a general model for some known vector-borne infections. Therefore, depending on the values of some parameters, the model can describe any type of dynamics in the human subpopulation, such as seen in Table 3. The model can also include vaccination, for instance, against yellow fever or dengue, but this subject will not be treated in this work. From system (1), it is possible to determine the equilibrium densities of the variables of interest. We carried out a detailed equilibrium analysis in a related article.27 For our purposes, we calculate the equilibrium densities ∗ of IM , the number of infected mosquitoes: ∗ IM =

∗ (δ + µH )(µH + γH + αH + σH )IH n Hh i ∗ o. IH H +αH +δH +θH )+γH δH ab δH 1 − (µH +σH )(µH +γ δH (µH +σH ) N∗ H

(3)

May 6, 2013

12:21

BC: 8846 - BIOMAT 2012

03˙massad

49 Table 3. Model’s structure as a function of the parameters. Model’s structure

δH

γH

σH

θH

SI

→∞

0

0

0 6= 0

SIS

→∞

0

0

SIR

→∞

6= 0

0

0

SIRS

→∞

6= 0

6= 0

0

SEIR

6= 0

6= 0

0

0

SEIRS

6= 0

6= 0

6= 0

0

∗ ∗ Replacing the values of IH and NH given below in Eq. (3), it is possible ∗ to see that IM increases with the biting rate a. ∗ ∗ The expressions for IH and NH , in terms of the model’s parameters, that appear in Eq. (3) are: N∗

∗ (γM + gµM ) a2 b c NM ∗ − Q(µM + γM )µM (1 − g) IH H = , ∗ N∗ NH (γM + gµM ) a2 b c δH NM ∗ Z + acQ(µM + γM )

(4)

H

which is the equilibrium prevalence of the infection in humans, and where Q=

(µH + σH )(µH + γH + αH + δH + θH ) + γH δH δH (µH + σH )

(5)

Z=

(µH + δH )(µH + γH + αH + σH ) δH

(6)

∗ NM

p cS µM (µE + p cS ) = κE 1 − . µM rM p cS

(7)

and

The calculation of the total human population expression at equilibrium, ∗ , is slightly more complicated and results in: NH √ −B + B 2 − 4AC ∗ NH = , (8) 2A where A = acrH Φ B = −acΦκH (rH − µH ) + ΓZrH − ΦµM (1 − g)αH κH , C = −ΓκH (rH − µH )Z + ΓαH κH

(9)

May 6, 2013

12:21

BC: 8846 - BIOMAT 2012

03˙massad

50

and Φ = Q(γM + µM ) ∗ Γ = (γM + gµM )a2 b c δH NM .

(10)

From Eq. (4), it is possible to deduce the expression of the Basic Reproduction Number of model (1) as:11,28,29

R0 =

µM (1 − g)(µM

M (0) (γM + gµM ) a2 b c N NH (0) h i, H +αH +δH +θH )+γH δH + γM ) (µH +σH )(µH +γ δH (µH +σH )

(11)

where NM (0) and NH (0) are the population of vectors and hosts calculated in the absence of the infection. 3. Estimating Risks In order to calculate the probability of an individual acquiring an infection, π, after the introduction of a single case in an entirely susceptible population, we consider the probe cohort followed through an entire outbreak. The probability of infection in this self-limiting outbreak is then given by the following expression: R∞ ′ SH (t)λ′ (t)dt 0 π= . (12) ′ (0) NH ′ ′ In the above expression, SH (t) and NH (t) are respectively the number of susceptible hosts and the total population of the cohort used as a probe, and λ′ (t) is the force of infection to which the probe is subject, defined as the per capita number of new cases per time unit30 and is expressed as:

λ′ (t) = a′ b′

IM (t) , NH (t)

(13)

where IM (t) is the number of infected mosquitoes. One can also calculate the average risk (probability) of infection for a single individual, who arrives in the affected region at week Ω after the outbreak is triggered and remains there for ω weeks, π 1 . This is done by setting the limits of integration in Eq. (3) as:

π1 =

Ω+ω R Ω

′ SH (t)λ′ (t)dt ′ (Ω) NH

.

(14)

May 6, 2013

12:21

BC: 8846 - BIOMAT 2012

03˙massad

51

In Figure 1, we show the result of a numerical simulation of Eq. (14) for 20 stochastic realizations from Ref. 26.

Figure 1.

Numerical simulation of Eq. (14).

The force of infection (the incidence density rate) for humans at the equilibrium is given by: I∗

(δH + µH )(µH + γH + αH + σH ) NH∗ H n h i ∗ o. λ∗ = IH H +αH +δH +θH )+γH δH δH 1 − (µH +σH )(µH +γ δH (µH +σH ) N∗

(15)

H

Hence, provided the value of the parameters in Eq. (15) are known, it is possible to calculate the value of the force of infection from equilibrium prevalence data. The force of infection then can be used to estimate the per capita risk of infection, according to Eq. (14). Note that the force of infection is non-linearly associated with the human prevalence. Therefore, for instance, if a given region has an endemic prevalence twice as high as

May 6, 2013

12:21

BC: 8846 - BIOMAT 2012

03˙massad

52

another, the per capita risk of acquiring the infection is not twice as high but will depend on a very complicated relationship between the force of infection, the system’s parameters and the equilibrium prevalence of the infection in the host.

4. Discussion Since the seminal work by Ronald Ross, mathematical models have provided a great deal of theoretical support for understanding the complex dynamics of vector-borne infections, in addition to the important role those models have played in designing and assessing control strategies.31 Key concepts like the Basic reproduction Number, Vectorial Capacity and the Force of Infection derived from the theoretical works on vector-borne infections are currently central to the quantification of transmission, as well as to the proposal of public health measures to control them.11 In this work, we propose a general, although very sketchy model that considers all the aspects related to the dynamics of mosquito-borne microparasites. From the equilibrium analysis, we calculated the prevalence of the infection in the host populations, from which the number of infected vectors was deduced. In addition, we deduced explicit expression for the Basic Reproduction Number and the equilibrium force of infection. It was possible then to demonstrate that, provided an equilibrium is reached, each mosquito-borne microparasite has a maximum host prevalence, depending on the disease’s structure and on the value of the parameters. This analysis was exemplified by the calculation of the maximum equilibrium prevalences of malaria and dengue. However, once the disease’s structure is determined and the values of the parameters are known, it is possible to calculate the maximum equilibrium prevalence of any mosquito-borne microparasitic infection. It may be argued that malaria is not exactly a good example of a microparasitic infection. However, although malaria can behave sometimes as a microparasite, sometimes as a macroparasite,30 in the specific context of the proposed model, it can be considered as a microparasitic disease. Another important limitation of our approach is that, in order to calculate the equilibrium densities of each of the model’s variables we have to neglect seasonal fluctuations, which can be very important in the transmission dynamics of such infections like dengue. However, seasonality in some tropical areas are not too important and the results can be applied to the average trend in prevalence levels.

May 6, 2013

12:21

BC: 8846 - BIOMAT 2012

03˙massad

53

Finally, a comment on an important aspect of Eq. (11) for the basic reproduction number, R0 . This expression has a discontinuity when g = 1, that is, when 100% of the eggs are laid infected. This is a theoretical possibility and when g → 1, there is a structural change in our model. The populations of susceptible and infected eggs become completely decoupled. It can be verified that the disease is able to sustain itself even without human hosts. As a matter of fact, as previously demonstrated,32 this is the only way the infection circulates exclusively among vectors without the hosts. In addition, when g = 1 and the human hosts are introduced into the system, then since all the eggs of infected mosquitoes are infected, the time evolution leads to a situation where all mosquitoes are infected. Therefore, when g = 1 and human hosts are introduced, the population of susceptible mosquitoes and eggs goes to zero. In any case, there is no known infection that is 100% transmitted to mosquitoes’ eggs. Acknowledgments The research from which these results were obtained has received funding from the European Unions Seventh Framework Programme (FP7/20072013) under grant agreement no. 282589, from LIM01 HCFMUSP and CNPq. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Conflicts of Interest The authors have declared that no competing interests exist. References 1. S. M. Lemon, P. F. Sparling, M. A. Hamburg, Vector-Borne Diseases: Understanding the Environmental, Human Health, and Ecological Connections. Washington. The National Academies Press (2008). 2. CIESIN (Center for International Earth Science Information Network) (2012). Changes in the incidence of vector-borne diseases attributable to climate change. http://www.ciesin.columbia.edu/TG/HH/veclev2.html. Accessed in 25 July 2012. 3. WHO (World Health Organization) (2004). Global strategic framework for integrated vector management. http://www.emro.who.int/rbm/PDF/GlobalStratFrameIVM.pdf. 4. A. Wilder-Smith, K. E. Renhorn, H. Tissera, S. Abu Bakar, L. Alphey, P. Kittayapong, S. Lindsay, J. Logan, C. Hatz, P. Reiter, J. Rockl¨ ov, P. Byass, V. R. Louis, Y. Tozan, E. Massad, A. Tenorio, C. Lagneau, G. L’ambert, D.

May 6, 2013

12:21

BC: 8846 - BIOMAT 2012

03˙massad

54

5. 6. 7. 8. 9.

10. 11.

12. 13.

14.

15. 16.

17. 18.

19.

Brooks, J. Wegerdt, D. Gubler, DengueTools: innovative tools and strategies for the surveillance and control of dengue. Glob Health Action 5, doi: 10.3402/gha.v5i0.17273 (2012). D. J. Gubler, Resurgent vector-borne diseases as a global health problem. Emerg. Infect. Dis. 4, 442-450 (1998). D. J. Gubler, The global emergence/resurgence of arboviral diseases as public health problems. Arch. Med. Res. 33, 330-342 (2002). D. J. Gubler, The changing epidemiology of yellow fever and dengue, 1990 to 2003: full cycle? Comp. Immunol. Microbiol. Infect. Dis. 27, 319-330 (2004). D. J. Gubler, Aedes aegypti and Aedes aegypti -born disease control in the 1990s: top down or bottom up. Am. J. Trop. Med. Hyg. 40, 571-578 (1989). D. J. Gubler, M. L. Wilson, The global resurgence of vector-borne diseases: lessons learned from successful and failed adaptation. In: Integration of public health with adaptation to climate change: lessons learned and new directions, edited by K.L. Ebi, J. Smith and I. Burton. London. Taylor and Francis, pp 44-59 (2005). IOM (Institute of Medicine), Emerging infections: microbial threats to health in the United States. Washington. The National Academies Press (1992). L. F. Lopez, F. A. B. Coutinho, M. N. Burattini, E. Massad, Threshold conditions for infection persistence in complex host-vectors interactions. C. R. Biol. 325, 1073-1084 (2002). K. Karunamoorthi, Global malaria burden: Socialomics implications. J. Socialomics 1, 2 (2012). http://dx.doi.org/10.4172/jsc.1000e108. C. J. L. Murray, L. C. Rosenfeld, S. S. Lim, K. G. Andrews, K. J. Foreman, D. Haring, N. Fullman, M. Naghavi, R. Lozano, A. D. Lopez, Global malaria mortality between 1980 and 2010: a systematic analysis. The Lancet 379 (9814), 413 - 431 (2012). doi:10.1016/S0140-6736(12)60034-8. O. P. Forattini, I. Kakitani, E. Massad, D. Marucci, Studies on mosquitoes (Diptera: Culicidae) and anthropic environment 2. Immature stages research at a rice irrigation system location in South-Eastern Brazil. Rev. Saude Publica 27, 227-236 (1993). S. M. Gray, N. Banerjee, Mechanisms of arthropod transmission of plant and animal viruses. Microbiol. Mol. Biol. Rev. 63, 128-148 (1999). WHO (World Health Organization) (2005). Vector-borne viral infections. In: State of the art vaccine research and development. Inhttp://www.who.int/vaccine research/documents/Vector Borne Viral fection.pdf. CDC (Centers for Diseases Control) (2005). Information of arboviral encephalitis. http://www.cdc.gov/ncidod/dvbid/arbor/arbdet.htm. D. L. Smith, J. Dushoff, R. W. Snow, S. I. Hay, The entomological inoculation rate and Plasmodium falciparum infection in African children. Nature 438 (7067), 492-495 (2005). M. E. J. Woolhouse, C. Dye, J. F. Etard, J. D. Smith, J. D. Charlwood, G. P. Garnett, P. Hagan, J. L. Hii, P. D. Ndhlovu, R. J. Quinell, C. H. Watts, S. K. Chandiwana, R. M. Anderson, Heterogeneities in the transmission of infectious agents: implications for the design of control programs. Proc. Natl.

May 6, 2013

12:21

BC: 8846 - BIOMAT 2012

03˙massad

55

Acad. Sci. U. S. A. 94, 338-342 (1997). 20. S. I. Hay, C. A. Guerra, P. W. Gething, A. P. Patil, et al., A world malaria map: Plasmodium falciparum endemicity in 2007. PLos Med. 6 (3), e1000048 (2009). doi:10.1371/journal.pmed.1000048 21. F. A. B. Coutinho, M. N. Burattini, L. F. Lopez, E. Massad, An approximate threshold condition for non-autonomous system: An application to a vectorborne infection. Math. Comput. Simul. 70, 149-158 (2005). 22. F. A. B. Coutinho, M. N. Burattini, L. F. Lopez, E. Massad, Threshold conditions for a non-autonomous epidemic system describing the population dynamics of dengue. Bull. Math. Biol. 68, 2263-2282 (2006). 23. M. N. Burattini, M. Chen, A. Chow, F. A. B. Coutinho, K. T. Goh, L. F. Lopez, S. Ma, E. Massad, Modelling the control strategies against dengue in Singapore. Epidemiol. Infect. 136, 309319 (2008). 24. E. Massad, F. A. B. Coutinho, L. F. Lopez, D. R. da Silva, Modeling the impact of global warming on vector-borne infections. Phys. Life Rev. 8 (2), 169-199 (2011). 25. E. Massad, F. A. B. Coutinho, M. N. Burattini, L. F. Lopez, C. J. Struchiner, Yellow fever vaccination: how much is enough? Vaccine 23, 3908-3914 (2005). 26. E. Massad, R. H. Behrens, M. N. Burattini, F. A. B. Coutinho, Modeling the risk of malaria for travellers to areas with stable malaria transmission. Malaria Journal 8, 296 (2009). doi: 1186/1475-2875-8-296. 27. M. Amaku, M. N. Burattini, F. A. B. Coutinho, L. F. Lopez, S. M. Raimundo, E. Massad, A comparative analysis of the relative efficacy of vector-control strategies against dengue fever. Submitted for publication. 28. G. MacDonald, The analysis of equilibrium in malaria. Trop. Dis. Bull. 49, 813828 (1952). 29. E. Massad, F. A. B. Coutinho, H. M. Yang, H. B. de Carvalho, F. Mesquita, M. N. Burattini, The basic reproduction ration of HIV among intravenousdrug-users. Math. Biosci. 123, 227-247 (1994). 30. R. M. Anderson, R. M. May, Infectious Diseases in Humans: Dynamics and Control. Oxford. Oxford University Press (1991). 31. E. Massad, F. A. B. Coutinho, Vectorial capacity, basic reproduction number, force of infection and all that: formal notation to complete and adjust their classical concepts and equations. Mem. Inst. Oswaldo Cruz 107 (4), 564-567 (2012). 32. B. Adams, M. Boots, How important is vertical transmission in mosquitoes for the persistence of dengue? Insights from a mathematical model. Epidemics 2, 1-10 (2010).

May 6, 2013

14:25

BC: 8846 - BIOMAT 2012

04˙madsen

SEASONAL FLUCTUATION IN TSETSE FLY POPULATIONS AND HUMAN AFRICAN TRYPANOSOMIASIS: A MATHEMATICAL MODEL

T. MADSEN, D. I. WALLACE, N. ZUPAN Dartmouth College Hanover, NH, 03755, USA Human African trypanosomiasis, commonly known as sleeping sickness, is a vectorborne disease endemic to Sub-Saharan Africa. An estimated 55 million people are at risk, and the World Health Organization classifies it as one of the world’s neglected tropical diseases. We develop a model of the dynamics of one species of vector, Glossina tachinoides, which incorporates the impact of seasonal temperature fluctuation on the life cycle of the disease vector.

1. Introduction Human African Trypanosomiasis (abbreviated HAT and commonly known as sleeping sickness) is an endemic public health threat to Sub-Saharan Africa. Classified by the World Health Organization (WHO) as a neglected tropical disease 1 , HAT is a protozoan parasitic infection borne by over 30 species of tsetse fly 1 . There are two known forms of the infection: one caused by the protozoan Trypanosoma brucei gambiense and the other caused by Trypanosoma brucei rhodesiense 2 . T. b. gambiense is responsible for 97% of all HAT infections 3 and causes the chronic form of the disease, which can be asymptomatic for months or years 2 . Infection by T.b. gambiense is fatal if left untreated 4 . The clinical treatment of HAT is difficult; by the time the infection presents symptoms, few drugs are capable of fighting it, and those that exist are unpleasant and potentially life-threatening 5 . Due to its lack of symptoms, surveillance of HAT is difficult; only 16,000 cases are reported per annum 2 . Despite these figures, the WHO estimates that 55 million people are at risk and that, with proper surveillance, there would be 300,000-500,000 reported cases and 50,000 deaths per annum 6 . Furthermore, similar protozoa (notably T. b. brucei, T. congolese, and T. vivax) cause animal African trypanosomiasis (AAT), an infection that has a major impact on agricultural production in the region 7 . All trypanoso56

May 6, 2013

14:25

BC: 8846 - BIOMAT 2012

04˙madsen

57

miases infect mammals exclusively 8 . Trypanosomiasis depends completely on the tsetse fly as a vector, with various stages taking place in both the mammalian host and the insect vector. Trypanosomes multiply in mammalian hosts, and are taken up when the fly takes a blood meal 2 . They then mature and migrate to the salivary glands of the fly, which permits transmission back to a mammalian host during a later blood meal 2 . These twenty to thirty fly species are highly localized, so the specific vector of HAT varies greatly from region to region 8 . Certain species of tsetse are more vulnerable than others to infection 8 . Adding to this complexity is the fact that non-human mammals, both domestic and wild, can serve as a reservoir for trypanosomiases including HAT, even if these hosts are not directly affected by the disease 9 . Indeed, the prevalence of T.b. gambiense, which afflicts only humans, is in some areas much higher in non-human reservoirs than it is in the human population 8 . Furthermore, different species of tsetse fly prefer different hosts 10 . The differences in behavior between tsetse flies of different regions make it difficult to develop general mathematical models of infection prevalence. Despite the inherent challenges, one such general model of trypanosomiasis prevalence has been developed by Rogers 8 . This model is composed of a system of ordinary differential equations describing the prevalence rates of one species of trypanosome in one vector species and two host species (either human and domestic animal in the case of HAT or wild and domestic animal in the case of animal trypanosomiasis). Trypanosome transmission rates are considered to be a function of biting rates, the proportion of infected vectors and hosts, the proportion of bites resulting in transmission, and the ratio of vectors to hosts 5 . For simplicity, the model assumed that the populations of vector and hosts remained constant over time, as did other relevant factors such as temperature and humidity. Values for parameters that varied significantly by fly species (such as the duration of the tsetse feeding cycle and the average adult life expectancy) were calculated as a crude average. Consequently, this model is useful as a general model, but lacks the specificity to make strong predictions about any particular region or tsetse species. In this paper we develop a model of insect dynamics based on the behavior and lifecycle of a particular species of tsetse fly. Such a model may be coupled with the disease model of Rogers to give a better picture of disease dynamics.

May 6, 2013

14:25

BC: 8846 - BIOMAT 2012

04˙madsen

58

2. Insect population submodel A species of tsetse that lends itself to mathematical modeling is Glossina tachinoides, found in west and central Sub-Saharan Africa 10 . G. tachinoides is one of the major vectors of HAT 8 . It is a member of the palpalis group and is best suited to live in very humid areas such as rainforests, swamps, lakeshores, and gallery forests 10 . This species can live further north than most species of tsetse, so in some regions, it is the only vector of HAT 10 . G. tachinoides is only susceptible to T. brucei infection during its teneral stage (that is, it can only be infected during its first meal) 8 . The life span of adult females of this species is dependent on temperature and humidity, as are the durations of the pupal period and of the feeding cycle 10 . Populations of G. tachinoides near human settlements tend to prefer pigs as their source of blood meals, with cattle and humans as the next most popular options 11 . This species was chosen as the point of reference for the model presented here. The life cycle of tsetse is complicated and rather unique 10 . Adult females mate only once during their lifetime, although males can mate more than once. The adult male deposits a large ball of sperm directly into the uterus of the female, which travels into the spermathecae. The sperm remains active for the rest of the female’s life. The female incubates one egg at a time. The egg passes into the uterus, where it is immediately fertilized. The egg spends four days developing into a larva, and about five days in a combination of three larval stages. About nine days after the egg passes into her uterus, the female deposits the fully-grown larva from her uterus into a patch of loose, protected soil, where it quickly develops a hard, dark shell, and becomes a pupa. The pupal period can last from about twenty to forty days, depending on the species, humidity, and temperature 10 . At the end of the period, the shell breaks and a small fly emerges. The time between hatching and the fly’s first meal is known as the teneral stage 8 . The first meal is very important as this food is used to develop the flight muscles in the thorax, which are undeveloped at emergence 10 . Flies are vulnerable to infection by T. brucei in this weak state, and develop immunity after their first blood meal 8 . After the teneral stage, the flies enter the adult stage. After mating, adult females give birth to a single larva approximately every nine to ten days for the remainder of their lives. Males live about three weeks. Females usually live longer, although their life expectancy varies greatly between different species and is very sensitive to atmospheric temperature 10 .

May 6, 2013

14:25

BC: 8846 - BIOMAT 2012

04˙madsen

59

2.1. Insect Model Equations We assume that every adult female fly mates successfully, independent of the male population. We do not assume that temperature is constant. However, we assume that its effect on parameters is linear when the effect is increasing monotonic in the temperature range of 19 to 33 degrees. If the effect is optimal somewhere in that range and declines otherwise, we model it with a quadratic. This assumption allows a crude match to known parameters at various temperatures in a reasonable range around the average. Most population models rely on a carrying capacity to bound population growth. Considering the life cycle of the tsetse fly, it is difficult to imagine what would impose such a bound. Fly populations are observed to vary widely 12 13 14 and there appear to be plenty of mammals to provide blood meals. Production of pupae is so small that no constraint seems relevant there. Therefore we do not assume any apriori bound on fly populations, and instead explore the role of temperature as a controlling factor. Equations 1-4 below describe the dynamics of the fly population (P = pupae, R = teneral, F = female, M = male). The rate of change in the pupa population is given by the rate of pupa deposition less the rate of maturation and the death rate. P ′ = I −1 F − QL−1 rP − V P

(1)

The rate of change in the teneral population is given by the rate of maturation of pupae less the rate of maturation of tenerals and the death rate. R′ = QL−1 rP − CE −1 rR − BHrR

(2)

The rate of change in the adult female population is given by the rate of maturation of tenerals into female adults less the death rate. F ′ = .5CE −1 rR − (S(r − DW −1 )2 + A−1 )F

(3)

The rate of change in the adult male population is given by the rate of maturation of tenerals into male adults less the death rate. M ′ = .5CE −1 rR − N −1 M

(4)

Values of all variables and constants are summarized in Table 1 below.

May 6, 2013

14:25

BC: 8846 - BIOMAT 2012

04˙madsen

60 Table 1. Notation

Parameters for the insect submodel

Value

Units

Description Source

A

90

days

maximum life expectancy of female fly

10

B

.97

-

teneral death temperature correction

10

C

1.12

-

feeding cycle temperature correction

10

D

25

Celsius

temperature for optimal female life expectancy

10

E

4

days

average duration of fly feeding cycle

8

F

variable

number

female fly population, initially 3000

-

H

.14

days−1

average teneral death rate

10

I

9

days

duration of larval period

10 10

L

26

days

duration of pupal period

M

variable

number

male fly population, initially 2000

-

average life expectancy of male fly

10

N

21

days

P

variable

number

pupal population, initially 5000

-

Q

1.02

-

pupal period temperature dependency correction

10

R

variable

number

teneral population, initially 1000

-

S

.292

-

female life expectancy temperature correction

10

T

variable

Celsius

temperature, initially 25C (January 1st)

15

V

.04

-

average pupal death rate

10

W

29

Celsius

average yearly temperature

15

2.2. Explanation of Equations Pupa population The number of pupae deposited daily depends on the number of adult females and the rate at which the adult females are depositing pupae. On average, an adult female deposits a pupa every ten days and has a larval period of about 9 days 10 . We estimate that on a given day, the number of pupae deposited is one-ninth the number of adult females. Tsetse flies breed continuously, not in waves or cycles, so this is not an invalid assumption 16 . Well call this nine-day period the larval period, and denote its duration by I. It is worth noting that using a 10 day period does not significantly alter any of the results described below. The number of pupae emerging into tenerals daily depends on atmospheric temperature. On average, the length of the so-called pupal period (L) is about thirty days. However, as temperatures rise, the pupa develops more quickly. At 19C, the pupal period lasts about thirty-eight days, and at 33C, the period only lasts twenty-three days 10 . To model this behavior,

May 6, 2013

14:25

BC: 8846 - BIOMAT 2012

04˙madsen

61

we assumed that the number of daily emergences from puparia was the product of the number of pupae, the average rate of daily emergence (estimated as the inverse of the length of the pupal period, 1/L), the ratio of the current temperature to the yearly average temperature (T/W, which we’ll abbreviate by r), and a correction factor Q that quantifies the impact of the temperature ratio on the length of the pupal period. Q was estimated by comparing the average pupal period and annual temperature to the known pupal period-temperature data given above, and averaging the size of the necessary correction factor for each data point. This method is highly approximate, but does provide a loose estimate of the dependence of average pupal period duration on temperature. The value of L (or equivalently, Q) was adjusted to fit approximately the two endpoints at 19 and 33 degrees. The daily death rate of pupae is unfortunately very difficult to estimate. Pupae can die in many ways - parasites, predators, flooding, dehydration and freezing all appear to be major sources of pupal death, and little is known about their relative importance 10 . Some data indicates that pupal death is correlated with temperature 10 . At the end of the four-month long rainy season, about half of pupae collected are dead, but at other times, nearly all pupae found will give rise to adult flies 10 . Since this data is collected seasonally, we estimate that if about eight percent of twelve consecutive ten-day generations of pupae die, then at the end of the season, we’d expect to find about as many dead pupae as living ones. So we estimate that during such a rainy season, approximately eight percent of pupae die. At average temperatures, it appears that no pupae die, and there is little data to support any particular model of intermediate conditions. An average relative death rate of V = .04 was therefore used. Teneral population To estimate the size of teneral population, we assumed that only three factors affected its size: emergence from puparia, death of tenerals, and emergence into adults. Of course, the pupal emergence term is given above, and serves as the gain term for tenerals in the same way it serves as the loss term for pupae. The daily number of tenerals emerging into adults depends on the number of tenerals and how frequently they eat, which turns out to also be temperature-dependent 10 . High temperatures reduce the amount of time that flies can live without a blood meal, and low temperatures have the opposite effect 10 . Other species of tsetse (not G. tachinoides), for which more data is available, demonstrate about a 20% fluctuation in the duration of their feeding cycle at extreme temperatures 10 . We assume that the daily

May 6, 2013

14:25

BC: 8846 - BIOMAT 2012

04˙madsen

62

rate of emergence into adults is dependent on the product of the inverse of the average length of the feeding cycle (1/E, also the average length of the teneral period), the ratio of the actual temperature to the average temperature, and a correction factor C that reflects the significance of temperature in determining the length of the feeding cycle. Note that this term takes precisely the same form as the teneral emergence term in the differential equation for P. This correction factor was calculated in the same way as Q. We assume a similar dependence on temperature for the death rate of tenerals. With the constants chosen for this model, at 19 degrees Celsius the amount of time spent in the teneral stage is 5.3 days and at 33 degrees it is about 3 days. These values bracket an intermediate value of E = 4, as suggested by Rogers 8 . Female adult population We assume that the size of the female population depends only on emergence by tenerals and death. Only half of emerging tenerals become adult females. The death rate of adult females is believed to be temperature dependent 10 . During the dry season, the average female life expectancy of G. tachinoides is one month; in the rainy season, the life expectancy is three months 10 . To model this temperature dependence, we assume that the rate of adult female death is the sum of the inverse of the optimal life expectancy (1/A) and a correction factor that increases at the temperature varies from optimal (D), modeled as a quadratic function of the ratio of that difference to mean annual temperature, W . A correction factor, S, adjusts the quadratic so that endpoints approximate observations on the interval from 19 to 33 degrees Celsius. Male adult population The growth of the male population is identical to that of the female population, except that the male life expectancy is not temperaturedependent 10 , so the temperature-dependent life expectancy term in the above equation can be replaced with a constant 1/N term, where N is the average life expectancy of male tsetse. Table 1 summarizes the parameters used in the insect population equations, gives default parameters for numerical experiments and sources from which these data were taken.

May 6, 2013

14:25

BC: 8846 - BIOMAT 2012

04˙madsen

63

2.3. Temperature Model It is clear that fly populations fluctuate seasonally, both in size and in density 8 . Unfortunately, it is very difficult to estimate the number of tsetse living in an area at a given time. The best metric is given by their so-called ‘apparent density’, which is measured by the number of non-teneral males caught in traps per day. This metric is flawed, though, as a high apparent density may reflect a hungry population instead of a large population 10 . The duration of almost every stage in the tsetse cycle is temperature dependent 10 . We defined a parametric temperature function (T ) using average daily temperature data collected at Ouagadougou Airport in Ouagadougou, Burkina Faso 15 , where the average annual temperature is 29 C. Clearly, not all years are the same, but the model was run with periodic temperatures for several years to get a picture of its behavior over time. 3. Analysis of model The insect submodel is a homogeneous linear model. It has only one equilibrium, with all populations zero. If we set q = Q/L, c = C/E, d = D/W, r = T /W the system may be represented by the following matrix:   qr − v 0 I −1 0  qr cr − BHr 0 0    (5)  0 .5cr −(A−1 + S(r − d)2 ) 0  0

.5cr

0

−N −1

All of the entries in this matrix are constant except r, which varies above and below 1. In the following discussion we will treat r as a constant and look at the behavior of the system over a range of possible values of r. The characteristic polynomial for the system is given in Equations 6 and 7. P (λ) = (−λ − N −1 )Q(λ)

(6)

where Q(λ) = (−λ−(qr+v))(−λ−(cr+BHr))(−λ−(S(r−d)2 +A−1 ))+(2I)−1 qcr2 (7) The cubic factor is a polynomial with three negative roots plus a quantity added at the end. This small quantity has the potential effect of increasing the root nearest zero to a positive number, thereby creating an unstable system.

May 6, 2013

14:25

BC: 8846 - BIOMAT 2012

04˙madsen

64

3.1. Instability of the model with constant temperature D The characteristic polynomial for the system factors into a linear term and a cubic. P (λ) = (−λ − N −1 )(−λ3 − αλ2 − βλ + δ)

(8)

The values of α, β, and δ all vary with r. However α and β are always positive. Thus, by DesCartes’ rule of sign, the system will have a positive root if δ is positive, and in that case will be unstable. It is easy to see that δ is given by δ(r) = −(qr + v)(cr + BHr)(S(r − d)2 + A−1 ) + (2I)−1 qcr2

(9)

If the temperature is set to the optimize the lifespan of the female, then r = d. With other constants at default values, this gives a positive value for δ, and instability of the system. 3.2. Sufficient insect death leads to stability Because the diagonal entries of the matrix in Equation 6 are negative, a corollary of Gerschgorin’s Circle Theorem 17 guarantees stability when the column sums are negative. The only column sum which is not necessarily negative is I −1 −(S(r −d)2 +A−1 ). At default parameters this expression is negative for r < .25 and r > 1.44, guaranteeing stability at those (extreme) temperatures. The theorem does not preclude the possibility of stability in a more reasonable range as well. 3.3. Variable temperature as a switched system It is worth noting that the system exhibits both stable and unstable behavior depending on the temperature, with stability at both extreme temperatures. In this case the system is a continuous example of a “switched system” 18 whose long term behavior goes to some intermediate value. This model shows the long term behavior of a population which, under some conditions, experiences unconstrained growth. This growth is tempered by temperature conditions which, if in place long enough, would eliminate the population entirely. 4. Numerical results of insect submodel Equations 1-4 were run using Matlab’s ODE45 solver. The sensitivity diagrams of Figure 3 were produced from Matlab output using Excel. All figures were postprocessed using Adobe Photoshop.

May 6, 2013

14:25

BC: 8846 - BIOMAT 2012

04˙madsen

65

Figure 1 shows the results of the population model in Equations 1-4, coupled with the temperature model described in Equation 5.

Figure 1. The top graph shows all four populations over an 1800 day interval. The lower graph shows temperature data input for that period.

As the analysis suggests, the fly population is controlled effectively by the oscillating temperatures. When the temperature rises to about 32 degrees Celsius the female fly population begins to decline, and continues to do so until temperatures drop to around 29 degrees. Because this model is strictly periodic in temperature, the overall population must either decline or grow. For these parameter choices it declines, although very slowly. 4.1. Rogers’ model revisited The model Rogers proposed 8 assumes one vector population and two host populations. The full model is described in that paper. Here, for simplicity, we assume only a human host population. The fly populations assumed to be constant by Rogers we take to be varying, and use the model developed here as the source for those varying populations over time. Figure 2 shows the resulting variation in human disease prevalence, along with the temperature variation during the same period. There are a few things worth noting here. First of all, the peaks of prevalence coincide with high temperatures. This is a sum of two time

May 6, 2013

14:25

BC: 8846 - BIOMAT 2012

04˙madsen

66

Figure 2. Rogers’ model is coupled with the insect dynamics developed in this paper. The top graph shows percent of humans infected and percent of vectors infected (negligible by comparison) over the same time period as in Figure 1. The lower graph shows the temperature input during that period.

lags. There is a lag between when the temperature drops to optimal for the insect vector and when its populations peak, visible in Figure 1. There is then a second lag between the rise in insect population and the peak of infection. The second thing we note is that the oscillations are not large. Rogers assumes a relatively slow return rate from the infected to the susceptible pool. If this rate were improved through aggressive medical interventions, the picture could be different. Four reasons for seasonal fluctuation of vector borne disease have been proposed, one of which is variation in the population of the vectors themselves 19 . Observed seasonal fluctuations of tsetse are documented in Kenya 14 , Burkina Faso 12 , and Ethiopia 13 . Accurate measurements of these fluctuations are difficult. Generally cattle are used as bait and the result is some measurement of both the density of flies and how hungry they are. Comparison with trapping data in the Burkina Faso study 12 does show a drop in fly density after temperature peaks, as does our model in Figure 1.

May 6, 2013

14:25

BC: 8846 - BIOMAT 2012

04˙madsen

67

5. Sensitivity of the model A sensitivity analysis was conducted on both the full model and the insect submodel. For each quantity, the outcome analyzed is the average of the high and the low values on the interval from t=500 to t=1800. The sensitivity analysis is computed by fixing all the parameters in the model but one which is varied by +/- 10% (submodel) or +/- 1% (full model). The resulting change in each outcome is recorded as a percent of the base value, and parameters are ordered by size effect. The results are summarized in Figure 3, which shows the sensitivity of the female fly population on the left and human disease incidence on the right. All of the fly populations behaved similarly, so only this one is displayed. The parameter D, which represents the optimal temperature for prolonging the life expectancy of female flies, has by far the largest effect, changing the female fly population by a thousand percent and the human disease prevalence by seventy five percent. The parameters with the largest size effect in the human disease model all come from the population model developed in this paper. That is, variation in insect population dynamics has a bigger effect on disease prevalence than the disease parameters themselves. Parameters listed in lower case letters occur only in the model of human disease prevalence. The one to have the largest effect in our sensitivity analysis was a, which represents the proportion of of blood meals taken from humans (as opposed to other mammals). The default value for this parameter was twelve percent. This, and all default parameters for the disease model, are as in Rogers’ original work 8 . The next two largest effects are from m, which represents the total human population, and b, the probability of a bite leading to infection. 6. Summary of results In this paper we construct a temperature dependent model of tsetse fly population dynamics and couple it with a version of Rogers’ epidemiology model for Human African Trypanosomiasis. Analysis and numerical investigation of the insect model yields the following insights. (1) The temperature dependent insect population model has oscillating behavior with time lags qualitatively similar to observed data 12 . (2) The model varies between stable and unstable states, with stability at the extreme values of temperature and instability at intermediate

May 6, 2013

14:25

BC: 8846 - BIOMAT 2012

04˙madsen

68

Figure 3. On the left parameters for the insect model are ordered by the size of their effect on the female fly population, as described above. On the right the same is done for the insect model coupled with a version of Rogers’ disease model.

values. In this way temperature controls the insect population. (3) The model is an example of a continuously switched system, in the sense of control theory 18 . (4) Behavior of both the insect model and the coupled disease model is most sensitive to insect related parameters, and especially the temperature that optimizes the female life span. (5) The coupled disease model was relatively insensitive to the parameters not associated with the insect submodel. These results suggest that control of the fly is the key to control of this disease. It also suggests that climate change will play a big role in spread

May 6, 2013

14:25

BC: 8846 - BIOMAT 2012

04˙madsen

69

of trypanosomiasis. Finally, the model shows that a better understanding of the particular response of tsetse to temperature is key to predicting outbreaks, as the model is very sensitive to the parameters related to temperature. References 1. E. Fevre, B. Wissmann, S. Welburn, and P. Lutumba. Public Library of Science of Neglected Tropical Diseases 2 12 (2008). 2. C. Nimmo, Travel Medicine and Infectious Disease 8, 263-268 (2010). 3. P. Solano, S. Ravel, and T. de Meeus. Trends in Parasitology 26 5, 255-263 (2010). 4. D. Rogers and M. Packer, The Lancet 342, 1282-1284 (1993). 5. McDermott, J.J., and P.G. Coleman, International Journal for Parasitology 31, 603-609 (2001). 6. A. Fairlamb, Trends in Parasitology 19 11, 488-494 (2003). 7. B. Swallow, Review Paper for PAAT (Programme Against African Trypanosomiasis) 1, 1-46(1999). 8. D. Rogers,Parasitology 97, 193-212 (1988). 9. S. Welburn, and I Maudlin, Parasitology Today 15 10, 399-404 (1999). 10. J.N. Pollock, Training Manual for Tsetse Control Personnel. Rome: Food and Agriculture Organization of the United Nations (1982). 11. D. A. T. Baldry,Insect Science Applications 1, 85-93 (1980). 12. N. Kon´e, E. K. NGoran, I. Sidibe, A. W. Kombassere and J. Bouyer, Medical and Veterinary Entomology 25, 156168 (2011). 13. S.G.A. Leak, W. Mulatu, E. Authi, G.D.M. d’Ieteren, A.S. Peregrine, G.J. Rowlands and J.C.M. Trail, Acta Tropica 53,121-134 (1993). 14. M. Baylis, ActaTropica 65, 8196 (1997). 15. WeatherSpark, n.d. Web. 30 October 2012. ¡http://www.climatecharts. com/Locations/n/NG61024.php¿. 16. D. Rogers, J. of Animal Ecology 48 3, 825-849 (1979). 17. S. Gerschgorin, Izv. Akad. Nauk. USSR Otd. Fiz.-Mat. Nauk 6, 749754 (1931). 18. N. Wang, M. Egerstedt and C. Martin, Proc. of the 48th IEEE Conference on Decision and Control, held jointly with the 2009 28th Chinese Control Conference, CDC/CCC 3721-3726 (2009). 19. N. C. Grassly and C. Fraser, Proc. R. Soc. B 273, 25412550 (2006).

May 6, 2013

14:22

BC: 8846 - BIOMAt 2012

05˙portz

A MATHEMATICAL MODEL FOR THE IMMUNOTHERAPY OF ADVANCED PROSTATE CANCER

TRAVIS PORTZ Computer Sciences Department, University of Wisconsin-Madison YANG KUANG School of Mathematical and Statistical Sciences, Arizona State University

A mathematical model of advanced prostate cancer treatment is developed to examine the combined effects of androgen deprivation therapy and immunotherapy. Androgen deprivation therapy has been the primary form of treatment for advanced prostate cancer for the past 50 years. While initially successful, this therapy eventually results in a relapse after two to three years in the form of androgen-independent prostate cancer. Intermittent androgen deprivation therapy attempts to prevent relapse by cycling the patient on and off treatment. Over the past decade, dendritic cell vaccines have been used in clinical studies for the immunotherapy of prostate cancer with some success. The model presented in this paper examines the efficacy of dendritic cell vaccines when used with continuous or intermittent androgen deprivation therapy schedules. Numerical simulations of the model suggest that immunotherapy can successfully stabilize the disease using both continuous and intermittent androgen deprivation.

1. Introduction Prostate cancer is the most common type of cancer in American men and the second leading cause of cancer mortality 11 . Beginning as early as the second decade of life, the development of prostate cancer can require over 50 years to reach a detectable state 1 . Due to the slow growth rate of prostate cancer, chemotherapy has a very limited effect on the disease. Instead, treatment focuses on surgery and radiotherapy for localized disease and hormone therapy for metastatic cancer. Since prostate cells and their malignant counterparts require stimulation by androgen to grow, advanced prostate cancer can be treated by androgen deprivation therapy (ADT). This therapy reduces the androgen-dependent (AD) cancer cells by preventing growth and inducing apoptosis 1 . While ADT is initially successful in most patients, almost all patients experience 70

May 6, 2013

14:22

BC: 8846 - BIOMAt 2012

05˙portz

71

a relapse within several years. At this hormone refractory stage, the AD cells have been replaced by androgen-independent (AI) cells. These cells do not require the normal levels of androgen to sustain growth and are also resistant to the apoptotic effect of a low-androgen environment 3 . Intermittent androgen deprivation (IAD) is an alternative therapy schedule where androgen deprivation therapy is administered until a patient experiences a remission of the disease and then is withheld until the disease progresses back to a certain level13 . This therapy schedule aims to maintain apoptotic effect of androgen deprivation on the prostate cancer cells. Clinical studies have shown that patients are responsive to multiple cycles of the hormone therapy 4,7,13 . Progression to androgen independence was observed after two to five cycles lasting an average of 128 weeks 7 . Additional treatment options are needed to prevent the progression to androgen independence and to treat those who already have androgen independent prostate cancer. Immunotherapy by dendritic vaccines is such an option. Dendritic cell (DC) vaccines are created by extracting DCs from a patient, loading the DCs with target antigens, and then reinfusing the DCs 6 . The antigen-loaded DCs activate naive T cells resulting in a cellular immune response against the target antigen. Clinical prostate cancer studies have used prostatic acid phosphatase (PAP) as the target antigen for prostate cancer DC vaccines 6,19 . These studies treated hormone refractory patients by administering DC vaccines loaded with PAP on a monthly basis. All patients developed an immune response following vaccination. Some patients experienced a significant decline in PSA level, and others experienced stabilization of their previously progressing disease. Androgen deprivation therapy has been studied with a mathematical model to investigate the mechanism for androgen-independent relapse 10 . This model assumed continuous administration of androgen deprivation therapy and predicted that the treatment is only successful for a small range of biological parameters. Intermittent androgen deprivation was applied to this model and predicted that relapse can only be prevented by IAD if normal androgen levels have a negative effect on the growth rate of AI cells 9 . This is biologically unlikely since AI cells typically have androgen receptors with increased sensitivity 8 . Using biologically likely hypotheses for AI cell growth rates, mathematical models may or may not show that continuous therapy results in a longer time to androgen-independent relapse than intermittent therapy 17,5 . Cell-mediated anti-tumor immune responses have been studied with a mathematical model by Kirschner and Panetta 12 . Their model examines

May 6, 2013

14:22

BC: 8846 - BIOMAt 2012

05˙portz

72

the dynamics between the adaptive immune system, tumor cells, and the cytokine interleukin-2 (IL-2). The model shows the immune system can control tumors with average to high antigenicity at a dormant state. They also explore the possibility of treatment with IL-2. Their results suggest that treatment with IL-2 alone is not enough to the clear the tumor without administering toxic levels of the cytokine. In this paper, a mathematical model for advanced prostate cancer treatment is proposed which combines immunotherapy with intermittent androgen deprivation therapy. The immunotherapy portion of the model is based on the Kirschner and Panetta model, and the intermittent androgen deprivation therapy portion of the model is based on the work by Ideta et al. The model will investigate the efficacy of DC vaccines when administered before a hormone refractory stage of cancer. While the clinical studies found that the DC vaccines could slow the progression of the disease in hormone refractory patients, they did not show how the vaccine would affect patients actively undergoing hormone therapy. The model proposed in this paper will determine if intermittent therapy has any benefits over continuous therapy other than improved quality of life when combined with the use of DC vaccines. The necessary conditions for disease elimination or stabilization using this method of treatment will also be examined. 2. The Model Prostate cancer treatment by immunotherapy and androgen deprivation therapy is modeled by a system of ordinary differential equations which takes the form dX1 = dt

r1 (A)X1 | {z }

proliferation and death

dX2 = dt

r2 X2 | {z }

− m(A)X1 − | {z } mutation to AI

+

proliferation and death

dT = dt

e2 D g2 + D | {z }

−

µT |{z}

+

mutation from AD

−

e4 T (X1 + X2 ) g + X1 + X2 |4 {z }

production by stimulated T cells

e 1 X2 T g 1 + X2 | {z }

(2)

killed by T cells

natural death

− ωIL |{z}

(1)

killed by T cells

m(A)X1 | {z }

activation by dendritic cells

dIL = dt

e 1 X1 T g +X | 1 {z 1}

clearance

e3 T IL g3 + IL | {z }

(3)

clonal expansion

(4)

May 6, 2013

14:22

BC: 8846 - BIOMAt 2012

05˙portz

73

dA = γ(a0 − A) − | {z } dt homeostasis

dD = dt

−cD | {z }

.

γa0 u(t) | {z }

(5)

deprivation therapy

(6)

natural death

The variables used in the model and their meanings are listed in Table 2. Parameter interpretations and estimates are given in Table 2.

variable X1 X2 T IL A D

Table 1. Variables in the model meaning

unit

number of androgen dependent cancer cells number of androgen independent cancer cells number of activated T cells concentration of cytokines concentration of androgen number of dendritic cells

cells cells cells ng/mL nmol/mL cells

The androgen-dependent functions for AD cell growth and mutation are defined as follows: A A r1 (A) = α1 − β1 (k2 + (1 − k2 ) ) (7) A + k1 A + k3 A m(A) = m1 (1 − ). (8) a0 The parameters in the expression for r1 (A) are chosen such that the net growth rate of AD cells is α1 −β1 when A = a0 and −β1 k2 when A = 0. The value of parameter k2 is chosen such that β1 k2 matches the rate of decline of serum PSA concentration during continuous androgen deprivation therapy 9 . The net growth rate of AI cells, r2 , is a constant in this version of the model. Ideta et al. proposed two alternatives for r2 which assumed that androgen had a negative effect on the proliferation rate of AI cells 9 . Mutation from AD cells to AI cells occurs at a rate m1 when A = 0, and no mutation occurs when A = a0 . Larger values of m1 result in a shorter time to androgen-independent relapse, thus relapse time can be used to estimate the value of m1 9 . Intermittent androgen deprivation therapy is modeled by (5) where u(t) = 0 indicates an on-treatment period and u(t) = 1 indicates an offtreatment period. During off-treatment periods, the androgen level tends toward homeostasis at the normal androgen level a0 . The androgen level decays at a rate γ during on-treatment periods. The state variable u(t) is

May 6, 2013

14:22

BC: 8846 - BIOMAt 2012

05˙portz

74

parameter α1 β1 k1

meaning

Table 2. Parameters in the model value

AD cell proliferation rate AD cell death rate AD cell proliferation rate dependence on androgen

ref.

0.025/day 0.008/day

1

2 ng/ml

9

1

k2

effect of low androgen level on AD cell death rate

8

2

k3

AD cell death rate dependence on androgen

0.5 ng/ml

9

r2 m1 a0

AI cell net growth rate maximum mutation rate normal androgen concentration

0.006/day 0.00005/day 30 ng/ml

9

0.08/day

9

10/day 0.03/day 0.14/day

18

1 9

ω µ c

androgen clearance and production rate cytokine clearance rate T cell death rate dendritic cell death rate

e1

maximum rate T cells kill cancer cells

0 – 1/day

12

g1

cancer cell saturation level for T cell kill rate

10 × 109 cells

12

e2

maximum T cell activation rate

20 × 106 cells/day

12

g2

DC saturation level for T cell activation

400 × 106 cells

19

e3

maximum rate of clonal expansion

0.1245/day

12

g3

IL-2 saturation level for T cell clonal expansion

1000 ng/ml

12

e4

maximum rate T cells produce IL-2

5 × 10−6 ng/ml/cell/day

12

10 × 109 cells

12

300 × 106 cells 1 × 10−9 ng/ml/cell 1 × 10−9 ng/ml/cell

19

γ

g4 D1 c1 c2

cancer cell saturation level for T cell stimulation DC vaccine dosage AD cell PSA level correlation AI cell PSA level correlation

12 15

9 9

controlled by monitoring the serum PSA level as follows: y(t) = c1 X1 + c2 X2 ( 0 → 1 when y(t) > L1 and dy/dt > 0, u(t) = 1 → 0 when y(t) < L0 and dy/dt < 0,

(9) (10)

May 6, 2013

14:22

BC: 8846 - BIOMAt 2012

05˙portz

75

where y(t) is the serum PSA concentration. Androgen deprivation is switched on when the serum PSA concentration exceeds some level L1 and switched off when the serum PSA concentration drops below some level L0 with L0 < L1 . This method for IAD has been used in several clinical studies 4,7,13 . Since T cells are activated and stimulated through interactions between proteins (antigens and cytokines) and receptors 14 , Michaelis-Menten kinetics are used for all immune response terms in the model. This is the approach taken by Kirschner and Panetta and is reasonable given that high levels of antigens and cytokines are likely to have a saturation effect on the T cells. IL-2 is included in the model to provide the clonal expansion dynamics of helper T cells. When stimulated by the antigens presented on tumor cells, the helper T cells produce IL-2. The IL-2 then stimulates the clonal expansion of T cells in a positive feedback loop 14 . The model assumes a constant ratio of cytotoxic and helper T cells, which greatly simplifies the model and should not have a significant impact on the long-term behavior of the system. The cytotoxic T cells interact with and kill the tumor cells based on antigen stimulation. The rate of interaction is assumed to be the same for both AD and AI cells. There is no biological reason to assume otherwise. The antigen-loaded dendritic cells are modeled by (6) which assumes that the DCs undergo apoptosis at a constant rate and are not being replenished by any mechanisms other than further vaccinations. Vaccinations are administered every 30 days in model simulations. Each vaccination contains D1 antigen-loaded DCs. The DCs are assumed to activate na¨ıve T cells based on Michaelis-Menten kinetics as shown in (3). The model assumes that there are always na¨ıve T cells available for activation. The Michaelis-Menten terms could be replaced by simpler mass action terms to make the non-zero steady states easier to find analytically. However, the system is repeatedly being perturbed by the administration of DC vaccines, so steady-state analysis has limited use in this model. Since the model will be analyzed through numerical simulations, the more physically accurate Michaelis-Menten terms are used.

3. Numerical Simulations The mathematical model is simulated using the parameter estimates shown in Table 2. We vary the parameter e1 based on the assumption that the T cell response against antigen-presenting tumor cells varies greatly

14:22

BC: 8846 - BIOMAt 2012

05˙portz

76

from patient to patient. All simulations are run with the initial conditions X1 (0) = 14, X2 (0) = 0.1, A(0) = 30, and T (0) = IL (0) = D(0) = 0 where cell populations are expressed in billions. We first consider the case with androgen deprivation therapy only. The on-treatment PSA level L1 is fixed at 15 (ng/ml), and the off-treatment PSA level L0 is varied from 0 to 14 (ng/ml) to show the effects of IAD. The simulations in Figure 1 show that intermittent androgen therapy alone results in a shorter time to relapse than continuous therapy, correctly reproducing the results of the Ideta et al. model. Figure 2 shows the AD cell population being replaced by AI cells resulting in hormone refractory prostate cancer. With shorter treatment cycles, the androgen-independent relapse occurs sooner.

80 L0 = 0 L0 = 2

70

L =5 0

L0 = 10 L0 = 14

60

50 PSA (ng/mL)

May 6, 2013

40

30

20

10

0

0

100

200

300

400

500 time (days)

600

700

800

900

1000

Figure 1. Solutions of the model without immunotherapy. The on-treatment PSA level is fixed at L1 = 15. When intermittent therapy is used, the cancer relapses sooner.

We next consider the case of continuous ADT with the DC vaccine. The parameter e1 is varied to examine the effect that the immune response has on the tumor. The simulations in Figure 3 shows that the cancer can be eliminated with a large enough value of e1 within our parameter range. Figure 4 shows a log-lin plot of the AI cell population where we

14:22

BC: 8846 - BIOMAt 2012

05˙portz

77 20

AD cells

15 10 5 0

0

100

200

300

400

500 time (days)

600

700

800

900

1000

0

100

200

300

400

500 time (days)

600

700

800

900

1000

80

AI cells

60 40 20 0

Figure 2.

AD and AI cell populations for the solutions in Figure 1.

100 e1 = 0

90

e1 = 0.25

80

PSA (ng/mL)

May 6, 2013

e1 = 0.50

70

e1 = 0.75

60

e1 = 1.00

50 40 30 20 10 0

0

200

400

600

800

1000 1200 time (days)

1400

1600

1800

2000

2200

Figure 3. Solutions of the model with continuous androgen deprivation therapy and immunotherapy. The cancer is eliminated with stronger anti-tumor immune responses, as seen for e1 = 0.75 and e1 = 1.0.

can see that the AI population grows exponentially for e1 ≤ 0.5 and decays exponentially for e1 ≥ 0.75. The exact value of e1 at which the cancer becomes manageable will be found in section 4.

14:22

BC: 8846 - BIOMAt 2012

05˙portz

78

AD cells

20

10

0

0

200

400

600

800

1000 1200 time (days)

1400

1600

1800

2000

2200

0

200

400

600

800

1000 1200 time (days)

1400

1600

1800

2000

2200

5

AI cells

10

0

10

−5

10

Figure 4.

AD and AI cell populations for the solutions in Figure 3.

100 e =0 1

90

e = 0.25

80

e1 = 0.50

1

e = 0.75 1

70

e = 1.00 1

PSA (ng/mL)

May 6, 2013

60 50 40 30 20 10 0

0

200

400

600

800

1000 1200 time (days)

1400

1600

1800

2000

2200

Figure 5. Solutions of the model with intermittent androgen deprivation therapy and immunotherapy. The on-treatment PSA level is fixed at L1 = 15. Relapse is prevented with stronger anti-tumor immune responses, as seen for e1 = 0.75 and e1 = 1.0.

Finally, we consider intermittent androgen deprivation therapy with the DC vaccine. Again, we vary the parameter e1 to examine the effect of the immune response. The IAD is performed with L0 fixed at 5 (ng/ml) and L1 fixed at 15 (ng/ml). These values are similar to the PSA levels used for treatment scheduling in clinical studies 4,7 . The simulations in Figure 5 show that relapse is prevented for e1 ≥ 0.75. Unlike continuous therapy, the intermittent therapy is not able to completely eliminate the disease. Due to the intermittent therapy schedule, the AD cell population

14:22

BC: 8846 - BIOMAt 2012

05˙portz

79 20

AD cells

15 10 5 0

0

200

400

600

800

1000 1200 time (days)

1400

1600

1800

2000

2200

0

200

400

600

800

1000 1200 time (days)

1400

1600

1800

2000

2200

5

10

AI cells

May 6, 2013

0

10

−5

10

Figure 6.

AD and AI cell populations for the solutions in Figure 5.

will always oscillate between two sizes determined by L0 and L1 when androgen-independent relapse is prevented. The exact value of e1 at which relapse becomes preventable will be found in section 4. 4. Mathematical Analysis Since the administration of DC vaccines will perturb the system away from any possible non-zero steady states, the only case where we could have a non-zero steady state is with a single administration of the DC vaccine at t = 0 and continuous androgen deprivation therapy. Based on the simulations in Figure 3 where DC vaccines are being administered every 30 days, we can see that there will be no non-zero steady state with just one vaccine administration of a reasonable dosage and with reasonable parameter estimates. For this reason, steady-state analysis will not be performed in this paper. Instead, we will find the values of e1 at which the disease becomes controllable by immunotherapy for both the continuous and the intermittent therapy schedules. We first find the value of e1 at which continuous ADT combined with the DC vaccine is able to stabilize the disease. This value is known to be between 0.5 and 0.75. The exact value can be found numerically by running simulations at parameter values within the known range and integrating the

14:22

BC: 8846 - BIOMAt 2012

05˙portz

80

0.192 0.1918 0.1916 0.1914 X2

May 6, 2013

0.1912 0.191 0.1908 0.1906 0.1904 0.1902 0.065

0.07

0.075

0.08

0.085

0.09

0.095

0.1

T Figure 7. therapy.

The limit cycle found for e1 = 0.69197 with continuous androgen deprivation

solution over a long period of time. When the solution is seen converging toward a limit cycle, the correct value of e1 has been found. The limit cycle, shown in Figure 7, was found at e1 ≈ 0.69197. The period of this limit cycle is 30 days, the time between DC vaccinations. The cusp in the limit cycle is due to the discontinuity created by the repeated injections of DCs. For e1 < 0.69197, the cancer eventually relapses and then grows exponentially as indicated by a diverging solution. For e1 > 0.69197, relapse is prevented and the cancer decays exponentially. We next find the value of e1 at which IAD therapy combined with the DC vaccine is able to prevent an androgen-independent relapse. Again, we know this value is between 0.5 and 0.75. With intermittent therapy, the solution will converges to a torus for a large enough value of e1 rather than decay exponentially. Thus we need to find lowest value of e1 at which the solution no longer diverges and grows exponentially. This is done again by integrating the solution over a long period of time and checking for divergence. The first stable torus, shown in Figure 8, was found at e1 ≈

14:22

BC: 8846 - BIOMAt 2012

05˙portz

81 0.645

0.64

0.635

2

0.63 X

May 6, 2013

0.625

0.62

0.615

0.61

4

6

8

10

12

14

16

18

X1

Figure 8. therapy.

The torus found for e1 = 0.68973 with intermittent androgen deprivation

0.68973. For larger values of e1 , the torus is located at lower values of X2 . The bifurcation diagram in Figure 9 shows the maximum and minimum values of X2 on these tori. Tori are produced in this case because X2 is experiencing oscillations due to the changes in T cell populations, which are periodically driven by the DC vaccinations, and the changes in the X1 mutation rate, which oscillates due to the changing androgen levels. X1 also experiences these oscillations, although the IAD therapy is the main source of the oscillations due to the much larger cell population and the effects of androgen on the growth rate of AD cells. 5. Discussion The model has shown that androgen deprivation therapy alone is not enough to eliminate advanced prostate cancer or prevent an androgenindependent relapse, consistent with actual clinical results and the results of the Ideta et al. model in the case where the growth rate of AI cells is not inhibited by androgen. The model also shows that intermittent androgen deprivation therapy does not delay the progression of androgen-independent cancer when compared to continuous androgen deprivation. The model can

14:22

BC: 8846 - BIOMAt 2012

05˙portz

82

0.7 0.6 0.5 0.4 X2

May 6, 2013

0.3 0.2 0.1 0 0.65

0.7

0.75

0.8

0.85

0.9

0.95

e1 Figure 9. the tori.

A bifurcation diagram showing the maximum and minimum values of X2 on

help explain the shorter time to relapse. The number of mutations from AI to AD cells is highest early in the on-treatment period of therapy when androgen is low and the AD cell population is still relatively high. With continuous therapy, this high mutation rate only occurs once at the beginning of treatment, but with intermittent therapy, the high mutation rate occurs for each treatment cycle. This leads to a higher average mutation rate and therefore a higher average net growth rate for the AI cell population. While there is no benefit over continuous therapy with respect to relapse time, intermittent therapy is still an appealing treatment option due to the increased quality-of-life during the off-treatment periods. When a dendritic cell vaccine is combined with continuous androgen deprivation, the model shows that the cancer can be eliminated with a relatively strong (but still within our reasonable parameter range) antitumor immune response. Clinical studies have shown that DC vaccines are able to stabilize disease progression in some patients with hormone re-

May 6, 2013

14:22

BC: 8846 - BIOMAt 2012

05˙portz

83

fractory prostate cancer. Since these patients have androgen-independent cancer, we can safely assume that DC vaccines are capable of stopping the net growth of AI cells. When administered to a patient actively undergoing hormone therapy, we would then expect the DC vaccine to prevent an androgen-independent relapse while allowing continuous androgen deprivation therapy to eliminate the AD cell population. Thus the results of our model are reasonable in the case of DC vaccination combined with continuous androgen deprivation. An interesting result of the model is that the DC vaccine is able to prevent relapse with a slightly weaker anti-tumor immune response when intermittent androgen deprivation is used instead of continuous androgen deprivation. While the difference was only small (e1 = 0.68973 compared to e1 = 0.69197), the result was still surprising considering the effect that intermittent therapy has on relapse time without immunotherapy. This small difference can likely be attributed to the ability of the immune response to offset the higher mutation rate when intermittent therapy is used and also to the consistently larger cancer cell population which is necessary to stimulate the clonal expansion of T cells. With the low toxicity of DC vaccines 6,19 and the quality-of-life benefits of IAD, the combination of these two treatments could be very advantageous over continuous ADT for the treatment of advanced prostate cancer. The complexity of the model combined with its discrete nature makes standard analysis difficult. This forced the mathematical analysis in section 4 to use specific parameter values when finding the desired values of e1 , thus preventing us from having an algebraic expression for those values of e1 . Simplifying the model by replacing the Michaelis-Menten terms with mass action terms was considered in section 2, but the discrete nature of the model would still present difficulties for bifurcation analysis. Simplifying the model further may be possible by using continuous functions for the D(t) and u(t), although this would make clinical application of the model difficult. Accuracy in the estimation of parameter values for the immune response is a significant limitation of the model. Most of the parameter estimates were based on the parameter estimates used in the Kirschner and Panetta model. While many of these parameter estimates were based on data from biological studies, the estimates are for generic anti-tumor immune responses. The proper parameter values for an immune response against prostate cancer may be significantly different, and additional data from clinical studies of DC vaccines before these parameter estimates can

May 6, 2013

14:22

BC: 8846 - BIOMAt 2012

05˙portz

84

be made accurately. A possibly greater limitation of the model is the lack of regulatory T cells in the model. These cells are responsible for reducing sensitivity to self-antigens and preventing toxic levels of lymphocytes in response to a pathogen 16 . Since targeting cancer cells requires targeting a self-antigen, the regulatory T cells may have a significant suppressive effect on antitumor immune responses 20 . Several studies have examined ways to target regulatory T cells in order reduce their suppressive effect on immunotherapy 20 . Including regulatory T cells in the model is a possible direction for future work, and the therapies which target the regulatory T cells could then be examined as well.

Acknowledgements This work is supported in part by NSF DMS-0436341 and DMS-0920744.

References 1. R. R. Berges, J. Vukanovic, J. I. Epstein, M. CarMichel, L. Cisek, D. E. Johnson, R. W. Veltri, P. C. Walsh and J. T. Isaacs. Implication of cell kinetic changes during the progression of human prostatic cancer. Clin Cancer Res, 1(5), 473 (1995). 2. N. Bruchovsky, L. Klotz, J. Crook and S. L. Goldenberg. Locally advanced prostate cancer-biochemical results from a prospective phase II study of intermittent androgen suppression for men with evidence of prostate-specific antigen recurrence after radiotherapy. Cancer, 109(5) (2007). 3. N. Craft, C. Chhor, C. Tran, A. Belldegrun, J. DeKernion, O. N. Witte, J. Said, R. E. Reiter and C. L. Sawyers. Evidence for Clonal Outgrowth of Androgen-independent Prostate Cancer Cells from Androgen-dependent Tumors through a Two-Step Process. Cancer Res., 59(19), 5030 (1999). 4. J. M. Crook, E. Szumacher, S. Malone, S. Huan and R. Segal. Intermitent androgen suppression in the management of prostate cancer. Urology, 53(3), 530 (1999). 5. S. Eikenberry, J. D. Nagy and Y. Kuang. The evolutionary impact of androgen levels on prostate cancer in a multi-scale mathematical model. Biology Direct, 5(24), 1 (2010). 6. L. Fong, D. Brockstedt, C. Benike, J. K. Breen, G. Strang, C. L. Ruegg and E. G. Engleman. Dendritic Cell-Based Xenoantigen Vaccination for Prostate Cancer Immunotherapy. J. Immunol., 167(12), 7150 (2001). 7. S. L. Goldenberg, N. Bruchovsky, M. E. Gleave, L. D. Sullivan and K. Akakura. Intermittent androgen suppression in the treatment of prostate cancer: a preliminary report. Urology, 45(5), 839 (1995). 8. M. E. Grossmann, H. Huang and D. J. Tindall. Androgen receptor signaling

May 6, 2013

14:22

BC: 8846 - BIOMAt 2012

05˙portz

85

9.

10.

11. 12. 13. 14. 15. 16. 17.

18.

19.

20.

in androgen-refractory prostate cancer. J. Nat. Cancer Inst., 93(22), 1687 (2001). A. M. Ideta, G. Tanaka, T. Takeuchi and K. Aihara. A Mathematical Model of Intermittent Androgen Suppression for Prostate Cancer, J. Nonlinear Sci., 18(6), 593 (2008). T. L. Jackson. A mathematical model of prostate tumor growth and androgen-independent relapse. Discrete Cont. Dyn. Syst. Ser. B, 4(1), 187 (2004). A. Jemal, R. Siegel, E. Ward, Y. Hao, J. Xu, T. Murray and M. J. Thun. Cancer Statistics. CA Cancer J. Clin., 58(2), 71 (2008). D. Kirschner and J. C. Panetta. Modeling immunotherapy of the tumorimmune interaction. J. Math. Biol., 37(3), 235 (1998). L. H. Klotz, H. W. Herr, M. J. Morse and W. F. Whitmore Jr. Intermittent endocrine therapy for advanced prostate cancer. Cancer, 58(11) (1986). A. Lanzavecchia and F. Sallusto. Regulation of T cell immunity by dendritic cells. Cell, 106(3), 263 (2001). M. T. Lotze A. W. Thomson. Dendritic cells: biology and clinical applications. Academic Pr. (2001). A. O’Garra and P. Vieira. Regulatory T cells and mechanisms of immune system control. Nature Med, 10(8), 801 (2004). T. Portz, Y. Kuang and J. D. Nagy. A clinical data validated mathematical model of prostate cancer growth under intermittent androgen suppression therapy. AIP Advances, 2(011002), doi: 10.1063/1.3697848 (2012). S. A. Rosenberg and M. T. Lotze. Cancer immunotherapy using interleukin2 and interleukin-2-activated lymphocytes. Annu. Rev. Immunol., 4(1), 681 (1986). E. J. Small, P. Fratesi, D. M. Reese, G. Strang, R. Laus, M. V. Peshwa and F. H. Valone. Immunotherapy of hormone-refractory prostate cancer with antigen-loaded dendritic cells. J. Clin. Oncol., 18(23), 3894 (2000). W. Zou. Regulatory T cells, tumour immunity and immunotherapy. Nature Rev. Immunol., 6(4), 295 (2006).

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

SEIZURE MANIFOLD OF THE EPILEPTIC BRAIN: A STATE SPACE RECONSTRUCTION APPROACH

MUJAHID N. SYED∗, PANDO G. GEORGIEV† and PANOS M. PARDALOS‡§ Industrial and Systems Engineering Dept., University of Florida.

The objective of this paper is to provide a methodological overview of the nonlinear dynamical approach in analyzing ElectroEncephaloGram (EEG) signals. Speciﬁcally, the focus of this paper is to review reconstruction methods of the state space manifold of epileptic brain using EEG time-series. The paper presents the usage of Taken’s theorem, and highlights the importance of time delay embedding dimensions. A seizure in the brain activity adds a chaotic structure to the existing nonlinear dynamics, and diﬀerent measures can be used to describe the chaotic behavior of the seizure. The reconstructed manifold can also be used to separate the preictal and postictal periods from ictal period of epileptic seizures. Finally, the paper concludes by presenting a short criticism on the usage of non-linear dynamics in the analysis of EEG signals. Keywords: ElectroEncephaloGraphy (EEG), Nonlinear Dynamical Analysis, Phase Space Portrait, Seizure Manifold, Seizure Prediction, Surrogate Test, Time Delay Embedding, Takens Theorem

1. Introduction Epilepsy is a chronic neurological disorder, which is marked by repeated seizures or convulsions. Impulsive synchronous neuronal activity in the cerebral cortex is the main reason behind the occurrence of seizures. Epilepsy is an ancient disorder, and the origin of the disorder had superstitious/religious beliefs. It was not until the late 1800’s and early 1900’s that the recordings of electrical activity from human brain, called ElectroEncephaloGraphy (EEG), are analyzed in order to understand the disorder. In fact, due to the analysis and study of EEG, other functions of brain as ∗ smujahid@uﬂ.edu

† [email protected] ‡ pardalos@uﬂ.edu § this

work is partially supported by NSF grant 86

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

87

well as its anatomy are being understood. Based on EEG analysis, it has been concluded that the impulsive synchronous neuronal activity may be kindled locally (in speciﬁc portions of cerebral hemispheres) or globally (in both cerebral hemispheres) in the brain. The seizures that are initiated from local activity and remained conﬁned to the region are called as partial or focal seizures. Whereas, the seizures that are initiated from global activity involving almost the entire brain are termed as generalized seizures. In almost all the cases, seizure occur without any prior indication. Thus, it is a critical issue for the patients suﬀering from epilepsy. It is a usual phenomenon that, after a seizure’s onset, there will be traces of seizures within the brain. These traces can be characterized by mild cognitive, psychic, sensory, motor or autonomic symptoms. Moreover, the frequency of seizures per day in a given patient is unpredictable. Even in the current decade, there has been no eﬃcient drugs that can be used to stop or control seizures. The only pragmatic approach, as of now, is to ask the patient to take the drug almost every day. Due to the ‘impulsive’ nature of the disorder, the patient cannot involve in social, educational and vocational activities. In addition to that, seizures result in several limitations involving family or personal activities of the patient. The patient’s quality of life can be dramatically changed forever. In the extreme case, it may lead to segregation of patient from the society. The thought provoking aspect of this disorder is that it may occur several times per day, one time per day, one time per month or one time per year. Thus, this is almost a discrete event, yet it has a life long eﬀect on the patient. The only solution to avoid segregation of the patient is to identify the onset of seizure suﬃciently early, such that drugs can be administered to prevent the onset. Fortunately, from experimental analysis of EEG series, existence of preictal periods with identiﬁable precursors that warns the future onset of seizure has been aﬃrmed. Although, identifying the preictal precursors from EEG is a challenging task, but practical algorithms that work on scalp EEG will be a boon for any epilepsy patient.88 Depending upon the level of stochasticity and nonlinearity, diﬀerent models can be applied to a complex system. A simplistic overview of available models based on the work presented in Ref. 104, is illustrated in ﬁgure (1). Understanding the characteristics of the system is the crucial and fore-

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

88

most task for any analysis of the system. If the theoretical characteristics in the form of equations are available, then a speciﬁc model can be applied to analyze the system. The problem of ambiguity arises when no theoretical information is available, and the characteristics of system has to be identiﬁed from a series of scalar measures, like time series. The ﬁniteness of time series, in addition to the noise, will inhibit proper understanding of the underlying characteristics of the system. Moreover, the sampling rate of collecting the response plays an important role in the concrete analysis of the system. In the absence of stochasticity, the major practices that are used to analyze a system can be categorized under linear and non-linear methods.

Figure 1.

Models available for analysis of time series

On the contrary to linear methods, dynamical methods exploits the nonlinear dynamics of the time series. Generally, the nonlinear dynamics of a system can be explained by a deterministic component and a stochastic component. Most of the dynamical analysis methods assumes that the stochastic component is almost negligible in the system. Such systems are self-organizing in nature, i.e. the non-linearity can be explained by deterministic rule. Moreover, at certain periods, these systems behave abruptly, and are unpredictable for some time interval. This phenomenon is also known as deterministic chaos, and the systems are called as chaotic systems. Although, at ﬁrst glance it may appear that the unpredictability is due to its stochastic nature, but the actual reason for chaos is attributed

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

89

to a past subtle change in the system’s state. In other words, any inﬁnitesimal perturbation will eventually result in exponential disturbance. In fact, the chaotic systems are marked by these exponential divergence and convergence in the state space. Moreover, EEG does show such exponential divergence and convergence (or chaos) at the time of seizure.42,87,67,40 Thus, seizure analysis of EEG have been widely investigated using dynamical methods.8,36 One of the basic measures that can be used for a chaotic system is the attractor’s dimension d, which is linked to the manifold. Given the current state of a chaotic system, all the future states of the system can be easily predicted due to the existence of an underlying deterministic rule. Usually, the state of the system is represented as a point in a vector space, also known as Phase Space or State Space. The dynamics of the system can be studied by dynamics of the points in the state space. However, in order to deﬁne the state space, the dimension of the system should be known (manifold dimension). Typically, dimensional information about the system is unknown a priori, thus the dimension of the state spaces is to be identiﬁed from the time series. In order to construct the state space, generally, a Time Delay Embedding (TDE) is used. Typically, the right delay time τ and embedding dimension m for the TDE is unknown. A successful TDE manifold is created when the main ambiguities are resolved, i.e. identiﬁcation of determinism, embedding dimension, and delay time. A manifold is referred as “strange attractor” when it never repeats its trajectory. This strange phenomenon is attributed to the fractal (non-integer) dimension of the manifold, which is in turn attributed to the dissipative nature of the system. Manifold created from EEG does show strange phenomenon.40 The manifolds are very useful in understanding and identifying the subtle changes in the system’s state that may instigate future exponential divergences (can be related to occurrence of seizure). The focus of the dynamical analysis of EEG is to identify the inﬁnitesimal changes (preictal precursors) that occur long before the actual unpredictability (seizure) using appropriate measures. The measures maybe obtained using local or global information of the manifold. Unlike the linear case, in the dynamical analysis, the results obtained from any of the measures are to be statistically veriﬁed with some surrogate data. This will ensure that the results do have a signiﬁcant meaning. In this paper, basic ideas of the steps that are involved in nonlinear dynamical analysis of EEG will be reviewed. Furthermore, application of the dynamical methods in the analysis of EEG signals, and their role in

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

90

seizure prediction in particular will be addressed.

2. Review In this section a review that highlights the underlying key ideas used in the state space reconstruction of a dynamical system from a scalar measure will be presented. Speciﬁcally, the conditions, which ensures that the reconstructed manifold is an embedding (a smooth one-to-one mapping with a smooth inverse), will be reviewed. Our approach will be to present the ideology, and interested readers are refereed to116,110,100,12,3,49 for precise statements of the theorems and deﬁnitions presented in this section. Before proceeding further, consider the following deﬁnitions: Definition 2.1. Homeomorphism: A bijective mapping φ is said to be homeomorphic, if φ and φ−1 are continuous. Definition 2.2. Manifold: An n-dimensional set, M ⊂ Rn , is called a manifold if for every x ∈ M ∃ an open neighborhood N (x) for which there is a homeomorphism φ deﬁned as, φ : S → N (x), where S is an open set in Rn . Definition 2.3. Diﬀeomorphism: Two manifolds, M1 and M2 are said to be diﬀeomorphic, if there exists a bijective mapping ϕ between M1 and M2 (i.e. M1 = ϕ(M2 )), such that both ϕ and ϕ−1 are diﬀerentiable. Definition 2.4. Discrete Dynamical System: A Discrete Dynamical System (DDS) is deﬁned by a tuple (T, M, Φ), where T ⊂ Z, M is a manifold, and Φ is an evolution function deﬁned as: Φ : U ⊂ T × M → M . For example, let x = {xi }N i=1 represents a discrete time series, and si represents state of the system generated by x. Let st1 represent the state of the system at time t1 and let st2 represent any future state at time t2. If there exists a deterministic rule Φ(t2, st1 ) such that st2 can be predicted (or observed) from st1 , i.e. st2 = Φ(t2, st1 )

(1)

then the collection of rules Φ(t, sr ) for all possible values of discrete times t, r is called DDS.

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

91

Definition 2.5. Embedding: Let M1 be a smooth manifold. Let f : M1 → M2 be a diﬀerential mapping. f is called embedding of M1 in M2 if: (1) f (M1 ) ⊂ M2 is a diﬀerential submanifold of M2 , and (2) f : M1 → f (M1 ) is a diﬀeomorphism. where smooth is deﬁned as continuously diﬀerentiable C 1 . 2.1. Embedding Let st be the unknown state space of the attractor at time t. In order to infer st , let m scalar independent measures xt = [xt,1 , . . . , xt,m ]T be taken from the system at time t. The key idea of identifying st is to show that there exists a map, such that xt = f (st ). Since a point on the state space can uniquely deﬁne the state of the system, the map should have a oneto-one property. Following theorems state the condition under which such one-to-one mapping exists. Theorem 2.1. (a conclusion from Whitney’s work): Let A be a d-dimensional smooth manifold in Rk , and f : Rk → Rm be generic. If m > 2d then f is one-to-one on A. Theorem 2.2. (Summary of Whitney’s theorem): Let Md ∈ Rd be the smooth, Hausdorﬀ and second-countable compact manifold. Let f : Rd → Rm be a generic smooth mapping. Let Mm be the manifold obtained by applying map f on Md , i.e. Mm = f (Md ). If m > 2d > 0, then Md and Mm are diﬀeomorphic. Here the term “generic” means open and dense, i.e. it means that for a given mapping F there is arbitrarily small perturbation which is an embedding, and any arbitrary small perturbations of an embedding is again an embedding. Let h : Rk → R be a general measurement function. Let g : Md → Md be a map that deﬁnes the dynamics of the system. Takens deﬁned the delay coordinate function fd : Rk → Rm as: fd (st ) = [h(st ), h(g −Δt (st )), . . . , h(g −(m−1)Δt (st ))]T −αΔt

(2)

(st ) can be boiled down to (s(t−αΔt) ), where, for the discrete systems, g and Δt can be considered as some positive multiple of the sampling interval between the scalar measurements.

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

92

Theorem 2.3. (Summary of Takens’ theorem): Let Md be a d-dimensional compact manifold in Rk , which is invariant under a map g : Md → Md . Let h : Rk → R be a general measurement function and Δt be a time delay. If m > 2d and fd : Rk → Rm is deﬁned as in equation (2), then delay coordinate map fd is an embedding of Md . In spite the fact that Theorem 2.3 can be applied to reconstruct manifold of a DDS, there are some practical limitations to the theorem in the case of experimental analysis. These limitations can be enumerated as: (1) d is an integer (2) Time varying scalar measurement are noise free (3) Inﬁnitely many measurements are available It turns out that Theorems 2.2 & 2.3 are valid when d is non-integer, as long as the dimension is measured in terms of box counting dimension. To the best of our knowledge, for practical scenarios, the theorems were extended by Sauer et al.100 in terms of the fractal dimensions. Furthermore, Sauer et al. provided multivariate extensions of Takens theorem, which can be summarized as: Theorem 2.4. (Fractal Multivariate Sauer et al. Theorem): Let Md be a compact set in Rk with box counting dimension dB , which is invariant under a map g : Md → Md . Let h1 , . . . , hr : Rk → R be r generic measurement functions and Δt1 , . . . , Δtr be r generic time delays. If m1 , . . . , mr are integers, such that m = m1 + · · · + mr > 2dB , and fd : Rk → Rm is deﬁned as: fd (st ) = [h1 (st ), h1 (g −Δt1 (st )), . . . , h1 (g −(m1 −1)Δt1 (st )), ..., hr (st ), hr (g −Δtr (st )), . . . , hr (g −(mr −1)Δtr (st ))]T then delay coordinate map fd is an embedding of Md . Towards this end, noise and ﬁnite data are yet to be incorporated in the deﬁnitions. Casdagli et al.12 illustrated extensions to incorporate noise and ﬁnite data using statistical methods. To sum, the major contribution of Casdagli et al.’s work is to stress on the optimal selection of embedding dimension m and delay time τ . Based on their work, various theories have been developed to select the two parameters. Some of these theories will be presented in the following section.

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

93

3. Methodology One of the most exploited methods of understanding the dynamics of a time series is to develop a state space portrait. The primary outcome of such state space portrait is to depict the time evolution of a dynamical system. The prominent steps that are necessary in analyzing a time series using dynamical methods are depicted in ﬁgure 2. In the following part of this section, the steps will be discussed in detail.

Figure 2.

Overview of Dynamical Analysis

3.1. Preprocessing Filtering Noise For a time series, usually, band pass ﬁlters (or low/high pass ﬁlters) are used to remove noise from the data. Although, similar methodology can be used for noise ﬁltering in dynamical analysis, but care should be taken

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

94

while applying ﬁlters. This is due to the fact that, ﬁlters may add deterministic nonlinearity to the data, and this may result in tampering of existing dynamics within the data. Many studies from the literature shows that improper ﬁltering will result in false characterization of time series.7 In fact, inﬁnite input response ﬁlters should be avoided while ﬁltering time-series for dynamical analysis, since such ﬁlters may produce faulty dimensions in the reconstructed manifold.90 After all such complexities, in practice, traditional band pass ﬁlters do perform well when the noise is restricted to certain frequencies (which is the case for EEG data). In addition to the traditional band pass ﬁltering, alternate ﬁltering methods have been developed for dynamical data analysis.108,55,18,48,10,72 An excellent review of noise ﬁltering methods for a general chaotic series (not conﬁned to dissipative systems like EEG time series) is given in Ref. 54.

Identifying Stationarity Stationarity is a usual assumption in most of the dynamical measures, and it is one of the important preprocessing step. Although, it may be argued that stationarity of time series may not suﬃciently reﬂect stationarity of dynamical system. However, the dynamical system is unknown, and stationarity of time series is the only feature of the system that can be measured at this point. Stationarity of time series can be veriﬁed by various methods.95,48,70 If the time series is found to be non-stationary, then the methods like windowing and de-trending can be used to overcome the eﬀects of non-stationarity in the time series. In addition to the traditional time-series stationarity, methods for detecting stationarity for the dynamical systems have been proposed in Ref. 103, 118, 98. For any arbitrary time series, incorporating non-stationarity is a diﬃcult task (for example: methods like recurrence plots Refs. 26, 73, 19, 53, space time index plots Ref. 118). However, for EEG time series simple techniques like windowing and de-trending works well for any analysis based on stationarity.84 Identifying Determinism If it is not for the determinism, there would have been no dynamical analysis. Determinism cannot be veriﬁed from time series, and it has to be calculated from the state space. Kaplan and Glass50 were pioneers to develop a simple theoretical method, which can classify the system as deterministic or stochastic. The main hypothesis of their approach is, a deterministic

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

95

manifold will have smooth ﬂow with respect to gradient directions and value. Whereas, a random (or stochastic) manifold will have abruptness in the gradients. However, the theoretical idea could not be easily applied to practical data due to the cumbersome task of gird analysis. Cao11 proposed a practical method to identify or detect any determinism hidden in the given time series. The measure is calculated from time series, but using the state space dimension information. Let D(m) be a measure of determinism deﬁned as: N −mτ 1 |xi+mτ − xI ∗ (i,m)+mτ | D(m) = N − mτ i=1

(3)

where I ∗ (i, m) denotes an index of any other point closest to yi (m) deﬁned as: I ∗ (i, m) = argminl:l=i ||yi (m) − yl (m)||, where yi (m) is deﬁned as: yi (m) = [xi , xi+τ , . . . , xi+(m−2)τ , xi+(m−1)τ ]T

∀ i = 1, . . . , N − mτ (4)

Let DR(m) be deﬁned as: DR(m) = D(m + 1)/D(m)

(5)

The behavior of DR(m) with respect to diﬀerent values of m gives information regarding determinism of time series. Usually, a completely random time series will have DR(m) = 1 for all values of m. Whereas, for a deterministic time series, there exists at least one m such that DR(m) = 1. 3.2. Manifold Generation The simple and powerful method to construct the state space manifold is TDE. Although, TDE has been applied to time series, even prior to the development of theory of dynamical systems119 , its success in dynamical analysis is mostly attributed to Takens.110 Roughly speaking, Takens showed that the information from a scalar measurement ( or one dimensional time series) is suﬃcient to construct the whole state space manifold (see section 2 for strict deﬁnition). Apart from TDE, methods of derivatives89 is another approach that stems out from Takens’ theorem. Principal value decomposition may also be used as an alternative for state space reconstruction. In this work, manifold construction by the method of TDE will be presented.

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

96

Time Delay Embedding Let x = {xi }N i=1 represents a discrete time series observed from the system under consideration. An m-dimensional TDE is a manifold M ∈ Rm , deN −(m−1)τ . Each point yk (m) is generated from time ﬁned as M = {yk (m)}k=1 series as: T

∀ k = 1, . . . , N − mτ (6) where, m represents the embedding dimension τ represents the delay time (or lag time). Typically, any geometrical analysis conducted on the manifold can be deﬁned as dynamical analysis of the system.69,89 yk (m) = [xk , xk+τ , . . . , xk+(m−2)τ , xk+(m−1)τ ]

Equation (6) is a simple reduction of Takens’ theorem. The delay embedding presented in equation (6) is simple and easy to understand. However, the bottleneck of such embedding lies in identifying optimal embedding dimension m and optimal delay time τ . When Takens proposed the embedding, there was no restriction on the value of τ . However, there was a lower limit on value of m, which is given as m > 2d, where d is the fractal dimension of the underlying attractor. But the critical aspect of Takens’ approach is the assumption of noise free data, with inﬁnite length. Whereas, the real world data is never noise free, moreover, the data will have ﬁnite length. Thus, for the real world data (noisy and ﬁnite in nature), Takens’ theorem has been extended, see Refs. 12, 86, 54. These papers suggested that there is need to select the proper values of m and τ . Improper selection of these two parameters will result in false-positive results, conﬁrming the dynamical structure of the manifold. Therefore, it is very important to select the proper parameters. There are methods that identiﬁes optimal embedding dimension for a given delay time, and there are methods that identify optimal delay time for a given embedding dimension. Thus a bi-directional search is required to ﬁnd the optimal values of m and τ . For the bidirectional search a rule of thumb112 is to set a constant K, and search over all the possible values of m and τ such that (m − 1) × τ = K. The search will be repeated for possible values of K to ﬁnd the optimal parameters.

Embedding Dimension From the Takens’ theorem, it is indicated that if m0 is a dimension of an appropriate embedding, then any dimension m1 > m0 will result in an

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

97

appropriate embedding as well. Therefore, the best choice of m will be the minimum value, that can exploit the determinism of the dynamical system, without any loss of information. Although, one may argue that a suﬃciently high value of m should serve the purpose, but it is not true in general. Since, a very high value of m may act as ambiguity in calculating some of the measures (like Lyapunov exponents). Furthermore, a higher value of m will result in higher computation time. From the literature52,33,97,24 it can be seen that false neighborhood method is considered as one of the ways to estimate suﬃcient embedding dimension. One simple and robust method of determining the suﬃcient embedding dimension will be presented in this paper. For the available methods interested readers are directed to Refs. 29, 31, 10, 76, 51. False Nearest Neighbor (FNN) of point P1 is described as any point P2 that appears closer to P1 in dimension m, but it is not closer to P1 in dimension m+1. Such closeness in dimension m occurs due to small value of m. When m is smaller than the suﬃcient dimension, then due to projection, points appear closer than they actually are in the real manifold. Thus, by determining the fraction of FNN points for a given m, a decision can be made regarding the validity of m. The basic rule of FNN is, if the total fraction of FNN for every point of the time series is below some threshold97 , or zero, then the dimension is an appropriate dimension. A threshold free practical method to identify the suﬃcient dimension is presented in Ref. 11, described as:

μ(k, m) =

||yk (m + 1) − yI ∗ (k,m) (m + 1)||p ||yk (m) − yI ∗ (k,m) (m)||p

(7)

where μ(k, m) is a measure of FNN, I ∗ (k, m) is as deﬁned in equation (3), and p is taken as ∞ in Ref. 11, however, it can be taken as any suitable norm. A mean FNN measure for a given time series is deﬁned as:

Δ(m) =

N −mτ 1 μ(k, m) N − mτ

(8)

k=1

Next a relative measure of FNN is deﬁned as: Δ(m + 1) Δ(m) = Δ(m)

(9)

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

98

Equation (9) provides a non-subjective measure of identifying suﬃcient embedding dimension. Let m−1 be the ﬁrst value of m, where the curve of with respect to m starts to saturate, i.e., there is no signiﬁcant change in Δ the value of Δ(m) for any value of m > m−1 . Then m0 deﬁned as m−1 +1 is taken as the suﬃcient embedding dimension of the time delay embedding. At this stage of ﬁnding suﬃcient embedding dimensions, optimal delay time is unknown. As a preliminary approach for a discrete time series, τ = 1 can be used. However, there are ample methods in the literature that can be used to ﬁnd the optimal value of τ . Following subsection presents the topic of selecting appropriate delay time.

Delay Time Theoretically, the value of τ does not interfere in determining the suﬃcient value of m, but practically, the value of τ plays a signiﬁcant role in rightly identifying m. This diﬀerence is again attributed to inﬁnite and noise-free data assumption in the theoretical result of Takens. Usually, if the value of τ is smaller than the optimal value, say τ0 , then the successive points in TDE are strongly correlated. Where as if τ >> τ0 then the successive points in TDE are almost independent. Strong correlation leads to false cluster generation along the diagonal in Rm , whereas, independence leads to spread of points in Rm . Both the scenarios will tamper the deterministic dynamics of the system. The value of τ that results in zero autocorrelation maybe deﬁned as an optimal value. A primitive linear independence approach is to search for the minimum value of τ such the autocorrelation function between any two consecutive state space points passes through zero.1 The linear autocorrelation method is further upgraded to higher-order autocorrelation method as proposed in Ref. 2. On the other hand, a method based on mutual information is also proposed in Ref. 23 to identify the optimal value of τ . In this work, we will present the method of mutual information, since some practical success has been seen from this approach. Let [L, U ] be the interval explored by the time series, i.e. L ≤ xi ≤ U ∀i = 1, . . . , N . Let the interval be equally divided into bins of width . These bins can be used to create a histogram. Let Ni be the total number of data points from the given time series, x, that falls into the ith bin.

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

99

Now, pi = Ni /N represents the probability that the time series belongs to ith bin. Similarly, let pij (τ ) be the probability that xr belongs to ith bin, and xr+τ belongs to j th bin for every value of r. The mutual information that can be achieved by τ delay time is given as: H (τ ) = pij (τ ) ln(pij (τ )) − 2 pi ln(pi ) (10) i,j

i

The width of bin can be adaptive or ﬁxed. An easy approach of calculating mutual information is to set the bin’s width to a coarse value. Since the actual value of mutual information is not important, but a relative information about ﬁrst minimum is required, a coarse bin width can be used without much error.

3.3. Measures of DDS The true success of dynamical analysis is displayed when a system which is assumed to be random can be described by the dynamic rules. The primary question while applying dynamical analysis is distinguish random perturbations from deterministic chaos. In fact, most of the proposed measures of DDS were used to quantify the determinism in the manifold. Measures can be used to characterize and/or predict the TDE manifold locally or globally. Although, it is not easy to predict a global behavior of the system unless its complete determinism is extracted from data, but local predictions are quiet possible. Most of the widely known measures of DDS were developed for a known dynamical system (when an analytical form is available). Moreover, such theoretical systems are based on inﬁnite time series, and usually neglect presence of noise. These theoretical methods were later extended to analyze the constructed manifolds (constructed from experimental data). In the following part of this subsection, some of the widely known dynamical measures will be discussed.

Fractal Dimension Dimension of the system gives an estimate of the systems complexity. One of the most common global measure that calibrates dynamics of time series is the fractal dimension of the system. This dimension is widely known as Hausdorﬀ dimension, and can be approximately calculated as box dimension. In the literature of DDS, a classical work to calculate the fractal dimension for the TDE manifold is presented by Grassberger and Procaccia31

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

100

using the concept of correlation. Their methodology was updated and widely applied to calculate dimension because of its simplicity.31,27,109,102 One of the modiﬁed practical approaches, that calculates the dimension of the system is presented by Judd in Refs. 44. In Ref. 25 a head to head comparison between both the methods is illustrated. In addition to these methods, another classical method that basically diﬀers from Grassberger and Procaccia’s method is independently presented by Takens111 and Ellner.22 Let C() denote the correlation integral of the manifold. Theoretically, according to Grassberger and Procaccia31, the fractal dimension, dc , of the manifold can be deﬁned as: C() = adc

(11)

where > 0 is very small. In31 the deﬁnition of C() is given as: N 1 Θ( − yi − yj ) N →∞ N 2 i,j=1

C() = lim

(12)

where Θ() is a Heaviside function deﬁned as:

Θ(h) =

1 h≥0 0 otherwise

(13)

As can be seen from equation (12), the term “N → ∞” is applicable only for the analytical models or for models with inﬁnite data. Thus, the equation cannot be applied directly to limited data. To overcome this limitations, following practical extension is used for the experimental data. Let L0 be the largest inter-point distance in the manifold. Let us deﬁne bins Bi on the real axis as: Bi = [li , li+1 )

− 1) i = 1, . . . , (B

(14)

is the total number of bins. Now where li = αi L0 , 0 < α < 1; and B all the inter-point distances on the manifold are calculated as dj,k , where dj,k = yk − yj denotes distance between two points on the manifold. If the total number of distances dj,k that belongs to interval Bi is denoted by i , then the distribution of distances is given as: B

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

101

i B pi = k Bk

(15)

With the knowledge of the distribution, the discretised version of correlation integral can be deﬁned as:

C(li ) = Ci =

B

pj

(16)

j=i

Thus, a slope of line on the plot of Ci and li will give the desired fractal dimension of the manifold. Unlike the dimension, which measures the complexity of the system, lyapunov exponent and Kolmogorov entropy measures the level of chaos in the system. In the following part of this subsection, the two types of measures will be presented.

Lyapunov Exponents Lyapunov exponent is another widely known measure that is used to determine the divergence rate of trajectory in the state space. Basically, it measures the sensitivity of the system to its initial conditions. Chaotic systems are very sensitive to perturbations, i.e., any inﬁnitesimal change in the initial stages of the system results in exponential divergence in the later stages. This is one of the fundamental characteristics of the chaotic system. The growth rate of the system is called as lyapunov exponent. From the theory, the basic idea of calculating lyapunov exponent is to measure the rate of growth in distance between two states. Let yt1 and yt2 be any two states of the system at initial observation, such that yt2 − yt1 = d0 . Let the second observation be made after an elapsed time of Δt. Then the maximum lyapunov exponent λmax is calculated as: λmax = lim

lim

Δt→∞ d0 →0

1 dΔt loge Δt d0

(17)

where yt2+Δt − yt1+Δt = dΔt . Diﬀerent local directions from initial observation will lead to diﬀerent values of lyapunov exponents. These exponents together are called as Lyapunov Spectra, and theoretically the spectra can provide global information about the growth rate. However, in practice, only maximum lyapunov exponent has been able to provide useful

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

.

102

information. Moreover, non-maximum lyapunov exponents are diﬃcult to calculate. In general, a ﬁnite positive value of λmax is an indicator of exponential divergence in nearby trajectories of the manifold. An early algorithm that calculates lyapunov exponents proposed by Wolfe et al.117 , was sensitive to noise and time delay τ . Improved method incorporating ﬁnite data is proposed by Rosenstein et al.99 A similar but robust measure that is insensitive to TDE dimensions is proposed by Kantz.47 A simpliﬁed robust practical method is presented in the following paragraph. Let N (yt ) denote an neighborhood of state space yt . The average maximum laypunov exponent is given by the slope of line joining the points S(Δt) and Δt, where S(Δt) is deﬁned as: ⎛ N 1 1 ⎝ ln S(Δt) = N t=1 |N (yt )|

k ∈ N (yt )

⎞

dist(yt , yk , Δt)⎠

(18)

where N is the full length of the time series. Schreiber104 propose to use a randomly selected set of points instead of the full length N . Furthermore, Kantz47 deﬁned dist(yt , yk , Δt) as: dist(yt , yk , Δt) = |xt+Δt − xk+Δt |,

(19)

In addition to that, fast neighborhood searching algorithms74 can be used to obtain N (yt ). Typically, for small values of Δt, a linear relation between Δt and S(Δt) cannot be observed, however, it can be seen for the intermediate values of Δt. Kolmogorov Entropy Kolmogorov entropy30 estimates the dynamics of a time series. The idea behind this measure is that an ordered system has zero entropy, whereas, a completely random system will have inﬁnite entropy. Given an mdimensional state space, consider a partition of the state space into boxes of sizes m . Let the state of the system be observed in the intervals of Δt, i.e. yΔt , y2Δt , . . ., yδΔt . Let p(b1 , . . . , bδ ) be the joint probability that the state at time Δt is in box b1 , and the state at time 2Δt is in box b2 , . . .,

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

103

and the state at time δΔt is in box bδ . The Kolmogorov entropy (K) is then deﬁned as: K = − lim lim lim

Δt→0 →0 δ→∞

1 p(b1 , . . . , iδ ) ln(p(b1 , . . . , iδ )) δ Δt

(20)

b1 ,...,iδ

In order to calculate p(b1 , . . . , bδ ), one simple method is to calculate number of points in box bi , say Ni . Then p(bi ) is calculated as: p(bi ) = pi =

Ni N

(21)

and K is calculated as: K −

Ni i

N

log2

Ni N

(22)

Although equation (21) can be used to calculate K from experimental data, however, it is not eﬃcient. Therefore, an improved method to calculate K is proposed in Ref. 14. The idea is to use an individual correlation function deﬁned as:

C (y) =

|N√ (y)| N

(23)

√ where |N√ (y)| is the number of points on the manifold in the neighborhood of y (in other words, it is number of points on the manifold i be the center of box whose square distance is less than from y). Let y bi , then probability pi can be deﬁned as: Ni = C 2 ( yi ) N In addition to that, following assumption is made in Ref. 14: pi =

Ni 1 log2 C 2 (yik ) = log2 C 2 ( yi ) Ni

(24)

(25)

k=1

where yik is deﬁned as k th point that belongs to box bi , 1 ≤ k ≤ Ni , and the total number of points in the box is denoted by Ni . Using equations (24,25) in equation (22), the new deﬁnition of K can be given as:

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

104

K −

N 1 log2 C 2 (yi ) N i=1

(26)

3.4. Surrogate Tests In order to validate the results of dynamical analysis, surrogate data based validation methods are used. Surrogate data characterize the interesting properties observed in the dynamical analysis either as a mere false-positive error or as the true behavior of the system. Before any conclusion based on the measures of DDS, the results are to be validated by surrogate tests. This is the standard way to provide validity of dynamic analysis. The methods that are available for generating surrogate data can be found in Refs. 113,105 . Typically, a surrogate data set is generated based on a null hypothesis, which usually states that certain features of the time series are preserved but there is no further structure in the time series. Among diﬀerent methods of generating surrogate data, the three most important methods of conducting surrogate test based on hierarchy of generality are explained in Ref. 113. In the following part of this subsection, the hierarchical order of surrogate tests will be presented.

Surrogate Data Test 1 This test is used to evaluate the existence of any sort of dynamic behavior in the time series. The null hypothesis is stated as: H0 : Observed data is indistinguishable from IID random noise. A simple and practical way to generate surrogate data for this test is by randomly shuﬄing the actual time series data. The random shuﬄing preserves the mean and variance, but destroys temporal correlation from the time series. Surrogate Data Test 2 A generalization of test 1, is to question weather the time series is a mere linearly autocorrelated noise. The null hypothesis is stated as: H0 : Observed data is indistinguishable from linearly autocorrelated noise.

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

105

The basic idea of generating surrogate data for the null hypothesis is to generate a random time series which has the same power spectra as the actual time series. A simple and practical way to generate such time series, it to obtain the windowed Fourier transform of the actual time series. The ﬁrst step is to randomize the phase of power spectra i.e., All the complex amplitudes of the transform are multiplied by eiθ , where θ is randomly taken from the interval [0, 2π]. After randomizing the phase, the phases are symmetrized, i.e. we set θ(f ) = −θ(−f ), where f is the frequency of in the transform. The symmetrization is done in order to have a nonimaginary inverse of the Fourier transform. Thus, the Fourier inverse of randomized and symmetrized phases results in a surrogate data, which is suitable for test 2. Surrogate Data Test 3 This surrogate test is a generalization of test 2. Test 3 address the question of similarity between actual time series and a monotonic nonlinear transformation of a linear random noise. The null hypothesis is stated as: H0 : Observed data is indistinguishable from monotonic nonlinear transformation of linearly autocorrelated noise. N Let a = {ai }N i=1 and b = {bi }i=1 be any two unrelated time series of equal length. Let each data point of time series a be ranked based on its value, i.e. if ai is the rth smallest value, then its rank is r. Now, if the data such that each bi has rank r, points in time series b are reordered into b, then let us call time series b as rank reordered b with respect to a.

In order to generate surrogate data for test 3, ﬁrst generate a random Gaussian time series having length equal to the actual data x = {xi }N i=1 , N z, which is the say z = {zi }i=1 be the random time series. Now generate rank reordered z with respect to x. Let zθ be the surrogate test 2 data for , which is the rank reordered x with respect time series z. Next generate x is the surrogate data for test 3. to zθ . The time series denoted by x

These three tests are the most general test that can be applied to any dynamical analysis data. Moreover, apart from using the modiﬁed t-statistic to test hypothesis, there are other approaches available in the literature96,34 that can be used for surrogate test.

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

106

3.5. Low Dimensional Phase Portraits If the TDE dimensions are higher than 3, then it is highly diﬃcult to visualize the manifold. To overcome the drawback of high dimensional manifold, dimensionality reduction methods can be used. Principal Component Analysis (PCA), Kernel PCA, Independent Component Analysis (ICA) are widely known dimensionality reduction methods. Apart from the advantage of being able to visualize the data in low dimensions, the ﬁrst two columns of PCS are noise free (since singular value decomposition is a noise reduction procedure). Thus, low dimensional phase portraits provides us with a clear picture of the dynamical behavior, that can be used with the information provided by various measures to better understand the dynamical system. 4. Seizure Manifold Analysis of EEG time series using dynamical methods have stirred the researchers in the ﬁeld of seizures identiﬁcation and prediction. Dynamical analysis in EEG data began with seminal work of Iasemidis et al.42 in 1988. This work has kindled many other investigations towards the dynamical aspects of EEG data.16,15 Signiﬁcant results from invasive EEG recordings have been shown in the literature, Refs. 93, 42, 71. Scalp EEG suﬀers from noise generated by eye blinking, head movement, and muscle noise. Typical, the entire dynamical analysis of EEG can be divided into two basic ideas: • State space can be used to completely characterize seizure • Onset of seizure has identiﬁable traces hidden in preictal phase Experimental existence of preictal phase has been reported in number of articles, like Refs. 41, 39, 35. Moreover, activities that drive or suppress seizure activity has been reported as well.107 However, later points are related to seizures in general, and the ﬁrst three points are more speciﬁc to the topic of this paper. Thus, the former point will be discussed in the following part of this section. The phenomenon of seizure is hypothesized to be nonlinear dynamic chaotic process.40,9 The postulators validated the hypothesis by illustrating experimental evidences and providing reasonable arguments.87 Based on the theory of dynamical analysis, two diﬀerent theories on the evolution of seizure has been proposed by Lopez de Silva et al.15,16 Theories that illustrates the development and propagation of focal

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

107

seizure through the notion of spatio-temporal dynamics have been proposed in Refs. 38, 20, 61, 13. In addition to that dymanical methods and measures in the prediction of epileptic seizure, focused on tracking the onset of seizure, have been presented in Refs. 17, 93, 60, 64, 91, 92, 46. 5. Criticism Despite the success over the linear methods, usage of dynamical methods in the analysis of EEG has been criticized.114,68 The very notion of the existence of chaos in seizure phenomenon has been posed as a fundamental question.43,75 The non-existence of chaos in seizure is shown by experimental results.63 It has been argued over and over that the successful results were based on selective data, with low noise to signal ratio. Even the seizure type and patient state were restricted in the analysis and the results. The reproducibility of the successful results shown in the literature have been questioned.6 Inability of correlation dimensions32 and Lyapunov exponents58,57 to predict the seizure has been experimentally addressed. Furthermore, it has been criticized that the results reported in the literature are for the speciﬁc case of focal epilepsy;71 however, no successful results are shown for the general epilepsy. In addition to that, the results were based on the cases wherein the knowledge of seizure’s focal location and onset time is known a priori. The measures that worked for one patient did not work correspondingly for other patients. For instance, the dynamical measures like ST Lmax37,39 behaves to be patient speciﬁc, and for a given patient seizure speciﬁc. The speciﬁc nature of this measure has been criticized as ambiguous in seizure prediction. Lastly, it has been criticized that information from the dynamical measures cannot be interpreted to a corresponding physiological phenomenon. Thus, the results from dynamical analysis provide a vague understanding of the epilepsy phenomena.

6. Conclusion Seizure prediction from EEG is based on the notion of preictal state, which is marked by the presence of measurable identiﬁers. From the evidences shown in Refs. 60, 65, 66, 35, 63, it is unanimously accepted that in theory and practice certain types of seizures can be early detected. An ideology that directed attention of epilepsy researcher towards possible connection between epilepsy and nonlinear dynamics can be attributed to the seminal

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

108

work of Schwartzkroin.106 The novelty that theory of dynamics added to the seizure prediction is the existence of a long preictal state.41 Thus, due to dynamical analysis, onset of seizure can be predicted hours before its occurrence, unlike the case of traditional linear analysis.35 However, the results from dynamical analysis were not unanimously accepted by the research community due to various ambiguities: from existence of determinism to interpretation of the results. Most of the ambiguities can be attributed to non-standardization of deﬁnitions for seizure onset and ictal events. It was not until the early 2000’s that an international society called International Seizure Prediction Group (ISPG) was formed to address the standardization issues. The outcomes of the ﬁrst international conference held by ISPG were contradictory and inconclusive. Nevertheless, standardization of definitions, test data and assessment criteria with unanimous acceptance are still an on going process in the ﬁeld of seizure prediction. In the Refs. 82, 85, 59, 45, 81, it has been reported that even for focal epilepsy, the recordings in non-focal areas may provide much more information than those on the focal area. This interesting ﬁnding further directs towards possibilities of improving TDE in the dynamical analysis of EEG. Furthermore, at present, the manifold construction based on multiple EEG channels is a simple appending of single channel methods. At present there is a need for control, synchronization and interactivity based methods for manifold generation, which may result in fruitful ﬁndings.83,80,78,28 In fact, there has been recent postulates to used multivariate measures based on the concept of synchronization in DDS.94 Instead of constructing a single manifold, a group of manifolds for each functional area of brain may be constructed; and inter-manifold measures between diﬀerent functional areas may unfold the mysteries of seizure phenomenon. Although no direct strong evidence may be available, yet it can be argued that brain is a complex system that comprises of many interconnected subsystems at multiple levels.101,56,115,4 Functionally connected subsystems may have a protocol based communication with one another. Research directions that address the measures of intercommunication, in addition to the synchronization will be helpful in seizure detection. To sum, at present there exists no practical algorithm that can detect seizure from scalp EEG, and there is no unanimously acceptable mechanism that depicts epilepsy.77,115,79,5 However, given the information of seizure, algorithms have been developed to detect precursors from EEG recordings.

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

109

Based on the experimental results on the dynamical analysis of EEG, it can be stated that the complexity of brain (represented by fractal/correlation dimension) reduces prior to the onset of seizure.62,21 Unambiguous detection and ﬂawless prediction of seizure from scalp EEG using dynamical measures is not yet achievable. Dynamical analysis in conjunction with other time series analysis methods may provide fruitful results in the future.35,63 References 1. A.M. Albano, J. Muench, C. Schwartz, A.I. Mees, and P.E. Rapp. Singularvalue decomposition and the grassberger-procaccia algorithm. Physical Review A, 38(6):3017, 1988. 2. AM Albano, A. Passamante, and M.E. Farrell. Using higher-order correlations to deﬁne an embedding window. Physica D: Nonlinear Phenomena, 54(1):85–97, 1991. 3. K.T. Alligood, T. Sauer, and J.A. Yorke. Chaos: an introduction to dynamical systems. Springer Verlag, 1997. 4. M. Amiri, F. Bahrami, and M. Janahmadi. Functional modeling of astrocytes in epilepsy: a feedback system perspective. Neural computing & applications, 20(8):1131–1139, 2011. 5. M. Amiri, F. Bahrami, and M. Janahmadi. Modiﬁed thalamocortical model: A step towards more understanding of the functional contribution of astrocytes to epilepsy. Journal of Computational Neuroscience, pages 1–15, 2012. 6. R. Aschenbrenner-Scheibe, T. Maiwald, M. Winterhalder, HU Voss, J. Timmer, and A. Schulze-Bonhage. How well can epileptic seizures be predicted? an evaluation of a nonlinear method. Brain, 126(12):2616–2626, 2003. 7. R. Badii, G. Broggi, B. Derighetti, M. Ravani, S. Ciliberto, A. Politi, and M.A Rubio. Dimension increase in ﬁltered chaotic signals. Physical review letters, 60(11):979–982, 1988. 8. E. Ba¸sar and T.H. Bullock. Chaos in brain function. Springer, 1990. 9. S. Blanco, H. Garcia, R.Q. Quiroga, L. Romanelli, and O.A Rosso. Stationarity of the eeg series. Engineering in Medicine and Biology Magazine, IEEE, 14(4):395– 399, 1995. 10. D.S Broomhead and G.P. King. Extracting qualitative dynamics from experimental data. Physica D: Nonlinear Phenomena, 20(2-3):217–236, 1986. 11. L. Cao. Practical method for determining the minimum embedding dimension of a scalar time series. Physica D: Nonlinear Phenomena, 110(1):43–50, 1997. 12. M. Casdagli, S. Eubank, J.D. Farmer, and J. Gibson. State space reconstruction in the presence of noise. Physica D: Nonlinear Phenomena, 51(1):52–98, 1991. 13. M. Ch´ avez, M. Le Van Quyen, V. Navarro, M. Baulac, and J. Martinerie. Spatiotemporal dynamics prior to neocortical seizures: amplitude versus phase couplings. Biomedical Engineering, IEEE Transactions on, 50(5):571–583, 2003. 14. A. Cohen and I. Procaccia. Computing the kolmogorov entropy from time signals of dissipative and conservative dynamical systems. Physical review A, 31(3):1872, 1985. 15. F.H.L. da Silva, W. Blanes, S.N. Kalitzin, J. Parra, P. Suﬀczynski, and D.N. Velis. Dynamical diseases of brain systems: diﬀerent routes to epileptic seizures. Biomedical Engineering, IEEE Transactions on, 50(5):540–548, 2003. 16. F.H.L. Da Silva, W. Blanes, S.N. Kalitzin, J. Parra, P. Suﬀczynski, and D.N. Velis.

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

110

17.

18. 19. 20. 21.

22. 23. 24. 25.

26. 27. 28.

29. 30. 31. 32.

33. 34. 35. 36. 37.

Epilepsies as dynamical diseases of brain systems: basic models of the transition between normal and epileptic activity. Epilepsia, 44:72–83, 2003. F.H.L. Da Silva, J.P Pijn, and W.J Wadman. Dynamics of local neuronal networks: Control parameters and state bifurcations in epileptogenesis. Prog. Brain Res, 102:359–370, 1994. M. Davies. Noise reduction schemes for chaotic time series. Physica D: Nonlinear Phenomena, 79(2-4):174–192, 1994. J.P. Eckmann, S.O. Kamphorst, and D. Ruelle. Recurrence plots of dynamical systems. EPL (Europhysics Letters), 4:973, 1987. C.E Elger and K. Lehnertz. Ictogenesis and chaos. Epileptic seizures and syndromes, pages 547–552, 1994. C.E Elger, G. Widman, R. Andrzejak, J. Arnhold, P. David, and K. Lehnertz. Nonlinear eeg analysis and its potential role in epileptology. Epilepsia, 41:S34– S38, 2000. S. Ellner. Estimating attractor dimensions from limited data: a new method, with error estimates. Physics letters A, 133(3):128–133, 1988. A.M. Fraser and H.L. Swinney. Independent coordinates for strange attractors from mutual information. Physical review A, 33(2):1134, 1986. D.R. Fredkin and J.A. Rice. Method of false nearest neighbors: A cautionary note. Physical Review E, 51:2950–2954, 1995. A. Galka, T. Maaﬂ, and G. Pﬁster. Estimating the dimension of high-dimensional attractors: A comparison between two algorithms. Physica D: Nonlinear Phenomena, 121(3-4):237–251, 1998. J.B Gao. Detecting nonstationarity and state transitions in a time series. Physical Review E, 63(6):066202, 2001. N.A. Gershenfeld. Dimension measurement on high-dimensional systems. Physica D: Nonlinear Phenomena, 55(1-2):135–154, 1992. J. G´ omez Garc´ıa, C. Ospina Aguirre, E. Delgado Trejos, and G. Castellanos Dominguez. Methodology for epileptic episode detection using complexitybased features. New Challenges on Bioinspired Applications, pages 454–462, 2011. P. Grassberger and I. Procaccia. Characterization of strange attractors. Physical review letters, 50(5):346–349, 1983. P. Grassberger and I. Procaccia. Estimation of the kolmogorov entropy from a chaotic signal. Physical review A, 28(4):2591–2593, 1983. P. Grassberger and I. Procaccia. Measuring the strangeness of strange attractors. Physica D: Nonlinear Phenomena, 9(1):189–208, 1983. M.A.F. Harrison, I. Osorio, M.G. Frei, S. Asuri, and Y.C. Lai. Correlation dimension and integral do not predict epileptic seizures. Chaos: An Interdisciplinary Journal of Nonlinear Science, 15(3):033106–033106, 2005. R. Hegger and H. Kantz. Improved false nearest neighbor method to detect determinism in time series data. Physical Review E, 60(4):4970, 1999. A.C.A. Hope. A simpliﬁed monte carlo signiﬁcance test procedure. Journal of the Royal Statistical Society. Series B (Methodological), pages 582–598, 1968. L.D. Iasemidis. Epileptic seizure prediction and control. Biomedical Engineering, IEEE Transactions on, 50(5):549–558, 2003. L.D. Iasemidis, L.D. Olson, R.S. Savit, and J.C. Sackellares. Time dependencies in the occurrences of epileptic seizures. Epilepsy research, 17(1):81–94, 1994. L.D Iasemidis, P. Pardalos, J.C Sackellares, and D.S. Shiau. Quadratic binary programming and dynamical system approach to determine the predictability of epileptic seizures. Journal of combinatorial optimization, 5(1):9–26, 2001.

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

111 38. L.D Iasemidis, J.C Principe, and J.C Sackellares. Measurement and quantiﬁcation of spatiotemporal dynamics of human epileptic seizures. Nonlinear biomedical signal processing, 2:294–318, 2000. 39. L.D. Iasemidis and J.C. Sackellares. The evolution with time of the spatial distribution of the largest Lyapunov exponent on the human epileptic cortex. Measuring chaos in the human brain, pages 49–82, 1991. 40. L.D. Iasemidis and J.C. Sackellares. Review: Chaos theory and epilepsy. The Neuroscientist, 2(2):118–126, 1996. 41. L.D. Iasemidis, J.C Sackellares, H.P. Zaveri, and W.J. Williams. Phase space topography and the lyapunov exponent of electrocorticograms in partial seizures. Brain Topography, 2(3):187–201, 1990. 42. L.D. Iasemidis, H.P. Zaveri, J.C. Sackellares, and W.J. Williams. Modelling of ecog in temporal lobe epilepsy. In 25th Ann. Rocky Mountain Bioeng. Symposium, volume 24, pages 187–193, 1988. 43. J. Jeong, J.C. Gore, and B.S. Peterson. A method for determinism in short time series, and its application to stationary eeg. Biomedical Engineering, IEEE Transactions on, 49(11):1374–1379, 2002. 44. K. Judd. Estimating dimension from small samples. Physica D: Nonlinear Phenomena, 71(4):421–429, 1994. 45. S. Kalitzin, D. Velis, P. Suﬀczynski, J. Parra, and F.L. da Silva. Electrical brainstimulation paradigm for estimating the seizure onset site and the time to ictal transition in temporal lobe epilepsy. Clinical neurophysiology, 116(3):718–728, 2005. 46. N. Kannathal, M.L. Choo, U.R. Acharya, and P.K Sadasivan. Entropies for detection of epilepsy in eeg. Computer methods and programs in biomedicine, 80(3):187– 194, 2005. 47. H. Kantz. A robust method to estimate the maximal lyapunov exponent of a time series. Physics Letters A, 185(1):77–87, 1994. 48. H. Kantz and T. Schreiber. Nonlinear time series analysis, volume 7. Cambridge Univ Pr, 2004. 49. T. Kapitaniak and S.R. Bishop. The illustrated dictionary of nonlinear dynamics and chaos. Recherche, 67:02, 1999. 50. D.T. Kaplan and L. Glass. Direct test for determinism in a time series. Physical Review Letters, 68(4):427–430, 1992. 51. M.B. Kennel, R. Brown, and H.D.I. Abarbanel. Determining embedding dimension for phase-space reconstruction using a geometrical construction. Physical review A, 45(6):3403, 1992. 52. M.B. Kennel and M. Buhl. Estimating good discrete partitions from observed data: Symbolic false nearest neighbors. Physical review letters, 91(8):84102, 2003. 53. M. Koebbe and G. Mayer-Kress. Use of recurrence plots in the analysis of timeseries data. Nonlinear modeling and forecasting, 21:361–378, 1992. 54. E.J. Kostelich and T. Schreiber. Noise reduction in chaotic time-series data: A survey of common methods. Physical Review E, 48(3):1752, 1993. 55. E.J. Kostelich and J.A. Yorke. Noise reduction in dynamical systems. Physical Review A, 38(3):1649, 1988. 56. M.A. Kramer, U.T. Eden, E.D. Kolaczyk, R. Zepeda, E.N. Eskandar, and S.S. Cash. Coalescence and fragmentation of cortical networks during focal seizures. The Journal of Neuroscience, 30(30):10076–10085, 2010. 57. Y.C. Lai, M.A Harrison, M.G. Frei, and I. Osorio. Controlled test for predictive power of lyapunov exponents: their inability to predict epileptic seizures. Chaos, 14(3):630–642, 2004. 58. Y.C. Lai, M.A.F. Harrison, M.G. Frei, and I. Osorio. Inability of lyapunov expo-

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

112 nents to predict epileptic seizures. Physical review letters, 91(6):68102, 2003. 59. M. Le Van Quyen, J. Soss, V. Navarro, R. Robertson, M. Chavez, M. Baulac, and J. Martinerie. Preictal state identiﬁcation by synchronization changes in long-term intracranial eeg recordings. Clinical neurophysiology, 116(3):559–568, 2005. 60. K. Lehnertz, R.G. Andrzejak, J. Arnhold, T. Kreuz, F. Mormann, C. Rieke, G. Widman, and C.E. Elger. Its possible use for interictal focus localization, seizure anticipation, and prevention: Nonlinear eeg analysis in epilepsy. Journal of Clinical Neurophysiology, 18(3):209, 2001. 61. K. Lehnertz and C.E Elger. Neuronal complexity loss of the contralateral hippocampus in temporal lobe epilepsy: a possible indicator of secondary epileptogenesis. Epilepsia, 36(suppl 4):21, 1995. 62. K. Lehnertz and C.E. Elger. Can epileptic seizures be predicted? evidence from nonlinear time series analysis of brain electrical activity. Physical Review Letters, 80(22):5019–5022, 1998. 63. K. Lehnertz, F. Mormann, H. Osterhage, A. M¨ uller, J. Prusseit, A. Chernihovskyi, M. Staniek, D. Krug, S. Bialonski, and C.E. Elger. State-of-the-art of seizure prediction. Journal of clinical neurophysiology, 24(2):147, 2007. 64. D.E. Lerner. Monitoring changing dynamics with correlation integrals: case study of an epileptic seizure. Physica D: Nonlinear Phenomena, 97(4):563–576, 1996. 65. B. Litt and J. Echauz. Prediction of epileptic seizures. The Lancet Neurology, 1(1):22–30, 2002. 66. B. Litt and K. Lehnertz. Seizure prediction and the preseizure period. Current opinion in neurology, 15(2):173, 2002. 67. M.C. Mackey and L. Glass. Oscillation and chaos in physiological control systems. Science, 197(4300):287–289, 1977. 68. T. Maiwald, M. Winterhalder, R. Aschenbrenner-Scheibe, H.U. Voss, A. SchulzeBonhage, and J. Timmer. Comparison of three nonlinear seizure prediction methods by means of the seizure prediction characteristic. Physica D: Nonlinear Phenomena, 194(3):357–368, 2004. 69. R. Ma˜ n´ e. On the dimension of the compact invariant sets of certain non-linear maps. Dynamical systems and turbulence, Warwick 1980, pages 230–242, 1981. 70. R. Manuca and R. Savit. Stationarity and nonstationarity in time series analysis. Physica D: Nonlinear Phenomena, 99(2-3):134–161, 1996. 71. J. Martinerie, C. Adam, M. Le Van Quyen, M. Baulac, S. Clemenceau, B. Renault, F.J Varela, et al. Epileptic seizures can be anticipated by non-linear analysis. Nature Medicine, 4(10):1173–1176, 1998. 72. J.M. Martinerie, A.M. Albano, A.I. Mees, and P.E. Rapp. Mutual information, strange attractors, and the optimal estimation of dimension. Physical Review A, 45(10):7058, 1992. 73. N. Marwan, M. Carmen Romano, M. Thiel, and J. Kurths. Recurrence plots for the analysis of complex systems. Physics Reports, 438(5):237–329, 2007. 74. J. McNames. A fast nearest-neighbor algorithm based on a principal axis search tree. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 23(9):964–976, 2001. 75. P.E. McSharry, L.A. Smith, and L. Tarassenko. Prediction of epileptic seizures: are nonlinear methods relevant? Nature medicine, 9(3):241–242, 2003. 76. A.I. Mees, P.E. Rapp, and L.S. Jennings. Singular-value decomposition and embedding dimension. Physical Review A, 36(1):340, 1987. 77. J.G. Milton. Epilepsy as a dynamic disease: A tutorial of the past with an eye to the future. Epilepsy & Behavior, 18(1):33–44, 2010.

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

113 78. J.G. Milton. The delayed and noisy nervous system: implications for neural control. Journal of Neural Engineering, 8:065005, 2011. 79. J.G. Milton. Neuronal avalanches, epileptic quakes and other transient forms of neurodynamics. European Journal of Neuroscience, 36(2):2156–2163, 2012. 80. P. Mirowski, D. Madhavan, Y. LeCun, and R. Kuzniecky. Classiﬁcation of patterns of eeg synchronization for seizure prediction. Clinical neurophysiology, 120(11):1927–1940, 2009. 81. F. Mormann, R.G. Andrzejak, C.E. Elger, and K. Lehnertz. Seizure prediction: the long and winding road. Brain, 130(2):314–333, 2007. 82. F. Mormann, T. Kreuz, C. Rieke, R.G. Andrzejak, A. Kraskov, P. David, C.E. Elger, and K. Lehnertz. On the predictability of epileptic seizures. Clinical neurophysiology, 116(3):569–587, 2005. 83. F. Mormann, K. Lehnertz, P. David, and C. E Elger. Mean phase coherence as a measure for phase synchronization and its application to the eeg of epilepsy patients. Physica D: Nonlinear Phenomena, 144(3-4):358–369, 2000. 84. A. Na¯ıt-Ali. Advanced biosignal processing. Springer Verlag, 2009. 85. V. Navarro, J. Martinerie, M. Le Van Quyen, M. Baulac, F. Dubeau, and J. Gotman. Seizure anticipation: Do mathematical measures correlate with video-eeg evaluation? Epilepsia, 46(3):385–396, 2005. 86. L. Noakes. The takens embedding theorem. International Journal of Bifurcation and Chaos, 1(4):867–872, 1991. 87. L.D Olson, L.D Iasemidis, and J.C Sackellares. Evidence that interseizure intervals exhibit low dimensional dynamics. Epilepsia, 30:644, 1989. 88. I. Osorio. Will the new antiseizure devices ﬁll the gap between drugs and surgery? treating the brain as a. Epilepsy & Behavior, 19(1):17–19, 2010. 89. N.H. Packard, J.P. Crutchﬁeld, J.D. Farmer, and R.S. Shaw. Geometry from a time series. Physical Review Letters, 45(9):712–716, 1980. 90. M. PaluS and I. DvoEk. Singular-value decomposition in attractor reconstruction: pitfalls and precautions. Physica D, 55:221–234, 1992. 91. L. Pezard, J. Martinerie, F. Breton, J.C. Bourzeix, and B. Renault. Non-linear forecasting measurements of multichannel eeg dynamics. Electroencephalography and clinical neurophysiology, 91(5):383–391, 1994. 92. L. Pezard, J. Martinerie, J. M¨ uller-Gerking, F.J. Varela, and B. Renault. Entropy quantiﬁcation of human brain spatio-temporal dynamics. Physica D: Nonlinear Phenomena, 96(1-4):344–354, 1996. 93. J.P.M. Pijn, D.N. Velis, M.J. Heyden, J. DeGoede, C.W.M. Veelen, and F.H. Lopes da Silva. Nonlinear dynamics of epileptic seizures on basis of intracranial eeg recordings. Brain Topography, 9(4):249–270, 1997. 94. A. Pikovsky, M. Rosenblum, and J. Kurths. Synchronization: A universal concept in nonlinear sciences, volume 12. Cambridge Univ Pr, 2003. 95. M.B Priestley and T.S. Rao. A test for non-stationarity of time-series. Journal of the Royal Statistical Society. Series B (Methodological), pages 140–149, 1969. 96. P.E Rapp, A.M Albano, I.D Zimmerman, and M.A Jimenez-Montano. Phaserandomized surrogates can produce spurious identiﬁcations of non-random structure. Physics letters A, 192(1):27–33, 1994. 97. C. Rhodes and M. Morari. False-nearest-neighbors algorithm and noise-corrupted time series. Physical Review E, 55(5):6162, 1997. 98. C. Rieke, K. Sternickel, R.G. Andrzejak, C.E. Elger, P. David, and K. Lehncrtz. Measuring nonstationarity by analyzing the loss of recurrence in dynamical systems. Physical review letters, 88(24):244102–244102, 2002.

May 10, 2013

14:59

BC: 8846 - BIOMAT 2012

06˙syed˙georgiev˙pardalos

114 99. M.T. Rosenstein, J.J. Collins, and C.J. De Luca. A practical method for calculating largest lyapunov exponents from small data sets. Physica D: Nonlinear Phenomena, 65(1-2):117–134, 1993. 100. T. Sauer, J.A. Yorke, and M. Casdagli. Embedology. Journal of Statistical Physics, 65(3):579–616, 1991. 101. J.M. Schoﬀelen and J. Gross. Source connectivity analysis with meg and eeg. Human brain mapping, 30(6):1857–1865, 2009. 102. J.C. Schouten, F. Takens, and C.M. van den Bleek. Estimation of the dimension of a noisy attractor. Physical Review E, 50(3):1851, 1994. 103. T. Schreiber. Detecting and analyzing nonstationarity in a time series using nonlinear cross predictions. Physical Review Letters, 78(5):843–846, 1997. 104. T. Schreiber. Interdisciplinary application of nonlinear time series methods. Physics Reports, 308(1):1–64, 1999. 105. T. Schreiber and A. Schmitz. Surrogate time series. Physica D: Nonlinear Phenomena, 142(3-4):346–382, 2000. 106. P.A. Schwartzkroin. Origins of the epileptic state. Epilepsia, 38(8):853–858, 1997. 107. F. Shayegh, R. Amirfattahi, S. Sadri, and K. Ansari-Asl. Evaluation of some physiological statements about seizure, using processing of epileptic eeg signals. In Electrical Engineering (ICEE), 2011 19th Iranian Conference on, pages 1–6. IEEE, 2011. 108. K. Shin, J.K Hammond, and P.R White. Iterative svd method for noise reduction of low-dimensional chaotic time series. Mechanical Systems and Signal Processing, 13(1):115–124, 1999. 109. R.L. Smith. Estimating dimension in noisy chaotic time series. Journal of the Royal Statistical Society. Series B (Methodological), pages 329–351, 1992. 110. F. Takens. Detecting strange attractors in turbulence. Dynamical systems and turbulence, Warwick 1980, pages 366–381, 1981. 111. F. Takens. On the numerical determination of the dimension of an attractor. Dynamical systems and bifurcations, pages 99–106, 1985. 112. J. Theiler. Estimating fractal dimension. JOSA A, 7(6):1055–1073, 1990. 113. J. Theiler, S. Eubank, A. Longtin, B. Galdrikian, and J. Doyne Farmer. Testing for nonlinearity in time series: the method of surrogate data. Physica D: Nonlinear Phenomena, 58(1):77–94, 1992. 114. J. Theiler and P.E. Rapp. Re-examination of the evidence for low-dimensional, nonlinear structure in the human electroencephalogram. Electroencephalography and clinical Neurophysiology, 98(3):213–222, 1996. 115. F. Wendling, P. Chauvel, A. Biraben, and F. Bartolomei. From intracerebral eeg signals to brain connectivity: identiﬁcation of epileptogenic networks in partial epilepsy. Frontiers in systems neuroscience, 4, 2010. 116. H. Whitney. Diﬀerentiable manifolds. The Annals of Mathematics, 37(3):645–680, 1936. 117. A. Wolf, J.B. Swift, H.L. Swinney, and J.A. Vastano. Determining lyapunov exponents from a time series. Physica D: Nonlinear Phenomena, 16(3):285–317, 1985. 118. D. Yu, W. Lu, and R.G. Harrison. Space time-index plots for probing dynamical nonstationarity. Physics Letters A, 250(4-6):323–327, 1998. 119. G.U. Yule. On a method of investigating periodicities in disturbed series, with special reference to wolfer’s sunspot numbers. Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, 226:267–298, 1927.

May 6, 2013

15:21

BC: 8846 - BIOMAT 2012

07˙wallace

SYNCHRONOUS CALCIUM INDUCED CALCIUM RELEASE (CICR) IN A MULTIPLE SITE MODEL OF THE CARDIAC MYOCYTE

D. I. WALLACE AND J. E. TANENBAUM Department of Mathematics, Dartmouth College, Hanover, NH 03755, USA E-mail: [email protected]

The behavior of the muscle of the heart depends upon changes in free Ca2+ concentration in the sarcoplasm and cytoplasm of a ventricular myocyte. We present a model of the free Ca2+ concentration across twenty components: ten adjacent sarcoplasmic/cytoplasmic junctions. It incorporates diffusion both in the cytoplasm and the sarcoplasmic reticulum. The model shows qualitative agreement with experimental observations on the effects of altering the rate of calcium input to a cell, the efficiency of the SERCA pump, the threshold setting for CICR, and the rate of a cells calcium loss. In addition it displays spontaneous recurring calcium peaks in the presence of sufficient extracellular calcium, and these peaks are seen to synchronize across the junctions. The model confirms the importance the SR by demonstrating that this robust synchronization across sites does not occur in the absence of SR diffusion.

1. Introduction Ca2+ serves as the predominant ionic messenger within the cardiac myocyte. As a result, Ca2+ plays a significant role in controlling Excitation Contraction Coupling (ECC). This is a process by which an increase of cytoplasmic calcium near a particular type of receptor on the sarcoplasmic reticulum results in a release of calcium into the cytoplasm. The receptors responsible for both ECC and sequestering of calcium in the SR are clustered at junctional areas called t-tubules distributed along the length of the myocyte. Though well studied, the literature currently contains few, if any, models that qualitatively describe the synchronicity of Ca2+ release across junctional sites within the myocyte. Most mathematical models, such as those studied by Zahradnikova 1 , Faber 2 , Greenstein 3 , Dupont 4 , and others (Falke5 , Shannon6 , Tang7 , 115

May 6, 2013

15:21

BC: 8846 - BIOMAT 2012

07˙wallace

116

Hinch8 , Spiro9 , Keizer10 , Jafri 11 , Tuan12 ) describe the role of Ca2+ in ECC by relying on so-called calcium release units. These units include the cardiac t-tubule and a closely juxtaposed junctional sarcoplasmic reticulum (JSR). Ca2+ can enter the cytoplasm as the result of the opening of a Ca2+ channel in the t-tubule. The entry of Ca2+ into the cytoplasm may then lead to Ca2+ release from a nearby SR junction 13 . This process, termed Calcium Induced Calcium Release (CICR), depends on the movement of Ca2+ throughout the cell and influx of Ca2+ into the cell 14 . In addition, myocytes utilize Ca2+ leaks to maintain homeostasis. These leaks emanate from both the SR into the cytoplasm and from the cytoplasm into the extra-cellular medium (ECM). The sarcoplasmic reticulum (SR) plays a pivotal role in Ca2+ homeostasis by sequestering Ca2+ , and is implicated in both normal and abnormal heart cell function 15 16 17 . The SR accomplishes this requisition by activating a Ca2+ ATPase pump, termed the SERCA pump, and by releasing Ca2+ via both passive and active means. These Ca2+ releases can only occur when the Ca2+ concentration in the SR reaches a threshold level 18 . A physiologically observed time lag exists between the time when the region of the SR closest to the site of the spark undergoes CICR and when the region of the SR furthest from the spark undergoes CICR 19 . Similarly, because Ca2+ moves through the SR as a wave 20 , the SR Ca2+ concentration is not equal at every point of the SR in the time directly following a Ca2+ spark. Calcium waves in the SR have also been observed directly 21 . Physiological studies show that in time the oscillations in Ca2+ concentration of both the SR and the cytoplasm synchronize22 . Whereas previous attempts at modeling Ca2+ dynamics within a myocyte have succeeded in accounting for calcium release units and other complexities of this system, no previous models demonstrate this experimentally observed synchronization. Our model consists of a set of twenty ordinary differential equations and is based on a model first developed by B. V. Williams 23 . This model differs from others by coupling ten cytoplasmSR junctions that show an initial lag in the onset of CICR dependent upon distance from Ca2+ input site as well as an eventual synchronization of Ca2+ concentration oscillations. Each junctional model is based on an early model of Goldbeter et al 24 . The coupled mathematical model was tested against four research findings that demonstrated different outcomes based upon experimental variations that correspond to changes in parameters or initial conditions of heart cell activity. First, we confirmed the writings of Klabunde25 by concluding

May 6, 2013

15:21

BC: 8846 - BIOMAT 2012

07˙wallace

117

that the amount of input Ca2+ controlled the ability of the heart cell to undergo ECC. Second, we confirmed the findings of Jiang et al26 by concluding that reducing the SERCA pump activity in the SR led to heart failure. In this study, heart failure was measured as a cessation of synchronous Ca2+ oscillations in both the SR and the cytoplasm. We further confirmed the findings of Marks et al27 28 by concluding that changing the threshold at which RyR channels embedded in the SR membrane release Ca2+ similarly led to heart failure. Finally, we confirmed the descriptions in Marin-Garc´ıa’s text 16 which link an increased Ca2+ leak rate (from the cytoplasm to the ECM) to heart failure. 2. Model Formulation The model described here includes the following features: (i) variable Ca2+ concentration in both the cytoplasm and the sarcoplasmic reticulum (ii) uptake of Ca2+ from the cytoplasm to the SR via the SERCA pump (iii) discharge of Ca2+ from the SR to the cytoplasm via CICR (iv) passive diffusion within both the cytoplasm and the SR (v) external input of Ca2+ to the cytoplasm There are many molecular pumps and exchangers besides these, both from the SR to the cytoplasm and from the cytoplasm to the cell exterior. These are not modeled explicitly but summarized in two single linear leaks from the SR to the cytoplasm and from the cytoplasm out of the cell. This model assumes that a portion of the calcium dynamic can be captured by a relatively simple local model coupled with itself to include multiple local t-tubule/SR junctional sites. It also treats all the SERCA pumps in one site as a single unit with one over-riding dynamic, and does the same with the CICR pumps. Our model consists of twenty coupled ordinary differential equations describing changes in the concentrations of both cytoplasmic (C) and sarcoplasmic (S) Ca2+ concentrations in a cardiac myocyte. Fig. 1 illustrates the interactions of these components. The arrows denote the movement of Ca2+ throughout the myocyte. The Ca2+ input to the system is denoted by the R arrow. Ca2+ is added to the cell, which then travels through the cytoplasm as a wave. As the Ca2+ reaches the different JSRs, the Ca2+ threshold for CICR is reached. We assume that the makeup of both the cytoplasm and the SR are uniform throughout.

May 6, 2013

15:21

BC: 8846 - BIOMAT 2012

07˙wallace

118

Figure 1. The box model pictures three of the ten junctions in the model, which includes CICR from the SR junction denoted M, SERCA uptake P, other leaks N from the SR, input R from the extracellular medium and discharge Q from cytoplasm to the extracellular medium.

The dynamics of calcium passing between the sarcoplasmic reticulum and the cellular region near a t-tubule have been studied extensively and expressed in various ways. In this model we use two quantities, C and S, to represent the concentrations of Ca2+ in the cytoplasm (C) and within the sarcoplasmic reticulum (S) in the region near a t-tubule, respectively indexed by d, as we consider ten of these junctions. The changes in cytoplasmic and SR Ca2+ concentrations can therefore be represented by a series of twenty ordinary differential equations. The definitions of the parameters are given along with their physiologically measured default values in Table 1. All parameters are positive and taken from Goldbeter 24 , and Williams 23 . For each of these pairs, (Cd , Sd ), we take into account the SERCA pump that sequesters calcium in the SR, the RyR that is responsible for calcium induced calcium release from the SR into the cytoplasm, and passive diffusion within both the cytoplasm and the SR. All other dynamics are summarized by linear leaks. For each pair we have the following equations. Cd′ = i −

Vra Sdr Cda Vu Cdu + Kuu + Cdu (Krr + Sdr )(Kaa + Cda )

+ k1 Sd − q(Cd − Cd+1 + q(Cd−1 − Cd ) − eCd

(1)

Cd′ = (Input from cell exterior) - (SR uptake via SERCA pump)+ (CICR release from SR) + (Leak from SR) + (Diffusion within cytoplasm) - (leak

May 6, 2013

15:21

BC: 8846 - BIOMAT 2012

07˙wallace

119

to exterior) Sd′ =

Vu Cdu Vra Sdr Cda − r −k1 Sd −r1 (Sd −Sd+1 +r1 (Sd−1 −Sd ) u u Ku + Cd (Kr + Sdr )(Kaa + Cda ) (2)

Sd′ = (SR uptake via SERCA pump)- (CICR release from SR) - (Leak to cytoplasm) + (Diffusion within SR) Table 1.

Default Parameter Values

Parameter

Symbol

Def ault V alue

i

3 µM/s

Max rate of calcium uptake from the cytoplasm into the SR

Vu

65 µ M/s

Max rate of calcium release from the SR into the cytoplasm

Vra

500 µ M/s

The threshold constant for calcium uptake into the SR

Ku

1µM

The threshold constant for calcium release from the SR into the cytoplasm

Kr

2µM

The threshold constant for activation of the SR calcium release

Ka

0.9 µ M

The passive leak form the SR into the cytoplasm

k1

1/s

Calcium efflux from the cytoplasm to the ECM

e

9/s

Rate of calcium transport between the cytosolic calcium components

q

0.3/s

Rate of calcium transport between the SR calcium components

r1

0.2/s

The Hill coefficient for calcium uptake

u

2

The Hill coefficient for calcium release

r

4

The Hill coefficient for activation of calcium release

a

2

Calcium influx form the ECM into the intracellular space

May 6, 2013

15:21

BC: 8846 - BIOMAT 2012

07˙wallace

120

2.1. Input from cell exterior The first term of Eq. 1, i , describes the influx of Ca2+ from the ECM into the intracellular space. For the numerical experiments in this study, i is taken to be constant. 2.2. SR uptake via SERCA pump V Cu

The second term, K uu+Cd u , utilizes a Hill function of degree u to model the u d rate of Ca2+ uptake from the cytoplasm into the SR Ca2+ store. Although this model is based on that of Goldbeter 24 , the form of this term is widely agreed upon and present in many models 6 9 8 . 2.3. CICR release from SR V

Sr C a

d d , uses two Hill functions, of degree r and The third term, (K r +Srar )(K a a r a +Cd ) d a, respectively, to model the rate of Ca2+ release from the SR via a process activated by the cytoplasmic Ca2+ concentration. The second of the two Hill functions, that of degree a, denotes the degree of cooperativity of this activation process. Here the model takes its inspiration from a model of Goldbeter 24 . Other authors use a different term to reflect the action of both the CICR pump and the SERCA pump. This term as in Shannon 6 is the difference between a forward process that releases Ca2+ from the SR into the cytoplasm, and the reverse process modeled in the usual way by the SERCA pump term described above. In those models, the forward process gives a larger rate when the Ca2+ concentration in the cytoplasm is higher, as is required by CICR. However the forward process in those models gives a lower rate when the Ca2+ in the SR is higher. This is not consistent with experimental observations (described in Gyorke 29 , Lukyanenko 30 , Hobai 31 and elsewhere, and summarized in Bers 32 ) that report higher Ca2+ release when the SR concentration rises. The functional form for this CICR pump in Goldbeter’s model solves this problem, as the response rises with Ca2+ concentration in either the cytoplasm or the SR. See Williams’ thesis23 for a graphical representation of this functional response. Some authors model the states of the CICR pumps (e.g. open, closed) separately 8 9 10 or incorporate more of the details of the chemistry, but the model studied here sacrifices some of the local complexity in order to couple multiple sites together.

May 6, 2013

15:21

BC: 8846 - BIOMAT 2012

07˙wallace

121

2.4. Leak from SR The fourth term, k1 Sd , is the passive leak of Ca2+ from the SR into the cytoplasm. As stated before, this term is a proxy for a variety of other ways in which Ca2+ may move from the SR into the cytoplasm. 2.5. Leak to exterior The final term of Eq. 1, −eCd , models the passive leak of Ca2+ from the cytoplasm to the ECM. As stated before, this term is a proxy for a variety of other ways in which Ca2+ may leave the cytoplasm. 2.6. Diffusion within SR and cytoplasm Each pair of a C and an S is interconnected not only within itself, but also with the preceding and succeeding C-S pairs. The next two terms of Eq. 1 and Eq. 2, either −q(Cd − Cd+1 or q(Cd−1 − Cd ) , respectively, represent the movement of Ca2+ within the cytoplasm and the SR. The t-tubules are assumed to be arranged in a sequence, and appropriate adjustments are made to these terms at the two ends of the sequence. These diffusion terms are taken from Williams 23 . Many models incorporate diffusion within the cytoplasm, but to our knowledge only Swietach et al33 incorporates diffusion within the SR, which must also occur. The rate of diffusion within the SR has not been measured, but this constant would have to take into account not only that rate but the presumed distance between t-tubules. This models takes its constant from Williams’ thesis23 , but it would be misleading to suggest it was experimentally determined. It is comparable in magnitude to the diffusion constant for the cytoplasm. 3. Methods Fig. 2 shows the synchronization of the Ca2+ concentration of the twenty different regions. The presence and stability of this phenomenon in the coupled system was the main result of the thesis of Williams 23 . The oscillations in Fig. 2 represent the oscillation of Ca2+ concentration in ten different cytoplasmic or SR regions. Each line represents a unique SR or cytoplasmic region. The Ca2+ concentration oscillations appear staggered initially but eventually synchronize. The initial conditions for Fig. 1 were the default parameters listed in Table 1. The default initial Ca2+ concentration levels for the twenty different regions are listed in Table 2 and were used to generate Fig. 2.

May 6, 2013

15:21

BC: 8846 - BIOMAT 2012

07˙wallace

122

The oscillations in Fig. 2 represent the oscillation of Ca2+ concentration in ten different cytoplasmic or SR regions. Each line represents a unique SR or cytoplasmic region. The Ca2+ concentration oscillations appear staggered initially but eventually synchronize. The default parameters listed in Table 1 and the default initial Ca2+ concentration levels for the twenty different regions listed in Table 2 were used to generate Fig. 2. The synchronization displayed in Figure 2 is robust and occurs across a range of parameter values and initial conditions.

Figure 2. This image depicts the baseline run of twenty quantities with parameters as in Table 1 and initial conditions as in Table 2.

Table 2. Default Conditions Region

Initial

Initial Value

C1

0.1 µM

C2 to C10

0.06 µM

S1

2.2 µM

S2 TO S10

1.7 µM

The model was then compared with four distinct results by altering the parameters associated with reported physiological experiments, described below. (i) The initial Ca2+ concentrations of the twenty different regions were varied. Altering the initial Ca2+ concentrations of the twenty areas

May 6, 2013

15:21

BC: 8846 - BIOMAT 2012

07˙wallace

123

(ii)

(iii)

(iv)

(v)

did not change the behavior of the observed baseline in Figure 2. Lakatta et al19 showed that the rate of Ca2+ influx directly affected the ability of the cardiac myocyte to undergo ECC. The value of the model parameter controlling calcium influx, i was varied from the default value to determine the range in which sustained oscillations were observed. Jiang et al26 showed that reducing the activity of the SERCA pump leads to heart failure. The value of the model parameter controlling the rate of Ca2+ uptake, Vu was varied from the default value to determine the range in which sustained oscillations were observed. Marks et al 27 28 propose that decreasing the threshold constant for SR Ca2+ release leads to heart failure. The value of the model parameter controlling the threshold constant for SR Ca2+ release, Kr was varied from the default value to determine the range in which sustained oscillations were observed. Marin-Garc´ıa 16 describes research reporting that an increased rate of Ca2+ leaving the cytoplasm led to heart failure. The value of the model parameter controlling the rate of Ca2+ removal from the cytoplasm to the ECM, e was varied from the default value to determine the range in which sustained oscillations were observed.

All model simulations were run on MATLAB version 2010(b) using solver ode113. Unless otherwise specified, all parameters and initial conditions were set to the default values given in Table 1 and Table 2. 4. Results 4.1. Propagation of calcium release For a large range of parameter choices the persistent propagation of calcium waves was observed, and these tended to synchronize across all of the (Cd , Sd ) pairs, representing a cell whose calcium cycle is synchronized in time across the length of the myocyte. Altering the initial Ca2+ concentrations of the twenty areas did not change our observed baseline. The presence and stability of this phenomenon in the coupled system was the main result of the thesis of Williams 23 . Loosely speaking, the cell is “contracting” when this happens, as observed in experiments 34 21 . Synchronous firing is also observed 22 . These spontaneous recurring discharges have been implicated in cardiac pathology 19 . Recurring spontaneous traveling waves have also been observed 35 36 20 14 . This model does not display a noticeable traveling wave, most likely because it would require a much larger number

May 6, 2013

15:21

BC: 8846 - BIOMAT 2012

07˙wallace

124

of (Cd , Sd ) pairs to simulate the length of a cell. But it clearly displays the regenerative pulse seen in experiments. Initial conditions for the baseline run in Figure 2 were quite similar. Figure 3 below shows synchronization even when the initial conditions vary greatly across sites.

Figure 3. Slower synchronization with less homogeneous initial conditions. All parameters and initial conditions are as in Figure 2 except C1 = 1, C9 = C10 = .000005.

4.2. Importance of diffusion in the SR The parameter r controlling diffusion in the SR is a key parameter in this model. Without it, synchronization is lost, as seen in Figure 4. In this figure, the only synchronized sites are those with identical initial conditions. 4.3. The Effect of Changing i on CICR Klabunde25 writes that the concentration of Ca2+ entering a cardiac myocyte directly influences both CICR and myocyte contractions. Any model that attempts to numerically describe Ca2+ dynamics in a cardiac myocyte must therefore show a dependence on the concentration of influxing Ca2+ in order to be a viable, working model. As shown in Figure 5, increasing the value of i, the influxing Ca2+ in our model, leads to a cessation of CICR. Because CICR has been shown to be the central mechanism of ECC, a cell that can no longer perform CICR will no longer undergo ECC. Similarly, decreasing the value of i leads to a complete cessation of CICR and therefore to termination of ECC. Our results are therefore consistent with Klabunde’s text.

May 6, 2013

15:21

BC: 8846 - BIOMAT 2012

07˙wallace

125

Figure 4. No synchronization without SR transport. All parameters and initial conditions as in Figure 3, except the rate of calcium transport in the SR, r1 , is set to zero.

Figure 5 shows the effect of changing the influx of calcium on the initiation, propagation, and synchronization of Ca2+ concentration oscillations in both the cytoplasm and the SR. The figure shows the lack of CICR when i is substantially below or above limits of sustained, synchronous oscillations. Figure 5 shows that when the value of i is between the lowest value at which sustained, synchronous oscillations occur and the value at which flat lining begins the initial, asynchronous Ca2+ concentration oscillations in both the cytoplasm eventually flat line. Figure 5 shows both the lowest and highest values of i for which sustained, synchronous Ca2+ concentration oscillations in both the cytoplasm and the SR develop. When the value of i is between the highest value at which sustained, synchronous oscillations occur and the value at which flat lining begins, asynchronous Ca2+ concentration oscillations in both the cytoplasm and the SR develop. Finally, Fig 5 illustrates the lack of CICR when i is substantially above the upper limit of sustained, synchronous oscillations. The parameter ranges across which sustained, synchronous Ca2+ concentration oscillations occur are presented in Table 3. What is observed in Figure 5 is a Hopf bifurcation, with bifurcation parameter i. 4.4. Effect of Changing SERCA Pump Activity on CICR In the second experiment, the value of Vu was varied from the default value to correspond to the experiments of Jiang et al26 , who demonstrated that decreasing the activity of the SERCA pump directly influences both

May 6, 2013

15:21

BC: 8846 - BIOMAT 2012

07˙wallace

126

CICR and myocyte contractions. Specifically, they found that heart failure resulted from a semi-deactivation of the SERCA pump. Therefore, only a model that successfully incorporates this finding represents a truly viable, working model. As shown in Figure 6, decreasing the value of Vu , the maximum rate of Ca2+ uptake form the cytoplasm into the SR, leads to a cessation of CICR. As stated above, a cell that can no longer perform CICR will no longer undergo ECC, a process the heart depends upon to maintain normal function. Similarly, we examined how increasing the value of Vu impacted CICR. We found that Vu could not be increased beyond a certain point and still maintain CICR, as increasing Vu by too much led to a complete cessation of CICR. Our results are therefore consistent with the findings of Jiang et al. The parameter Vu is also a Hopf bifurcation parameter and the results of varying it are similar to Figure 5, displaying the lack of CICR when Vu is substantially below the lower limit or above the upper limit of sustained, synchronous oscillations. When the value of Vu is between the lowest value at which sustained, synchronous oscillations occur and the value at which flat lining begins, the initial, asynchronous Ca2+ concentration oscillations in both the cytoplasm eventually flat line. The parameter ranges across which sustained, synchronous Ca2+ concentration oscillations occur are presented in Table 3.

4.5. Effect of Changing the SR Ca2+ Release Threshold on CICR Researchers have shown that there is an increase in Ca2+ passing from the SR to the cytoplasm in myocytes of animals in chronic heart failure38 39 which Shannon et al6 attribute to RyR disregulation, based on experiments of Marx 28 and others. Marks et al 27 28 demonstrated that increasing the rate of the passive Ca2+ leak from the SR into the cytoplasm led to heart failure. Much like the two previous cases, only a model that successfully incorporates this third finding represents a truly viable, working model. As shown in Figure 7, decreasing the value of Kr , the threshold for SR Ca2+ release into the cytoplasm, leads to a cessation of CICR. As stated above, a cell that can no longer perform CICR will no longer undergo ECC. Similarly, increasing the value of Kr leads to a complete cessation of CICR and therefore to termination of ECC. Our results are therefore consistent with Marks’ findings because our model also predicts heart failure when Kr becomes either too big or too small.

May 6, 2013

15:21

BC: 8846 - BIOMAT 2012

07˙wallace

127

Figure 5.

The Effect of Changing i on CICR

May 6, 2013

15:21

BC: 8846 - BIOMAT 2012

07˙wallace

128

In the third experiment, the value of Kr was varied from the default value. This parameter controls the threshold for CICR response, and the model displays synchronization of Ca2+ concentration oscillations only in a range of values again showing a Hopf bifurcation similar to Figure 5. The parameter ranges across which sustained, synchronous Ca2+ concentration oscillations occur are presented in Table 3. 4.6. Effect of Changing the Rate of Cytoplasm to ECM Ca2+ Movement on CICR Marin-Garc´ıa 16 reported that increasing the rate of Ca2+ movement from the cytoplasm to the ECM led to heart failure. In the fourth experiment, we changed the value of e from the default value and examined the effect of this change on the synchronization of Ca2+ concentration oscillations in CICR. As in the three previous cases, this model successfully demonstrates this effect. Increasing the value of e, the rate of Ca2+ efflux from the cytoplasm into the ECM, leads to a cessation of CICR. A cell that can no longer perform CICR will no longer undergo ECC. Similarly, decreasing the value of e leads to a complete cessation of CICR and therefore to termination of ECC. Our results are therefore consistent with the description in Marin-Garc´ıa’s text because our model also predicts heart failure when e becomes either too big or too small. As in previous figures, the system shows a passage from no CICR to asynchronous oscillation, to synchronous oscillation and back again. The parameter ranges across which sustained, synchronous Ca2+ concentration oscillations occur are presented in Table 3. Table 3.

Summary of regions of oscillation as parameters vary

Observation

Study

Parameter

Low Boundary

High Boundary

The amount of input calcium determines ability for CICR

25

i

2.8 µM/s

6.275 µM/s

Reduced SERCA pump stops CICR

26

Vu

49.5 µM/s

172 µM/s

Decreased threshold for SR Ca release stops CICR

27

Kr

0.45 µM

2.8 µM

Increased rate of Ca movement from cytoplasm to ECM stops CICR

16

e

4/s

9.6/s

May 6, 2013

15:21

BC: 8846 - BIOMAT 2012

07˙wallace

129

5. Summary This paper presents a novel model of calcium homeostasis in cardiac myocytes that builds on the work of Goldbeter et al24 and Williams 23 by coupling together ten different junctional release sites. Because the model works on the scale of anywhere from one to ten distinct cytoplasm-SR junctions, it would be possible with sufficient computing power to extend it to the many thousands of junctions that exist in a single cardiac myocyte. This model extends previous work in this field by generating Ca2+ concentrations for each cytoplasmic and SR region that initially oscillate independently but ultimately synchronize. In this study, we compared our model to four different experimental conditions. The analysis of these comparisons is given below. This model exhibits many of the same phenomena observed in the laboratory using only two different kinds of Ca2+ pumps. In particular, it

(i) displays spontaneous recurring calcium peaks in the presence of sufficient calcium input (ii) displays synchronization across sites of these peaks under a range of parameters (iii) shows appropriate dependence on the efficiency of the SERCA pump, as described in Jiang et al26 (iv) shows appropriate dependence on the threshold setting for CICR, as described in Marks et al27 28 (v) shows cessation of oscillations when the calcium leak out of the cell is too large or too small, consistent with the description in MarinGarc´ıa 16 (vi) confirms the importance of diffusion in the SR by demonstrating that synchronization ceases without SR diffusion.

This model allows its user to distinguish which observed aspects of myocyte Ca2+ dynamics occur because of intracellular communication and which phenomena occur simply because of the local physiological arrangement of the system near each junction. It further demonstrates that the SR, and more specifically the communication between different regions of the SR, plays a key role in the synchronization of Ca2+ concentration oscillations.

May 6, 2013

15:21

BC: 8846 - BIOMAT 2012

07˙wallace

130

Acknowledgements The authors wish to thank the James O. Freedman Presidential Scholarship for funding the work of J. E. Tanenbaum. References 1. A. Zahradnikova and I. Zahradnik Biophysical Journal 71 1996 2996-3012 (1996). 2. G. M. Faber, J. Silva, L. Livshitz, and Y. Rudy Biophysical Journal 9215221543 (2007). 3. J. L. Greenstein and R. L. Winslow Biophysical Journal 83 29182945 (2002). 4. G. Dupont, M. J. Berridge and A. Goldbeter Cell Calcium 12 73–85 (1991). 5. M. Falcke Biophys J 84 28-41(2003). 6. T. R. Shannon, F. Wang, and D. M. Bers Biophysical Journal 89 40964110 (2005). 7. Y. Tang, J. L. Stephenson, and H. G. Othmer Biophysical Journal 70 246-263 (1996). 8. R. Hinch, J. L. Greenstein, A. J. Tanskanen, L. Xu,and R. L. Winslow Biophysical Journal 87 37233736 (2004). 9. P. A. Spiro and H. G. Othmer Bulletin of Mathematical Biology 61, 651681(1999). 10. J. Keizer and L. LevineBiophysical Journal 71 3477-3487(1996). 11. M. S. Jafri, J. J. Rice, and R. L. Winslow Biophysical Journal 74 11491168 (1998). 12. H.T. Tuan , G.S. Williams, A.C. Chikando , E.A. Sobie , W.J. Lederer , M.S. Jafri . Conf Proc IEEE Eng Med Biol Soc.4677-80 (2011). 13. D. M. Bers, Excitation-Contraction Coupling and Cardiac Contractile Force edition 2, Kluwer Academic, Dordrecht, Netherlands (2001). 14. H. Cheng, W. J. Lederer and M. B. Cannell, Science, New Series 262 5134 740-744 (1993). 15. M. Fu, R-X. Li, L. Fan, G-W. He, K. L. Thornburg, Z. Wanga Biochemical Pharmacology 75 2147- 2156 (2008). 16. J. Marin-Garc´ıa, Contemporary Cardiology 15 Springer Science+Business Media, (2010). 17. G. Hasenfuss and B. Pieske J Mol Cell Cardiol 34 951-969 (2002). 18. Thomas R. Shannon, Kenneth S. Ginsburg and Donald M. Bers Circ. Research 91 594-600 (2002) 19. E. Lakatta,Cardiovasc. Res 26 193-214 (1992). 20. A. A. Kort, M. C. Capogrossi and E.G. Lakatta Circ. Res.57 844-855 (1985), 21. M. H. P. Wussling, K. Krannich, G. Landgraf, A. Herrmann-Frank, D. Wiedenmann, F. N. Gellerich, and H. Podhaisky. FEBS Lett. 463103109 (1999). 22. M. C. Capogrossi, S. R. Houser, A. Bahinski and E. G. Lakatta Circ. Res. 61 498-503 (1987). 23. B.V. Williams, Ph.D. Thesis, Dartmouth College, (2005).

May 6, 2013

15:21

BC: 8846 - BIOMAT 2012

07˙wallace

131

24. A. Goldbeter, G. Dupont and M. J. Berridge, Proc Natl Acad Sci U S A 87 1461-1465 (1990). 25. R. E Klabunde, Cardiovascular Physiology Concepts Second Edition, Lippincott Williams & Wilkins, (2011) 26. M. T. Jiang, A. J. Lokuta, E. F. Farrell, M. R. Wolff, R. A. Haworth, H. and H. H. Valdivia, Circ. Res. 911015-1022: (2002) 27. A. R. Marks J Mol Cell Cardiol 33 615624 (2001). 28. R. S. Marx, Y. Hisamatsu, T. Jayaraman, D. Burkhoff, N. Rosemblit, A.R. Marks Cell 101 365-376 (2000) 29. S. Gyorke, I. Gyorke, V. Lukyanenko, D. Terentyev, S. Viatchenko-Karpinski, and T. F. Wiesner 2 Frontiers in Bioscience7 1454-1463 (2002). 30. V. Lukyanenko, S. Subramanian, I. Gyorke, T. F.Wiesner and S. Gyorke Journal of Physiology 518 1, pp. 173186 (1999). 31. I. A. Hobai and B. ORourke Circulation 1031577-1584 (2001). 32. D. M. Bers, D. A. Eisner and H. H. Valdivia Circ. Res. 93 487-490 (2003). 33. P. Swietach, K. W. Spitzer, and R. D. Vaughan-Jones Front Biosci. 15 661680 (2010). 34. Y. Nakayama1, K. Kawahara, M. Yoneyama, and T. Hachiro Biol. Rhythm Res. 36 317326 (2005). 35. J. Engel, A. J. Sowerby, S. A. E. Finch, M. Fechner, and A. Stier Biophysical Journal 68 40-45 (1995). 36. J. Engel, M. Fechner, A. J. Sowerby, S. A. E. Finch, and A. Stier Biophys. J 66 1756 1762 (1994). 37. D. A. Williams, L. M. Delbridge, S. H. Cody, P. J. Harris, and T. O. Morgan Am J Physiol Cell Physiol 262 3 C731-C742 (1992). 38. P. Neary, A. M. Duncan, S. M. Cobbe, and G. L. Smith 2002. Pflugers Arch. 444 360371 (2002). 39. T. R. Shannon, S. M. Pogwizd, and D. M. Bers Circ. Res. 93 592594 (2003).

May 6, 2013

15:31

BC: 8846 - BIOMAT 2012

08˙mehr

MODELING NATURAL KILLER CELL REPERTOIRE DEVELOPMENT AND ACTIVATION DYNAMICS

MICHAL STERNBERG-SIMON, RAMIT MEHR The Mina and Everard Goodman Faculty of Life Science, Bar-Ilan University, Ramat-Gan, Israel Natural killer (NK) cells are immune system cells that play a key role in several clinical conditions, such as cancer, transplantations and pregnancies. Their activation is determined by the integration of opposing signals received through a set of activating and inhibitory receptors. Self tolerance of NK cells is ensured during a process termed “education”, which acts by adjusting the expression frequency of the inhibitory receptors and by tuning the responsiveness of the cell, according to the strength of inhibitory signals it receives. As a result, the NK cell repertoire is comprised of a heterogeneous population of cells that vary in their receptor composition as well as in their functionality. The high complexity of the NK cell repertoire calls for the use of computational methods to elucidate the rules governing its formation and function. Our group investigates several aspects concerning the development and activation of NK cells, at different levels. At the population level, we study the dynamics of developing NK cells using labeling data and overall differential equations. At the repertoire level, we investigate the forces shaping the repertoire composition, in terms of the expression frequencies of each receptor. At the single cell level, it has been shown that the NK cell population is very heterogenic, as each NK cell will behave differently once encountering a target cell. We wish to use computer simulation to understand which characteristics distinguish one NK cell from the other. Finally, we investigate the internal mechanisms that control the activation of the NK cell.

1. Introduction 1.1. The immune response and its complexity The immune system involves various cells types, including T, B and Natural Killer (NK) lymphocytes. Each of these cell types exhibits a diverse repertoire of individual cells: T and B cells express somatically-rearranged antigen receptors which differ between T and B cell clones and are capable of distinguishing between self and foreign antigens; NK cells express one or more inhibitory and activating receptors and kill non-self or altered-self body cells, such as those infected by viruses or transformed into cancer cells .These repertoires interact with other repertoires and other immune system 132

May 6, 2013

15:31

BC: 8846 - BIOMAT 2012

08˙mehr

133

cells in a complex network of interactions. Furthermore, lymphocyte activation is regulated by a complicated network of signals. As a result, the immune system and its functionality are highly complex and non-linear. The analysis of immune system data is non-trivial and calls for the use of systems biology methods, including mathematical and computational models. 1.2. Natural killer cells 1.2.1. NK cell Function NK cells are lymphocytes that can secrete cytokines and kill virally infected, transformed or otherwise stressed target cells. They play an important role in many clinical conditions, such as cancer, viral infections and autoimmune diseases, and determine the outcome of transplantation and pregnancy. They are traditionally considered as innate immune cells, having no immune memory and requiring no prior sensitization. However, both concepts have been challenged by recent findings showing that resting NK cells require additional signals prior to their activation 1,2 and that following their response, a long-lived memory NK cells are generated, which are better at responding to secondary challenges 3 . 1.2.2. NK cell receptors NK cells differ from T and B cells, as they do not rearrange their antigen receptor genes, and thus their specificity for target cells is not dominated by a single antigen receptor. Instead, the cell’s activation is determined by the balance between signals from a range of stimulatory and inhibitory receptors expressed at the surface of each NK cell 4,5 . NK activating receptors, such as CD16, NK1.1 and NKG2D, bind certain membranebound molecules that are upregulated in stressed cells, and antibodies that coat target cells. The main inhibitory receptor families are Ly49 in rodents, killer-cell immunoglobulin-like receptors (KIRs) in humans and NKG2A/CD94 in both 6 . These receptors mostly bind self Major Histocompatibility Complex (MHC) class-I molecules, which are downregulated in virally-infected or transformed cells. Such downregulation removes the inhibition and allows NK cells to kill the target cells. The genes encoding NK cell receptors, particularly the inhibitory ones, are highly polymorphic and their expression is polygenic, such that an individual’s genome contains one or more of many variants of each gene, and several genes may be

May 6, 2013

15:31

BC: 8846 - BIOMAT 2012

08˙mehr

134

expressed simultaneously in each cell 5,7 . Each cell expresses a randomly chosen subset of the available receptor genes, and thus a highly variable repertoire of NK cell subsets expressing different receptor combinations is created. In addition, the different receptors exhibit diverse binding properties. The degree of MHC class I binding is related to the extent of functional inhibition and to the educating impact of the receptors. How the NK cell integrates the signals from all activating and inhibitory receptors to reach a decision of whether to kill the target cells and/or secrete cytokines, or do nothing, is a still not completely known and has been the subject of several mathematical and modeling studies (reviewed below).

1.2.3. NK cell education NK cell education is the process that leads to the maturation of a NK cell functional repertoire, which is adapted to self MHC class I environment. NK cells in MHC class I-deficient hosts are hyporesponsive and are therefore tolerant to self cells 8 . NK cells in MHC class I-sufficient hosts either express at least one inhibitory receptor or are also hyporesponsive. Several mechanisms that lead to self tolerance have been proposed, among them the “arming”, “licensing”, “Disarming” and the most recent and updated one, the “dynamic rheostat model”. According to this model, the efficiency by which NK cells respond to a certain stimulus is set in developing NK cells, and is directly proportional to the strength of the signal delivered by MHC class I ligands during education 9−11 . Furthermore, peripheral NK cells from MHC-deficient mice gain responsiveness upon their adoptive transfer into a MHC-sufficient environment 12,13 , implying that NK cell education is a dynamic process and that even mature NK cells are constantly tuning their responsiveness according to their environment. There is an ongoing debate regarding whether NK receptor expression is random and independent, or is affected by the MHC class I background. The ‘product rule’ (PR) hypothesis proposes that the frequency of an NK subset co-expressing multiple inhibitory receptors is the product of the expression probabilities of each receptor, that is, individual receptor expression frequencies are independent. However, if tolerance is achieved by regulating, among other factors, the expression of different combinations of receptors, then the expression pattern of the inhibitory receptors would not comply with the product rule, but rather differ between hosts with different MHC backgrounds and be similar between hosts with a similar MHC background. In addition, the NK cell repertoire in MHC-deficient hosts,

May 6, 2013

15:31

BC: 8846 - BIOMAT 2012

08˙mehr

135

where no education effect is possible, has so far been assumed to follow the product rule. While it is now clear that in mice, MHC class I does exert an effect on the frequencies of the different Ly49 subsets 14 , the question of whether a similar effect takes place in humans as well is yet to be answered 15,16 . We have used mathematical and computational modeling to address this question (see below). 1.2.4. NK cell development and maturation A popular view is that NK cell development and education take place in the bone marrow. However, NK cells have been shown to arise at additional sites, such as the thymus and lymph nodes 17−21 . The maturation stages that the NK cell undergoes on its way to become a mature and potent NK cell are also not well characterized. Several differentiation markers distinguish NK cell subsets that differ not only by their phenotype but also by their functionality and anatomic distribution 21−23 . Several studies from our group have used computational and mathematical models for the purpose of investigating NK cell education and development. Among them are models that evaluate the plausibility of different education theories 24−26 and models that simulate the population dynamics of NK cells during their maturation, based on different maturation markers [Elemans et al, manuscript in preparation]. Some of these models are reviewed below. 1.2.5. NK cell activation – Synapses and Signaling As already mentioned, the activation or inhibition of NK cells are regulated by the expression of MHC class I molecules on the target cells. Studies have shown that there is a threshold in the amount of MHC expressed by target cells required to inhibit NK cell cytotoxicity 27,28 . NK cells are, however, highly heterogeneous in their thresholds and responses. nfelt et al. used imaging of individual NK cells and showed that the NK cell population is heterogeneous in their cytotoxicity, indicating a separate mechanism of killing or a differ killing efficiency for each NK cell 29 . NK cell activating and inhibitory receptors bind their respective ligands in the NK cell immunological synapse (NKIS) – a highly organized supramolecular structure which forms at the contact area between the NK cell and the target cell, involving inhibitory and activating receptors, adhesion molecules and lipid microdomains (also known as rafts). The NKIS can be an inhibitory NKIS or a cytolytic one, based on the signals exchanged

May 6, 2013

15:31

BC: 8846 - BIOMAT 2012

08˙mehr

136

between the NK cell and target cell. If a cytolytic NKIS is formed, the lipid rafts move and recruit various signaling molecules towards the center of NKIS. In contrast, if enough inhibitory receptors are bound to inhibit the signals from the activating receptors, then the directed movement is stopped, the recruitment of the lipid rafts is prevented and the synapse does not become cytolytic. 2. Using mathematical models to understand NK cell education 2.1. Methods for comparing repertoires We aimed to determine whether the MHC class I background affects NK cell repertoire composition, in both humans and mice. For this purpose, we analyzed data on NK cell repertoires extracted from the bone marrow (BM) of mice from different single-MHC and MHC-deficient backgrounds. In addition, we analyzed NK cell repertoire data from blood samples of humans with different MHC backgrounds. In both cases, cells were probed for the expression of five inhibitory receptors (from the Ly49 or KIR families, in mice and humans, respectively). We first aimed to determine whether repertoires of individuals from the same background are more similar to each other than to repertoires of individuals from different backgrounds. For this purpose, we implemented the Jensen-Shannon divergence, which measures the similarity between two probability distributions (Equation 1) 30 . 1X JSD(P |Q) = pi ln 2 i

pi qi

1X + qi ln 2 i

qi pi

.

(1)

In the above equation, P and Q are two distributions, and pi and qi are the frequencies of the i-th observation. Second, we aimed to quantify the effects of MHC class I on the repertoire, by comparing the observed repertoire to the expected one under the assumption that repertoire formation follows the PR. If the MHC background does not affect the repertoire, one would expect to see only minor deviations of the observed repertoire from the product rule. Conversely, if there is an MHC effect, then the deviations should be consistent in both size and direction, for all individuals from the same background. In any case, the deviations in individuals from MHC deficient background should be minor. In general, when comparing between repertoires, we examine

May 6, 2013

15:31

BC: 8846 - BIOMAT 2012

08˙mehr

137

the ratio between the observed and the expected frequency of each receptor combination, rather than the absolute difference between them, which is less informative. We therefore calculated the log(observed/expected), and present sample results below.

2.2. Repertoire formation in mice Our results show that (a) the repertoires in MHC−/− mice differ from the PR, and thus some MHC-independent effects must participate in shaping the NK cell repertoire; and (b) the differences from the PR observed in MHC-sufficient mice correlate with self MHC binding (Figure 1). For example, in mice with the H-2Kb background, out of the single receptor combinations (that are usually the most frequent combinations in the repertoire), only combinations of self-MHC-specific receptors (Ly49I and Ly49C) were overexpressed, and thus preferred. In general, combinations of one or two receptors are over-represented, which may indicate that they are preferred. The results also show that the addition of an additional receptor results in an under-expression of the new combination, regardless of whether the additional receptor is a self or a non-self receptor. This may indicate that the probability to express too many receptors is low, and that this cause is more dominant than other processes shaping the repertoire.

2.3. Repertoire formation in humans In the case of human NK cell repertoires, the characterization of the forces shaping the repertoire is much more complicated, for several reasons. First, the variation between individuals, even those with the same MHC background, is very large. The Jensen-Shannon distances within groups of individuals from the same background is similar to the distances between individuals with different backgrounds. This can be explained by the fact that, unlike laboratory mice kept in specific pathogen-free conditions, the immune system of each of these individuals has been challenged by different antigens, which affected the repertoire composition. In addition, these humans are not genetically identical, unlike lab mice that are usually inbred. Another obstacle in the study of repertoire formation in humans is that we do not have enough data on MHC-deficient individuals, making it impossible to separate the MHC-dependent and -independent effects on the repertoire. In spite of these difficulties, we have observed clear deviations from the PR in the human repertoires tested (data not shown).

May 6, 2013

15:31

BC: 8846 - BIOMAT 2012

08˙mehr

138

Figure 1. The deviation from the PR rule of 43 MHC−/− mice and 26 Kb mice. Deviation were calculated as log(observed/expected). An asterisk near the name of a combination shows that the deviation from the PR was found to be statistically significant for this combination, using Student’s T test and the FDR correction for multiple comparisons.

2.4. Deciding between hypotheses: The sequential vs. the Two-step selection models Two conceptual models have been proposed by Vance and Raulet (1998). to explain the process of NK cell “education” in which the cells adapt to the self MHC environment 31 . The first is the sequential activation model. In this scheme, Ly49 genes are randomly expressed; once a receptor gene is expressed, it is known to remain stably activated (that is, once the gene is activated, it stays “on”). During maturation, the cell expresses new receptors, until a receptor that binds self-MHC is expressed. The cells are periodically tested for interaction with self MHC molecules. Strong interactions between the receptors and their MHC ligands prevents additional receptors from being expressed and results in the maturation of the cell. In this scheme, a single testing step accomplishes the education process;

May 6, 2013

15:31

BC: 8846 - BIOMAT 2012

08˙mehr

139

however this single step may be repeated many times during the cell’s development. The second scheme is the two-step selection model, in which the repertoire is fully formed at an initial stage by a stochastic process and subsequently shaped by two selection steps: one selects for cells expressing at least one self-specific receptor, the other selects against cells expressing multiple self-specific receptors. The two-step selection thus occurs only once for each cell, when it has completed its receptor gene activation. Depending on the signaling thresholds of these steps, the process may allow maturation of cells expressing more than one self-specific receptor. We have used mathematical models and computer simulations to evaluate the two hypothesized selection mechanisms 24 . The “two-step” model was implemented as a computer program which calculates the expected frequencies of cells with given receptor expression patterns as function of the selection thresholds, based on exact mathematical formulas we have developed. The “sequential” model of NK cell selection was implemented as a stochastic agent-based simulation of the development of the NK cell repertoire, again giving the expected frequencies of cells with given receptor expression patterns based on model parameters. The original Raulet group data was not sufficient to conclusively decide between the models 25 . Hence we applied the models to specifically generated, larger data set from single-MHC mice, and have shown that the two-step selection model fits the data significantly better than the sequential model 26 .

2.5. Models of NK cell signaling While NK cell repertoire and functionality have been extensively studied, we still lack knowledge regarding the signaling mechanisms responsible for the integration of signals transmitted into the cell and for NK cell activation. Mathematical models and computer simulations serve as a useful tool for investigating the complex signaling pathways involved in NK cell activation. Das et al. have investigated how activation of NK cells is modulated by the strength of receptor-ligand binding and ligand concentrations 32 . They have created a mathematical model composed of stochastic differential equations, describing membrane proximal initial signaling events and solved the equations using the Gillespie algorithm. They have shown that the magnitude of the concentrations of the inhibitory and activating signaling molecules determines how fast the cell will be activated, while the ratio between the inhibitory and activating signaling molecules determine if

May 6, 2013

15:31

BC: 8846 - BIOMAT 2012

08˙mehr

140

the cell will at all be activated. In addition, they have shown that NK cell activation is not linear with the concentration of the stimulatory ligand. Rather, as this concentration increases beyond a threshold, the activation (measured in the model by the amount of activating signaling molecules) increases substantially and this threshold increases as the concentration of inhibitory ligand increases. Another interesting insight provided by their model is that, in the presence of weak affinity stimulatory ligands, the activating receptors transmit an inhibitory signal, which is stronger than the activating signal transmitted, resulting in the deactivation of the cell. Under certain conditions, the opposite effect can also occur, that is, inhibitory receptors may induce activation. Watzl et al. have also investigated NK cell activation mechanisms through the use of mathematical models and experiments 33 . They have tested several concepts regarding the integration of activating and inhibitory signals, by creating a modular library of mechanistically different modules, an approach known as ensemble modeling. Using Matlab’s SimBiology package, which translates a model to differential equations, they generated 72 putative models, each combining one or more of the possible modules. Their results show that, in order to obtain the activation pattern observed in the experiments, the model must include an association between the activating receptor and the kinase.

2.6. Simulations of the NK cell immune synapse The models reviewed in the previous section did not address the question of how signals from activating and inhibitory receptors are spatially integrated within the NK cell immunological synapse. We addressed this question, among others, on our studied on the formation of the NK cell immune synapse and how it regulates NK cell function 34 . We constructed a computer simulation of the assembly of the NK cell-target cell synapse, which is based on many experimental findings. Our model of the NK cell immune synapse is a generalized cellular automaton containing two parallel two-dimensional grids, representing the contact areas on the membranes of the two interacting cells: the NK and the Target cell grids. The grids contain some of the different cell surface molecules, and membrane microdomains, present in the NK cell immune synapse. The simulation starts at the moment of cell-cell contact, and molecules move around the grid based on pre-defined rules, the state of each molecule and the molecules in neighboring positions, and the signals received by the cell in the preceding

May 6, 2013

15:31

BC: 8846 - BIOMAT 2012

08˙mehr

141

time steps. We chose to implement our model as an agent based model (ABM), because molecules numbers in the NKIS are small and stochastic effects are important, and also because such models are more easily accessible to biologists, and are easy to program and modify. The first question we addressed using this simulation is how NK cells integrate the signals from the various receptors. We assumed that when an activating receptor binds its ligand, it transduces a positive signal, which may be cancelled by bound inhibitory receptors. The inhibitory receptors recruit phosphatases that interfere with the signaling from activating receptors. If phosphatases cannot diffuse very fast, the effect will be a distance-based inhibition. To model such inhibition, we counted signals transduced by activating receptors. For each bound activating receptor, we did not count its signal at all if there were bound inhibitory receptors in its local environment, defined by a local inhibition radius, which was varied in the simulation. On the other hand, if inhibition extends over the whole contact area due to diffusion of the phosphatase, then inhibition may be regarded as quantity-based, and described by a “Simple sum” model: the cell “sums” over all positive signals (number of bound activating receptors), subtracts the inhibited signals (number of bound inhibitory receptors), and acts according to the net total signal. The integrated signal behavior obtained when using the distance-based inhibition signal model with an inhibition radius of the order of 3-10 molecules was closer to the experimentally observed behavior. We therefore concluded that inhibitory receptors act locally, i.e., that every bound inhibitory receptor acts on activating receptors within a certain radius around it. The simulation generated a few more interesting insights, as follows. First, the total signal is highly influenced by activating complex dissociation rates, but not by adhesion and inhibitory complex dissociation rates. Second, concerted motion of receptors in clusters – such as those formed by molecule aggregation on certain membrane microdomains – significantly accelerates NKIS maturation.

2.7. NK cell activation thresholds The inhibition of NK cells depends on the amount of MHC class I expressed by the target cells. It has been shown that there is a clear threshold in the amount of surface MHC class I required for inhibition of NK cell cytotoxicity. In spite of the insights gained in the studies reviewed above, however, we still lack knowledge about the molecular forces tuning NK cell response

May 6, 2013

15:31

BC: 8846 - BIOMAT 2012

08˙mehr

142

during education, as well as the regulation of NK cell activation once encountering a target cell. We have addressed the question of how activation and inhibition of human NK cells is regulated by the expression level of MHC class I on target cells, using both experimental and computational tools 35 . Target cells were transfected to stably express different levels of the MHC class I protein HLA-Cw6. NK cell clones, expressing varying levels of inhibitory receptors were introduced with target cells expressing different MHC class I levels. The degranulation and interferon (IFN)- secretion levels were measured. It was found that the expression levels of the inhibitory receptor KIR2DL1 determined the amount of HLA-Cw6 required for NK cell inhibition, with very little contribution from other receptors. To model the dependence of the activation/inhibition threshold on MHC class I and KIR expression levels, we used a simple mathematical equation. We assumed that the target cell killing probability per encounter, k, depends on the product of MHC and KIR numbers (M HC × KIR), such that higher numbers of either will lead to higher inhibition, as the probability of an inhibitory receptor finding its ligand depends on the number of ligands on the target cell (and vice-versa). We used a sigmoid threshold function (Equation 2), where S denotes the threshold of the NK cell clone in the same experiment, in units of molecule numbers2 as it is given in terms of the M HC × KIR product. The parameter kmax is the maximum killing capacity of each cell in the clone per encounter, and n is the exponent of the sigmoid function. k = kmax

Sn (M HC × KIR)n + S n

(2)

This function approaches kmax for M HC × KIR ≪ S (insufficient inhibition), and for M HC × KIR ≫ S it approaches zero (complete inhibition). Around M HC × KIR = S, threshold sharpness is determined by the exponent n. We fitted this function to the data on NK cell killing of target cells with various MHC levels, obtained in experiments from many different human NK cell clones, in order to see whether clones differ in their activation threshold, maximal killing capacity, or both. While the values of kmax, MHC and KIR were set according to the experimental data, the parameters S and n varied between simulations. Fitting was done using the c least-square fitting function of Matlab . The results show that the values of S varied about ten-fold between NK cell clones, while the values of kmax and n varied at about four- and two-

May 6, 2013

15:31

BC: 8846 - BIOMAT 2012

08˙mehr

143

fold, respectively. Thus, NK cells differ mostly in their intrinsic activation threshold (S), which may be set during the education process, rather than in their maximal killing capacity (kmax). We next explored the kinetics of NK cell activation and target cell lysis. Our experiments, as well as others’, showed that while the number of NK cell-target cell conjugations reaches saturation in the first half-hour of the experiment, lysis is hardly seen even after 3 hours; only after 5 hours significant target cell lysis is observed. To address this issue, we simulated the kinetics of NK cell activation under three alternative mathematical models. Due to lack of sufficient data, we did not perform quantitative fitting, but only looked for a qualitative resemblance to the data. In the simplest model, any NK cell-target cell encounter may result in immediate target cell killing (equations 3-5). C dC = βN T − dt τ

(3)

C dN = −βN T + dt τ

(4)

dT (1 − k)C = −βN T + dt τ

(5)

Here, the number of free NK cells, free target cells and NK-target cell conjugations are denoted by N, T and C, respectively. The conjugation rate is denoted by β, (cell × min)−1 , such that the number of conjugations formed at every time step, NT, is proportional to the number of free target cells and free NK cells. Conjugate lifetime is denoted by τ (min), such that conjugates dissociate at rate 1/τ (min)−1 . C/τ is therefore the number of NK cells becoming free due to conjugation release. The parameter k denotes the death rate of post-encounter target cells, such that (1 − k) × C/τ living target cells return to the free target population. This model, however, gave very fast killing kinetics, unlike the kinetics observed experimentally. In the second model version, there is a delay in the target cell death post-encounter, and the dying target cells are a separate population, denoted by Td . The death rate of dying target cells is denoted by µ (min)−1 (equation 6). However, even in this model, target cell death kinetics were not sufficiently similar to the experimental data, kC dTd = − µTd dt τ

(6)

May 6, 2013

15:31

BC: 8846 - BIOMAT 2012

08˙mehr

144

In the third model, the NK cell will kill its target cell only if the NK cell was already primed, by a former target cell encounter. Thus, the population of NK cells is divided into resting NK cells (Nr) that do not kill targets, and activated NK cells (Na), that are capable of killing; resting cells become activated after their first encounter with a target cell (equations 7-11). In the following equations, Cr and Ca denote complexes of resting and activated cells with target cells. Cr dCr = βr N r T − dt τr

(7)

dCa Ca = βa N a T − dt τa

(8)

dNr = −βr Nr T dt

(9)

dNa Ca Cr = −βa Na T + + dt τa τr

(10)

dT Cr (1 − k)Ca = −T (βr Nr + βa Na ) + + (11) dt τr τa Our results showed that the model that best fits the experimental data was the third model, which requires NK cells to be primed before they can kill target cells, and that this priming occurs after the first encounter of the resting NK cell with a target cell. Furthermore, the possibility that resting NK cells could also kill target cells, but with lower efficiency than that of primed cells, yielded graphs that were more similar to those of the simple model, and were less in correspondence with the experimental data. 2.8. Concluding remarks and future questions Many issues remain unresolved in the field of NK cells. It is now known that NK cell education acts not only through the shaping of the repertoire, but also, if not primarily, through the tuning of the responsiveness of each cell individually, according to the strength of the cell’s binding to self-MHC. Future studies should examine the repertoire both phenotypically and functionally. In the field of NK cell signaling, we still lack knowledge about the mechanisms controlling the responsiveness of the cell and the integration of signals transmitted into the cell during its activation. Though computational models are a useful tool for exploring such mechanisms, experimental

May 6, 2013

15:31

BC: 8846 - BIOMAT 2012

08˙mehr

145

work is greatly needed in order to gain more detailed information about these processes and also for the validation of conclusions arising from the computational work. Acknowledgments The studies in our lab that are described in this Work have benefitted from our collaborations with Catarina R. Almeida, Petter Brodin, Daniel M. Davis, Marjet Elemans, Petter Hglund, Maria Johansson, Sofia Johansson, Klas Krre, Karl-Johan Malmberg, Peter Parham, Markus Uhrberg, and Eric Vivier. We are indebted to all of them for the data and fruitful discussions that led to the work described herein. These studies were funded over the years by The Israel Science Foundation (grant number 759/01); The Human Frontiers Science Program – a Young Investigator Grant; and a Systems Biology Prize grant from Teva Pharmaceuticals. The collaborative work on NK cells with groups in Sweden was supported by the Swedish Foundation for Strategic Research and the Swedish Research Council. References 1. Y.T. Bryceson, M. E. March, H. G. Ljunggren and E. O. Long. Activation, coactivation and costimulation of resting human natural killer cells. Immunol. Rev., 214, 73 (2006). 2. M. Lucas, W. Schachterle, K. Oberle, P. Aichele and A. Diefenbach. Dendritic cells prime natural killer cells by trans-presenting interleukin 15. Immunity, 26, 503 (2007). 3. J. C. Sun, S. Lopez-Verges, C. C. Kim, J. L. DeRisi and L. L. Lanier. NK cells and immune ”memory”. J. Immunol., 186, 1891 (2011). 4. S.T. P. Agrawal and S. Naik. Natural Killer Cells in Clinical and Experimental Transplantation: NK Cells in Self-tolerance. Expert Rev. Clin. Immunol. (2008). 5. L. L. Lanier. NK cell recognition. Annu. Rev. Immunol., 23, 225 (2005). 6. L. L. Lanier. NK cell receptors. Annu. Rev. Immunol., 16 359 (1998). 7. A. Rouhi, C. B. Lai, T. P. Cheng, F. Takei, W. M. Yokoyama and D. L. Mager. Evidence for high bi-allelic expression of activating Ly49 receptors. Nucleic Acids Research (2009). 8. S. Kim, J. Poursine-Laurent, S. M. Truscott, L. Lybarger, Y. J. Song, L. P. Yang, A. R. French, J. B. Sunwoo, S. Lemieux, T. H. Hansen and W. M. Yokoyama. Licensing of natural killer cells by host major histocompatibility complex class I molecules. Nature, 436, 709 (2005). 9. D. H. Raulet and R. E. Vance. Self-tolerance of natural killer cells. Nature Reviews Immunology, 6, 520 (2006). 10. N. C. Fernandez, E. Treiner, R. E. Vance, A. M. Jamieson, S. Lemieux and D. H. Raulet. A subset of natural killer cells achieves self-tolerance without

May 6, 2013

15:31

BC: 8846 - BIOMAT 2012

08˙mehr

146

11. 12.

13.

14. 15.

16.

17. 18. 19.

20.

21.

22.

23. 24.

25. 26.

expressing inhibitory receptors specific for self-MHC molecules. Blood, 105, 4416 (2005). P. Brodin and P. Hoglund. Beyond licensing and disarming: a quantitative view on NK-cell education. Eur. J. Immunol, 38, 2934 (2008). N. T. Joncker, N. Shifrin, F. Delebecque and D. H. Raulet. Mature natural killer cells reset their responsiveness when exposed to an altered MHC environment. J. Exp. Med., 207, 2065 (2010). J. M. Elliott, J. A. Wahle and W. M. Yokoyama. MHC class I-deficient natural killer cells acquire a licensed phenotype after transfer into an MHC class I-sufficient environment. J. Exp. Med., 207, 2073 (2010). P. Hoglund and P. Brodin. Current perspectives of natural killer cell education by MHC class I molecules. Nat. Rev. Immunol., 10, 724 (2010). S. Andersson, C. Fauriat, J. A. Malmberg, H. G. Ljunggren and K. J. Malmberg. KIR acquisition probabilities are independent of self-HLA class I ligands and increase with cellular KIR expression. Blood, 114, 95 (2009). K. Schonberg, M. Sribar, J. Enczmann, J. C. Fischer and M. Uhrberg. Analyses of HLA-C-specific KIR repertoires in donors with group A and B haplotypes suggest a ligand-instructed model of NK cell receptor acquisition. Blood, 117, 98 (2011). L. L. Veinotte, T. Y. F. Halim and F. Takei. Unique subset of natural killer cells develops from progenitors in lymph node. Blood, 111, 4201 (2008). J. P. Di Santo and C. A. Vosshenrich. Bone marrow versus thymic pathways of natural killer cell development. Immunological Reviews, 214, 35 (2006). C. Luther, K. Warner F. and Takei. Unique progenitors in mouse lymph node develop into CD127+ NK cells: thymus-dependent and thymus-independent pathways. Blood (2011). N. D. Huntington and J. P. Di Santo. Humanized immune system (HIS) mice as a tool to study human NK cell development. Current Topics in Microbiology and Immunology, 324, 109 (2008). N. D. Huntington, H. Tabarias, K. Fairfax, J. Brady, Y. Hayakawa, M. A. Degli-Esposti, M. J. Smyth, D. M. Tarlinton and S. L. Nutt. NK cell maturation and peripheral homeostasis is associated with KLRG1 up-regulation.J. Immunol., 178, 4764 (2007). L. Chiossone, J. Chaix, N. Fuseri, C. Roth, E. Vivier and T. Walzer. Maturation of mouse NK cells is a 4-stage developmental program. Blood, 113, 5488 (2009). E. Sitnicka. Early cellular pathways of mouse natural killer cell development. J. Innate Immun., 3, 329 (2011). M. Salmon-Divon, P. Hoglund and R. Mehr. Generation of the natural killer cell repertoire: The sequential vs. the two-step selection model. Bulletin of Mathematical Biology, 65, 199 (2003). M. Salmon-Divon, P. Hoglund and R. Mehr. Models for natural killer cell repertoire formation. Clin. Dev. Immunol., 10, 183 (2003). S. Johansson, M. Salmon-Divon, M. H. Johansson, Y. Pickman, P. Brodin, K. Karre, R. Mehr and P. Hoglund. Probing natural killer cell education by Ly49 receptor expression analysis and computational modelling in single

May 6, 2013

15:31

BC: 8846 - BIOMAT 2012

08˙mehr

147

MHC class I mice. PLoSOne, 4, e6046 (2009). 27. C. R. Almeida and D. M. Davis. Segregation of HLA-C from ICAM-1 at NK cell immune synapses is controlled by its cell surface density. Journal of Immunology, 177, 6904 (2006). 28. T. D. Holmes, Y. M. El-Sherbiny, A. Davison, S. L. Clough, G. E. Blair and G. P. Cook. A human NK cell activation/inhibition threshold allows small changes in the target cell surface phenotype to dramatically alter susceptibility to NK cells. J. Immunol., 186, 1538 (2011). 29. M. A. Khorshidi, B. Vanherberghen, J. M. Kowalewski, K. R. Garrod, S. Lindstrom, H. Andersson-Svahn, H. Brismar, M. D. Cahalan and B. Onfelt. Analysis of transient migration behavior of natural killer cells imaged in situ and in vitro. Integr Biol (Camb), 3, 770 (2011). 30. J. Lin. Divergence measures based on the Shannon entropy. IEEE Transactions on Information theory, 37, 145 (1991). 31. R. E. Vance and D. H. Raulet. Toward a quantitative analysis of the repertoire of class I MHC-specific inhibitory receptors on natural killer cells. Curr. Top. Microbiol. Immunol., 230, 135 (1998). 32. J. Das. Activation or tolerance of natural killer cells is modulated by ligand quality in a nonmonotonic manner. Biophys. J., 99, 2028 (2010). 33. S. Mesecke, D. Urlaub, H. Busch, R. Eils and C. Watzl. Integration of activating and inhibitory receptor signaling by regulated phosphorylation of Vav1 in immune cells. Sci. Signal, 4, ra36 (2011). 34. A. Kaplan, S. Kotzer, M. Salmon-Divon, C. R. Almeida, P. Hoglund, D. M. Davis and R. Mehr. Modelling natural killer cell immunological synapses. Tissue Antigens, 69, 382 (2007). 35. C. R. Almeida, A. Ashkenazi, G. Shahaf, D. Kaplan, D. M. Davis and R. Mehr. Human NK cells differ more in their KIR2DL1-dependent thresholds for HLA-Cw6-mediated inhibition than in their maximal killing capacity. PLoS One, 6, e24927 (2011).

May 6, 2013

15:36

BC: 8846 - BIOMAT 2012

09˙shapiro

SATURATION EFFECTS ON T-CELL ACTIVATION IN A MODEL OF A MULTI-STAGE PATHOGEN

MICHAEL SHAPIRO∗ Pathology Department Tufts University Boston, MA USA E-mail: [email protected] EDGAR DELGADO-ECKERT Children’s Hospital (UKBB) University of Basel Spitalstr. 33, Postfach 4031 Basel Switzerland E-mail: [email protected]

In 6 , we studied host response to a pathogen which uses a cycle of immunologically distinct stages to establish and maintain infection. We showed that for generic parameter values, the system has a unique biologically meaningful stable fixed point. That paper used a simplified model of T-cell activation, making proliferation depend linearly on antigen-T-cell encounters. Here we generalize the way in which T-cell proliferation depends on the sizes of the antigenic populations. In particular, we allow this response to become saturated at high levels of antigen. We show that this family of generalized models shares the same steady-state behavior properties with the simpler model contemplated in 6 while offering a new mathematical explanation of post-transplant lympho-proliferative disorder.

1. Introduction Pathogens that cyclically traverse different stages during their life cycle or during an infection process have been studied since the late nineteenth century. Important examples are Plasmodium 7 , Trypanosoma 18 , and the family of herpes viruses, including the Epstein-Barr virus (EBV) 17,16 . One ∗ This

work was partially supported by NIH grant K25AI079404 to MS. 148

May 6, 2013

15:36

BC: 8846 - BIOMAT 2012

09˙shapiro

149

remarkable characteristic of infections with many of such pathogens is lifelong persistent infection 9 , 17 , 12 , 7 , 18 . In 6 , we introduced a model of a pathogen that uses a cycle of n antigenically distinct stages to establish and maintain infection. The model is given by 2n differential equations, dSj = Fj (S, T ) = rj−1 fj−1 Sj−1 − aj Sj − fj Sj − pj Sj Tj dt dTj = Gj (S, T ) = cj Sj Tj − bTj . dt

(1)

Here Sj denotes the pathogen population at stage j, Tj is the cognate host response. The indices j = 0, . . . , n − 1 are taken modulo n. The parameters represent the following processes: • aj is the decay rate of stage Sj . If aj is negative, this state proliferates. • fj is the rate at which stage Sj is lost to become (or produce) stage Sj+1 . • rj is an amplification factor in the process by which stage Sj becomes (or produces) stage Sj+1 . For example, the loss of one lytically infected cell may produce rj ∼ = 104 free virus. • pj represents the efficacy of the immune response Tj in killing infected stage Sj . • cj is the antigenicity of stage Sj , i.e., its efficacy in inducing proliferation of immune response Tj . • b is the natural death rate of the response Tj . We assume it is the same for all stages. We refer to the parameters collectively as θ. Except for aj , j = 0, . . . , n− 1, these are assumed non-negative. Our flagship result is that while (1) has 2n fixed points for generic values of θ, exactly one of these is biologically meaningful and stable 6 . Let us focus for the moment on the terms −pj Sj Tj of Fj and cj Sj Tj of Gj . The term −pj Sj Tj represents the killing of pathogen at stage j (usually infected cells in a particular differentiation state) by the cognate T-cell population. This takes place pursuant to an encounter between Tcells and infected cells displaying antigen complexed to MHC. To a first approximation the rate of such encounters is proportional to the product of the sizes of the two populations. Thus, to a first approximation, this term reflects the mechanism of the biological process it represents.

May 6, 2013

15:36

BC: 8846 - BIOMAT 2012

09˙shapiro

150

The term cj Sj Tj of Gj represents proliferation of T-cells in response to the presence of antigen. Here, the underlying biological processes are considerably more complicated, involving a number of cell types. Initially, T-cells are activated and begin to proliferate only in response to antigen presenting cells, particularly dendritic cells (DCs) 14 . The density of presented antigen is known to affect these T-cell-DC interactions 8,21 , eliciting differing CTL responses at different densities, including T-cell exhaustion at high concentrations of presented antigen 13 . Activated CD8+T-cells also exhibit central and effector memory phenotypes and the relationships between these phenotypes is not well understood 4 . Finally, the length of the cell cycle places a hard limit on the rate at which the T-cell population can proliferate. Thus, the rate of T-cell proliferation becomes saturated for large amounts of antigen 10 . In this, they bear a similarity to the rates of enzyme catalyzed chemical reactions (reviewed in 2 , see 3 for experimental evidence). To accommodate dose-dependent effects, we will study the system dSj = Fbj (S, T ) = rj−1 fj−1 Sj−1 − aj Sj − fj Sj − pj Sj Tj (2) dt dTj bj (S, T ) = ϕj (Sj )Tj − bTj . =G dt which generalizes (1). The function Fb is unchanged from F . The terms of b j represent proliferation of CTLs in response to the presence of antigen G

and the loss of CTLs due to death or decommissioning. We use functions ϕj : [0, ∞) → [0, ∞) to denote the dose-response curves. We assume that for each j, ϕj (0) = 0, and that for x ∈ (0, ∞), ϕ′j (x) exists and is positive. In particular, each ϕj is continuous on [0, ∞) and strictly monotone increasing. The possibility of dose-response saturation arises from the case where there is an mj ∈ R so that limx→∞ ϕj (x) = mj . We show that with appropriate modification, the major results of 6 hold for (2). While the term ϕj (Sj ) is still phenomenological in that it omits discussion of biological mechanism, we argue in Section 4 that it may well offer a way to address this limitation. The system (2) can exhibit a behavior which does not arise with (1). In either case if aj + fj < 0, we say that j is self-establishing. It is not hard to see that a self-establishing stage which is not regulated will expand without bound. On the other hand, as we will show, if mj ≤ b, the host cannot mount a response to Sj . It is immuno-incompetent with respect to this stage. If j is self-establishing and the host is immuno-incompetent with respect to j, we say that the parameter set is fatal. We will suggest

May 6, 2013

15:36

BC: 8846 - BIOMAT 2012

09˙shapiro

151

in Section 4 that this mathematical behavior casts light on post-transplant lympho-proliferative disorder (PTLD). If the host is immunologically incompetent at all stages we say the host is totally immunologically incompetent. Clearly, in this case, if the basic reproductive number of the pathogen is greater than one, infection is also fatal to the host. Accordingly, we will assume that the host is immuno-competent for at least one stage. 2. Background and definitions We start by transforming (2) through a change of coordinates. For this purpose we take mj := limx→∞ ϕj (x) ∈ R ∪ {∞}. Notice that if mj > b, there is a unique value bj ∈ R so that ϕj (bj ) = b. If mj ≤ b, we take bj := ∞. So we define ( ϕ′j (bj ) if bj < ∞ cj := 1 otherwise In the former case, cj is the marginal antigenicity of Sj at the value bj . We now use the linear change of coordinates H : R2n → R2n

(Sj , Tj ) 7→ (S j , T j ) := Hj (Sj , Tj ) := (cj Sj , pj Tj ). This gives the equations dS j = Fbj (S, T ) = r j−1 fj−1 S j−1 − aj S j − fj S j − S j T j dt dT j bj (S, T ) = ϕj (S j )T j − bT j =G dt cj+1 rj = rj cj Sj ϕj (S j ) = ϕj cj

(3)

Note that for each j, ϕj still enjoys the properties that it is differentiable, ϕ′j (x) > 0 for x > 0 and ϕj (0) = 0. We now take bj to be the unique solution to ϕj (bj ) = b, i.e., bj := cj bj , where such exists. Note that ϕj now enjoys the additional property that ϕ′j (bj ) = 1. In the case studied in 6 , ϕj (Sj ) = cj Sj and thus ϕj (Sj ) = Sj , giving dS j = Fbj (S, T ) = rj−1 fj−1 S j−1 − aj S j − fj S j − S j T j dt dT j bj (S, T ) = S j T j − bT j =G dt

(4)

May 6, 2013

15:36

BC: 8846 - BIOMAT 2012

09˙shapiro

152

We will henceforth drop the bars and assume our equations are given in the form (3). We take a parameter set θ to be a set of values for b, rj , fj , aj , mj and bj , j = 0, ..., n − 1. When we need to make an explicit comparison with the ϕj of (2), we will refer to the later as “biological ϕ”, ϕjbio . We will adopt the following notational conventions. Sets such [j, k] and [j, k) are to be taken cyclically. That is to say, if j < k, then [j, k] = {j, . . . , k}, while if j > k, [j, k] = {j, . . . , n − 1, 0, . . . , k}. We take [j, j) to be the empty set so that any product taken over [j, j) is equal to one. We abuse notation by taking [0, n) = {0, . . . , n − 1} We now review and in some cases generalize the definitions of 6 . Definition 2.1. Given a fixed point (S ∗ , T ∗ ) of (3), the regulated and unregulated stages of (S ∗ , T ∗ ) are Reg(S ∗ , T ∗ ) = {j | Tj∗ 6= 0}

Unreg(S ∗ , T ∗ ) = {j | Tj∗ = 0}

Definition 2.2. (S ∗ , T ∗ ) is biologically meaningful if Sj∗ ≥ 0, Tj∗ ≥ 0 for j = 0, . . . , n − 1. It is infected if for some (hence, all, see 1) below) j, Sj∗ > 0. Definition 2.3. Given a parameter set θ, the self-establishing stages are SE(θ) = {j | aj + fj < 0}. Definition 2.4. The immuno-incompetent stages of θ are Incomp(θ) = {j | mj ≤ b} = {j | bj = ∞}. Definition 2.5. If j ∈ SE(θ) ∩ Incomp(θ), we say that the stage j and the parameter set θ are fatal. We will assume that the host is capable of mounting a response to at least one stage, i.e., Incomp(θ) 6= [0, n). Definition 2.6. If SE(θ) = ∅, the follow-on constants of θ are rj fj aj+1 + fj+1 Y = Mℓ

Mj = Mjk

ℓ∈[j,k)

In the case where SE(θ) 6= ∅, Mj is only meaningful for our purposes for j+1 ∈ / SE(θ). Accordingly, Mjk is only meaningful if (j, k] ∩ SE(θ) = ∅. Note that for every k ∈ [j, ℓ) it holds Mjℓ = Mjk Mkℓ .

May 6, 2013

15:36

BC: 8846 - BIOMAT 2012

09˙shapiro

153

Definition 2.7. We say that j starves k and write j ≻ k if bj Mjk < bk . Here we assume that Mjk is meaningful and that bj is finite, though bk need not be. In particular, if Mjk is meaningful, j ∈ / Incomp(θ), and k ∈ Incomp(θ), then j ≻ k. Definition 2.8. The starvable stages of θ are Str(θ) = {k | there is j so that j ≻ k}. The unstarvable stages Unstr(θ) are the complement of these. Definition 2.9. A biologically meaningful fixed point (S ∗ , T ∗ ) is saturated a if Reg(S ∗ , T ∗ ) = Unstr(θ). It is moderated if for j ∈ Unreg, Sj∗ < bj . Definition 2.10. If SE(θ) = ∅, we define R0 =

n−1 Y

Mj .

j=0

R0 may be interpreted as the number of copies of the pathogen produced by a single copy entering a naive host 6 . It is not hard to see that R0 is invariant under the transformation H as befits a property of the organism being described. Definition 2.11. following: • • • •

When we say that θ is generic we will require the

R0 6= 1. There is no j so that aj + fj = 0. There is no pair (j, k) so that bj < ∞, bk < ∞ and bj Mjk = bk . At the saturated biologically meaningful fixed point, there are j and k so that Tj∗ 6= Tk∗ .

It is not hard to see that each of these conditions has measure zero, thus justifying the use of the term generic. The detailed motivation for these exclusions can be found in 6 . Definition 2.12. Suppose Reg(S ∗ , T ∗ ) 6= ∅. Given a stage k, we define hk to be the unique stage such that hk ∈ Reg(S ∗ , T ∗ ) and (hk , k) ⊂ Unreg(S ∗ , T ∗ ). a There

is an unfortunate collision here between the use of the term saturated to denote the host mounting a T-response to all stages capable of supporting one 6 and the meaning of the term used in the Introduction above, namely, a maximum prolfieration rate, with no increase through further stimulation.

May 6, 2013

15:36

BC: 8846 - BIOMAT 2012

09˙shapiro

154

3. Results We will start by assuming that SE(θ) = ∅. This will be a standing assumption until it is lifted in Section 3.1. The linear stability analysis performed in 6 is possible because we were able to calculate the characteristic polynomial of the Jacobian matrix of the right hand side of (4), which corresponds to setting ϕj (Sj ) = Sj for each j in (3) (recall that we are omitting the bars). Here we contemplate more general ϕj : R → R, j = 0, ..., n − 1 (which is a consequence of contemplating more general ϕbio : R → R) with the properties mentioned j above. Consequently, in order to make use of the results obtained in 6 , we need to establish what changes are induced on the Jacobian matrix through the use of more general functions ϕj . The partial derivatives of the right hand side of ( 3) are given by   if j = k − 1  rk−1 fk−1 ∂ Fbk = −ak − fk − Tk if j = k  ∂Sj  0 otherwise ( −Sk if j = k ∂ Fbk = ∂Tj 0 otherwise ( bk ϕ′k (Sk )Tk if j = k ∂G = ∂Sj 0 otherwise ( bk ϕk (Sk ) − b if j = k ∂G = ∂Tj 0 otherwise Since the functions Fbj do not depend on any ϕj , only the partial derivatives bk bk ∂G ∂G 6 ∂Sj and ∂Tj differ from the results obtained in . As we shall see, most differences vanish when the functions are evaluated at a fixed point. Proposition 3.1. (1) R0 is the basic reproductive number of the pathogen. (2) If R0 < 1, the pathogen fails to establish infection and (S ∗ , T ∗ ) = 0 is a local attractor. If R0 > 1, the pathogen is able to establish infection. In particular, this makes (S ∗ , T ∗ ) = 0 an unstable fixed point. (3) If R0 < 1, (S ∗ , T ∗ ) = 0 is a global attractor. Proof. There are two ways to establish the first and second claims. One is

May 6, 2013

15:36

BC: 8846 - BIOMAT 2012

09˙shapiro

155

by using the interpretation of R0 in terms of the lifespan and productivity of each stage. The other is by computing the eigenvalues the Jacobian. We briefly sketch the first approach. In the absence of immune response, 1 . During that time, it produces stage 0 has an expected lifespan of a0 +f 0 r0 f0 a0 +f0

f0 r1 f1 copies of stage 1. These, in turn, produce ar00+f copies of stage 0 a1 +f1 2. Continuing in this way produces R0 copies of stage 0. If this is greater than 1, the pathogen can establish infection, if less than 1, not. To see these two claims using the eigenvalues of the Jacobian, notice that since ϕj (0) = 0, j = 0, ..., n − 1, the Jacobian matrix evaluated at (S ∗ , T ∗ ) = (~0, ~0) is identical to the one obtained in 6 . Thus the claim follows from Propositions 1 and 2 of 6 . The third claim comes from showing that the reproductive number in the presence of immune response is no more than the reproductive number in the naive host as in 6 .

6

The following correspond to the numbered observations in Section 3 of and follow from the fixed point equations ∗ S˙j = Fbj (S ∗ , T ∗ ) = rj−1 fj−1 Sj−1 − Sj∗ (aj + fj + Tj∗ ) = 0 b j (S ∗ , T ∗ ) = (ϕj (S ∗ ) − b)T ∗ = 0 T˙j = G j j

1) Given (S ∗ , T ∗ ), if there is j such that Sj∗ = 0 then (S ∗ , T ∗ ) = 0. 2) If j ∈ Reg(S ∗ , T ∗ ), then Sj∗ = bj . 3) If j ∈ Unreg(S ∗ , T ∗ ), then Tj∗ = 0. ∗ 4) If j + 1 ∈ Unreg(S ∗ , T ∗ ), then Sj+1 = Sj∗ Mj . ∗ ∗ ∗ 5) If [j +1, k] ⊂ Unreg(S , T ) then Sk = Sj∗ Mjk . This follows by induction on the previous observation. 6) Assume Reg(S ∗ , T ∗ ) 6= ∅. If k ∈ Unreg(S ∗ , T ∗ ), then Sk∗ = bhk Mhk k . This follows from 2) and 5). 7) If θ is generic and (S ∗ , T ∗ ) 6= 0 then Reg(S ∗ , T ∗ ) 6= ∅. Were Reg(S ∗ , T ∗ ) = ∅, then by 5) S0∗ = S0∗ R0 . Consequently R0 = 1, contradicting our first genericity requirement. r f ∗ 8) If j ∈ Reg(S ∗ , T ∗ ) then Tj∗ = j−1bj j−1 Sj−1 − (aj + fj ). bh

9) If j ∈ Reg(S ∗ , T ∗ ) then Tj∗ = rj−1 fj−1 bjj Mhj j−1 − (aj + fj ). This ∗ follows from the fact that Sj−1 = Sh∗j Mhj j−1 (which follows from 6) if j − 1 ∈Unreg(S ∗ , T ∗ ) , and holds trivially, if j − 1 ∈Reg(S ∗ , T ∗ )) and Sh∗j = bhj . 10) If j ∈ Reg(S ∗ , T ∗ ), then Tj∗ > 0 if and only if bhj Mhj j > bj . This follows from the previous observation; (recall that we have assumed SE(θ) = ∅).

May 6, 2013

15:36

BC: 8846 - BIOMAT 2012

09˙shapiro

156

Proposition 3.2. Suppose θ is generic and (S ∗ , T ∗ ) is a biologically meaningful fixed point. Suppose further that j ≻ k . If j ∈ Reg(S ∗ , T ∗ ), then k ∈ Unreg(S ∗ , T ∗ ). Proof. Suppose j ≻ k and j ∈Reg(S ∗ , T ∗ ). Let {j = j0 , j1 , . . . , jm } = [j, k)∩Reg(S ∗ , T ∗ ) (cyclically ordered as listed) and let jm+1 = k. If m = 0, then, hk = j and, due to j ≻ k, it holds bhk Mhk k < bk . Thus, the claim follows by observation 10), above. Otherwise we have j 6= jm = hk and it suffices to show that Sj∗m Mjm k < bk . For ℓ = 0, . . . , m, we have jℓ ∈Reg(S ∗ , T ∗ ) so we must have Sj∗ℓ = bjℓ . For ℓ = 0, . . . , m − 1 we must also have Sj∗ℓ+1 < Sj∗ℓ Mjℓ jℓ+1 , because hjℓ+1 = jℓ . We then have bj1 = Sj∗1 < Sj∗0 Mj0 j1 bj2 = Sj∗2 < Sj∗1 Mj1 j2 < Sj∗0 Mj0 j2 .. . bjm = Sj∗m < Sj∗0 Mj0 jm = bj0 Mj0 jm so that Sh∗k Mhk k = Sj∗m Mjm k < bj0 Mj0 k < bk . Thus k ∈Unreg(S ∗ , T ∗ ) as required. Proposition 3.3. Let θ be a generic parameter set such that R0 > 1. Then ≻ is a strict partial order. Proof. We must show that ≻ is anti-reflexive, asymmetric and transitive. The first follows immediately from the fact that Mjj = 1. To see that ≻ is asymmetric, suppose we have j ≻ k and k ≻ j . We then have bj Mjk < bk and bk Mkj < bj . This gives bj > bj Mjk Mkj . But Mjk Mkj = R0 , contradicting R0 > 1. To see the third we suppose that j ≻ k and k ≻ ℓ. We consider two cases, k ∈ [j, ℓ] and ℓ ∈ [j, k]. In the first case, we have Mjk Mkℓ = Mjℓ . We then have bj Mjk < bk , bk Mkℓ < bℓ giving bj Mjℓ = bj Mjk Mkℓ < bk Mkℓ < bℓ as required. In the second case, we have Mjk Mkℓ = R0 Mjℓ , so that bj R0 Mjℓ = bj Mjk Mkℓ < bk Mkℓ < bℓ . Since R0 > 1, this implies j ≻ ℓ as required. Remark 3.1. Since ≻ is a partial order, it is cycle-free, that is there is no sequence of stages j0 ≻ jj ≻ . . . ≻ j0 . Consequently, we can define the

May 6, 2013

15:36

BC: 8846 - BIOMAT 2012

09˙shapiro

157

depth of a stage k, d(k) to be the length of the longest chain j0 ≻ . . . ≻ jd(k) = k. It follows that Unstr(θ) consists of the stages of depth 0. In particular, Unstr(θ) 6= ∅. Note that if θ is such that no two stages are comparable, then ≻ is empty and every stage is ≻-maximal, so Unstr(θ) = [0, n). Str(θ) consists of the stages of positive depth. If Incomp(θ) 6= ∅, Incomp(θ) consists of the stages of maximal depth. Proposition 3.4. Suppose that θ is generic. Furthermore, let (S ∗ , T ∗ ) be a biologically meaningful infected fixed point. Then the pathogen populations are moderated at (S ∗ , T ∗ ) if and only if the immune response is saturated at (S ∗ , T ∗ ). Proof. We first show that if (S ∗ , T ∗ ) is moderated, then (S ∗ , T ∗ ) is saturated. We claim Unstr(θ)⊆Reg(S ∗ , T ∗ ). Suppose to the contrary j ∈Unstr(θ)∩Unreg(S ∗ , T ∗ ). By assumption (S ∗ , T ∗ ) is moderated, so Sj∗ < bj . Since j ∈ Unreg(S ∗ , T ∗ ), by 7), 6) and 2) above, Sj∗ = Sh∗j Mhj j = bhj Mhj j . This gives bhj Mhj j < bj , i.e., hj ≻ j, contradicting the assumption that j ∈ Unstr(θ). This proves the claim. We claim that Str(θ) ⊆Unreg(S ∗ , T ∗ ). If Str(θ) = ∅, this holds trivially. Suppose k ∈ Str(θ). Then there is a maximal j so that j ≻ k. Being maximal j ∈ Unstr(θ) and thus j ∈Reg(S ∗ , T ∗ ). It follows by Proposition 3.2 that k ∈Unreg(S ∗ , T ∗ ) as required. We now show that if (S ∗ , T ∗ ) is saturated, then (S ∗ , T ∗ ) is moderated. If Unreg(S ∗ , T ∗ ) = ∅, the claim holds vacuously. Suppose that k ∈Unreg(S ∗ , T ∗ ). We must show that Sk∗ < bk . Again, we choose j to be maximal so that j ≻ k. Since (S ∗ , T ∗ ) is saturated, j ∈Reg(S ∗ , T ∗ ) , thus, by 2) Sj∗ = bj . If [j+1, k) ⊆ Unreg(S ∗ , T ∗ ), we are done, for then Sk∗ = Sj∗ Mjk = bj Mjk < bk . On the other hand if [j + 1, k) ∩ Reg(S ∗ , T ∗ ) 6= ∅, choose m ∈ [j + 1, k)∩Reg(S ∗ , T ∗ ) so that m = hk . Since m ∈ Reg(S ∗ , T ∗ ), by the assumed saturation m ∈ Unstr(θ). Therefore j ⊁ m, in other words, bj Mjm ≥ bm . If bm Mmk ≥ bk , these two inequalities would yield bj Mjk = bj Mjm Mmk ≥ bk contradicting j ≻ k. Consequently bm Mmk < bk ∗ must hold. Now we have Sk∗ = Sm Mmk = bm Mmk < bk as required. Theorem 3.1. Suppose θ is generic and (S ∗ , T ∗ ) is a biologically meaningful infected fixed point which is not saturated. Then there is j ∈ Unreg(S ∗ , T ∗ ) so that for any open neighborhood U of (S ∗ , T ∗ ) there is dT a biologically meaningful point x ∈ U so that dtj x > 0 . In particular (S ∗ , T ∗ ) is unstable.

May 6, 2013

15:36

BC: 8846 - BIOMAT 2012

09˙shapiro

158

Proof. Since (S ∗ , T ∗ ) is not saturated, it is not moderated. Thus, there is j ∈ Unreg(S ∗ , T ∗ ) with Sj∗ ≥ bj . Since θ is generic, Sj∗ > bj , for otherwise we would have bhj Mhj j = bj . In particular, bj < ∞. It follows that b ∂G

j∈ / Incomp(θ). Since Sj∗ > bj , ϕj (Sj∗ ) > b, so ∂Tjj |(S ∗ ,T ∗ ) = ϕj (Sj ) − b > 0. Let eTj be the unit vector in the Tj direction. Then, for any δ > 0, ˙ Tj |(S ∗ ,T ∗ )+δeTj > 0. Thus, in any open neighborhood U of (S ∗ , T ∗ ), there are biologically meaningful points whose orbits move away from (S ∗ , T ∗ ). In particular, (S ∗ , T ∗ ) is unstable as required.

Theorem 3.2. Suppose that θ is generic and that (S ∗ , T ∗ ) is a biologically meaningful infected fixed point. In particular, not all Tj∗ are equal. If (S ∗ , T ∗ ) is moderated then (S ∗ , T ∗ ) is a local asymptotically stable equilibrium. In particular, the eigenvalues of the Jacobian matrix J(S ∗ , T ∗ ) have strictly negative real part. Corollary 3.1. For a generic parameter set, the system ( 3) (and hence (2)) has a unique biologically meaningful stable fixed point. Proof. Since the sets of starvable and unstarvable stages depend only on θ, there is exactly one saturated fixed point, hence exactly one moderated fixed point. The Corollary now follows from the Theorem. Proof. [Theorem 3.2] The proof of the corresponding Theorem in 6 proceeds by showing that the Jacobian matrix of the system (4 ) has eigenvalues all of whose real parts are negative. It will therefore suffice to show that we can carry out the same computation on the Jacobian matrix of (3) evaluated at a moderated fixed point (S ∗ , T ∗ ). Since F and Fb are identical, we b We have need only consider the partials of G and G. ∂Gk = ∂Sj

(

Tk

if j = k

0

otherwise

bk ∂G = ∂Sj

(

ϕ′k (Sk )Tk

if j = k

0

otherwise

Now for k ∈ Unreg(S ∗ , T ∗ ), both of these partial derivatives vanish, while if k ∈ Reg(S ∗ , T ∗ ), We then have Sk∗ = bk so that ϕ′k (Sk∗ ) = 1, and once again, the two are identical.

May 6, 2013

15:36

BC: 8846 - BIOMAT 2012

09˙shapiro

159

Moreover, we have ∂Gk = ∂Tj

(

0

bk ∂G = ∂Tj

(

ϕk (Sk ) − b

Sk − b if j = k

otherwise

0

if j = k otherwise

Now if k ∈ Reg(S ∗ , T ∗ ), we have Sk∗ = b so that Sk − b = 0 in the former case, while in the latter case we have Sk∗ = bk so that ϕk (Sk∗ ) − b = 0. Finally, in the case where k ∈ Unreg(S ∗ , T ∗ ), the proof of ? appeals to the fact that (S ∗ , T ∗ ) is moderated, thus ensuring that Sk∗ − b < 0. Here, the fact that (S ∗ , T ∗ ) is moderated implies that Sk∗ < bk so that ϕk (Sk∗ ) − b < 0 and we can proceed as before. 3.1. Self-establishing stages We now turn to the case where SE(θ) 6= ∅ . In this case we need the assumption that θ is not fatal, that is, SE(θ) ∩ Incomp(θ) = ∅ and Incomp(θ) 6= [0, n). We start by observing that if SE(θ) 6= ∅, then the pathogen is viable. Accordingly, in place of Proposition 3.1, we have the following. Proposition 3.5. If SE(θ) 6= ∅, then (S ∗ , T ∗ ) = (0, 0) is an unstable equilibrium. In particular, the pathogen is able to infect the host. b ∂F

Proof. Suppose that j ∈ SE(θ). Then ∂Sjj |(0,0) = −aj − fj > 0. This gives orbits with positive and increasing Sj inside any open set around (0, 0). The numbered observations 1) through 9) listed above hold without change. Observation 10) now requires the additional hypothesis that j ∈ / SE(θ), giving 10′ ) If j ∈ Reg(S ∗ , T ∗ ) and j ∈ / SE(θ), then Tj∗ > 0 if and only if bhj Mhj j > bj . As before, this follows from observation 9). Proposition 3.6. Suppose that SE(θ) 6= ∅ and (S ∗ , T ∗ ) is a biologically meaningful infected fixed point. Then SE(θ) ⊆ Reg(S ∗ , T ∗ ). Proof. This follows from noting that j ∈ SE(θ), Sj∗ > 0 and Tj∗ = 0 implies S˙j > 0.

May 6, 2013

15:36

BC: 8846 - BIOMAT 2012

09˙shapiro

160

Proposition 3.7.

b

Suppose SE(θ) 6= ∅. Then ≻ is a strict partial order.

Proof. We must show that ≻ is anti-reflexive, asymmetric and transitive. The first follows immediately from the fact that Mjj = 1. To see that ≻ is asymmetric, suppose we have j ≻ k and k ≻ j . This implies (j, k]∩SE(θ) = ∅ and (k, j]∩SE(θ) = ∅, contradicting SE(θ) 6= ∅. To see the third we suppose that j ≻ k and k ≻ ℓ. We consider two cases, k ∈ [j, ℓ] and ℓ ∈ [j, k]. In the first case, we have Mjk Mkℓ = Mjℓ . We then have bj Mjk < bk , bk Mkℓ < bℓ giving bj Mjℓ = bj Mjk Mkℓ < bk Mkℓ < bℓ as required. The second case would imply (j, k] ∪ (k, ℓ] = [0, n) as well as (j, k]∩SE(θ) = ∅ and (k, ℓ]∩SE(θ) = ∅, contradicting SE(θ) 6= ∅. We define Str(θ), Unstr(θ), saturated and moderated as before. Proposition 3.2 holds in the case SE(θ) 6= ∅. However, there is a small change in the proof. In the case where SE(θ) = ∅, we appeal to observation 10). In the case where SE(θ) 6= ∅, we need to note that j ≻ k implies that k ∈ / SE(θ) and we are thus able to appeal to observation 10′ . The proof then proceeds as before. The equivalence of moderation and saturation (Proposition 3.4) and their necessity for stability (Theorem 3.1) can be proved as before, the result of Proposition 3.6 playing an important role. In order to prove sufficiency in the presence of self-establishing stages (Theorem 3.2 and its consequences), we rely on Lemma 1 of Section 8 in 6 and the argument provided there after the Proof of the Lemma. 4. Discussion In this paper, we have generalized the T-cell activation and proliferation model of 6 in order to make that model applicable to more realistic antigen dose - T-cell proliferation response curves. We argue below that this is likely to confer theoretical advantages in trying to understand a more mechanistically accurate model of T-cell activation. We claim it also sheds light on PTLD. A common result of post-transplant immuno-suppression is that levels of EBV-infected B-cell populations rise by a factor of approximately 50 (Thorley-Lawson lab, unpublished) and then settle into a new equilibrium. b We take the opportunity to amend Proposition 7 of Section 8 in 6 . The condition R > 1 0 in the statement of that proposition is not required, given that we take SE(θ) 6= ∅ as a standing assumption for the entire Section 8.

May 6, 2013

15:36

BC: 8846 - BIOMAT 2012

09˙shapiro

161

This is the behavior predicted by the model of 6 : the effect of immunosuppression is to reduce the values of the net antigenicity parameters cj . In fact, if all of these are reduced by the same factor, it is not hard to see that the new equilibrium will have the same regulation profile as before immuno-suppression. However, in more unfortunate cases, the infected blast population launches into uncontrolled growth 5 , 11 . It is possible that this behavior could arise under the model of 6 despite the fact that the system still has a unique stable equilibrium. We believe (but have not been able to prove) that the basin of attraction for this fixed point is essentially the entire phase. How could this uncontrolled expansion arise? The answer is that while the mathematical trajectory of the system might still lead to this equilibrium, it passes through non-vital regions of phase space, i.e., the patient dies first. However, the generalization we have analyzed here, allows for a different understanding of this uncontrolled expansion. In this model, we can see immuno-suppression as lowering the dose-response curves, ϕj . If none of these are suppressed to the point where the host becomes immunoincompetent against any stage, the model predicts that the levels of infected populations would rise and reach a new equilibrium. Even if immunosuppression drives a dose-response curve ϕj down to the level of of immunoincompetence, the system will still have a unique stable equilibrium provided stage j is not self-establishing. Now the criterion for a stage to be self-establishing is that aj + fj < 0, that is, if it proliferates faster than it differentiates. Thus if a cell undergoes a mutation or an epigenetic modification which either increases the proliferation rate (thus lowering aj ) or retards its ability to differentiate (thus lowering fj ) the cell could tip over into becoming self-establishing. This suggests that PTLD results from the conjunction of two factors, to wit, immuno-suppression to the point of immuno-incompetence, together with a mutation (or other transformation) that causes the cell to become self-proliferating. We also argue that the generalization presented here may allow us to apply these results to more mechanistic models of T-cell activation. While in most regimes, we would expect T-cell proliferation to rise in response to increased antigen, this rate is not driven by the encounter of T-cells with infected target cells, but rather by the presentation of antigen to T-cells by dendritic cells 14 . Thus, a model of the mechanisms underlying T-cell proliferation must include additional cell populations and quite complicated cellular processes carried out by those cells 20 . Further, there is a widespread phenomenon that cannot be explained as

May 6, 2013

15:36

BC: 8846 - BIOMAT 2012

09˙shapiro

162

the fixed point of a system like (3), namely the existence of long-lived T-cell responses to multiple epitopes, either of a single pathogen or in our case to a single pathogen stage. Multiple T-cell responses are often modeled as competing for antigen in a predator-prey dynamic. The antigenicity of the T-cell’s epitope functions as the T-cell’s fitness and this leads to a winnertake-all dynamic where the response to the most antigenic epitope survives and the others become extinct 15 , 19 . It seems likely that memory T-cells play an important role in the survival of multiple responses. However, the interactions between effector and memory populations are still not well understood 4,1 . These populations exhibit differing longevity. Stated in terms of our model, they do not share a common value for b. In addition, high levels of antigen can lead to CTL exhaustion 13 , a phenomenon that argues against a monotone increasing ϕ. The upshot of this is that if we wish to refine the cyclic pathogen model to present an increasingly detailed picture of the cell populations and their mechanisms, we will need to include multiple immune cell populations at each stage. The dynamics of such a system could be quite complicated. However, there is a variable that summarizes the collective immune pressure against a given stage, namely their net kill rate of the effector populations. Thus, if Tj1 , . . . , Tjk are the effector populations against stage Sj , we can write τj = τj (Tj1 , . . . , Tjk ) so that we now have dSj = rj−1 fj−1 Sj−1 − (aj + fj + τj )Sj . dt We cannot expect that the dynamics of such an expanded system can be mapped to the dynamics of (3) because we cannot necessarily expect Tj1 , . . . , Tjk (and any non-effector populations) to vary in a way which dτ makes dtj a function of Sj and τj . However, once the dynamics of these populations are understood, in the neighborhood of a fixed point, understanding the marginal response of τj to Sj may allow us to use (3) to summarize these dynamics in a way which will allow us to establish the existence of a stable fixed point.

Acknowledgements We wish to thank Dr. Jared Hawkins and Prof. David Thorley-Lawson for many conversations about the underlying biology which inspired this work. Dr. Hawkins was especially helpful in bringing many apropos references to our attention.

May 6, 2013

15:36

BC: 8846 - BIOMAT 2012

09˙shapiro

163

References 1. R. Antia, V. V. Ganusov, and R. Ahmed. The role of models in understanding CD8+ T-cell memory. Nat Rev Immunol, 5:1474–1733, 2005. 2. W. W. Chen, M. Niepel, and P. K. Sorger. Classic and contemporary approaches to modeling biochemical reactions. Genes & Development, 24(17):1861–1875, 2010. 3. W. W. Cleland. What limits the rate of an enzyme-catalyzed reaction. Accounts of Chemical Research, 8(5):145–151, 1975. 4. W. Cui and S. M. Kaech. Generation of effector CD8+ T cells and their conversion to memory T cells. Immunological Reviews, 236(1):151–166, 2010. 5. Dharnidharka, Vikas and Araya, Carlos. Post-transplant lymphoproliferative disease. Pediatric Nephrology, 24:4, 2009. 6. E. Delgado-Eckert and M. Shapiro. A model of host response to a multi-stage pathogen. J. Math. Biol., 63(2):201–227, 2011. 7. B. M. Greenwood, K. Bojang, C. J. Whitty, and G. A. Targett. Malaria. The Lancet, 365(9469):1487–1498, 2005. 8. S. Henrickson, T. Mempel, I. Mazo, B. Liu, M. Artyomov, H. Zheng, A. Peixoto, M. Flynn, B. Senman, T. Junt, H. Wong, A. Chakraborty, and U. von Andrian. T cell sensing of antigen dose governs interactive behavior with dendritic cells and sets a threshold for t cell activation. Nat Immunol, 9(3):282–91, Mar 2008. 9. D. Hochberg, T. Souza, M. Catalina, J. L. Sullivan, K. Luzuriaga, and D. A. Thorley-Lawson. Acute infection with epstein-barr virus targets and overwhelms the peripheral memory b-cell compartment with resting, latently infected cells. J Virol, 78(10):5194–204, 2004. 10. D. Hudrisier, J. Riond, L. Garidou, C. Duthoit, and E. Joly. T cell activation correlates with an increased proportion of antigen among the materials acquired from target cells. Eur J Immunol, 35(8):2284–94, Aug 2005. 11. Hazem A. H. Ibrahim and Kikkeri N. Naresh, Post-transplant lymphoproliferative disease. Pediatric Nephrology, 2012, 2012 12. G. Khan, E. M. Miyashita, B. Yang, G. J. Babcock, and D. A. ThorleyLawson. Is EBV persistence in vivo a model for B cell homeostasis? Immunity, 5(2):173–9, 1996. 13. S. Mueller and A. R. High antigen levels are the cause of t cell exhaustion during chronic viral infection. Proc Natl Acad Sci U S A, 106(21):8623–8, May 2009. 14. K. M. Murphy, P. Travers, and M. Walport. Janeway’s Immunobiology (Immunobiology: The Immune System (Janeway)). Garland Science, 7 edition, Nov. 2007. 15. M. A. Nowak and R. M. May. Virus dynamics. Oxford University Press, Oxford, 2000. Mathematical principles of immunology and virology. 16. D. Thorley-Lawson. Epstein-barr virus: exploiting the immune system. Nature Reviews Immunology, 1(1):75–82, 2001. 17. D. A. Thorley-Lawson, K. A. Duca, and M. Shapiro. Epstein-barr virus: a paradigm for persistent infection - for real and in virtual reality. Trends in

May 6, 2013

15:36

BC: 8846 - BIOMAT 2012

09˙shapiro

164

Immunology, 29(4):195 – 201, 2008. 18. K. M. Tyler and D. M. Engman. The life cycle of trypanosoma cruzi revisited. International Journal for Parasitology, 31(5-6):472 – 480, 2001. 19. D. Wodarz. Killer Cell Dynamics: Mathematical and Computational Approaches to Immunology. Springer Verlag, 2007. 20. N. Zhang and M. Bevan. Cd8(+) t cells: foot soldiers of the immune system. Immunity, 35(2):161–8, Aug 2011. 21. H. Zheng, B. Jin, S. Henrickson, A. Perelson, U. von Andrian, and A. Chakraborty. How antigen quantity and quality determine t-cell decisions in lymphoid tissue. Mol Cell Biol, 28(12):4040–51, Jun 2008.

May 7, 2013

9:6

BC: 8846 - BIOMAt 2012

10˙floudas

ADVANCES IN DE NOVO PROTEIN DESIGN FOR MONOMERIC, MULTIMERIC, AND CONFORMATIONAL SWITCH PROTEINS

JAMES SMADBECK, GEORGE A. KHOURY, MEGHAN B. PETERSON, AND CHRISTODOULOS A. FLOUDAS Department of Chemical and Biological Engineering Princeton University, Princeton, NJ 08544 E-mail: [email protected]

The aim of de novo protein design is to find the amino acid sequences that will fold into a desired 3-dimensional structure with improvements in specific properties, such as binding affinity, agonist or antagonist behavior, or stability, relative to the native sequence. Protein design lies at the center of the process of drug design and discovery. Not only does protein design provides predictions for potentially useful drug targets, but it also enhances our understanding of the protein folding process and protein-protein interactions. Experimental methods such as directed evolution have shown success in protein design. However, such methods are restricted by the limited sequence space that can be searched. In contrast, computational design strategies allow for the screening of a much larger set of sequences covering a wide variety of properties and functionality. We have developed a range of computational de novo protein design methods capable of tackling several important areas of protein design. These include the design of monomeric proteins for increased stability, protein-peptide complexes and multimeric proteins for increased binding affinity and aggregation affinity, and the design of conformational switches with minimal number of mutations. Each method has enjoyed notable successes and experimental validation. For the dissemination of these methods for broader use we present Protein WISDOM (http://atlas.princeton.edu/proteinwisdom), a webtool that provides automated design methods for the wide range of protein design problems we are capable of addressing. Structural templates are submitted to a server and parsed into separate chains and sequences for design. All the design methods use the structural template to discern relevant mutation constraints based on solvent accessible surface area properties and biological constraints on charge and amino acid content. The first stage of all methods is an optimization-driven sequence selection stage that minimizes an objective function based on the desired design problem. Selected sequences are then run through a cascade of increasingly demanding and accurate validation steps for confirmation of desired design. A rank-ordered list of the sequences for each step of the process provides the user with comprehensive quantitative assessment of the design for all available methods. Here we provide the details of each design method as well as several notable successes attained through the use of the methods. 165

May 7, 2013

9:6

BC: 8846 - BIOMAt 2012

10˙floudas

166

1. Introduction De novo protein design aims to identify sequences that will yield a desired three-dimensional (3D) structure with particular properties or functions. It is the inverse of the protein folding problem, where one aims to elucidate the 3D structure corresponding to a given amino acid sequence. Protein design exhibits degeneracy, as many sequences can fold into a particular structure. This complicates the problem of finding the optimal sequence for a particular property or function. Experimentally, proteins can be designed via random mutagenesis, rational design, or directed evolution 1 , but all of these methods are limited by the number of tractable sequences that can be identified and tested, due to time and cost constraints. For example, a protein of only 15 amino acids undergoing design can have a sequence complexity of 2015 combinations of sequences, which would be experimentally expensive and time consuming to explore fully. The search space continues to grow astronomically with each additional amino acid position being considered for design. Thus, peptides and proteins with many design positions are particularly challenging for experimental protein design. Complicating this complexity is the tremendous and increasing number of post-translational modifications and non-canonical amino acids 2 being discovered, which can be considered as design choices. These can be difficult to site-specifically incorporate via means outside peptide synthesis, which causes size limitations for optimal design. Systematic computational approaches have been created to address these challenges in protein design. These methods can be both deterministic and probabilistic 3,4 , and as a result of technological advances in computing hardware and algorithms, can take into account backbone flexibility both directly by using a flexible template 5–13 , or by allowing backbone relaxations in each iteration of design 14 . Alternatively, one can use a single fixed-template which can save computational time 15–17 . Protein design is a rigorous test of our understanding of molecular recognition and has been applied to many systems of biological, physical, and medicinal interest. For success, it requires accurate modeling of the protein structures involved, maintaining stability of a protein at the required operating conditions, and proper elucidation and atomistic modeling of the key interactions between the different participants in a design problem 18 . In recent works, computational design has been used to design inhibitors against H1N1 influenza hemagglutinin 19 , to switch cofactor specificity of an enzyme 20 , for generalized antibody design for recognition of a target

May 7, 2013

9:6

BC: 8846 - BIOMAt 2012

10˙floudas

167

epitope 21 , and to successfully increase the activity of an enzyme using a crowd-sourced online multiplayer game 22 . This list of applications is not exhaustive, and the applications of solving this problem continue to grow. See Fung et al. 4 , Pantazes et al. 18 , and Samish et al. 23 for excellent reviews of the recent advances in the area. In this paper, we highlight the advances in the area of de novo protein design made in our lab and describe their integration into a unified webtool, Protein WISDOM. This webtool is capable of applying the methods of protein design in a user-friendly, interactive fashion, and is being made available to the academic community for use in the design of novel sequences for diverse functions. 2. Review of De Novo Design Framework and Successful Experimentally Validated Designs The overall protein design framework has been successfully applied in a number of systems of medicinal interest, targeting proteins involved in chronic ailments such as HIV, cancer, innate immunity, and other autoimmune disorders. Notably, many of these fully computational predictions have been validated by external experimental collaborators. We summarize the major designs to date in Table 1. Table 1. Summary of protein and peptide designs to date using the de novo protein design framework. The # of computational predictions is presented as the number of favorable predictions (i.e. fold specificities above a certain cutoff or approximate binding affinities greater than the native sequence). The # of experimental validations gives two numbers: the first is the number of predictions that were experimentally validated while the second is the total number of predictions that were tested experimentally. Protein Design Full sequence design of human betadefensin-2 Compstatin inhibitors of human C3 Compstatin analogues that bind to rat C3c C3a receptor agonists and antagonists HIV-1 gp14 inhibitors Bak inhibitors of Bcl-xL and Bcl-2 Inhibitors of EZH2 Inhibitors of LSD1 and LSD2 Inhibitors of HLA-DR1

Protein Length 41

# of Computational Predictions 340

# of Experimental Validations -

13 13 77 12 16 - 18 21 16 13

28 5 20 6 10 17 41 6

3/3 4/7 4/5 5/5 10/10 17/20 -

Reference 24 25,26 27 28 29 30 31

32

Table 2 summarizes experimentally validated peptide sequences with agonistic and antagonistic activity predicted using the de novo protein design framework. The approximate binding affinity metric was used to predict nine of the sequences (inhibitors of human C3c, HIV-1 gp41, EZH2,

May 7, 2013

9:6

BC: 8846 - BIOMAt 2012

10˙floudas

168

LSD1, and LSD2), while the fold specificity metric was used to identify four of the sequences (agonists/antagonists of C3aR). These peptides highlight the success of the de novo protein design framework, particularly the added approximate binding affinity metric. The framework is extremely versatile in its applicability. Six different proteins linked to twenty-five different diseases have been successfully designed and experimentally validated. Table 2. Computationally predicted and experimentally validated peptides targeting various diseases. Name SQ027 SQ086 SQ059 SQ110-4 SQ060-4 SQ007-5 SQ002-5 SQ435 SQ037 SQ011-1 SQ016-1 SQ026-1 SQ015-1

IC50 0.94 µM 1.98 µM 4.73 µM

15.4 nM 26.1 nM 29 - 253 µM 13.57 µM 0.521 µM 0.249 µM 2.51 µM 1.332 µM

EC50

Protein Target human C3c human C3c human C3c 15.2 nM C3aR 36.4 nM C3aR C3aR C3aR HIV-1 gp41 EZH2 LSD1 LSD1 LSD2 LSD2

Applicable Diseases stroke, heart attack, Alzheimer’s disease, asthma, rheumatoid arthritis, systemic lupus erythematosus, multiple sclerosis, psoriasis, diabetes type I, Crohn’s disease, pancreatitis, and cystic fibrosis AIDS prostate, breast, lymphoma, myeloma, bladder, colon, skin, liver, endometrial, lung, and gastric cancers

3. Monomeric De Novo Protein Design Framework 3.1. Method Overview The de novo protein design framework, which is the algorithmic driving force behind Protein WISDOM, is a two-stage approach, and is presented in Fig. 1. In the first stage, the method uses coarse-grained forcefields to elucidate low-energy sequences that will fold into a particular template structure. The second stage applies fold specificity calculations, approximate binding affinity calculations, and/or transition specificity calculations to re-rank the sequences produced in the first stage. These calculations are done to increase both stability of the designed sequence to the template fold, as well as the binding affinity of the designed sequence to a target receptor. Fold specificity is normally used when designing a single protein in isolation for improved specificity, approximate binding affinity is utilized when designing or redesigning a complex, and transition specificity is used to simultaneously design positively for one template structure and negatively for another template structure.

May 7, 2013

9:6

BC: 8846 - BIOMAt 2012

10˙floudas

169

Design Inputs: Several inputs are required to begin protein design. An accurate 3D structure, which can be from the Protein Data Bank 33 , a homology model, or an ab initio predicted structure, is required. This structure can be a single structure, or an ensemble of structures derived from NMR spectroscopy, molecular dynamics simulations, or predicted docked poses. The latter defines a flexible template, and can increase the robustness of predictions by including the local structural changes experienced by a protein. Increased accuracy of the structures inputted for design, increases the likelihood of predictions to be successful in subsequent experimental validation 18 .

Figure 1.

Overview of the de novo protein design framework.

The template structure is utilized to generate allowed mutations in each sequence position. This mutation set is generated based on consideration for the solvent accessible surface area (SASA) of each residue in the template. When a residue is more than 50% exposed to solvent, hydrophilic amino acids are allowed (D, E, G, H, K, N, P, Q, R, S, T). Conversely, if a residue is less than 20% solvent-exposed, hydrophobic amino acids are allowed as design choices (A, F, I, L, M, V, W, Y). If a residue has an intermediate solvent-exposure, between 20% and 50%, all amino acids are candidates. Often, cysteine is intentionally disallowed as a design choice because of its

May 7, 2013

9:6

BC: 8846 - BIOMAt 2012

10˙floudas

170

propensity to form disulfide bridges, unless external information deems it appropriate for inclusion. Biological constraints can be added to account for charge and amino acid content constraints as well. The next input involved is the choice of forcefield. This forcefield describes the pairwise interaction energies of the sequences in the design template. Two distance-dependent forcefields have been developed in-house and are used in the framework to describe the preferability of amino acid pairs to be within certain distances of each other. In the forcefields, the distances are calculated either as the Cα-Cα distance 34 or the distance between the centroids of the side-chains of a pair of residues 35 . Although both are applicable in design, we are typically more concerned with optimizing interactions of the side-chains for molecular recognition, so we have found more success with the centroid-centroid forcefield. The forcefields are discretized into distance bins to account for backbone flexibility. Once the mutation set, forcefield, and biological constraints are defined, one can proceed to performing design using the two-stage approach. Stage One: Sequence Selection: Sequence selection aims to select and rank amino acid sequences according to those that minimize the energy defined by the selected forcefield. This stage takes into account 3D interactions between the amino acids, can account for structural flexibility, and in the overall framework is the first step to reducing the astronomically large combinatorial sequence complexity to a more tractable set. Single Structure Model: The original form of the sequence selection model (Eq. 1), which is based on Integer Linear Optimization (ILP), was developed by Klepeis et al. 12,13 and was subsequently further refined by Fung et al. 36 and coworkers.

May 7, 2013

9:6

BC: 8846 - BIOMAt 2012

10˙floudas

171

min

l yij ,yk

subject to

mk n X mi X n X X

jl jl Eik (xi , xk )wik

i=1 j=1 k=i+1 l=1

mi X

yij = 1 ∀ i

mi X

jl wik = ykl ∀ i, k > i, l

mk X

jl wik = yij ∀ i, k > i, j

j=1

j=1

l=1

(1)

jl ∈ {0, 1} ∀ i, j, k > i, l yij , ykl , wik

Set i = 1, . . . , n defines the residue positions in the design template. At each position i, mutations are represented by j{i} = 1, . . . , mi , where mi = 20 if position i is allowed to mutate to any of the twenty natural amino acids. The alias sets k ≡ i and l ≡ j, with k > i, are employed to represent all unique pairwise interactions. Binary variables yij and ykl are introduced to model amino acid mutations. The yij variable will assume the value of one if the model assigns amino acid j to position i, and the value of zero otherwise (similarly for ykl ). The objective function represents the sum of all jl pairwise energy interactions in the design template. Parameter Eik (xi , xk ), which is the energy interaction between position i occupied by amino acid j and position k occupied by amino acid l, depends on the distance between the α-carbons or side chain centroids at the two positions (xi , xk ) as well as the type of amino acids j and l. It only contributes to the objective function if both yij and ykl are equal to one. The formulation (1) for solving sequenceselection was shown to be significantly more computationally efficient than twelve other equivalent quadratic assignment-like models 36,37 . Weighted Average Model: Two models were developed by Fung et al. 36 for de novo protein design in which the design template is flexible. The Weighted Average Model (Eq. 2) uses a weighted average energy, bm X jl jl Eik (xi , xk )wt(xi , xk , d), in place of the energy parameter Eik (xi , xk ) d=1

in the Single Structure Model (Eq. 1). Weights wt(xi , xk , d) are based on the frequencies of the distance between xi and xk falling into distance bin d in the ensemble of template structures.

May 7, 2013

9:6

BC: 8846 - BIOMAt 2012

10˙floudas

172 mi X mk X bm n X n X X

min

l yij ,yk

jl jl Eik (xi , xk )wt(xi , xk , d)wik

i=1 j=1 k=i+1 l=1 d=1

subject to

mi X

yij = 1 ∀ i

mi X

jl wik = ykl ∀ i, k > i, l

mk X

jl wik = yij ∀ i, k > i, j

j=1

j=1

l=1

(2)

jl ∈ {0, 1} ∀ i, j, k > i, l yij , ykl , wik

Distance Bin Model: The Distance Bin Model of sequence selection 36 (Eq. 3) utilizes the distance information from the flexible template by introducing a binary variable bikd . This binary variable is activated if the distance between xi and xk falls into distance bin d, and is set to zero otherwise. A parameter is introduced, disbin(xi , xk , d), that is activated if the distance between xi and xk in any structure from the flexible template falls into distance bin d, and is zero otherwise. Only one distance bin per amino acid is allowed to contribute towards to the total energy.

min

l yij ,yk

subject to

mk mi X n X n X X

X

jl jl Eik (xi , xk )bikd wik

i=1 j=1 k=i+1 l=1 d:disbin(xi ,xk ,d)=1

mi X

yij = 1 ∀ i

mi X

jl wik = ykl ∀ i, k > i, l

mk X

jl wik = yij ∀ i, k > i, j

j=1

j=1

l=1

X

d:disbin(xi ,xk ,d)=1

bikd = 1 ∀i, k > i

jl yij , ykl , wik , bikd ∈ {0, 1} ∀ i, j, k > i, l, d

(3)

May 7, 2013

9:6

BC: 8846 - BIOMAt 2012

10˙floudas

173

Stage Two: Validation: Three different calculations may be employed to validate designed sequences elucidated in Stage One: these are Fold Specificity, Approximate Binding Affinity, and Transition Specificity. Depending on the intended type of design being employed, one to three of these methods are required. The following sections describe in detail each of these validation methods. Fold Specificity: Fold specificity is a metric used to rank the sequences from Stage One based on how well a designed sequence folds into the design template. The fold specificity calculations can take on one of two approaches: the ASTRO-FOLD 34,35,38–54 approach and the TINKER/CYANA 55–57 approach. ASTRO-FOLD is a deterministic global optimization-based protein structure prediction framework. Its powerful conformational search and branch-and-bound approaches were those initially used for the fold specificity calculations 12,13 . In the initial approach, two conformational ensembles were generated: one in which the protein is constrained to a region around the flexible backbone template, and the other in which the protein is allowed to fold freely, while maintaining the secondary structure. The probability of an amino acid sequence assuming the target fold is calculated based on a Boltzmann-weighted probability. Since this approach utilizes global optimization with a highly non-convex objective function, it can become computationally demanding to perform fold specificity calculations in this way for hundreds of sequences. Therefore, an alternate approach for fold specificity is commonly used 24 . Given the template structure, an ensemble of hundreds structures is generated via a simulated-annealing in CYANA 2.1 55,56 . The structure is allowed to vary by a given percentage or fixed distance and the φ and ψ angles between residues are allowed to vary by a given degree range from the initial template structure. This is performed for the original native sequence, as well as for the predicted mutant sequences derived from Stage One, with side-chain substitutions made accordingly. Next, each structure in the ensemble is subjected to a local minimization by TINKER 3.6 57 , utilizing the AMBER forcefield 58 to evaluate the potential energy of each structure. Using these structures, the fold specificity of each mutant sequence to the fold of the template structure can be calculated relative to the native sequence (Eq. 4).

May 7, 2013

9:6

BC: 8846 - BIOMAt 2012

10˙floudas

174

fspec =

X

e−βEi

X

e−βEi

i∈novel

(4)

i∈native

Approximate Binding Affinity: The approximate binding affinity is introduced for protein-peptide complexes. It aims to capture the equilibrium between a protein and ligand in the complex state and each of them in their free state. This calculation is done when one is performing design with the intention to improve the binding affinity of a peptide to a protein target. It can be done for sequences predicted out of Stage One, or on sequences that were re-ranked based on their Fold Specificity. Lilien et al. 59 originally proposed an approach to calculate the approximate binding affinity of protein-ligand complexes based on statistical mechanics. The approximate binding affinity is meant to approximate the true binding constant, KA . It can be calculated by generating rotamericallybased ensembles of the protein, the ligand, and the protein-ligand complex and using those ensembles to calculate partition functions for each state. The partition functions of the protein-ligand complex qP L , free protein qP , and free ligand qL are defined in Eq. 5, where the sets B, F , and L contain the rotamerically-based conformations of the bound complex, the free protein, and the free ligand, respectively. En is the energy of conformation n, R is the gas constant, and T is the temperature. qP L =

X

b∈B

e

−Eb RT

,

qP =

X

e

−Ef RT

f ∈F

,

qL =

X

e

−Eℓ RT

(5)

ℓ∈L

The approximate binding affinity, denoted as K ∗ , can be defined as a ratio of the partition functions, as in Eq. 6, K∗ =

qP L , qP qL

(6)

Structure Prediction: A large ensemble of structures should obey statistical mechanics. Therefore, we need to generate ensembles of physicallyrelevant structures for the bound and unbound states for each predicted sequence. Large ensembles of structures for each sequence are predicted using the Rosetta AbRelax function 60–62 , part of the Rosetta 3.4 software package. This method utilizes a fragment-based Monte Carlo search to replace local structures with sequence-derived fragments. This step produces

May 7, 2013

9:6

BC: 8846 - BIOMAt 2012

10˙floudas

175

a set of fragment-based structure predictions for each of the predicted sequences, which may or may not have homology to fragments contained in the Protein Data Bank or the fragment libraries. Clustering: The ensemble of structures from AbRelax for each sequence are then clustered based upon their φ and ψ angles using OREO 63,64 . The clustering method, OREO, elucidates representative structures having dissimilar backbones from the entire structural ensemble. The average structures from the ten largest clusters and the overall lowest energy structure are chosen for docking to the target protein. This provides 11 unique backbone structures for each peptide sequence, incorporating backbone flexibility into the ensemble generation. Docking Prediction: Docked poses of these newly predicted sequences to the target protein are also required. Therefore, we utilize RosettaDock 65–67 to generate predicted docked poses to a target receptor protein. For each sequence, each of the 11 peptide backbone structures is docked against the target protein. When the binding site is known, the peptides are initially placed near the binding site and allowed to translate up to 3 ˚ A ˚ normal to the binding site, 8 A parallel to the binding site, and rotate 8◦ . This docking step produces a large ensemble of complex structures. The ten lowest energy complexes in each of the 11 runs are used as starting structures in the final rotamerically-based conformation ensemble generation (110 starting structures per sequence). Final Ensemble Generation: RosettaDesign 68 is used to generate the final structural ensemble for each sequence. RosettaDesign is chosen as it can generate a large number of structures by only adjusting the rotamers on the side chains through the fixbb function. To generate the peptide ensemble, which incorporates both backbone flexibility and rotamer substitutions, the ten lowest-energy peptide structures from each of the ten largest clusters plus the ten overall lowest-energy peptide structures are used as starting structures for RosettaDesign (110 total starting structures). For each starting structure, 200 rotamer conformers are generated, giving a final ensemble of 22,000 structures (set L in Eq. 5). The complex ensemble is generated similarly by taking the 110 starting structures from the docking prediction step and generating 200 rotamer conformers per starting structure for a total of 22,000 structures (set B in Eq. 5). Finally, the protein ensemble is produced by utilizing RosettaDesign on just the target protein structure to create 2000 rotamer conformations for the starting structure (set F in Eq. 5).

May 7, 2013

9:6

BC: 8846 - BIOMAt 2012

10˙floudas

176

4. Transition Specificity Design with Minimum Mutations Determining the minimum number of mutations necessary to induce a change in the tertiary structure of a protein is a highly desirable design target. This design target challenges the idea that structure is more conserved than sequence by pushing the idea to extremes. Such “conformational switches” could be used in the development and understanding of highly sensitive biosensors, as well as in understanding the implication of a minimal, but significant, change in amino acid sequence during protein design. This objective of determining the minimum number of mutations for a conformational switch was first described in 1994 by George D. Rose and Trevor P. Creamer. They opened a competition, denoted as the Paracelsus challenge 69 , which offered a cash prize to the group able to develop two proteins, of distinct tertiary structure, within 50% sequence identity. Since the challenge was first met in 1997 70,71 , there have been many advances in the field. These include switches induced by experimental exploration of sequence space between proteins with determined tertiary structure 72–76 , by employing the introduction of small molecules or metals, and by changing hydrophobic-polar patterning 77,78 . However, there are no computational methods that have attained the design of conformational switches with minimum mutations. A three stage method for the design of minimum mutation conformational switches has been developed 79 . This method has been tested with an untrained experimental test set and is currently undergoing experimental testing for predicted conformational switches. Once experimentally validated, the method will be included in Protein WISDOM. The first stage is an optimization model that selects for sequences which satisfy properties characteristic of conformational switches. Like in monomeric protein design, the protein model used can either be for rigid or flexible templates and several in-house forcefields can be used to represent potential energy 34,35 . The second stage uses the detailed energy function FAMBE-pH for free energy calculations 80 to quantify how likely the sequence is to switch from the native conformation to the target conformation. The conformational switch metric is denoted as the Transition Specificity 79 (TSpec ) of the sequence.

May 7, 2013

9:6

BC: 8846 - BIOMAt 2012

10˙floudas

177

Stage One: Sequence Selection with Minimal Mutations The first step in designing a conformational switch is to generate an ensemble of candidate mutant sequences that all fold into a similar “switched” structure, but may have different properties, such as stability or function. The quadratic assignment-like models first developed by Klepeis et al. 13 and extended by Fung et al. 36 were used as a basis for this work. The main question for a conformational switch is: for a given pair of conformations with distinctly different folds (e.g., α-helical and α/β protein structures), what is the minimum number of mutations necessary to move from one structure to the other? For this reason, the objective function of the optimization model minimizes the number of mutations. Using the rigid backbone model as a representative model, the conformational switch design model takes the form shown in Equation 7.

min

l yij ,yk

subject to

n X X

yij

i=1 j:ˆ yij =0

mi X

yij = 1∀i

mi X

jl wik = ykl ∀i, k > i, l

mk X

jl wik = yij ∀i, k > i, j

j=1

j=1

l=1

(7)

=

mi n−1 XX

mk X bm n X X

jl Eik

E′b =

mi n−1 XX

mk X bm n X X

jl jl Eik (xi , xk ) wik

E

′ a

jl (xi , xk ) wik

i=1 j=1 k=i+1 l=1 d=1

i=1 j=1 k=i+1 l=1 d=1

E

′

E

′

a b ′

≥ Ea − ∆a

≤ Eb + ∆b

E b − E ′ a ≤ Eb − Ea

jl yij , ykl , wik = {0, 1}∀i, j, k > i, l

The set i = 1, . . . , n defines the number of amino acid positions along the native and target protein backbone. The equivalent set k is defined for all positions. For each position i the set j{i} = 1, . . . , mi is defined as the

May 7, 2013

9:6

BC: 8846 - BIOMAt 2012

10˙floudas

178

possible amino acids mutations. For the general case this represents the 20 naturally occurring amino acids, such that mi = 20. The equivalent set l{k} is defined for the allowable mutations at position k. The binary variables yij and ykl are defined as equal to 1 if position i or k is occupied by amino jl acid j or l, respectively. The binary variable wik represents a linearization j jl j l of yi *yk , such that wik takes a value of 0 if yi and/or ykl is equal to 0, and takes a value of 1 if both binary variables are equal to 1. The energy jl parameters Eik are the pairwise interactions between amino acids j and l occupying positions i and k. In order to optimize for minimum mutations, the parameter yˆij takes a value of 1 if residue j in position i is part of the native sequence, and takes a value of 0 otherwise. The energy parameter Et is defined as the energy of the starting sequence in the template t, parameter Et′ is the calculated energy of a given sequence in the template t, and parameter ∆t is a tolerance value for the energy constraints. We define two templates in t: template a is the native template and template b is the target template. The model consists of an objective function defined to minimize the number of mutated positions. Several energy constraints are defined to guarantee that any sequence selected has energetics indicative of a potential conformational switch. These constraints are used to guarantee that the candidate sequence has more favorable energetics in the target template than the native sequence and less favorable energetics in the native template than the native sequence.

Stage Two: Transition Specificity via FAMBE-pH The second stage of the methodology takes the top sequences from Stage One and introduces the first principles FAMBE-pH free energy formulation 80 using a fixed backbone protein structure input. Similar to the Fold Specificity step of the monomeric protein design, CYANA 55,56 and TINKER 81 , are used to generate large ensembles of structures. Both the native and switch structures produced by CYANA and TINKER are evaluated for free energy using FAMBE-pH. This method of protein energy evaluation not only takes into account the torsional free energy and intramolecular forces, but also accounts for the free energy of cavitation, van der Waals interactions, free energy of protein ionization, and solvation free energy. The FAMBE-pH free energy calculation is performed for both the native and target conformational ensembles produced for a given sequence. The Transition Specificity (TSpec ) is then calculated in order to assess the

May 7, 2013

9:6

BC: 8846 - BIOMAt 2012

10˙floudas

179

likelihood that the sequence folds into the target structure. The first step in the TSpec calculation is the Ensemble Energy Calculations, which involves a Boltzmann-weighted energy summation for both the native and target conformational ensembles, as shown in Equation 8. SN ative =

X

e−βGi ,

ST arget =

i∈N ative

X

e−βGi

(8)

i∈T arget

The set i consists of the conformations in a given conformational ensemble t, defined as either the native or target conformational ensemble. The Boltzmann weighted summation uses the free energy calculated by FAMBE-pH for a given structure i, Gi . The TSpec value is then calculated using these Ensemble Energy values as shown in Equation 9.

TSpec =

ST arget = SN ative

X

e−βGi

X

e−βGi

i∈T arget

(9)

i∈N ative

Using this formulation, sequences obtained from the first stage of the method can be quantitatively assessed for the likelihood that they would switch from the native conformation to the target conformation. 5. Multimeric Protein Design In the protein design methodology described above, the objective is to design a single protein sequence. Even in the approximate binding affinity calculation step, this involves either the design of the protein or peptide, independent of its partner. However, it is highly desirable to enable the design of multimeric systems, whereby all proteins in the system can be designed simultaneously, in many objectives (e.g. the design of aggregating peptides). In order to generalize this method for the design of multimeric systems, where all or some of the proteins are being designed, several important aspects of the method must be modified. The Sequence Selection Stage must take into account the possible complications of a system with multiple, identical proteins (e.g. in protein aggregation). A constraint must be added to allow for specification of a corresponding position in separate protein chains. The Fold Specificity step must be modified to allow for the production of flexible templates for many proteins in the environment of other proteins. Finally, the Approximate Binding Affinity Calculation stage

May 7, 2013

9:6

BC: 8846 - BIOMAt 2012

10˙floudas

180

must be made general for higher order protein interactions. The addition of these considerations and changes are currently in the process of being included in future versions of Protein WISDOM. 6. Protein WISDOM Protein WISDOM, which stands for Protein Workbench for In Silico De novo design Of bioMolecules, is a unified tool that gives the protein design community access to the framework described above in a user-friendly way. The webtool currently can perform design for single protein chains to adopt a template fold, as well as designing peptides to bind to a target protein. Specifically, Protein WISDOM automates the calculation of Sequence Selection, Fold Specificity, and Approximate Binding Affinity. A manuscript describing Protein WISDOM and the detailed steps involved in submitting a particular sequence for design, as well as interpreting the output, is in preparation 82 . 7. Conclusion In this work, we presented and reviewed our recent mathematical and algorithmic advances, and their applications to the fundamental problem of de novo protein design. We summarized a large number of designed sequences targeting a variety of different receptors involved in various disease processes, which have been experimentally validated. In addition, we described our efforts towards extending this framework to apply it to designing conformational switches and the simultaneous design of multiple sequences via multimeric design. Protein design still remains a formidable challenge. Successful design can be achieved by addressing each of the degrees of freedom in the inputs. These include the determination and selection of input template structures to be designed, the selection of which positions and what possible amino acid content is allowed in each position, and the location of the binding pocket when designing for increased affinity. Additionally, the incorporation of a flexible template can take into account local structural changes, which increases accuracy with limited increases in computational expense. We created a unified web interface, Protein WISDOM, for the dissemination and application of our algorithms that is available to the academic community. The applicability of such a method cannot be understated, and we anticipate the automation of the design framework via Protein WISDOM will have broad applicability.

May 7, 2013

9:6

BC: 8846 - BIOMAt 2012

10˙floudas

181

Acknowledgements CAF gratefully acknowledges support from NSF, NIH (R01 GM52032; R24 GM069 736), and the US Environmental Protection Agency, EPA (R 832721-010). A portion of this research was made possible with Government support by DoD, Air Force Office of Scientific Research. JS gratefully acknowledges support from NIH (P50GM071508-06). GAK gratefully acknowledges support from a National Science Foundation Graduate Research Fellowship under grant number DGE-1148900. MLBP gratefully acknowledges support from a National Defense Science and Engineering Graduate (NDSEG) Fellowship, 32 CFR 168a. References 1. Farinas E. T., Bulter T., Arnold F. H. Directed enzyme evolution. Curr Opin Biotech 12(6):545–551 (2001). 2. Khoury G. A., Baliban R. C., Floudas C. A. Proteome-wide posttranslational modification statistics: Frequency analysis and curation of the swiss-prot database. Sci Rep 1(90) (2011). 3. Floudas C. A. Research challenges, opportunities and synergism in systems engineering and computational biology. AIChE J 51(7):1872– 1884 (2005). 4. Fung H. K., Welsh W. J., Floudas C. A. Computational de novo peptide and protein design: Rigid templates versus flexible templates. Ind Eng Chem Res 47(4):993–1001 (2008). 5. Su A., Mayo S. L. Coupling backbone flexibility and amino acid sequence selection in protein design. Protein Sci 6(8):1701–1707 (1997). 6. Desjarlais J. R., Handel T. M. Side-chain and backbone flexibility in protein core design. J Mol Biol 290(1):305–318 (1999). 7. Farinas E., Regan L. The de novo design of a rubredoxin-like Fe site. Protein Sci 7(9):1939–1946 (1998). 8. Harbury P. B., Plecs J. J., Tidor B., Alber T., Kim P. S. High-resolution protein design with backbone freedom. Science 282(5398):1462–1467 (1998). 9. Koehl P., Levitt M. De novo protein design. I. In search of stability and specificity. J Mol Biol 293(5):1161–1181 (1999). 10. Koehl P., Levitt M. De novo protein design. II. Plasticity in sequence space. J Mol Biol 293(5):1183–1193 (1999). 11. Kuhlman B., Dantae G., Ireton G. C., Verani G., Stoddard B. L., Baker D. Design of a novel globular protein fold with atomic-level accuracy.

May 7, 2013

9:6

BC: 8846 - BIOMAt 2012

10˙floudas

182

Science 302(5649):1364–1368 (2003). 12. Klepeis J. L., Floudas C. A., Morikis D., Tsokos C. G., Argyropoulos E., Spruce L., Lambris J. D. Integrated structural, computational and experimental approach for lead optimization: Design of compstatin variants with improved activity. J Am Chem Soc 125(28):8422–8423 (2003). 13. Klepeis J. L., Floudas C. A., Morikis D., Tsokos C. G., Lambris J. D. Design of peptide analogs with improved activity using a novel de novo protein design approach. Ind Eng Chem Res 43(14):3817–3826 (2004). 14. Saraf M. C., Moore G. L., Goodey N. M., Cao V. Y., Benkovic S. J., Maranas C. D. Ipro: An iterative computational protein library redesign and optimization procedure. Biophys J 90(11):4167– 4180 (2006). 15. Ponder J. W., Richards F. M. Tertiary templates for proteins: Use of packing criteria in the enumeration of allowed sequences for different structural classes. J Mol Biol 193(4):775–791 (1987). 16. Dahiyat B. I., Mayo S. L. Protein design automation. Protein Sci 5(5):895–903 (1996). 17. Dahiyat B. I., Gordon D. B., Mayo S. L. Automated design of the surface positions of protein helices. Protein Sci 6:1333–1337 (1997). 18. Pantazes R. J., Grisewood M. J., Maranas C. D. Recent advances in computational protein design. Curr Opin Struc Biol 21(4):467–472 (2011). 19. Whitehead T. A., Chevalier A., Song Y., Dreyfus C., Fleishman S. J., De Mattos C., Myers C. A., Hetunandan K., Blair P., Wilson I. A., Baker D. Optimization of affinity, specificity and function of designed influenza inhibitors using deep sequencing. Nat Biotechnol 3(6):543– 548 (2012). 20. Khoury G. A., Fazelinia H., Chin J. W., Pantazes R. J., Cirino P. C., Maranas C. D. Computational design of candida boidinii xylose reductase for altered cofactor specificity. Protein Sci 18(10):2125–2138 (2009). 21. Pantazes R. J., Maranas C. D. Optcdr: a general computational method for the design of antibody complementarity determining regions for targeted epitope binding. Protein Eng Des Sel 23(11):849–858 (2010). 22. Eiben C. B., Siegel J. B., Bale J. B., Cooper S., Khatib F., Shen B. W., Players F., Stoddard B. L., Popovic Z., Baker D. Increased diels-alderase activity through backbone remodeling guided by foldit

May 7, 2013

9:6

BC: 8846 - BIOMAt 2012

10˙floudas

183

players. Nat Biotechnol 30(2):190–192 (2012). 23. Samish I., MacDermaid C. M., Perez-Aguilar J. M., Saven J. G. Theoretical and computational protein design. Annu Rev Phys Chem 62(1):129–149 (2011). 24. Fung H. K., Floudas C. A., Taylor M. S., Zhang L., Morikis D. Toward full-sequence de novo protein design with flexible templates for human beta-defensin-2. Biophys J 94(2):584–599 (2008). 25. Bellows M. L., Fung H. K., Floudas C. A., L´opez de Victoria A., Morikis D. New compstatin variants through two de novo protein design frameworks. Biophys J 98(10):2337–2346 (2010). 26. L´ opez de Victoria A., Gorham Jr R. D., Bellows-Peterson M. L., Ling J., Lo D. D., Floudas C. A., Morikis D. A new generation of potent complement inhibitors of the compstatin family. Chem Biol Drug Des 77(6):431–440 (2011). 27. Tamamis P., L´ opez de Victoria A., Gorham R. D., Bellows-Peterson M. L., Pierou P., Floudas C. A., Morikis D., Archontis G. Molecular dynamics in drug design: New generations of compstatin analogs. Chem Biol Drug Des 79(5):703–718 (2012). 28. Bellows-Peterson M. L., Fung H. K., Floudas C. A., Kieslich C. A., Zhang L., Morikis D., Wareham K. J., Monk P. N., Hawksworth O. A., Woodruff T. M. De novo peptide design with c3a receptor agonist and antagonist activities: Theoretical predictions and experimental validation. J Med Chem 55(9):4159–4168 (2012). 29. Bellows M. L., Taylor M. S., Cole P. A., Shen L., Siliciano R. F., Fung H. K., Floudas C. A. Discovery of entry inhibitors for HIV-1 via a new de novo protein design framework. Biophys J 99(10):3445–3453 (2010). 30. Sun J.-J., Abdeljabbar D. M., Clarke N. L., Bellows M. L., Floudas C. A., Link A. J. Reconstitution and engineering of apoptotic protein interactions on the bacterial cell surface. J Mol Biol 394(2):297–305 (2009). 31. Smadbeck J., Bellows-Peterson M. L., Zee B. M., Garapaty S., Mago A., Giannis A., Trojer P., Garcia B. A., Floudas C. A. De novo protein design and validation of histone methyltranferase inhibitors. (2012) In Preparation. 32. Bellows M. L., Fung H. K., Floudas C. A. in Molecular Systems Engineering, Process Systems Engineering, eds Adjiman C. S., Galindo A. (Wiley-VCH Verlag GmbH & Co. KGaA) Vol. 6, pp 207–232 (2010). 33. Berman H. M., Westbrook J., Feng Z., Gilliland G., Bhat T. N., Weissig H., Shindyalov I. N., Bourne P. E. The protein data bank. Nucleic Acids

May 7, 2013

9:6

BC: 8846 - BIOMAt 2012

10˙floudas

184

Res 28(1):235–242 (2000). 34. Rajgaria R., McAllister S. R., Floudas C. A. A novel high resolution Cα -Cα distance dependent force field based on a high quality decoy set. Proteins 65(3):726–741 (2006). 35. Rajgaria R., McAllister S. R., Floudas C. A. Distance dependent centroid to centroid force fields using high resolution decoys. Proteins 70(3):950–970 (2008). 36. Fung H. K., Taylor M. S., Floudas C. A. Novel formulations for the sequence selection problem in de novo protein design with flexible templates. Optim Method Softw 22(1):51–71 (2007). 37. Fung H. K., Rao S., Floudas C. A., Prokopyev O., Pardalos P. M., Rendl F. Computational comparison studies of quadratic assignment like formulations for the in silico sequence selection problem in de novo protein design. J Comb Optim 10(1):41–60 (2005). 38. Klepeis J. L., Floudas C. A. Free energy calculations for peptides via deterministic global optimization. J Chem Phys 110(15):7491–7512 (1999). 39. Klepeis J. L., Floudas C. A., Morikis D., Lambris J. D. Predicting peptide structures using NMR data and deterministic global optimization. J Comput Chem 20(13):1354–1370 (1999). 40. Klepeis J. L., Schafroth H. D., Westerberg K. M., Floudas C. A. Deterministic global optimization and ab initio approaches for the structure prediction of polypeptides, dynamics of protein folding and proteinprotein interactions. Adv Chem Phys 120:265–457 (2002). 41. Klepeis J. L., Floudas C. A. Ab initio prediction of helical segments of polypeptides. J Comput Chem 23(2):246–266 (2002). 42. Klepeis J. L., Floudas C. A. Prediction of beta-sheet topology and disulfide bridges in polypeptides. J Comput Chem 24(2):191–208 (2003). 43. Klepeis J. L., Floudas C. A. ASTRO-FOLD: A combinatorial and global optimization framework for ab initio prediction of threedimensional structures of proteins from the amino acid sequence. Biophys J 85(4):2119–2146 (2003). 44. Klepeis J. L., Pieja M. T., Floudas C. A. A new class of hybrid global optimization algorithms for peptide structure prediction: Integrated hybrids. Comput Phys Commun 151:121–140 (2003). 45. Klepeis J. L., Pieja M. T., Floudas C. A. Hybrid global optimization algorithms for protein structure prediction : Alternating hybrids. Biophys J 84(2):869–882 (2003b). 46. Klepeis J. L., Floudas C. A. Analysis and prediction of loop segments

May 7, 2013

9:6

BC: 8846 - BIOMAt 2012

10˙floudas

185

in protein structures. Comput Chem Eng 29(3):423–436 (2005). 47. M¨ onnigmann M., Floudas C. A. Protein loop structure prediction with flexible stem geometries. Proteins 61(4):748–762 (2005). 48. McAllister S. R., Mickus B. E., Klepeis J. L., Floudas C. A. Novel approach for alpha-helical topology prediction in globular proteins: Generation of interhelical restraints. Proteins 65(4):930–952 (2006). 49. Floudas C. A., Fung H. K., McAllister S. R., M¨onnigmann M., Rajgaria R. Advances in protein structure prediction and de novo protein design: A review. Chem Eng Sci 61(3):966–988 (2006). 50. Subramani A., Wei Y., Floudas C. A. Astro-fold 2.0: An enhanced framework for protein structure prediction. AIChE J 58(5):1619–1637 (2012). 51. Wei Y., Thompson J., Floudas C. A. Concord: a consensus method for protein secondary structure prediction via mixed integer linear optimization. P Roy Soc A-Math Phy 468(2139):831–850 (2011). 52. Subramani A., Floudas C. A. β-sheet topology prediction with high precision and recall for β and mixed α/β proteins. PLoS ONE 7(3):e32461 (2012). 53. Rajgaria R., Wei Y., Floudas C. A. Contact prediction for beta and alpha-beta proteins using integer linear optimization and its impact on the first principles 3d structure prediction method astro-fold. Proteins 78(8):1825–1846 (2010). 54. Subramani A., Floudas C. A. Structure prediction of loops with fixed and flexible stems. J Phys Chem B 116(23):6670–6682 (2012). 55. G¨ untert P., Mumenthaler C., W¨ uthrich K. Torsion angle dynamics for NMR structure calculation with the new program DYANA. J Mol Biol 273(1):283–298 (1997). 56. G¨ untert P. Automated NMR structure calculation with CYANA. Methods Mol Biol 278:353–378 (2004). 57. Ponder J. W. TINKER, software tools for molecular design. 1998 (Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine: St. Louis, MO.) (1998). 58. Cornell W. D., Cieplak P., Bayly C. I., Gould I. R., Merz K. M., Ferguson D. M., Spellmeyer D. C., Fox T., Caldwell J. W., Kollman P. A. A 2nd generation force-field for the simulation of proteins, nucleic-acids, and organic-molecules. J Am Chem Soc 117(19):5179–5197 (1995). 59. Lilien R. H., Stevens B. W., Anderson A. C., Donald B. R. A novel ensemble-based scoring and search algorithm for protein redesign and its application to modify the substrate specificity of the gramicidin syn-

May 7, 2013

9:6

BC: 8846 - BIOMAt 2012

10˙floudas

186

60.

61.

62. 63.

64.

65.

66.

67.

68. 69. 70. 71.

72.

thetase a phenylalanine adenylation enzyme. J Comput Biol 12(6):740– 761 (2005). Lee M. R., Baker D., Kollman P. A. 2.1 and 1.8 ˚ A Cα RMSD structure predictions on two small proteins, HP-36 and S15. J Am Chem Soc 123(6):1040–1046 (2001). Rohl C. A., Baker D. De novo determination of protein backbone structure from residual dipolar couplings using rosetta. J Am Chem Soc 124(11):2723–2729 (2002). Rohl C. A., Strauss C. E. M., Misura K. M. S., Baker D. Protein structure prediction using rosetta. Method Enzymol 383:66–93 (2004). DiMaggio P. A., McAllister S. R., Floudas C. A., Feng X. J., Rabinowitz J. D., Rabitz H. A. Biclustering via optimal re-ordering of data matrices in systems biology: Rigorous methods and comparative studies. BMC Bioinformatics 9(458) (2008). DiMaggio P. A., McAllister S. R., Floudas C. A., Feng X. J., Rabinowitz J. D., Rabitz H. A. A network flow model for biclustering via optimal re-ordering of data matrices. J Global Optim 47(3):343–354 (2010). Daily M. D., Masica D., Sivasubramanian A., Somarouthu S., Gray J. J. CAPRI rounds 3-5 reveal promising successes and future challenges for RosettaDock. Proteins 60(2):181–186 (2005). Gray J. J., Moughon S. E., Kortemme T., Schueler-Furman O., Misura K. M. S., Morozov A. V., Baker D. Protein-protein docking predictions for the CAPRI experiment. Proteins 52(1):118–122 (2003). Gray J. J., Moughon S., Wang C., Schueler-Furman O., Kuhlman B., Rohl C. A., Baker D. Protein-protein docking with simultaneous optimization of rigid-body displacement and side-chain conformations. J Mol Biol 331(1):281–299 (2003). Kuhlman B., Baker D. Native protein sequences are close to optimal for their structures. P Natl Acad Sci USA 97(19):10383–10388 (2000). Rose G. D., Creamer T. P. Protein folding: Predicting predicting. Proteins 19(1):1–3 (1994). Dalal S., Balasubramanian S., Regan L. Protein alchemy: Changing β-sheet into α-helix. Nat Struct Biol 4(7):548–552 (1997). Dalal S., Regan L. Understanding the sequence determinants of conformational switching using protein design. Protein Sci 9(9):1651–1659 (2000). Alexander P. A., He Y., Chen Y., Orban J., Bryan P. N. A minimal sequence code for switching protein structure and function. P Natl Acad Sci USA 106(50):21149–21154 (2009).

May 7, 2013

9:6

BC: 8846 - BIOMAt 2012

10˙floudas

187

73. He Y., Chen Y., Alexander P. A., Bryan P. N., Orban J. NMR structures of two designed proteins with high sequence identity but different fold and function. P Natl Acad Sci USA 105(38):14412–14417 (2008). 74. Pandya M. J., Cerasoli E., Joseph A., Stoneman R. G., Waite E., Woolfson D. N. Sequence and structural duality: Designing peptides to adopt two stable conformations. J Am Chem Soc 126(51):17016–17024 (2004). 75. Alexander P. A., Rozak D. A., Orban J., Bryan P. N. Directed evolution of highly homologous proteins with different folds by phage display: implications for the protein folding code. Biochemistry-US 44(43):14045– 14054 (2005). 76. Alexander P. A., He Y., Chen Y., Orban J., Bryan P. N. The design and characterization of two proteins with 88% sequence identity but different structure and function. P Natl Acad Sci USA 104(29):11963– 11968 (2007). 77. Cordes M. H. J., Walsh N. P., McKnight C. J., Sauer R. T. Evolution of a protein fold in vitro. Science 284(5412):325–327 (1999). 78. Anderson T. A., Cordes M. H. J., Sauer R. T. Sequence determinants of a conformational switch in a protein structure. P Natl Acad Sci USA 102(51):18344–18349 (2005). 79. Smadbeck J., Peterson M. B., Floudas C. A. De novo protein design of conformational switches with minimum number of mutations. (2012) in preparation. 80. Vorobjev Y. N., Vila J. A., Scheraga H. A. FAMBE-pH: A fast and accurate method to compute the total solvation free energies of proteins. J Phys Chem B 112(35):11122–11136 (2008). 81. Ren P., Ponder J. W. Consistent treatment of inter- and intramolecular polarization in molecular mechanics calculations. Journal of Computational Chemistry 23(16):1497–1506 (2002). 82. Smadbeck J., Peterson M. B., Khoury G. A., Taylor M. S., Floudas C. A. Protein WISDOM: A Workbench for In Silico De novo design Of bioMolecules. (2012) in preparation.

May 9, 2013

11:32

BC: 8846 - BIOMAT 2012

11˙xia

MATHEMATICAL MODELS AND TECHNIQUES FOR BIOMOLECULAR GEOMETRIC ANALYSIS∗

K. L. XIA Department of Mathematics Michigan State University, MI 48824, USA E-mail: [email protected] F. XIN Department of Computer Science and Engineering Michigan State University, MI 48824, USA E-mail: [email protected] Y. TONG† Department of Computer Science and Engineering Michigan State University, MI 48824, USA E-mail: [email protected] G. W. WEI‡ Department of Mathematics Department of Biochemistry & Molecular Biology Michigan State University, MI 48824, USA E-mail: [email protected]

∗ This

work was supported in part by NSF grants CCF-0936830, IIS-0953096 and DMS1160352, and NIH grant R01GM-090208. † Corresponding author. ‡ Corresponding author. 188

May 9, 2013

11:32

BC: 8846 - BIOMAT 2012

11˙xia

189 The understanding of biomolecular structure, function and dynamics depends on theoretical models established based on experimental observations. Geometric modeling interprets and translates experimental data three dimensional (3D) shapes and gives rise to geometric characterization of macromolecules. In this work, we explore geometric flow approaches for biomolecular surface construction and surface characterization. Specifically, we study the surface generation using the variational principle to produce the minimal molecular surfaces (MMS) and macromolecular surfaces that represent the minimization of both polar and nonpolar solvation free energies. We discuss the use of high order geometric flows for electron microscopy (EM) data processing, and propose a variety of new potential and curvature driven evolution equations based on general functions of principal curvatures. These new equations can be used for both surface construction and EM data analysis. Finally, with biomolecular surfaces generated from geometric flow approaches, which are free from geometric singularities and local-scale fluctuations, surface characterization is carried out by curvatures, such as Gaussian and mean curvatures, as well as minimum and maximum principal curvatures. The present curvature estimation together with electrostatic potential obtained from the Poisson-Boltzmann equation in the present variational analysis, have a great potential in applications for drug design and protein design.

1. Introduction Both biophysical modeling and geometric modeling are important for practical analysis of biomolecules. Biophysical modeling includes charge distribution, density distribution, electrostatic potential, ﬂexibility, mobility etc. While geometric modeling refers to surface generation and/or extraction, shape representation and characterization, curvature evaluation, mesh generation, etc. Geometric modeling often provides the basis for biophysical modeling. The geometric and topological features of biomolecules are closely associated with their functions. For example, protein surface curvature modulates the hydrophobic eﬀect in protein folding; based on shapes recognition, enzyme can diﬀerentiate functional groups; membrane bending curvatures recruit certain proteins; and cloverleaf-shape RNAs and hairpin-shape RNAs can be diﬀerentiated ribozymes. Additionally, in ligand-protein and protein-protein interactions, binding mode and binding aﬃnity are signiﬁcantly aﬀected by their geometric complementarity. Geometric traits and features are widely used in drug design and protein design. For proteins, potential ligand binding sites are wells, caves, bowls, clefts, valleys etc on the protein surfaces. A common feature of these sites is their concave shape regions, or pockets. The thermal dynamical properties of drug binding sites were often dictated by the geometric features of a binding pocket, such as pocket volume, pocket area, pocket mouth size, pocket depth etc 1 . Molecular topology also determines molecular func-

May 9, 2013

11:32

BC: 8846 - BIOMAT 2012

11˙xia

190

tions. For example, genus number can be used as an index for ion channel open and close states. A major source of geometric and topological features of biomolecules is experimental measurements, such as macromolecular X-ray crystallography, NMR, cryo-electron microscopy (cryo-EM), multiangle light scattering, confocal laser-scanning microscopy, small angle scattering, ultra fast laser spectroscopy, etc, which have been continuously advanced in the past few decades to produce biomolecular structures with atomic resolution. Based on these experimental data, protein surfaces are constructed according to certain theoretical models and widely used in virtual screening and structure-based drug design.

Figure 1. Illustration of a geometric singularity on the solvent excluded surface of protein 2CND.

Protein shapes and surfaces are determined by their theoretical models. Based on the atom and bond model of molecules by Corey and Pauling 2 , the van der Waals surface, the solvent-excluded surface (SES) (also known as molecular surface), and the solvent-accessible surface have been proposed 3,4 . These models give rise to molecular shapes and surfaces with input atomic coordinates, which can be obtained either from the Protein Data Bank (PDB), one of the major protein structure sources, or from simulations. However, these surfaces often have many defects. For instance, they are not smooth and contain geometric singularities 5,6,7 , i.e., cusps, tips, and self-intersecting areas, see Fig. 1. Consequently, they are not diﬀerentiable, which lead to diﬃculties in the application of these models for computations. In particular, the calculation of curvatures is not straightforward, and excessive trivial local information is involved in these

May 9, 2013

11:32

BC: 8846 - BIOMAT 2012

11˙xia

191

models 8 . Additionally, these surface models do not satisfy the physical laws of surface energy minimization, as these deﬁnitions just simply ad hoc divisions of biomolecules and their surroundings. To remove geometric singularities in biomolecules, curvature driven evolution partial diﬀerential equations (PDEs), or geometric ﬂows, were ﬁrst introduced to construct singularity free surfaces in 2005 9 . This approach was embedded in a variational framework to minimize the molecular surface free energy, which results in the minimal molecular surface (MMS), the ﬁrst variational surface deﬁnition for biomolecules 10,11 . Similar geometric ﬂow approaches were used to smooth Gaussian molecular surfaces 12 . The geometric ﬂow approach to macromolecular surfaces was devised to construct a variety of variational multiscale models 13 . A special example is the differential geometry based solvation models, which consist of two essential components: nonpolar and polar terms 13,14,15 . In the nonpolar term, one takes into consideration the surface free energy, the mechanical work of cavity creating, and the solvent-solute interactions. The electrostatic potential in polar term is described by the Poisson-Boltzmann equation 16 . Another major source of protein structures is the electron microscopy data bank (EMDataBank), which collects the data from cryo-EM. Three dimensional (3D) structures are built from projection images collected by illuminating the sample, which can be in diﬀerent functional or chemical states, from diﬀerent directions. Unlike the protein data in PDB, the data in cryo-EM are in a volumetric format, and usually suﬀer from low signal-to-noise rate (SNR). Laplace-Beltrami operators have been shown to perform well in improving the SNR of electron microscopy data (EMD) 17 . In this paper, the high order geometric ﬂow method 18,19 is applied to edge-preserved noise reduction of EMD. Apart from surface construction, surface characterization is also a major task in geometric modeling. The evaluation of Gaussian curvature and mean curvature is important for drug and protein design, as discussed above. Based on the signs of Gaussian curvature and mean curvature, the surface geometry of a protein can be classiﬁed to indicate biological and biophysical properties and relevances. Additionally, based on Gaussian and mean curvatures, two principal curvatures can be obtained. As principal curvatures are the building blocks for other curvature functions, such as Gaussian curvature, mean curvature, shape index and curvedness, one can construct more functions for practical applications. The objectives of the present work is to present a general framework for some biomolecular geometric modeling aspects, including surface gener-

May 9, 2013

11:32

BC: 8846 - BIOMAT 2012

11˙xia

192

ation, and curvature estimation. We discuss the protein surface constructions from PDB information sources using geometric ﬂow approaches based on the variational principle. We also generate biomolecular surfaces from EMD information sources using high order geometric ﬂows. We characterize the protein surface geometry as eight distinct types and calculate the related surface curvatures. These basic geometric features have a potential for active site prediction. As our protein surfaces are free from geometric singularities and local ﬂuctuations, the binding sites information can be evaluated in a natural manner. We discuss the construction of generalized principal curvature ﬂows for surface construction and image processing. The rest of paper is organized as follows. Section 2 is devoted to variational methods for biomolecular surface constructions. The MMS generation and PB based solvation model are presented for geometric modeling perspective. In Section 3, computational algorithms and technique for surface generation, noise reduction and curvature evaluation are presented. Applications of these approaches are demonstrated with realistic examples. An extended discussion of geometric ﬂows is given in Section 4. We present many novel principal curvature ﬂows. This paper ends with a conclusion. 2. Theory and models Minimal surfaces are common phenomena in nature. They are associated with surface energy minimizations and thus attain the stability against environmental perturbations. Geometric singularities occurred in molecular surfaces are unphysical and problematic in computational modeling. In this work, we discuss the MMS model, which is free of geometric singularities. Further, the Poisson-Boltzmann equation is coupled with our geometric ﬂow equation to achieve the solvation free energy minimization of both polar and nonpolar components. Electrostatic potential, an important quantity in biophysical modeling, is also obtained by solving the Poisson-Boltzmann equation. 2.1. Minimal molecular surface model The starting point for MMS generation is to construct the free energy functional. Then, the governing equation of the surface is obtained through a variational procedure. To this end, a characteristic function S is deﬁned, with value S representing the protein domain and 1 − S the environmental region. From Fig. 2, it is seen that there are overlapping regions between the protein and solvent surroundings. Thus two surface interpretations,

May 9, 2013

11:32

BC: 8846 - BIOMAT 2012

11˙xia

193

sharp interface and smeared interface, can be employed. The sharp interface, obtained by extracting an isovalue between 0 and initial value S0 , is more suitable for surface visualization. The smeared surface, in contrast, is deﬁned as the domain where S value goes from 0 to S0 . This continuous surface deﬁnition is preferred in Cartesian representation based modeling and computation. In a 3D representation, a mean surface area can be deﬁned through the coarea formula 13 :

Figure 2.

Area =

0

1

Characteristic function S in a 1D setting.

S −1 (c)

dσdc = Ω

|∇S(r)|dr,

r ∈ R3 .

(1)

Ω

Here Ω is the whole computational domain. The surface free energy is given by Gsurface = γArea = γ|∇S|dr, (2) Ω

where γ is the surface tension. By using the Euler-Lagrange equation, the variation with respect to S leads to the vanishing mean curvature ∇S = 0. ∇· |∇S| By means of artiﬁcial time, we convert this elliptic equation into a parabolic equation ∂S ∇S = |∇S|∇ · . (3) ∂t |∇S| This is the mean curvature ﬂow equation. The steady state solution of Eq. (3) contains the minimal molecular surface information we need. To avoid confusion, this equation is solved under some biological constraints, that is,

May 9, 2013

11:32

BC: 8846 - BIOMAT 2012

11˙xia

194

in the iteration process, one needs to keep some inner volume unchanged. We will discuss computational details in Sec. 3.1. 2.1.1. PB based surface construction The solvation process involves a variety of interactions. The solvation energy can be usually divided into two parts, a nonpolar one and a polar one. The polar energy is associated with electrostatic interactions. When there are charged ions in the solvent environment, their concentration distributions can be approximately described by the Boltzmann distribution ρα = ρα0 e

− kqα Φ T B

,

(4)

where kB is the Boltzmann constant and T is the temperature. The terms ρα and ρα0 denote the concentration and reference bulk concentration of the αth charge species, respectively. Here qα denotes the valence of the αth charge species. The use of Boltzmann distribution makes a continuum description of the solvent domain. The polar solvation free energy can be represented as, m S − |∇Φ|2 + Φ ρm + Gpolar = 2 Ω

qα Φ s −k T 2 (1 − S) − |∇Φ| − kB T ρα0 e B − 1 dr. (5) 2 α where Φ is the electrostatic potential, coeﬃcients s and m are the dielectric constants of the solvent and solute, respectively, and ρm represents the ﬁxed charge density of the solute, which takes a discrete, atomistic description. For the nonpolar energy part, the free energy functional can be summarized as: ρα Uα dr, r ∈ R3 , (6) Gnonpolar = γArea + pVol + Ωs

α

where p is the hydrodynamic pressure, and Uα denotes the solvent-solute non-electrostatic interactions. The solvent domain is denoted as Ωs . Based on the characteristic function S deﬁned in Sec. 2.1, the nonpolar energy expression can be rewritten as, γ|∇S| + pS + (1 − S) (7) ρα Uα dr. Gnonpolar = Ω

α

The |∇S| term is due to the coarea formula for the surface area and 1 − S characterizes the domain of the solvent.

May 9, 2013

11:32

BC: 8846 - BIOMAT 2012

11˙xia

195

The total energy functional can be obtained by adding the polar and nonpolar parts together, m γ|∇S| + pS + S − |∇Φ|2 + Φ ρm Gtotal [S, Φ] = 2 Ω (8) qα Φ U s α − −1 dr. ρα0 e kB T 1 − + (1 − S) − |∇Φ|2 − kB T 2 kB T α Variation by using the Euler-Lagrange equation with respect to electrostatic potential φ and surface characteristic function S, two equations are derived δGtotal ⇒ δΦ

∇ · ([(1 − S)s + Sm ]∇Φ) + Sρm

− qα Φ +(1 − S) α qα ρα0 e kB T 1 − kUBαT = 0,

(9)

and ∇S m δGtotal ⇒−∇· γ +p− |∇Φ|2 + Φ ρm δS |∇S| 2 (10) Uα s − qα Φ − 1 = 0. + |∇Φ|2 + kB T ρα0 e kB T 1 − 2 kB T α From Eq. 9, we deﬁne the generalized permittivity function as, (S) = (1 − S)s + Sm . The generalized Poisson-Boltzmann equation is obtained Uα − qα Φ . −∇ · ((S)∇Φ) = Sρm + (1 − S) qα ρα0 e kB T 1 − kB T α

(11)

(12)

Through the introduction of an artiﬁcial time 18,14,15 , the generalized mean curvature ﬂow equation is achieved, ∂S ∇S = |∇S| ∇ · γ +V , (13) ∂t |∇S| the potential driven term V is given by s m |∇Φ|2 − Φ ρm − |∇Φ|2 V = −p + 2 2 Uα − kqα Φ T B −1 . 1− −kB T ρα0 e kB T α

(14)

Here the surface is generated under the geometric and potential driven ﬂow, which can incorporate other related interactions if the biomolecular system is in a non-equilibrium state 20 . The generalized PB equation and

May 9, 2013

11:32

BC: 8846 - BIOMAT 2012

11˙xia

196

the generalized Laplace-Beltrami equation in Eqs. 12 and 13 describe the biomolecular surface and electrostatic potential in the solvation process. These coupled equations are solved in an iterative way to guarantee the convergence of the solution. 3. Computational algorithms In this section, we discuss computational algorithms and techniques for generation and extraction of protein surfaces from two major protein data sources, the PDB and the EMDataBank. Based on the resulting protein surfaces, the geometric and topological features are evaluated and potential applications are brieﬂy discussed. 3.1. Surface generation from PDB The data downloaded from the PDB can be processed to deliver some basic atomic information of the protein, such as, total atom number, atom types, atom radii, and atom coordinates. We deﬁne atom positions as vi = (vi,x , vi,y , vi,z ), i = 1, · · · , n with n the total number of atoms in the protein. Therefore the initial conditions can be given as S0 , (x, y, z) ∈ D S(x, y, z, 0) = (15) 0, otherwise. D = ∪ni=1 v : |v − vi | < ari .

(16)

The coeﬃcient a here is used to build up an inﬂated volume. It can be adjusted to generate surfaces of diﬀerent scales and a = 1.3 is used in the present work. The surface characteristic function S is iterated under biological constraints, 1, (x, y, z) ∈ Dχ (17) χ(x, y, z) = 0, otherwise. Dχ = ∪ni=1 v : |v − vi | < ri ,

(18)

where function χ(x, y, z) is used to protect the van der Waals surface enclosed volume. Under these conditions, we can discretize Eq. (3). The minimal molecular surface can be extracted from certain isovalue between 0 and S0 . Fig. 3 demonstrates the resulting MMS. Compared with the solvent excluded surface, the MMS is much smoother and free of geometric singularities and undesirable local variations, which is in a preferred form for geometric characterization.

May 9, 2013

11:32

BC: 8846 - BIOMAT 2012

11˙xia

197

(a)

(b)

(c)

(d)

Figure 3. Comparison of geometric and electrostatic potential properties of protein 2CND. (a) Solvent excluded surface; (b) MMS; (c) Surface electrostatic potential on the solvent excluded surface; (d) Surface electrostatic potential on the MMS.

When there is an external potential, the surface formation and evolution is under the control of a geometric and potential driven ﬂow. Particularly, the electrostatic potential is usually described by the PB equation. The resulting governing equations are coupled PB equation and generalized Laplace-Beltrami equation. Two approaches are available for solving the coupled system 14,15 . One is to update the characteristic function S and

May 9, 2013

11:32

BC: 8846 - BIOMAT 2012

11˙xia

198

electrostatic potential φ by a linear combination of the previous ones and the new ones, S = αSnew + (1 − α)Sold ,

φ = α φnew + (1 − α )φold ,

0 < α < 1;

0 < α < 1,

(19) (20)

where Snew and φnew are the newly solved surface function and electrostatic potential, respectively. Sold and φold are the previous ones. This approach is more stable. To take into consideration the diﬀerent convergence rates between electrostatic potential and surface function, we can refresh the surface function more frequently. Usually, the PB equation is not updated until a number of initial iterations (i.e. 10 to 100 steps) of the generalized Laplace-Beltrami equation. The resulting electrostatic potential can be projected onto the protein surface to analyze the biomolecular functions and interactions as shown in Fig. 3. 3.2. Surface generation from EMD

Figure 4.

Noise reduction by geometric flows for EMD1275.

The data from EMDataBank is of low SNR. Therefore, a preprocessing stage of the surface data is almost mandatory. To this end, there are many algorithms and techniques in the literature, including wavelet transform techniques, bilateral ﬁlter, and iterative median ﬁltering 21,22,17,23,24,25 . The geometric ﬂows are often applied to image and surface processing.

May 9, 2013

11:32

BC: 8846 - BIOMAT 2012

11˙xia

199

Perona and Malik’s anisotropic diﬀusion equation 26 , Osher and Sethian’s level set method 27 , and Willmore ﬂow 28,29 are often used. PDE approach to image analysis was pioneered by Witkin in 1984 30 . Perona and Malik used a nonlinear formulation to make it edge-preserving during noise reduction. The ﬁrst family of arbitrarily high order nonlinear evolution PDEs was proposed by Wei for image processing in 1999 31 . High order geometric ﬂows can more eﬃciently suppress the high-frequency components as shown in recent work on protein surface analysis 19 . A speciﬁc form of arbitrarily high order geometric PDEs is given by 18 2q ∂S S) ∇(∇ q + P (S, |∇S|), (21) = (−1) g(|∇∇2q S|)∇ · ∂t g(|∇∇2q S|)

where g(|∇∇2q S|) = 1 + |∇∇2q S|2 is the generalized Gram determinant. When q = 0, we arrive at the the potential and curvature driven ﬂow ∇S ∂S = |∇S|∇ · + P (S, ∇S). (22) ∂t |∇S|

Usually, the alternating direction implicit (ADI) schemes are needed for solving high PDEs due to stability reasons 18 . The noise reduction eﬀect of high order geometric ﬂow equation is demonstrated in Fig. 4 for EMD1275. 3.3. Geometric features

Figure 5. Illustration of surface types. From top left to bottom right: pit, valley, saddle valley, flat, minimal surface, saddle ridge, ridge and peak.

It is well known that geometric features such as concaveness or convexness are of signiﬁcant importance in predicting the potentially active

May 9, 2013

11:32

BC: 8846 - BIOMAT 2012

11˙xia

200

sites. Concave regions on protein surfaces are likely binding sites for small drugs 1 . Therefore, the classiﬁcation of protein surface based on geometric and topological features can facilitate drug screening in drug design. Since protein surfaces generated with geometric ﬂows are smooth and with less local variations (see for example, Fig. 3), curvatures can be directly evaluated without numerical diﬃculties 8 . Based on the characteristics of Gaussian curvature and mean curvature, there are eight diﬀerent surface types: pit, valley, saddle valley, ﬂat, minimal surface, saddle ridge, ridge and peak 32 . These features are illustrated in Fig. 5 and listed in Table 1. Table 1. Eight surface types in Fig. 5. H>0 H=0 H0 peak none pit

K =0 ridge ﬂat valley

K 1 then the branching process is super-critical. That is, with positive probability, the virus population survives and grows indefinitely at an exponential rate proportional to mn when n → ∞. (iii) If R(1 − d) = 1 then the branching process is critical. That is, with probability 1, the virus population becomes extinct but this may take an infinite time to happen. Proof. This is a straightforward consequence of the classification of multitype branching processes, as generalized by Sevastyanov28,32, which is necessary to include the simple phenotypic model, and equation (7). Theorem 3.1 provides a partition of the parameter space of the simple phenotypic model {(d, R) : d ∈ [0, 1], R ∈ N} into two regions (see Figure 2). The survival region defined by R > 1/(1 − d) and the extinction region defined by R < 1/(1 − d). The curve R = 1/(1 − d) gives the extinction threshold.

10

7,5

R 5

Survival Region

2,5

d

1

0,75

0,5

0

0,25

Extinction Region

0

May 7, 2013

Figure 2. Graph of the function R = 1/(1−d). The region below this curve corresponds to the sub-critical parameters (d, R) and the region above this curve corresponds to the super-critical parameters (d, R). The curve itself corresponds to the critical parameters (d, R).

May 7, 2013

11:39

BC: 8846 - BIOMAT 2012

13˙abcj

230

It is also important, specially in order to describe the asymptotic behaviour in the super-critical case, to know the left eigenvectors v and right eigenvectors u corresponding to the eigenvalue λR Let us write the left and right eigenvectors in components as v = (v0 , v1 , . . . , vR ) and u = (u0 , u1 , . . . , uR ) and assume that they are normalized in the following way: vt u = 1

and 1t u = 1 .

Then in the version “with zero class” the left eigenvector v is given by v=

1 (0, . . . , 0, 1) (1 − d)R

and the right eigenvector u has components uk given by R uk = (1 − d)k dR−k = binom(k; R, 1 − d) . k It is interesting to note that the simple phenotypic model is a “completely solvable” branching process in the sense that we may explicitly solve the spectral problem for its mean matrix independently of the numerical values of the parameters. Next we turn to the computation of the extinction probabilities γr . In this case, it is necessary to solve a non-linear system of polynomial equations: z0 = 1 z1 = dz0 + (1 − d)z1 z2 = dz1 + (1 − d)z2 .. .

2

zR = dzR−1 + (1 − d)zR

(8) R

This may be done in a recursive way, since the equation for z0 is already solved z0 = 1 and the equation for zk depends only on zk and zk−1 . Thus we get for R = 0, 1, 2: γ0 = 1 γ1 = 1 ( γ2 =

d2 /(1 − d)2

for 0 6 d 6

1

for

1 2

1 2

6d61

May 7, 2013

11:39

BC: 8846 - BIOMAT 2012

13˙abcj

231

When R > 3 the formulas become very complicated and when R > 5 the equation may not even be solvable by radicals, but in general one may write ( f (d) for 0 6 d 6 dc γr = 1 for dc < d 6 1 where dc = r−1 r and f (d) is a strictly increasing smooth function on [0, 1[ satisfying: (i) f (0) = 0, (ii) f (dc ) = 1, (iii) f (d) < 1 for 0 6 d < dc and (iv) limd→1 f (d) = +∞. This expression suggests that the surviving probabilities ωr = 1−γr can be interpreted as an order parameter associated to the occurrence of a phase transition when the deleterious probability d attains the critical point dc = r−1 r , which marks the transition from supercriticality to sub-criticality, ( g(d) for 0 6 d 6 dc ωr = 0 for dc < d 6 1 where g(d) = 1 − f (d) and thus satisfies: (i) g(0) = 1, (ii) g(dc ) = 0, (iii) g(d) > 0 for 0 6 d < dc and (iv) limd→1 g(d) = −∞. Observe that for a fixed numerical value of d, the system of equations (8) can be easily solved by numerical approximation using Newton’s method. For instance, in Figure 3 we show some curves for the surviving probability ωr (d).

Figure 3.

Curves for the surviving probability ωr as function of d for r = 2, . . . , 7.

May 7, 2013

11:39

BC: 8846 - BIOMAT 2012

13˙abcj

232

The result shows that, with respect to ωr , the model has a critical behavior very similar to a second order phase transition (Figure 3). Therefore, the critical properties of the model can be characterized by means of relevant critical exponents. Finally, it is not difficult to see that for fixed d, the numbers γr satisfy 1 > γ2 > γ3 > . . . > γR and therefore the extinction probability for a general initial condition Z0 = (Z00 , . . . , Z0R ) may be estimated far from the critical deleterious probability dc = (R − 1)/R by |Z0′ |

P(Zn = 0 for some n|Z0 ) ≈ γ2

,

(9)

where |Z0′ | = Z02 + . . . + Z0R and near dc = (R − 1)/R by ZR

P(Zn = 0 for some n|Z0 ) ≈ γR0 .

(10)

It has been demonstrated that large population passages are able to increase the adaptability of virus populations38 . On the other hand, small population passages represented by bottleneck events are capable to increase the risk towards viral extinction. Among the aspects of abrupt population reductions are the exacerbated effects of drift that coupled with the Muller’s hatchet principle43 may lead to the random and progressive lost of the best adapted virus in a population. It also has been suggested that large virus populations bearing a significant phenotypic diversity are more adaptable to environment fluctuations and robust. It is correct to assume that large initial virus populations colonizing new hosts may show better survival probabilities than populations recovering from bottlenecks. In this way the size of the viral innoculums may have an impact in the survival rates of different virus populations. It is important to note that the existence of a clear cut between regimes of surviving and non-surviving populations by means of a critical state is directly related to the problem of lethal mutagenesis for viral populations. From now on we shall split the analysis of the simple phenotypic model according to which it is sub-critical, super-critical or critical. 3.1. The Sub-critical Regime: Lethal Mutagenesis The first consequence of our results is a generalization, in the context of the phenotypic model (provided one assumes that all effects are of purely mutational nature), of the conjecture of lethal mutagenesis of Bull, Sanju´an and Wilke6 . Recall that Bull et al 6 assume that all mutations are either neutral or deleterious and write the mutation rate U = Ud + Uc where the

May 7, 2013

11:39

BC: 8846 - BIOMAT 2012

13˙abcj

233

component Uc comprises the purely neutral mutations and the component Ud comprises the mutations with a deleterious fitness effect. Let Rmax denote the maximum reproductive capacity among all particles in the viral population. The extinction criterion proposed by Bull et al 6 states that a sufficient condition for extinction is e−Ud Rmax < 1 . 6

(11)

−Ud

According to Bull et al , the factor e is both the mean fitness level and also the proportion of offspring with no non-neutral mutations. In the absence of beneficial mutations the only type of non-neutral mutations are the deleterious mutations and hence e−Ud = c = 1 − d .

(12)

Since the maximum reproductive capacity among all particles in the viral population in our model is given by the maximum number of replicative classes R, it follows that Rmax = R. Therefore, the extinction criterion (11) is equivalent, in the context of the simple phenotypic model, to (1 − d)R < 1 ,

(13)

which is exactly the condition for the model to be sub-critical. In another work, Bull et al 7 suggest a modification of the extinction threshold eq. (11) that accounts for beneficial effects as long as they do not couple the deleterious ones (see Antoneli et al 2 for a more general result). The main conclusion here is that the existence of lethal mutagenesis depends on “genetic components” (mutational rates) and other additional deleterious effects (host driven pressures intensifications), as well as on strict “ecological components”, namely, the maximum replication capacity of the particles in the population and on the initial population size. As a result the viral population may reach extinction by increasing the number of deleterious mutations per replication cycle, by decreasing the value of R in the population or by a combination of the two mechanisms. The mutational strategy is the basis of treatments using mutagenic drugs11 that induce errors in the generation process of new viral particles reducing their replication capacity. A straightforward consequence of extinction criterion eq. (11) or eq. (13) is that a single particle showing the maximum replication capacity R is able to rescue a viral population driven to extinction by mutagenic drugs. If it is assumed that RNA virus populations correspond to a swarm of variants with distinct replication capacities, for a therapy to become effective it is important that it will eliminate the classes represented by particles with highest replication capacities. As a conclusion the

May 7, 2013

11:39

BC: 8846 - BIOMAT 2012

13˙abcj

234

higher the replication capacity of the first particles infecting the organism the larger should be the number of deleterious mutations (or effects) and therefore the larger should be the drug concentration. This can be a clear limitation for treatments based on mutagenic drugs. 3.2. The Super-critical Case: Relaxation and Equilibrium In the super-critical regime, the population grows at a geometric pace indefinitely. Nevertheless, there are two distinct phases that occur during this growth: a transient phase (“relaxation”or “recovery time”) and a dynamical stationary phase. 3.2.1. Relaxation towards equilibrium An important question concerning the adaptation process of a viral population to the host environment is the typical time needed to achieve the equilibrium state. As the equilibrium is characterized by constant mean replication capacity an obvious criteria to measure the time to achieve equilibrium would be by the vanishing variation of this variable as used in other studies (Aguirre et al 1 ). Nevertheless, this method is clearly subjected to the limitations of numerical accuracy with evident disadvantages if one wants a sharp and universal criterion to differentiate populations from the point of view of how fast it can be typically stabilized in a host. Viral populations are commonly submitted to transient regimes. As pointed out earlier the infection transmission process represents the passage of a small number of particles from one organism to another in such a way that in this process the viral population is submitted to a subsequent bottleneck effect during spreading of viruses in the host population. In order to approach the problem of relaxation after a bottleneck process in a more sound basis the natural quantity to be considered is the characteristic time derived from the decay of the mean auto-correlation function. The temporal correlation function C(n) is typically of the form exp(−αn) and the decay rate is given by the parameter α. The natural way to define a characteristic time T to achieve equilibrium is by setting T = 1/α. In order to find the characteristic decay rates one should consider the recursive application of the mean matrix M on the initial population: Z0t M n Z0 . In fact, it is enough to consider the canonical initial population Z0 = eR = (0, 0, ..., 1). By direct inspection it is easily verified that the decay of correlations is typically exponential and given by C(n) = exp − log(R(1 − d)) n ,

May 7, 2013

11:39

BC: 8846 - BIOMAT 2012

13˙abcj

235

where m = R(1 − d) is the malthusian parameter. The decay rate is therefore given by α = log(R(1 − d)). Among others, one possible application of this result relates to the very initial phase of the infection process. If we consider that during this phase the host immune system has not been yet stimulated against the virus, one can assume that the deleterious effects would be solely represented by the viral intrinsic mutation rates. Therefore, the largest the value of R, i.e., the largest the replication capacity of the initial viral particle the fastest the progeny auto-correlation decays and reaches equilibrium stabilizing the viral population; intuitively the parameter R defines the degree of virulence of the infection during the early stage of the infective process. The increment of deleterious effects plays an opposite role on the decay rates. In fact, as it will be shown below the closest the parameter d is to its critical value dc more time is needed to achieve equilibrium. 3.2.2. The Dynamical Stationary State When the simple phenotypic model is super-critical and is initialized with exactly one particle in the class r (Z0r = 1) the effective malthusian parameter is me = λr = rc = r(1 − d) with corresponding normalized right eigenvector u(r) = u0 (r), . . . , ur (r), 0, . . . , 0 , where the components uk (r), with k = 0, . . . , r, are r uk (r) = binom(k; r, 1 − d) = (1 − d)k dr−k . (14) k Therefore, the simple phenotypic model has R − 1 distinct asymptotic distributions of types of particles, describing R − 1 distinct dynamical stationary states, characterized by their asymptotic distribution of classes given by (14) (up to a random scalar perturbation), each one of these being achieved when the branching process is initialized with exactly one particle in the class r (Z0r = 1) for r = 2, . . . , R, respectively. Note that when r = 0, 1 the process is always sub-critical. Theorem 3.2. If the simple phenotypic model is super-critical with malthusian parameter m = R(1 − d) and starts with at least one particle of class R then, in the long run, the relative number of particles in each class reaches a stable stationary dynamical state and is (up to a random scalar perturbation) distributed according to the Binomial Distribution: binom(k; R, 1−d), where k = 0, . . . , R are the replication classes.

May 7, 2013

11:39

BC: 8846 - BIOMAT 2012

13˙abcj

236

Proof. This is consequence of the generalized Kesten-Stigum36,35,37 results about the asymptotic behaviour of decomposable (i.e., with non-primitive mean matrix) super-critical multitype branching processes and the computation of the normalized right eigenvector associated to the malthusian parameter m = R(1 − d) given by equation (14). From theorem 3.2 we immediately obtain: • The mean replication capacity is E(u) = R(1 − d) . • The phenotypic diversity is Var(u) = Rd(1 − d) . It is well accepted that the phenotypic diversity is an important characteristic of the viral population intuitively related to the idea of population robustness21 . In fact, a homogeneous population would be less flexible from the point of view of adaptation. The variance associated with the stationary state can be understood as a natural quantity to measure diversity. It shows that the maximum value of the phenotypic diversity r/4 is reached if d = 1/2 for any value of r. If R > 2 the variation of the phenotypic diversity as a function of d shows that there are two different domains to be considered: below d = 1/2 the diversity is an increasing function of d. It implies that if the population has a typical value of d < 1/2 the effect of inducing an increment of d (for instance using mutagenic drugs) increases the phenotypic diversity. For 1/2 < d < dc this effect reverses and diversity decreases with increasing d. This result raises the question if in normal conditions the viral population adapt to the host environment guided by a principle of maximum phenotypic diversity or if the environmental conditions simply contribute to fix one possible value of diversity for the population that may vary from one to another host organism. Interesting enough, the natural deleterious mutations has been measured for certain viruses and, as shown in the Table 1, they are close to the value d = 1/2. In the first case one could preview that the set point of the viral disease should be invariant (or with small variation) for all hosts. On the other hand the second hypothesis leads to the idea of different responses to treatment depending on the initial value of d before the adoption of treatment strategies to improve d. At the present the two scenarios may apply to different type of viruses and this point clearly has to be decided experimentally.

May 7, 2013

11:39

BC: 8846 - BIOMAT 2012

13˙abcj

237 Table 1. Experimental results of deleterious mutation rates: (VSV) vesicular stomatitis virus, (TEV) Tobacco etch virus and (ΦX174, Qβ) bacterial viruses. Virus

Ud

(1 − d) = e−Ud

REF.

VSV

0.692

0.500

51

TEV

0.773

0.461

9

ΦX174

0.72 - 0.77

0.48 - 0.46

17

Qβ

0.74 - 0.86

0.47 - 0.42

17

Another important consequence of the above results concerns the efficiency of the use of mutagenic drugs. In the region d < 1/(R + 1) < 1/2 the viral population’s most representative particle is the fittest one (class R). If we assume that the drug action is deeply influenced by drug transport coefficients in different host tissues, it is important to be assured that local drug concentrations will still eliminate the set of class R particles. If d increases beyond 1/(R + 1) the representative particle of the population is not anymore the fittest one but a set of particles from different replication classes. Therefore the main drug target represents a group of average replicating particles of a population with higher phenotypic diversity in which resistance drug mutants can be contained. In this case one would say that the viral population displays a kind of endogenous strategy to scape the deleterious action of the mutagenic drug. If we assume that deleterious effects are small in the early stage of the infection process we should expect that at this stage the drug efficiency would be maximum reinforcing the successful practice of post exposure therapy, currently adopted in the case of HIV infections33 . 3.3. The Critical Case: Extinction Threshold The clearest way to characterize the time behavior of the viral population at or around the critical point is through the typical time T to approach equilibrium derived from the decay of correlations described above. The expression T = 1/ log(R(1 − d)) shows that at the critical point the equilibrium state is never reached, i.e., the decay to equilibrium is at least non-exponential. A scaling exponent characterizing the behavior of T in the neighborhood of the critical point dc can be easily obtained. The expansion around dc = (R − 1)/R gives T ≈ (1 − dc ) |d − dc |−1 .

May 7, 2013

11:39

BC: 8846 - BIOMAT 2012

13˙abcj

238

Although it is always possible to calculate intermediate distributions of progeny, it is quite easy to see that at the critical point the time evolution of densities never achieves an invariant density. Unlike in the super-critical regime, the relative number of particles in each class/sub-population is never stable. Nevertheless, our preliminary results concerning the dynamics of fluctuations show that the time variation of the numbers of particles in each separated class follows a pattern such that the variation observed in one class is rigorously the same observed in all the others. This indicate a high level of correlation between the classes in complete analogy with critical phenomena of many physical systems. We conjecture that in the critical regime the highly correlated classes in the population behave as an inseparable whole such that the notion of the population divided in separated classes becomes meaningless. In other words the correlation between classes makes them behave as if they constitute one unique class, which reminds one of the basic properties of the error threshold in Eigen’s theory19 . In fact, according to Eigen, when mutational rates are increased beyond a threshold, infinite viral populations are not anymore able to retain its best adapted variants. At this critical mutation level, selection is overruled by mutation and all variants share the same fitness status. Moreover, populations at Eigen’s error threshold do not become extinct, but well defined replication classes cease to exit, as particles hazardously wander through the surface of a flat landscape. If in the super-critical case the notion of the mean replication capacity and therefore that of the “mean viral particle” exists defining a typical scale in the system, in the critical case this notion is absent. Therefore, in using branching processes to model the time behavior of viral populations the concept of error threshold should be identified with that of criticality. In the same direction of reasoning, in terms of branching processes the existence of lethal mutagenesis should be identified with that of critical behavior of the model. The critical behavior of the model can also be observed through the survival probability ω for d . dc as show in Figure 3. Expansion of the survival probability around the critical point by means of functional equation (8) gives directly R ω ≈ 2 |d − dc | . dc It is interesting to note that the critical exponents of T and ω are the same found in critical behavior of a large class of dynamical random networks49 . Finally, it is noteworthy that here we talk about criticality of a process taking place in time, and therefore the term critical phenomenon (imported

May 7, 2013

11:39

BC: 8846 - BIOMAT 2012

13˙abcj

239

from equilibrium statistical mechanics of space distributed systems) is used to highlight the fact that the survival probability behaves like an order parameter and the amount of deleterious effects quantified by d behaves as a control parameter that can be changed by external means.

4. Conclusions and Outlook Using the previous theoretical model for virus evolution proposed by L´azaro et al 38 and Aguirre et al 1 as a starting point we show that virus evolution can be described by an exact solvable multivariate branching process. By applying our approach we are able to identify crucial aspects of the dynamics of replicating viral populations on a sound theoretical basis. Among these several aspects we are able to demonstrate that – as long as the beneficial effects are close to zero – the two main driving features of a virus population are the maximum replication capacity and the fraction of the population not affected by deleterious effects. Based on this result we show that, as proposed by Bull et al, if the product between the above mentioned parameters m = R(1 − d) yields a value less than one the population undergoes extinction. On the other hand, if m = R(1 − d) is greater than one and the environment is constant we show that the population will reach an asymptotic stationary state characterized by the stability of the replicative classes. However, the time to reach the stationary equilibrium strictly depends on how intense is the deleterious effect, more precisely, the higher d the longer is the transient phase and when d approaches its critical value dc the transient tends to infinity. We also demonstrate that by keeping the deleterious effects constant the survival probability of a virus population will depend on its initial population size. By increasing the population size at time zero we push the survival probability curves, in the region before the critical point, towards one (see Figure 3 and equations (9), (10)). According to this result it can be speculated that virus with greater innoculums have a better chance of survival colonizing new hosts. Interestingly enough and in a frontal disagreement to the above observations it has been shown that only a limited number of particles, and in some cases even one particle, is enough to start a new infectious process in a host34,60 . However, according to the model and as discussed before, the R parameter determines the success of an incoming virus population because the corresponding value of dc is uniquely given by R. The present work suggests that minimum innoculums must have at least one particle with replicative capacity large enough in order to

May 7, 2013

11:39

BC: 8846 - BIOMAT 2012

13˙abcj

240

survive in the new host. We speculate that those particles with maximum replicative capacity should constitute the effective innoculum described in Zwart et al 60 . In fact, the experimental data about viral load in HIV early infected patients strongly suggests that the host deleterious effects over the viral population are minimal and increase after the onset of the immunological response48 . We note that the characteristic form of this data can be easily reproduced by the model (see Castro10 ). Acknowledgments FA wish to acknowledge the support of CNPq through the grant PQ313224/2009-9 and thanks FAP-UNIFESP and BIOMAT Consortium for the financial support to present this work at the “12th International Symposium on Mathematical and Computational Biology”. FB received support from the Brazilian agency FAPESP. DC received financial support from the Brazilian agency CAPES. References 1. J. Aguirre, E. L´ azaro, and S. C. Manrubia. A trade-off between neutrality and adaptability limits the optimization of viral quasispecies. J. Theor. Biol., 261:148–155, 2009. 2. F. Antoneli, F. Bosco, D. Castro, and L. M. Janini. Virus replication as a phenotypic version of polynucleotide evolution. Submitted, arXiv:1204.6353:0–22, 2012. 3. F. Antoneli, A. P. S. Dias, M. Golubitsky, and Y. Wang. Patterns of synchrony in lattice dynamical systems. Nonlinearity, 18(5):2193–2209, 2005. 4. K. B. Athreya and P. E. Ney. Branching Processes. Springer-Verlag, 1972. 5. E. Batschelet, E. Domingo, and C. Weissmann. The proportion of revertant and mutant phage in a growing population, as a function of mutation and growth rate. Gene, 1:27–32, 1976. 6. J. J. Bull, R. Sanju´ an, and C. O. Wilke. Theory of lethal mutagenesis for viruses. J. Virology, 18(6):2930–2939, 2007. 7. J. J. Bull, R. Sanju´ an, and C. O. Wilke. Lethal mutagenesis. In E. Domingo, C. R. Parrish, and J. J. Holland, editors, Origin and Evolution of Viruses, chapter 9, pages 207–218. Academic Press, London, second edition edition, 2008. 8. J. A. Capit´ an, J. A. Cuesta, S. C. Manrubia, and J. Aguirre. Severe hindrance of viral infection propagation in spatially extended hosts. PLoS One, 6(8):e23358, 2011. 9. P. Carrasco, F. de la Iglesia, and S. F. Elena. Distribution of fitness and virulence effects caused by single-nucleotide substitutions in Tobacco Etch Virus. J. Virology, 18(23):12979–12984, 2007.

May 7, 2013

11:39

BC: 8846 - BIOMAT 2012

13˙abcj

241

10. D. Castro. Simula¸c˜ ao computacional e an´ alise de um modelo fenot´ıpico de evolu¸c˜ ao viral. Master’s thesis, Universidade Federal de S˜ ao Paulo UNIFESP, S˜ ao Paulo, 2011. 11. S. Crotty, C. E. Cameron, and R. Andino. RNA virus error catastrophe: direct molecular test by using ribavirin. Proc. Natl. Acad. Sci. U. S. A., 98:6895–6900, 2001. 12. J. A. Cuesta, J. Aguirre, J. A. Capit´ an, and S. C. Manrubia. Struggle for space: viral extinction through competition for cells. Phys. Rev. Lett., 106(2):028104–028104, Jan 2011. 13. J. M. Cuevas, F. Gonz´ alez-Candelas, A. Moya, and R. Sanju´ an. The effect of ribavirin on the mutation rate and spectrum of Hepatitis C virus in vivo. J. Virology, 83:5760–5764, 2009. 14. L. Demetrius, P. Schuster, and K. Sigmund. Polynucleotide evolution and branching processes. Bull. Math. Biol., 47(2):239–262, 1985. 15. E. Domingo, E. Baranowski, C. M. Ruiz-Jarabo, A. M. Mart´ın-Hern´ andez, J. C. S´ aiz, and C. Escarm´ıs. Quasispecies structure and persistence of rna viruses. Emerg. Infect. Dis., 4(4):521–527, 1998. 16. E. Domingo and et al. The quasispecies (extremely heterogeneous) nature of viral rna genome populations: biological relevance – a review. Gene, 40(1):1– 8, 1985. 17. P. Domingo-Calap, J. M. Cuevas, and R. Sanju´ an. The fitness effects of random mutations in single-stranded DNA and RNA bacteriophages. PLoS Genet, 5(11):e1000742, 2009. 18. J. W. Drake and J. J. Holland. Mutation rates among RNA viruses. Proc. Natl. Acad. Sci. U. S. A., 96:13910–13913, 1999. 19. M. Eigen. Selforganization of matter and the evolution of biological macromolecules. Naturwissenschaften, 58:465–523, 1971. 20. S. F. Elena and R. Sanju´ an. Adaptive value of high mutation rates of rna viruses: separating causes from consequences. J. Virology, 79(18):11555– 11558, Sep 2005. 21. S. F. Elena, C. O. Wilke, C. Ofria, and R. E. Lenski. Effects of population size and mutation rate on the evolution of mutational robustness. Evolution, 61(3):666–674, Mar 2007. 22. C. Escarm´ıs, E. L´ azaro, and S. C. Manrubia. Population bottlenecks in quasispecies dynamics. Curr. Top. Microbiol. Immunol., 299:141–170, 2006. 23. A. Eyre-Walker and P. D. Keightley. The distribution of fitness effects of new mutations. Nat. Rev. Genet., 8(8):610–618, Aug 2007. 24. W. Feller. An Introduction to Probability Theory and Its Applications, volume 1. Wiley, 1968. 25. A. Grande-P´erez, S. Sierra, M. G. Castro, E. Domingo, and P. R. Lowenstein. Molecular indetermination in the transition to error catastrophe: systematic elimination of lymphocytic choriomeningitis virus through mutagenesis does not correlate linearly with large increases in mutant spectrum complexity. Proc. Natl. Acad. Sci. U. S. A., 99:12938–12943, 2002. 26. R. M. Gray. Toeplitz and Circulant Matrices: A review. Now Publishers Inc., 2006.

May 7, 2013

11:39

BC: 8846 - BIOMAT 2012

13˙abcj

242

27. U. Grenander and G. Szeg¨ o. Toeplitz Forms and Their Applications. University of California Press, 1958. 28. T.E. Harris. The Theory of Branching Processes. Springer-Verlag, 1963. 29. J. Hermisson, O. Redner, H. Wagner, and E. Baake. Mutation-selection balance: ancestry, load, and maximum principle. Theor. Popul. Biol., 62(1):9– 46, Aug 2002. 30. J. J. Holland, E. Domingo, J. C. de la Torre, and D. A. Steinhauer. Mutation frequencies at defined single codon sites in vesicular stomatitis virus and poliovirus can be increased only slightly by chemical mutagenesis. J. Virology, 64:3960–3962, 1990. 31. M. Imhof and C. Schlotterer. Fitness effects of advantageous mutations in evolving escherichia coli populations. Proc. Natl. Acad. Sci. U. S. A., 98(3):1113–1117, Jan 2001. 32. M. Jiˇrina. A simplified proof of the Sevastyanov theorem on branching processes. Annales de l’I. H. Poincar´e, Section B, 6(1):1–7, 1970. 33. M. H. Katz and J. L. Gerberding. Postexposure treatment of people exposed to the human immunodeficiency virus through sexual contact or injectiondrug use. N. Engl. J. Med., 336(15):1097–1100, Apr 1997. 34. B.F. Keele and et al. Identification and characterization of transmitted and early founder virus envelopes in primary HIV-1 infection. Proc. Natl. Acad. Sci. U. S. A., 105(21):7552–7557, 2008. 35. H. Kesten and B. P. Stigum. Additional limit theorems for indecomposable multidimensional Galton-Watson processes. Ann. Math. Stat., 37(6):1463– 1481, 1966. 36. H. Kesten and B. P. Stigum. A limit theorem for multidimensional GaltonWatson processes. Ann. Math. Stat., 37(5):1211–1223, 1966. 37. H. Kesten and B. P. Stigum. Limit theorems for decomposable multidimensional Galton-Watson processes. J. Math. Annal. Appl., 17:309–338, 1967. 38. E. L´ azaro, C. Escarm´ıs, E. Domingo, and S. C. Manrubia. Modeling viral genome fitness evolution associated with serial bottleneck events: Evidence of stationary states of fitness. J. Virology, 76(17):8675–8681, 2002. 39. C. H. Lee, D. L. Gilbertson, I. S. Novella, R. Huerta, E. Domingo, and J. J. Holland. Negative effects of chemical mutagenesis on the adaptive behavior of vesicular stomatitis virus. J. Virology, 71:3636–3640, 1997. 40. L. A. Loeb, J. M. Essigmann, F. Kazazi, J. Zhang, K. D. Rose, and J. I. Mullins. Lethal mutagenesis of HIV with mutagenic nucleoside analogs. Proc. Natl. Acad. Sci. U. S. A., 96:1492–1497, 1999. 41. S. C. Manrubia, E. L´ azaro, J. P´erez-Mercader, C. Escarm´ıs, and E. Domingo. Fitness distributions in exponentially growing asexual populations. Phys. Rev. Lett., 90(18):188102, 2003. 42. R. Miralles, P. J. Gerrish, A. Moya, and S. F. Elena. Clonal interference and the evolution of rna viruses. Science, 285(5434):1745–1747, Sep 1999. 43. H. J. Muller. The relation of recombination to mutational advance. Mutat. Res., 106:2–9, May 1964. 44. S. Ojosnegros, N. Beerenwinkel, T. Antal, M. A. Nowak, C. Escarm´ıs, and

May 7, 2013

11:39

BC: 8846 - BIOMAT 2012

13˙abcj

243

45.

46. 47.

48.

49. 50.

51.

52. 53.

54.

55.

56. 57. 58.

59.

60.

E. Domingo. Competition-colonization dynamics in an rna virus. Proc. Natl. Acad. Sci. U. S. A., 107(5):2108–2112, Feb 2010. S. Ojosnegros, N. Beerenwinkel, and E. Domingo. Competition-colonization dynamics: An ecology approach to quasispecies dynamics and virulence evolution in rna viruses. Commun. Integr. Biol., 3(4):333–336, Jul 2010. H. A. Orr. The distribution of fitness effects among beneficial mutations. Genetics, 163(4):1519–1526, Apr 2003. M. Parera, G. Fern` andez, B. Clotet, and M. A. Mart´ınez. Hiv-1 protease catalytic efficiency effects caused by random single amino acid substitutions. Mol. Biol. Evol., 24(2):382–387, Feb 2007. R. M. Ribeiro, L. Qin, L. L. Chavez, D. Li, S.G. Self, and A. S. Perelson. Estimation of the initial viral growth rate and basic reproductive number during acute hiv-1 infection. J. Virology, 84(12):6096–6102, Jun 2010. O. Riordan and L. Warnke. Explosive percolation is continuous. Science, 333:322–324, jul 2011. D. R. Rokyta, C. J. Beisel, P. Joyce, M. T. Ferris, C. L. Burch, and H. A. Wichman. Beneficial fitness effects are not exponential for two viruses. J. Mol. Evol., 67(4):368–376, Oct 2008. R. Sanju´ an, A. Moya, and S. F. Elena. The distribution of fitness effects caused by single-nucleotide substitutions in an RNA virus. Proc. Natl. Acad. Sci. U. S. A., 101:8396–8401, 2004. R. Sanju´ an, M. R. Nebot, N. Chirico, L. M. Mansky, and R. Belshaw. Viral mutation rates. J. Virology, 84(19):9733–9748, 2010. W. E. Severson, C. S. Schmaljohn, A. Javadian, and C. B. Jonsson. Ribavirin causes error catastrophe during Hantaan virus replication. J. Virology, 77:481–488, 2003. S. Sierra, M. D´ avila, P. R. Lowenstein, and E. Domingo. Response of footand-mouth disease virus to increased mutagenesis: influence of viral load and fitness in loss of infectivity. J. Virology, 74:8316–8323, 2000. D. A. Steinhauer, E. Domingo, and J. J. Holland. Lack of evidence for proofreading mechanisms associated with an rna virus polymerase. Gene, 122(2):281–288, Dec 1992. H. W. Watson and F. Galton. On the probability of the extinction of families. J. Anthropol. Inst. Great Britain and Ireland, 4:138–144, 1874. C. O. Wilke. Probability of fixation of an advantageous mutant in a viral quasispecies. Genetics, 163(2):467–474, Feb 2003. C. O. Wilke, J. L. Wang, C. Ofria, R.E Lenski, and C. Adami. Evolution of digital organisms at high mutation rates leads to survival of the flattest. Nature, 412(6844):331–333, Jul 2001. S. Zhou, R. Liu, B. M. Baroudy, B. A. Malcolm, and G. R. Reyes. The effect of ribavirin and IMPDH inhibitors on Hepatitis C virus subgenomic replicon RNA. Virology, 310:333–342, 2003. M. P. Zwart, J.-A. Dar` os, and S. F. Elena. One is enough: In vivo effective population size is dose-dependent for a plant RNA virus. PLoS Pathog, 7(7):e1002122, 2011.

May 7, 2013

11:44

BC: 8846 - BIOMAT 2012

14˙tilles

ASSOCIATIVE LEARNING OF A LEXICON IN A NOISY CROSS-SITUATIONAL SCENARIO

P. F. C. TILLES AND J. F. FONTANARI Instituto de F´ısica de S˜ ao Carlos Universidade de S˜ ao Paulo Caixa Postal 369, 13560-970 S˜ ao Carlos SP, Brazil A popular approach to learning an object-word mapping is the so-called crosssituational learning, which is nothing but a mechanism of sensitivity to covariation such that two events that occur at the same time become associated. Here we present extensive Monte Carlo simulations aiming at measuring the performance of an associative learning algorithm for acquiring a one-to-one mapping between N objects and N words based solely on the co-occurrence between objects and words. In particular, a learning trial in our cross-situational learning scenario consists of the presentation of C < N objects together with a word - the target word - which refers to one of the objects in the context with probability 1 − γ. Hence the parameter γ measures the level of noise, since a confusing episode in which the target word does not refer to any object in the context occurs with probability γ. We find that there is a critical value of the noise parameter γc = 1 − C+1 N above which learning is impossible. We investigate the region close to the phase transition using finite size scaling and found that the learning rate vanishes like (γc − γ)2 .

1. Introduction The origin of human language is truly secret and marvelous, being probably the most fundamental unsolved problem of science1 . We hear often the claim that language is the very feature that distinguishes humans from the other animals and sometimes language is even set above thought, the hallmark of human rationality, as in this quote by Ferdinand de Saussure2 : “Without language, thought is a vague, uncharted nebula. There are no pre-existing ideas, and nothing is distinct before the appearance of language”. Today few researchers subscribe to such a radical viewpoint, but the more established stances exhibit equally controversial claims such as the position that language evolved from animal cognition, not from animal communication3 , to the despair of an entire generation of ethologists who 244

May 7, 2013

11:44

BC: 8846 - BIOMAT 2012

14˙tilles

245

sought for evidences of a continuous line of development between animal communication systems and language4. The current predominant view, at least among linguists, is that language and thought are distinct abilities of the mind5 and a quick comparison between the cognitive and linguistic abilities of apes and parrots appears to be a convincing argument for many researchers. (See Bickerton’s book1 for an ecological interpretation of this comparison.) We must mention, however, that the interplay between language and cognition still is a most controversial issue, the center of which is the so-called Sapir-Whorf or language determinism hypothesis which asserts that language determines thought, or in a weaker version, that language partially influences thought6,7 . There are a few radically distinct approaches to study the emergence of communication codes (i.e., meaning-signal or object-word mappings). On the one hand, there is the approach based on a direct analogy with biological evolution8,9,10,11,12 and make use of the explicit assumption that the communication codes are transmitted from parents to children (vertical transmission in the population genetics jargon) and that possessing an efficient communication code confers a fitness advantage to the individual. On the other hand, there is the cultural-based approach, of which the so-called Iterated Learning Framework (ILF) is the most important representative13. In the ILF there are (typically) only two agents involved the teacher and an initially tabula rasa pupil. After learning, the pupil replaces the teacher and a fresh pupil is introduced in the learning scenario. This procedure is repeated until the communication code becomes fixed. There are other variants of this learning scheme which involves the repeated interaction between the same two individuals, who interchange the roles of teacher and pupil, and are generally known by guessing or naming games14,15 . The lexicon acquisition scenario we consider here differs from the biological and cultural approaches in the sense that the concept of evolution, i.e., variations passed from generation to generation, plays no role. In fact it involves a single agent (the pupil) who tries to learn a fixed vocabulary, modeled here by a one-to-one object to word mapping. In that sense, our approach bears relevance on the way children learn a vocabulary from the observation of language-proficient adults16 . More specifically, here we use the cross-situational learning framework, which presupposes that one way that a learner can determine the meaning of a word is by finding something in common across all observed uses of that word17,18,19 . Although cross-situational associative learning can be easily implemented and studied through numerical simulations15,20,21,22 , there were

May 7, 2013

11:44

BC: 8846 - BIOMAT 2012

14˙tilles

246

only a few attempts to draw general mathematical conclusions about the validity of this learning strategy23,24 . In this paper, we build on the work of Smith et al.23 , in the sense we use the same minimal learning scenario (see Sect. 2), but we consider a more complex learning task due to the presence of noise in the scenario. The rest of this paper is organized as follows. In Sect. 2 we describe the learning task as well as the linear associative algorithm25 used by the learner. The behavior of the model in the case of learning with noise is described through Monte Carlo simulations in Sect. 3. These simulations revealed an unexpected finding, namely, the existence of a critical value of the noise parameter beyond which learning is impossible. The behavior of the learning algorithm close to this critical value is then investigated using finite-size scaling techniques26 in Sect. 4. In particular, we were able to characterize the dependence on the number of learning trials τ in the critical region and obtain explicit expressions for the scaling functions. Finally in Sect. 5 we summarize our main results and present some concluding remarks.

2. Cross-situational learning scenario and associative learning algorithm A cross-situational learning scenario is defined as a system with N objects, N words and a one-to-one mapping between words and objects. To describe the one-to-one word-object mapping we use the index i = 1, ..., N to represent distinct objects and h = 1, ..., N to represent distinct words. Without lack of generality, the correct mapping is defined by the associations of object i = 1 to the word h = 1, object i = 2 to the word h = 2 and so on. The problem faced by a learner is to determine the correct mapping among all words and objects through a sequence of learning events that allows the learner to exclude incorrect word-object associations. Each learning event is constituted of a single target word and a context of C + 1 distinct objects to be associated with that word. Under normal conditions (i.e., in absence of noise), the context exhibits the correct object (i.e., the object named by the target word according to the object-word mapping) plus C mismatching objects that are referred to as confounders. Noise is added to the scenario via a probability γ ∈ [0, 1] that the correct object does not appear in the context. In order to tackle this lexicon learning task the learner is equipped with a linear learning model25 . The basic assumption is that learning can be

May 7, 2013

11:44

BC: 8846 - BIOMAT 2012

14˙tilles

247

modeled as a change in the confidence with which the learner associates the target word to a certain object in the context. This confidence is represented by a matrix whose non-negative integer entries pih yield a value for the confidence with which the word h is associated to object i. Before any learning experience takes place all confidences are set to zero, i.e., pih = 0 for i, h = 1, ..., N , ad whenever object i∗ appears in a context with target word h∗ the confidence pi∗ h∗ increases by one unit. To determine which object corresponds to word h the learner simply chooses the object index i for which pih is maximum. In the case of ties the learner selects one object at random among those that maximize the confidence. From the definition of the correct word-object mapping the learning algorithm achieves a perfect performance when phh > pih for all h and i 6= h. To measure the learning efficiency we define a single word learning error function ǫsw (τ ) as the fraction of wrong word-object associations at a given trial τ : if phh > pih for a given word and all objects i distinct from h then ǫsw = 0; for the cases when phh = pih for n different values of i then ǫsw = n/ (n + 1). In the noiseless case (γ = 0) we have phh ≥ pih with i 6= h since object i = h must appear in the contexts of all learning events. But in the presence of noise (γ > 0) it is possible to have phh < pih for i 6= h, thus the learning error is ǫsw = 1. A salient feature of the linear learning algorithm is the fact that words are learned independently, which allows the description of a simplified version of the problem in which a given word h appears in all learning trials. Although a generalization to the problem of learning an entire lexicon is easily obtained given the sampling structure of different words27 , in this work we will focus on the single-word learning problem only.

3. Single-word learning To begin our description of the single-word learning case we will start with the noiseless case (γ = 0). The initial condition for the system prior to any learning event is such that the target word may be associated to any of the N objects, so the learning error is equal to (N − 1) /N . The context in the first trial contains C confounders, leading to the learning error ǫ (τ = 1) = C/ (C + 1). As each confounding object has a probability C/ (N − 1) of being chosen to compose the context, for the next trial the evolution of the learning error is probabilistic: it may either be the same (when the same context is repeated) or decrease to a n/ (n + 1) value when only n < C objects of the first context appear again. For subsequent trials

11:44

BC: 8846 - BIOMAT 2012

14˙tilles

248

1.0

sample

0.8

Εsw

May 7, 2013

ç ç ç çç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ççç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç çç ç ç ç

ç çç ç

0.6 ç ç çç ç ç

ç

ç ç ç ç çç ç ç çç çç çç çç ç ç ç çç ç ç ç çç ç ç ç ç ç çç ç ç

0.4 0.2 0.0

ç ç ç ç ç

0

ç ç çç çç

200

400

600

ç ç ç ç ç ç ç ç ç ç ç ç ç ç

çç ç çç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç

800

1000

Τ Figure 1. Learning error as function of the number of learning trials τ for a single run of the associative learning algorithm for N = 20, C = 5 and γ = 0.69. This figure illustrates the stochastic nature of the learning process. The lines are guides to the eye.

the probability of repeating the context becomes smaller, and the learning error behaves as a decreasing function of the number of trials. A mathematical description of the noiseless learning task becomes straightforward when one recognizes that the learning error at a given trial is only dependent on the last realization and may never exceed its previous value. The description is given by a Markov chain W (τ + 1) = W (τ ) T,

(1)

where W (τ ) gives the probability of every error state n = 0, 1/2, ..., C/ (C + 1) at trial τ and T is a (C + 1) × (C + 1) transition matrix whose entries Tmn yield the probability that the error at a certain trial is n/ (n + 1) given that the error was m/ (m + 1) in the previous trial. Since the transition matrix is triangular, the average learning error at any given trial is a linear combination of exponential functions in the form exp (−τ ln λn ), where λn are the eigenvalues of T (the diagonal elements that give the probabilities that the system remain in each error state for the subsequent trial). When noise is added to the system, the fact that at each trial there is a probability γ that the correct object does not appear in the context allows the occurrence of a situation where the confidence value is greater for a wrong object than for the correct one. As such event may take place at any

May 7, 2013

11:44

BC: 8846 - BIOMAT 2012

14˙tilles

249

given trial, the learning error is not a decreasing function of τ anymore, as can be seen in Fig. 1: while for the noiseless case the only allowed transitions are from higher to lower error values, in a sample with noise there are multiple transitions among different error values, even though the asymptotic behavior is hǫi → 0 as τ → ∞ (as long as γ does not exceed a certain threshold γc , as will be discussed later). This change in the behavior due to the influence of noise not only affects the average learning error but also impacts on the mathematical framework used to describe the system, since there is a major problem with the Markov chain description: the knowledge of a current error value does not enable one to determine its next one because it is necessary to know exactly how the confidences are distributed among the objects, i.e., there is a memory effect on the error states. Given the differences between the noiseless and the noisy learning scenarios, it is necessary to begin the analysis of the noise effects through simulations to gain insights about the possibility of an analytical mathematical description. As expected, the increase in the number of contexts in which the correct object is absent results in a slower decaying to zero of the average learning error, as illustrated in Fig. 2. In addition, instead of the pure exponential decay exhibited in the noiseless case, the asymptotic behavior of the average learning error for γ < γc takes the form ǫsw ≍ τ −1/2 exp (−ατ )

(2)

where α = α (N, C, γ) can be loosely interpreted as a learning rate. The nonlinear way the noise affects the system indicates that as γ increases the learning rate α tends to zero. This feature becomes evident when one looks at the average learning error for the whole range of γ, as shown in Fig. 3. Perfect learning (ǫsw = 0) is eventually achieved for τ sufficiently large as long as γ is smaller than a critical value γc , while above this threshold the probability of misguessing the object associated to the target word is high enough to account for ǫsw → 1 as τ → ∞. In order to determine γc it is necessary to observe that at this critical noise parameter the average learning error is constant and equal to the error in the situation prior to the learning trials, i.e., ǫcsw = ǫsw (τ = 0) =

N −1 . N

(3)

When equating this result with the average learning error at τ = 1 one

11:44

BC: 8846 - BIOMAT 2012

14˙tilles

250

100

Εsw

10-1 10-2 10-3 10-4

0

50

100 Τ

150

200

100 10-1 Εsw

May 7, 2013

10-2 10-3 10-4

0

2000

4000

6000

8000

10 000

Τ Figure 2. Average learning error ǫsw as function of the number of learning trials τ for N = 10 and C = 2. The upper panel shows the results for (left to right) γ = 0.05, 0.1, ..., 0.6, whereas the lower panel for γ = 0.65, 0.66, 0, 67, 0.68 and 0.69. The slopes of the straight lines yield the learning rates α defined by Eq. (2).

obtains C+1 . (4) N Further inspection of Eqs. (3) and (4) shows that the borderline between learning and non-learning occurs when all objects are equally likely of γc = 1 −

11:44

BC: 8846 - BIOMAT 2012

14˙tilles

251

1.0 0.8 0.6 Εsw

May 7, 2013

0.4 0.2 0.0

0

5

10 Τ

15

20

Figure 3. The average single-word learning error as function of the number of learning trials for N = 5, C = 1 and (bottom to top) γ = 0, 0.1, 0.2, ..., 0.9. The critical value of the noise parameter is γc = 0.6 at which ǫcsw = 0.8. The symbols are the simulation results and the lines are guides to the eyes.

being selected to compose the context, and although we have no compelling argument to prove its validity, the expressions for ǫcsw and γc proved correct for a vast selection of values of N and C. The existence of a critical noise parameter γc suggests that instead of looking at the asymptotic behavior of the average learning error we may focus our attention on the vicinity of the critical point for finite values of τ and use a finite size scaling approach26 to determine the scaling function and the critical exponent that governs the manner the learning rate α vanishes as the critical point is approached. This analysis is carried out in the next section. 4. Finite size scaling analysis Our first goal is to determine how the average learning error behaves near the critical noise parameter for different values of τ , as shown in Fig. 4. As the number of trials τ increases, the difference between the learnable and the unlearnable task regimes becomes evident: for γ < γc we have ǫ → 0, and ǫ → 1 for γ > γc . All curves intersect at γ = γc for which the average error is a constant given by Eq. (3). The key insight is obtained when one considers the average learning

11:44

BC: 8846 - BIOMAT 2012

14˙tilles

252

1.0 0.8 0.6 Εsw

May 7, 2013

0.4 0.2 0.0 -0.10

0.00 Γc -Γ

-0.05

0.05

0.10

Figure 4. Average learning error as function of γc − γ for N = 10 and C = 1. On the positive region, from top to bottom, the symbols are the simulation results for τ = 100, 200, 300, 500, 700 and 1000. The lines are just guide to the eyes.

error as a function of the reduced variable (γc − γ) τ 1/2 , as exhibited in Fig. 5. Use of the reduced variable produces the collapse of the data for different τ into a single C− and N -dependent curve. These curves, termed scaling functions in the statistical mechanics jargon, are very well described by the ansatz ǫsw (τ ) =

h i 1 erfc A (N ) + B (γc − γ) τ 1/2 , 2

(5)

which has a single fitting parameter B because at γ = γc we know that the average learning error is given by Eq. (3) and so we can use this information to obtain the parameter A. In fact, a trite algebraic manipulation yields −1

A (N ) = ercf

2 (N − 1) . N

(6)

Taking the limit τ → ∞ in Eq. (5) and using the asymptotic expansion √ 2 2 erfc (x) ≍ e−x /π x we find the learning rate α ∝ (γc − γ) . The extremely good quality of the fitting can be appreciated in Fig. 5, which illustrates the power of the finite-size scaling technique to shed light on problems that are analytically intractable.

11:44

BC: 8846 - BIOMAT 2012

14˙tilles

253

1.0 0.8 0.6 Εsw

May 7, 2013

0.4 0.2 0.0 -3

-2

-1

0 HΓc -ΓLΤ

1

2

3

12

Figure 5. Average learning error as function of (γc − γ) τ 1/2 for N = 10 and, from bottom to top, C = 0, 1, 2 and 3. The symbols are the simulation results and the curves joining them are given by the function (5) with the parameter B obtained from the fitting of the data.

5. Conclusion The characteristic of our model that allowed extremely efficient Monte Carlo simulations (in all graphs the error bars were smaller than the symbol sizes) is the fact that words are learned independently from each other. In this limited context, the linear associative algorithm considered here corresponds to the optimal learning strategy. Clearly, more efficient learning strategies are available if one allows interactions between words. For example, the mutual exclusivity constraint28 directs children to map novel words to unnamed referents: if there are two objects present and one has a known name, the child infers that the novel name refers to the second object. However, it is not clear what would be the effect of noise for such inflexible built-in principles. In any event, the results presented here can be viewed as upper bounds to the learning error of more sophisticated associative or hypothesis-testing learning mechanisms. In summary, this paper offers a successful application of finite-size scaling, a method from statistical physics26 , to characterize the dependence on the number of learning trials near the critical noise parameter29 . Similar rewarding uses of this technique have been made for a variety of problems

May 7, 2013

11:44

BC: 8846 - BIOMAT 2012

14˙tilles

254

outside the realm of physics, such as combinatorial optimization30 , ecology31 and prebiotic evolution32 , to name just a few. Acknowledgments This research was supported by The Southern Office of Aerospace Research and Development (SOARD), Grant No. FA9550-10-1-0006, and Conselho Nacional de Desenvolvimento Cient´ıfico e Tecnol´ogico (CNPq). P.F.C.T. was supported by Funda¸ca˜o de Amparo `a Pesquisa do Estado de S˜ao Paulo (FAPESP). References 1. D. Bickerton, Adam’s Tongue: How Humans Made Language, How Language Made Humans (Hill and Wang, New York, 2009). 2. F. de Saussure, Course in General Linguistics (McGraw-Hill Book Company, New York, 1966). 3. I. Ulbaek, The origin of language and cognition, in Approaches to the Evolution of Language Eds.: J. R. Hurford, M. Studdert-Kennedy and C. Knight (Cambridge University Press, Cambridge, UK, 1998). 4. M. D. Hauser, The Evolution of Communication (MIT Press, Cambridge, MA, 1997). 5. S. Pinker, The Language Instinc (The Penguin Press, London, 1994). 6. P. Lee, The Whorf Theory Complex: A Critical Reconstruction (John Benjamins, New York, 1996). 7. J. F. Fontanari and L. I. Perlovsky, Neural Networks 21, 250 (2008). 8. J. R. Hurford, Lingua 77, 187 (1989). 9. M. A. Nowak and D. C. Krakauer, Proc. Natl. Acad. Sci. USA 96, 8028 (1999). 10. J. F. Fontanari and L. I. Perlovsky, Phys. Rev. E 70, 042901 (2004). 11. J. F. Fontanari and L. I. Perlovsky, IEEE Trans. Evol. Comput. 11, 758 (2007). 12. J. F. Fontanari and L. I. Perlovsky, Theory Biosci. 127, 205 (2008). 13. H. Brighton, K. Smith and S. Kirby, Phys. Life Rev. 2, 177 (2005). 14. V. Loreto and L. Steels, Nature Physics 3, 758 (2007). 15. J. F. Fontanari and A. Cangelosi, Interaction Studies 12, 119 (2011). 16. P. Bloom, How children learn the meaning of words (MIT Press, Cambridge, MA, 2000). 17. S. Pinker, Language learnability and language development (Harvard University Press, Cambridge, MA, 1984). 18. L. Gleitman, Language Acquisition 1, 1 (1990). 19. J. M. Siskind, Cognition 61, 39 (1996). 20. A. D. M. Smith, Lecture Notes in Artificial Intelligence 2801, 499 (2003). 21. A. D. M. Smith, Artificial Life 9, 557 (2003).

May 7, 2013

11:44

BC: 8846 - BIOMAT 2012

14˙tilles

255

22. J. F. Fontanari, V. Tikhanoff, A. Cangelosi, R. Ilin and L. I. Perlovsky, Neural Networks 22, 579 (2009). 23. K. Smith, A. D. M Smith, R. A. Blythe and P. Vogt, Lecture Notes in Computer Science 4211, 31 (2006). 24. R. A. Blythe, K. Smith and A. D. M. Smith, Cognitive Science 34, 620 (2010). 25. R. R. Bush and F. Mosteller, Stochastic Models for Learning (Wiley, New York, 1955). 26. V. Privman, Finite-Size Scaling and Numerical Simulations of Statistical Systems (World Scientific, Singapore, 1990). 27. P. F. C. Tilles and J. F. Fontanari, Journal of Mathematical Psychology 56, 396 (2012). 28. E. M. Markman, Cognitive Science 14, 57 (1990). 29. P. F. C. Tilles and J. F. Fontanari, Europhysics Letters 99, 60001 (2012). 30. S. Kirkpatrick and B. Selman, Science 264, 1297 (1994). 31. J. R. Banavar, J. L. Green, J. Harte and A. Maritan, Phys. Rev. Lett. 83, 4212 (1999). 32. P. R. A. Campos and J. F. Fontanari, Phys Rev E 58, 2664 (1998).

May 7, 2013

12:8

BC: 8846 - BIOMAT 2012

15˙barsanteetall

RELATIONSHIP BETWEEN RAINFALL AND CONTROL EFFECTIVENESS OF THE AEDES AEGYPTI POPULATION THROUGH A NON-LINEAR DYNAMICAL MODEL: CASE OF LAVRAS CITY, BRAZIL

L. B. BARSANTE, R. T. N. CARDOSO, J. L. ACEBAL Centro Federal de Educa¸c˜ ao Tecnol´ ogica de Minas Gerais Av. Amazonas 7675, Nova Gameleira, Belo Horizonte, Minas Gerais, Brasil E-mail: [email protected] M. M. MORAIS, A. E. EIRAS Laborat´ orio Universidade Federal de Minas Gerais Av. Presidente Antonio Carlos, 6.627, Pampulha, Belo Horizonte, Minas Gerais, Brasil

Despite the efforts spent to control the dengue vector, Aedes aegypti, by public managers and society, many cases of dengue have been reported periodically in the world, mainly in tropical and even subtropical regions. Although the budget of government agencies for the control of this vector has been growing every year worldwide, in most countries these resources are often scarce. It is, therefore, desirable to improve the effectiveness of the control actions of Ae. aegypti population to acceptable levels in the environment. Among the possible actions to maximize this efficiency, there lies the determination of the best period of year to perform the public health control actions. In Brazil, unlike the actions recommended by the Ministry of Health to provide control in the beginning of wet seasons, experts conjecture that this period of the dengue vector control should be advanced to the cold and dry seasons. By such purposed approach, there are expectations in to occur the decrease in the number of infections of dengue fever in the subsequent wet season, thereby incurring a lower cost and lower economic social impact. In order to validate the above conjecture, it is proposed and analyzed a mathematical model composed of four populations expressed through non-linear differential equations to describe the population dynamics of the life stages of development of the Ae. aegypti. The coefficients of the model are set to be dependent to the rainfall index. In this model, it is possible to implement the control in any week of the year, and to compare their relative efficacies. The model was evaluated by using the actual rainfall data of city of Lavras-Minas Gerais, Brazil. The model is numerically solved with M AT LAB r R2009b.

256

May 7, 2013

12:8

BC: 8846 - BIOMAT 2012

15˙barsanteetall

257

1. Introduction Dengue has become in recent years the major international public health concern. It is currently estimated there may be 50100 million dengue infections worldwide every year which occurs mainly in tropical and subtropical regions 1 due to its environmental and climatic dependence 2 . Several species of mosquitoes of the genus Aedes can perform transmission of dengue virus, but the Ae. aegypti, is the main dengue vector mainly because the its adaptation to urban areas. The mosquito has the complete life cycle consisting of four development stages: eggs, larvae, pupae and adult. Although the budget of government agencies for the control of this vector has been growing every year worldwide, in most countries these resources are often scarce. It is therefore essential to optimize the effectiveness of the control control actions of the population of Ae. aegypti to acceptable levels in the environment. The connexion between the mosquito development stages to climate causes dengue fever to be a seasonal disease, with increasing occurrence in coincidence with the summer due to increasing rainfall and high temperatures, which favour the reproduction and survival of the vector 3 . Usually, the dengue vector control actions are not permanent throughout the year, occurring often in periods in which the vector population increase with the rainfall 4 . In Brazil, as recommended by the Ministry of Health, the disease prevention efforts come into effect after the rainfall season beginning. Therefore, a set of urban conditions cause the dengue vector, Ae. aegypti to find breeding sites outside the seasonal period 5 , which makes it able to transmit the disease throughout the year. One of the authors, expert in Ae. aegypti ecology and control, conjectures that the period of the public health dengue vector control should be advanced to the cold and dry seasons of the year to reduce the number of annual infections, incurring in less cost, and in less social impact, unlike the actions recommended by the Brazilian Ministry of Health. In this work, we proposed and analyzed a computer-mathematical model of four populations expressed through non-linear differential equations to describe the population dynamics of the stages of the life cycle of the vector. The model was evaluated by using the actual rainfall data of city of Lavras-Minas Gerais, Brazil. The model was numerically solved with M AT LAB r R2009b.

May 7, 2013

12:8

BC: 8846 - BIOMAT 2012

15˙barsanteetall

258

2. Formulation of the model The present mathematical model is a dynamical system expressed by nonlinear differential equations. This system describes the dynamics of four populations of the development stages of Ae. aegypti on the linear variation of weekly cumulative rainfall for a particular city. The egg population is represented by E(t), the aquatic population (larvae + pupae) is represented by A(t), the population of females pre-bloodmeal is represented by F1 (t) and the population of mated females, or post-bloodmeal females, is represented by F2 (t).  dE E(t)   = φ(p) 1 −  C(p) F2 (t) − σA (p)E(t) − µE (p)E(t) − cE (t)E(t)  dt        dA     dt = σA (p)E(t) − γ(p)A(t) − µA (p)A(t) − cA (t)A(t)   dF1   = γ(p)A(t) − β(p)F1 (t) − µF1 (p)F1 (t) − cF1 (t)F1 (t)    dt         dF2 = β(p)F1 (t) − µF (p)F2 (t) − cF (t)F2 (t) 2 2 dt

(1) The parameters t and p stand, respectively for time and rainfall index. The first equation of the model (1) describes the rate of variation of the population of eggs E(t), where the population of post-bloodmeal females F2 (t) contribute laying eggs at a oviposition rate per unit of individual

E(t) given by φ(p) 1 − C(p) . The parentheses enclose a factor that mitigates φ(p) as the population of this stage is sufficiently large if compared with the value of carrying capacity C(p), which represents the environmental capacity to support the individuals life associated with the abundance of nutrients, availability of breeding sites, among others. The factor σA (p) stands for the portion of eggs that develop per unit time from the egg stage for the aquatic population as larvae and pupae. The population E(t) suffers decrease due to natural mortality of the species at a rate per unit of individual µE (p) and also due to additional mortality at a rate per unit of individual cE (t). The second equation of the model (1) describes the rate of variation of the population aquatic A(t). This population suffers an accretion rate due the term with factor σA (p). The term γ(p) is the rate per unit of individual tha develops from the aquatic stage the winged stages represented by F1 (t) and F2 (t). The population A(t) suffers a decline

May 7, 2013

12:8

BC: 8846 - BIOMAT 2012

15˙barsanteetall

259

due to natural mortality of the specie at a rate per unit of individual given by the rate µA (p) and, also, because of the additional mortality caused by control actions at a rate per unit of individual cA (t). The rate of variation of the population of pre-bloodmeal females F1 (t) is described by the third equation. This population increases at a rate per unit of individual given by γ(p) and declines due to the rate per unit of females that mate, have blood meal and migrates to the F2 (t) at a rate per unit of individuals β(p). This population also declines due to natural mortality of the species at a rate per unit of individual given by µF 1 (p) and, also, because of the additional mortality caused by control actions at a rate per unit of individual given by cF1 (t). The last equation of the model (1) represents the rate of variation of the population of post-bloodmeal females F2 (t). This population increases at the rate per unit given by β(p), and suffer a decline due to natural mortality of the species at a rate per unit of individual given by µF 2 (p) and the additional mortality due to control actions at a rate per unit of individual given by cF2 (p). The model (1) is essentially non autonomous, since their coefficients, or rates of the system depend on time and on rainfall and this, in turn, are function of the time. Despite of that, since we deal with discretization of the model for numerical solving, the model behaves as a succession of autonomous models with coefficients being function by cumulative weekly rainfall. As a first approach of modeling, we adopt a linear parametrization for the dependence of the model parameters with rainfall p = p(t). In this sense, the natural parameters of the model,φ(p), C(p), σE (p), γ(p), µF1 (p), µF2 (p), µA (p), µE (p), β(p) have linear dependence with rainfall index p as the following form exemplified by phi(p): φ(t) = φmin +

(φmax − φmin ) (p − pmin ) (pmax − pmin )

(2)

The pmax and pmin was implemented as a climatic characteristic of the region and the entomological parameters φ, C,σE , γ, µF1 , µF2 , µA , µE , β whose units is days−1 6,7 and control rates cE , cA , cF1 e cF2 adopted the model (1) are illustrated in the table (1).

May 7, 2013

12:8

BC: 8846 - BIOMAT 2012

15˙barsanteetall

260 Table 1. Interval of the model parameters. The maximum and minimum values are associated to the maximum and minimum week rainfall index pmax and pmin , respectively. Rate

Range

Rate

Range

Rate

0, 56 − 11, 2

C

0, 03 − 1

σE

µE

0, 01 − 0, 01

cE

0.3 − 0.3

γ

0, 06 − 0, 16

µA

0, 164 − 0, 164

cA

0.3 − 0.3

µF1

0, 043 − 0, 17

0-0

β

0, 2 − 0, 2

µF2

0, 057 − 0, 17

φ

cF1 = cF2

Range 0, 01 − 0, 5

3. Analysis of Equilibrium and Stability The system (1) exhibits two equilibrium points: P0 (E ∗ , A∗ , F1∗ , F2∗ ) = (0, 0, 0, 0)

(3)

P1 (E ∗ , A∗ , F1∗ , F2∗ ) = 1 σA γ β C 1− , E∗, A∗ , F1∗ RM (γ + µA + cA ) (β + µF1 + cF1 ) (µF2 + cF2 ) (4) Where, RM denotes the reproducibility basic rate of the model for the Ae. aegypti population which is given by RM =

φ σA γ β (σA + µE + cE ) (γ + µA + cA ) (β + µF1 + cF1 ) (µF2 + cF2 )

(5)

From (5), it can be noted that by definition, RM ≥ 0, because all parameters are positive, and from (4), there must have RM ≥ 1 for the model to be ecologically meaningful with non-negative populations. The case RM = 1 shows we have P1 (E, A, F1 , F2 ) = P0 (E, A, F1 , F2 ), which consists in bifurcation point. We note that the condition RM > 1 is a necessary condition in order to have populations of Ae. aegypti stages. The study of the stability of these equilibrium points can be evaluated by the eigenvalues of the jacobian matrix B associated to the local linearization of the system (1) around the equilibrium points. The jacobian matrix B evaluated in the equilibrium point (3) is given by   −(σA + µE + cE ) 0 0 φ   σA −(γ + µA + cA ) 0 0  BP0 =    0 γ −(β + µF + cF ) 0 1

0

0

β

1

−(µF2 + cF2 )

May 7, 2013

12:8

BC: 8846 - BIOMAT 2012

15˙barsanteetall

261

Calculating the characteristic polynomial, we obtain: λ4 + [(µF2 + cF2 ) + (σA + µE + cE ) + (γ + µA + cA ) + (β + µF1 + cF1 )]λ3 +

(µF2 + cF2 )[(σA + µE + cE ) + (γ + µA + cA ) + (β + µF1 + cF1 )]λ2 +

(γ + µA + cA )[(σA + µE + cE ) + (β + µF1 + cF1 )]λ2 +

(β + µF1 + cF1 )(σA + µE + cE )λ2 +

(γ + µA + cA )(σA + µE + cE )[(µF2 + cF2 ) + (β + µF1 + cF1 )]λ +

(β + µF1 + cF1 )(µF2 + cF2 )[(γ + µA + cA ) + (σA + µE + cE )]λ +

(β + µF1 + cF1 )(µF2 + cF2 )(γ + µA + cA )(σA + µE + cE ) − σA φβγ = 0 (6) The jacobian matrix B evaluated in equilibrium point (4), is given by

BP1



−φβγσA

 (β+µF1 +cF1 )(µF2 +cF2 )(γ+µA +cA )  σA =  0 0

φ

0

0

−(γ + µA +cA )

0

γ

−(β+µF1 +cF1 )

0

0

β

−(µF2 +cF2 )

RM 0

Calculating the characteristic polynomial, we obtain: λ4 + [(µF2 + cF2 ) + (γ + µA + cA ) + (β + µF1 + cF1 )]λ3 + φβγσA λ3 + (β + µF1 + cF1 )(µF2 + cF2 )(γ + µA + cA ) (µF2 + cF2 )[(γ + µA + cA ) + (β + µF1 + cF1 )]λ2 +

    

May 7, 2013

12:8

BC: 8846 - BIOMAT 2012

15˙barsanteetall

262

(γ + µA + cA )(β + µF1 + cF1 ) +

(µF2

(β + µF1

φβγσA λ2 + + cF1 )(µF2 + cF2 )

φβγσA φβγσA + λ2 + + cF2 )(γ + µA + cA ) (β + µF1 + cF1 )(γ + µA + cA )

σA φβγ σA φβγ σA φβγ + + λ+ (µF2 + cF2 ) (β + µF1 + cF1 ) (γ + µA + cA ) (β + µF1 + cF1 )(µF2 + cF2 )(γ + µA + cA )λ +

σA φβγ −

σA φβγ =0 RM

(7)

The characteristic polynomials (6) and (7) are of order 4 in λ, having the form λ4 + a1 λ3 + a2 λ2 + a3 λ + a4 = 0. According to the criterion of Routh-Hurwitz, the equilibrium points are locally asymptotically stable if the characteristic polynomial associated with the Jacobian matrix has all roots with negative real part, equivalently, we must have a1 > 0, a3 > 0, a4 > 0 and a1 a2 a3 > a23 +a4 a21 . Thus, we find that the equilibrium point (3), and locally and asymptotically stable if 0 < RM < 1 and the equilibrium point (4) and locally and asymptotically stable if RM > 1. 4. Results The model (1) was numerically solved with the Runge-Kutta fourth order method implemented in M AT LAB r R2009b. The actual rainfall data used as input of the set of linear dependent parameters was the weekly accumulated rainfall index of the city of Lavras, Minas Gerais, Brazil, obtained from National Institute For Space Research (INPE). The data refers to the epidemiological weeks 9 to 52 of year the 2009 and the epidemiological weeks 1 to 46 of year the 2010. The maximum environment carrying capacity was set to one, Cmax = 1 which caused the populations sizes to vary as a fraction of the unit. The figures below illustrated evolution of the populations sizes of Ae. aegypti without control comparison to the rainfall index for the city of Lavras, Minas Gerais, Brazil over the period of study.

12:8

BC: 8846 - BIOMAT 2012

15˙barsanteetall

263

100

150

1

100 rainfall (mm)

0.5

1.5

A(t)

200

rainfall (mm)

1

E(t)

May 7, 2013

0.5

0

9

19

29

39 49 7 17 27 epidemiological week

37

47

0

0

50

9

19

29

39 49 7 17 27 epidemiological week

37

47

0

Figure 1. Time evolution of the populations sizes of Ae. aegypti development stages E(t) and A(t) in comparison to the rainfall index for the city of Lavras, Minas Gerais, Brazil over the period of study.

Note that the time evolution of the populations sizes of the development stages E(t), A(t) and F1 (t) follows the curve of rainfall with a certain delay between the peaks of approximately one week, and population F2 (t) introduces a delay of approximately two weeks. The control action was modeled over the populations in E(t) and A(t), because the public health programs usually perform control by mechanical removal of breeding sites. We focused our analysis in the population F2 (t), because that population is responsible for the infection of dengue fever and, also, because this population is captured by adult traps placed as breeding sites. The values of these controls were estimated and set to be constant over a specifyied week where cE (t) = cA (t) = 0.3. The model was evaluated in three ways: firstly, with control placed in a low rainfall index week (LRW), secondly, with control placed in a high rainfall index week (HRW) and, finally without control week (WCW). For evaluation, there was defined relative differences N (t) and M (t), involving the in pairs LRW and WCW cases as well as HRW and WCW cases, respectively.

N (t) =

[F2 (t) LRW] − [F2 (t) WCW] [F2 (t) WCW]

12:8

BC: 8846 - BIOMAT 2012

15˙barsanteetall

264

0.6

150

0.4

100

0.4

100

0.2

50

0.2

50

0

9

19

29

39 49 7 17 27 epidemiological week

37

47

0

F2(t)

rainfall (mm)

150

rainfall (mm)

0.6

F1(t)

May 7, 2013

0

9

19

29

39 49 7 17 27 epidemiological week

37

47

0

Figure 2. Time evolution of the populations sizes of Ae. aegypti development stages F1 (t) and F2 (t) in comparison to the rainfall index for the city of Lavras, Minas Gerais, Brazil over the period of study.

M (t) =

[F2 (t) HRW] − [F2 (t) WCW] [F2 (t) WCW]

The figure below illustrates the comparison between the effectiveness of control performed in a low rainfall index at week 27 (LRW-27) and high rainfall index at week 47 (HRW-47). The curves represent the relative differences of the model time evolution with control performed at the specific epidemiological week to the model time evolution without control (WCW). The M (t) and N (t) are respectively associated to HRW with WCW and LRW with WCW.

12:8

BC: 8846 - BIOMAT 2012

15˙barsanteetall

265

0% −5%

M(t) N(t)

−10% −15% Relative Differences

May 7, 2013

−20% −25% −30% −35% −40% −45%

9 13 17 21 25 29 33 37 41 45 49 1 5 9 13 17 21 25 29 33 37 41 45 49 Epidemiological Weeks

Relative differences the population F2 (t) with control at week 27 (LRW) compared to the case without control week (WCW), and with control at week 47(HRW) in compared to the case without control week (WCW) for the city of Lavras, Minas Gerais, Brazil. Figure 3.

Next, areas enclosed by the curve of population size depletion were evaluated by integrals over period of study ILRW

Z = N (t)dt I

IHRW

Z = M (t)dt I

Finally, the relative percentage difference of the population size F2 (t) depletion area was defined in order to compare the effectiveness of the control actions performed in HRW and against that performed in LRW. R=

ILRW − IHRW × 100% IHRW

(8)

The table (2) illustrates the relative percentage difference of the graphics area of population dynamics of F2 (t) of the model (1) conducting control over three weeks of low rainfall index (LRW) and over three weeks of high rainfall index (HRW), calculated by the equation (8) and, in pairs, there were considered all the possible combinations of these weeks.

May 7, 2013

12:8

BC: 8846 - BIOMAT 2012

15˙barsanteetall

266 Table 2. Difference in percentage of the population of the area F2 (t) of the model (1) performing control actions over a LRW and over a HRW for the city of Lavras, Minas Gerais, Brazil.

Population F2 (t)

XX XXX HRW XXX LRW 24 27 30

38

47

-28.31 % 209.00 % 96.79 %

146.62 % 962.95 % 576.95 %

51 30.18 % 461.08 % 257.33 %

Analyzing this data set, we found that when they performed the control in epidemiological weeks of low rainfall, there is a difference in percentage of area population of F2 (t) of the model (1) compared with control in epidemiological weeks of heavy rain. The only week that the control in LRW was found to be disadvantageous in terms of the post-bloodmeal F2 (t) females population size in comparison to the control over a HRW occurred for the case of weeks (24, 38). The control in LRW was found to be significantly advantageous than the control in HRW for the pairs of weeks (24, 51) , (24, 47) and (30, 38), ranging from 30.18% to 146.62%. In the remaining weeks, the control in LRW was very favorable, ranging from 209% to 962.95%. In this model it is possible to implement the control in any week of the year, and to compare their relative effectiveness. 4.1. Conclusion The modeled control actions showed to be up to 926.95% more efficient if performed over a epidemiological week of low rainfall index than the control perfomed over a epidemiological week of high rainfall index, carried out in relation to epidemiological week of heavy rain, considering the actual data of rainfall accumulated weekly in the years 2009/2010 the city of Lavras, Minas Gerais, Brazil. Therefore, for this case, the conjecture that suggests control in dry season was positively verifyied. Future work should include along with the rainfall, the effect of temperature and humidity in the entomological parameters of the model (1). Optimization methods can be used to provide refinement of the parameters of the model. Similarly, the choice of linear dependence of the parameters of the model (1) with rainfall index can be improved by the optimization process to suitable non linear dependences. Afterwards, the time intervals at which control can be effected can also be an object of study, seeking the

May 7, 2013

12:8

BC: 8846 - BIOMAT 2012

15˙barsanteetall

267

best time to perform control by public health policies. The parametrization should be successively optimized to approach the actual species behaviour with varying climate. Acknowledgments The authors express their acknowledgments to the financial support from the Coordination for Enhancement of Higher Education Personnel - CAPES, the Minas Gerais State Research Foundation - FAPEMIG and also Federal Center for Technological Education of Minas Gerais State CEFETMG. References 1. World Health Organization, Dengue prevention and control, Report by the secretariat (2001). 2. World Health Organization, Dengue haemorrhagic fever prevention and control, Regional Office for South-East Asian (2003). 3. D. J. Gubler, The arboviruses epidemiology and ecology, In: Monath TP., Boca Raton, Florida, 2 (1989). 4. M. de F. Lenzi and L. C. Coura, Preven¸c˜ ao da dengue: a informa¸c˜ ao em foco, in portuguese, Rev. Soc. Bras. Med. Trop., 37 (4) (2004). 5. D. M. Watts, D. S. Burke, B. A. Harrison, R. E. Whitmire and A. Nisalak, Effect of temperature on the vector effectiveness of Aedes aegypti for dengue 2 virus, Am. J. Trop. Med. Hyg., 36 (1987). 6. C. P. Ferreira and H. M. Yang, Estudo da transmiss˜ ao da dengue entre os indivduos em intera¸c˜ ao com a popula¸c˜ ao de mosquitos Aedes aegypti, in portuguese, Tend. Mat. Apl. Comput. Rev., 4 (3) (2003). 7. C. P. Ferreira, S. T. R. Pinho, L. Esteva, F.R. Barreto, V. C. Morato and Silva, and M. G. L. Teixeira, Modelling the dynamics of dengue real epidemics, CNMAC, 3 (2010).

May 7, 2013

13:58

BC: 8846 - BIOMAT 2012

16˙eliseo

SPATIOTEMPORAL DYNAMICS OF TELEGRAPH REACTION-DIFFUSION PREDATOR-PREY MODELS

ELISEO HERNANDEZ-MARTINEZ†, HECTOR PUEBLA‡∗ , TERESA PEREZ-MUNOZ† , MARGARITA GONZALEZ-BRAMBILA‡ AND JORGE X. VELASCO-HERNANDEZ† †

Matem´ aticas Aplicadas y Computaci´ on, Instituto Mexicano del Petr´ oleo Eje Central L´ azaro Crdenas Norte 152 Col. San Bartolo Atepehuacan Azcapotzalco, 07730, D.F. M´exico ‡ Departamento de Energ´ıa, Universidad Autonoma Metropolitana, Azcapotzalco Av. San Pablo No. 180, Reynosa-Tamaulipas Azcapotzalco, 02200, D.F. M´exico ∗ E-mail [email protected]

Reaction-diffusion (RD) equations are commonly used to describe the propagation effects in population interactions in ecology. RD system are described by parabolic partial differential equations (PDE), based on the Fick diffusion equation, hence arbitrary large population speeds are involved. To avoid this unrealistic situation, we introduce the Cattaneo’s diffusion in spatiotemporal models of prey-predator interactions, which leads to more realistic and interesting interactions between populations and can be useful to gain insights to the understanding of food-chain dynamics. The resulting model is the so-called telegraph RD model. Numerical simulations on a predator-prey model with Holling type II functional response and cross-diffusion show the effects of the relaxation time in the dynamic of population interactions.

1. Introduction In the study of dispersive effects in population dynamics and distribution of organism, a simple diffusion models are usually employed 1,2,3,4 . The diffusion model is obtained from the continuity equation and the Fick’s first law. Using a pulse (i.e., Dirac delta) as initial condition, it is well know that solution of diffusion equation is a Gaussian distribution, which describe for very small times the existence of a finite amount of the diffusing substance at large distances from the origin, implying infinite velocities of propagation. For this reason, diffusive models based on the Fick’s law have been subjected to criticisms. Particularly, in ecological interactions, Fickian diffusion exhibit unrealistic considerations that no organism satisfies 5 . 268

May 7, 2013

13:58

BC: 8846 - BIOMAT 2012

16˙eliseo

269

Indeed, the organisms cannot travel distances in an infinitely small amount of time. To avoid the unphysical effects, produced by traditional diffusion, the use of the Cattaneo’s diffusion 6 in population interaction models has been proposed 7 . This generalization leads to hyperbolic partial differential equation commonly called the telegraph reaction diffusion (TRD) equation. In the TRD model the displacement of organisms/particles exhibit finite velocities and persistence trends, this indicate that TRD models can be a good alternative to describe the behavior of population interactions7 . Unlike reaction-diffusion models, based on Fick’s law, the TRD model have poorly studied and the most of the works have been focused to the identification of mathematical properties. For example, Prajneshu 10 (1980) proposed the incorporation of a time delay in a two interacting species model. He analyzed two special kernels, delay and random telegraph kernels, and discussed the effect of the time delay on the stability of the system. Holmes 7 (1993) analyzed the mathematical properties in invasion systems for both models (diffusion and telegraph), finding that both models predict very similar dispersal patterns for nonreproducing organisms. However, they predict grossly different rates of range expansion for all but a small range of parameters values. Ahmed et al. 11 , (2001), introduced the two dimensional TRD equation for interacting systems and as an example they studied the predator-prey system, identifying that under some conditions the system has a homogeneous steady states. In addition to these theoretical aspects, an important interest lies in the behavior of numerical approximations exhibiting spatial patterns. Fahmy and Abdusalam 12 , (2009) used the factorization method to find an explicit particular solutions for three population interaction models, a system of two reaction-diffusion equations, a simple epidemic model for the spatial spread and a telegraph predatorprey model. The method proposed is efficient and can be applied to solving spatiotemporal ecological models. We propose numerical schemes based on finite difference methods. Moreover, we devote ourselves to the study of two important aspects of these patterns, the relaxation time and the crossdiffusion. Indeed, studies on stability mechanism and bifurcation analysis of a system of interacting populations by the combined effect of delay time and cross-diffusion become an important issue in ecology. For the numerical study two interactions models are considered: (i) a predator-prey model with the Holling type II functional response and (ii) a TRD predator-prey model with cross-diffusion. We find that the TRD model can be a good alternative to describe the dynamic of population interactions where exist presence of time delays respect to the classical models.

May 7, 2013

13:58

BC: 8846 - BIOMAT 2012

16˙eliseo

270

The paper is organized as follows: In the next Section for the sake of completes, a brief discussion of spatiotemporal prey-predator interactions is presented. In Section 3, the proposed spatiotemporal model of preypredator interactions with Cattaneo’s diffusion is presented. In Section 3, details of the numerical schemes used to solve the TRD systems are exposed. Next, numerical simulations for the case studies are presented. Finally, some concluding remarks are given in Section 5.

2. Spatiotemporal Prey-Predator Interactions As a one of benchmark types of ecological interactions of multi-species, prey-predator interactions have been studied under field observations, laboratory experiments and mathematical modeling. Prey-predator models describes the dynamics of two interacting populations, specifically, a population of preys that could be destroyed on contact with a population of predators. This interaction together with competition and cooperation are the basis of the any ecosystem. The general Lotka-Volterra model is the starting point for a wide variety of models in ecology, biology, economics, chemistry, physics, etc. Several modifications of the classical Lotka-Volterra model have been used in different applications in order to add realism. For instance, Holling 13,14 suggested three different kinds of functional responses for different kinds of species to model the phenomena of predation, to incorporate not only a linear functional response but also satiation and switching behaviors. Other modification was introduced by Beddington 15 (1975) and DeAngelis et al. 16 , (1975), which is a generalization of the type II functional response. Ecological modelling of species dynamics where spatial effects are involved is important, practically useful and fascinating. Indeed, understanding spatial behavior of interacting species in ecological systems is one of the central scientific problems in population ecology. Reaction–Diffusion (RD) equations have been used to describe spatiotemporal dynamics in population dynamics. In general, a classical spatial predator-prey system can be written as a RD with a simple Fickian diffusion mechanism for the spatial dispersal of species describing self and cross diffusion. The term self-diffusion which implies the per capita diffusion rate of each species is influenced only by its own density, i.e. there is no response to the density of the other one. On the other hand, cross-diffusion implies the per capita diffusion rate of each species influenced by the other one. Moreover, RD system are described by parabolic partial differential equations (PDE),

May 7, 2013

13:58

BC: 8846 - BIOMAT 2012

16˙eliseo

271

which has as solution a Gaussian distribution, i.e., at short times the RD equation exhibit propagation of individuals of a population with infinite velocity. However, populations propagation occurs at finite-times, fact that can be modeled using the Cattaneo’s difussion model that leads to a PDE model known as reaction-diffusion telegraph equation. Spatiotemporal oscillation dynamics can arise naturally in a class of RD models of predator-prey interactions. Indeed, since the seminal ideas of Segel and Jackson 17 (1972), a great amount of work has been carried out looking for biological mechanisms of pattern formation via Turing instabilities, which state that a non-linear system is asymptotically stable in the absence of self and cross-diffusion but unstable in the presence of self and cross-diffusion. The origin of these oscillators takes place on a spatial gradient of the specific growth rate of the prey. The stability behavior of a system of interacting populations in the presence of self as well as cross-diffusion in the prey-dependent predator-prey models has received much attention by both ecologists and mathematicians. But in the studies on spatiotemporal predator-prey system with functional response, little attention has been paid to study on the effect of cross-diffusion. We propose and analyze numerically a spatiotemporal prey-predator model described with the TRD model. We consider a prey-dependent predator-prey interacting model with self as well as crossdiffusion and investigate the relaxation time. 3. Telegraph reaction-diffusion equation In ecological system the Fick’s second law is used to describe standard diffusive processes. This law can be derived by the combination of the continuity equation ∂u(x, t) ∂J =− ∂t ∂x

(1)

and a constitutive equation (Fick’s first law), J = −D

∂u(x, t) ∂x

(2)

where J denotes the flux, u(x, t) the distribution function of the diffusing quantity, and D the diffusion coefficient. Substituting Eq. (1) in (2), we obtain the traditional diffusion equation as ∂u(x, t) ∂ 2 u(x, t) =D ∂t ∂x2

(3)

May 7, 2013

13:58

BC: 8846 - BIOMAT 2012

16˙eliseo

272

It is well know that for D constant and initial delta distribution, the solution of (3) is given by a Gaussian distribution. Physically, this distribution implies that for very short times exist infinite velocities of propagation. To avoid this phenomena, which is unphysical, Cattaneo 6 (1948) proposed a variation to Fick’s first law as ∂J ∂u(x, t) J +τ = −D (4) ∂t ∂x where τ is a relaxation time. Substituting Eq. (4) in Eq. (2) leads ∂u(x, t) ∂ 2 u(x, t) ∂ 2 u(x, t) +τ = D (5) ∂t ∂t2 ∂x2 Eq. (5) is the Cattaneo’s diffusion or also called telegraph equation 18 , this equation is a generalization of Eq. (3), in fact, if we consider τ = 0 one recovers the Fickian diffusion. Eq. (5) has been widely studied to description mass and heat transport phenomena 19,18,20 . Considering a initial distribution as Dirac delta, D = 0.01 and τ = 0.1, Figure 1 shows the comparing distributions of Fick and Telegraph’s diffusion, where it is interesting note that the frontal edge of telegraph frequency distribution is abrupt, unlike that for Fick’ diffusion. On the other hand, in ecological models the diffusive phenomena are coupled with reactive terms (i.e., population interactions). Therefore, for a reaction-diffusion system, the mass balance is describe by ∂u(x, t) ∂J =− + f (u(x, t)) (6) ∂t ∂x where f (u) represent the interacting term. Using the Fick’s first law (Eq. 2), we obtain ∂u(x, t) ∂ 2 u(x, t) =D + f (u(x, t)) (7) ∂t ∂x2 and using Eq. (4) in (6), the corresponding TRD equation is obtained ∂ 2 u(x, t) ∂f (u(x, t)) ∂u(x, t) ∂ 2 u(x, t) τ + 1 − τ = D + f (u(x, t)) (8) ∂t2 ∂u(x, t) ∂t ∂x2 Extending the TRD equation to prey-predator interactions with n species, the general TRD model of prey-predators interactions is written as follows n X ∂ 2 uj (x, t) ∂uj (x, t) ∂f (uj (x, t)) ∂ 2 uj (x, t) τ + − τ = D ∂t2 ∂t ∂t ∂x2 j + f (u1 (x, t), u2 (x, t), . . . , un (x, t)) where j = 1, 2, . . . , n.

(9)

13:58

BC: 8846 - BIOMAT 2012

16˙eliseo

273

0.1

Telegraph Diffusion

t=0.4 0.05

Concentration, u

May 7, 2013

0 0

0.2

0.4

0.6

0.8

1

0.2

0.4

0.6

0.8

1

0.1

t=0.7 0.05

0 0

x (space) Comparison between telegraphic and diffusion models considering a pulse in x = 0.5 and zero-flux as boundary conditions. Figure 1.

4. Numerical Schemes Several approaches have been taken to numerically solve reaction-diffusion equations. However, due to its extensive results on stability and efficiency, finite difference (FD) methods are best used to solve problems modeled with reaction-diffusion PDE. Particularly, for numerical solution of ecological models have proposed robust and efficient FD schemes 21 . In this work, we consider traditional FD schemes following some guides providing by Garvie 21 , (2007). Consider an equidistant grid XN +1 = xa , x1 , . . . , xN , xb , where xa = a and xb = b, with xi − xi−1 = h and ui (t) = u(xi , t), one can obtain the semi-discrete form of Eq. (7), as dui (t) ui+1 (t) − 2ui (t) + ui−1 (t) = − f (ui (t)) dt h2

(10)

where i = 1, 2, ..., N . The spatial operator (i.e., second derivative) is approximate by central second-order FD schemes. Numerical methods can be used for solving the N ordinary differential equations (ODEs), given by

May 7, 2013

13:58

BC: 8846 - BIOMAT 2012

16˙eliseo

274

Eq. (10), such as Runge-Kutta methods. An alternative is to perform a full discretization that includes the temporal derivative. In this way, let δ = ∆t be a time step, tk = t0 + kδ and uki = ui (tk ). Assume an implicit Euler discretization of the time derivative in Eq. (10), this is, This leads to the following equation

dui (tk ) dt

≈

k−1 uk i −ui . δ

uk − 2uki + uki−1 uki − uk−1 i = i+1 + f (uki ), k = 1, 2, . . . (11) δ h2 Eq. (11) is a set of algebraic nonlinear equations (ANEs) which for each time tk can be solved using iteration methods (e.g. Newton Raphson, fixed-point). On the other hand, defining w(x, t) = ∂u(x,t) and y(x, t) = ∂f (u(x,t)) , ∂t ∂t then Eq. (8) can be written as the following set of ODE ∂y(x, t) = f (u(x, t)) ∂t ∂u(x, t) = w(x, t) ∂t ∂w(x, t) ∂ 2 u(x, t) τ =D + f (u(x, t)) − w(x, t)(1 − τ y(x, t)) (12) ∂t ∂x2 Eqs. (12) have the same structure of Eq. (7), therefore the numerical schemes given by Eq. (10) can be applied to TRD model, which leads to ∂yi (t) = f (ui (t)) ∂t ∂ui (t) = wi (t) ∂t dwi (t) ui+1 (t) − 2ui (t) + ui−1 (t) τ =D − f (ui (t)) − wi (t)(1 − τ yi (t)) dt h2 (13) and the corresponding implicit method can be obtained using a global discretization. Finally, to complete the spatial and temporal discretization, it is necessary introduce the initial and boundary conditions. For Dirichlettype boundary conditions, the incorporation of nodes in the boundary is direct. On the other hand, with the purpose of avoiding ghost nodes in the derivatives of boundary conditions, since ghost nodes can lead to unphysical behaviors, the discretization is based on first-order FD schemes. 5. Numerical Simulations To appreciate the effect of relaxation time in the spatiotemporal patterns, we present the solution dynamics of two population models based on TRD

May 7, 2013

13:58

BC: 8846 - BIOMAT 2012

16˙eliseo

275

equation (prepator-prey model using Holling type II functional response and cross diffusion). The numerical simulation were carried out using the FD scheme present in the previous section. 5.1. Holling type II functional response According to the functional response of prey-predator interactions, populations can be classified in three types 13,14 , linear (type I), concave downwards (type II), and sigmoid (type III). Type II functional responses are the most frequently studied functional responses and well documented in empirical studies 22,23,24 . Here, we consider a predator-prey RD model is given by 25 ∂u(x, t) ∂ 2 u(x, t) = Du + fu (x, t) ∂t ∂x2

(14)

∂v(x, t) ∂ 2 v(x, t) = Dv + fv (x, t) ∂t ∂x2

(15)

where αu(x, t)v(x, t) γ + u(x, t) v(x, t) fv (x, t) = βv(x, t) 1 − u(x, t)

fu (x, t) = u(x, t)(1 − u(x, t)) −

α, β and γ are positive parameters, and u and v are the population density of predator and prey respectively. Both boundary conditions are described by zero-flux conditions. Eqs. (14) and (15) takes into account the invasion of the prey species by predators but does not include stochastic effects or any influences from the environment. Nevertheless, reaction-diffusion equations modeling predatorprey interactions show a wide spectrum of ecologically relevant behavior resulting from intrinsic factors alone. Using the numerical scheme given by Eq. (10) and parameters α = 3.0, β = 0.05 and γ = 0.2, Figure 2 shows the dynamical interactions, where it is observed a periodic oscillatory behavior. Substituting the Fick’s diffusion by Cattaneo’s diffusion, Eqs. (14)-(15) can be re written as ∂ 2 u(x, t) ∂fu (x, t) ∂u(x, t) ∂ 2 u(x, t) τ + 1 − τ = D + fu (x, t) (16) u ∂t2 ∂v(x, t) ∂t ∂x2 ∂ 2 v(x, t) ∂fv (x, t) ∂v(x, t) ∂ 2 v(x, t) τ + 1 − τ = D + fv (x, t) (17) v ∂t2 ∂v(x, t) ∂t ∂x2

May 7, 2013

13:58

BC: 8846 - BIOMAT 2012

16˙eliseo

276

Population densities based on a reaction-diffusion model with Holling type II functional response, a) predator and b) prey species. Figure 2.

Using the method of lines, Eqs. (13), Figure 3 shows the spatiotemporal profiles of predator density for three values of τ , where periodic oscillatory dynamic can be observed. When the relaxation time has more importance in the population dynamic (τ > 0), it is observed a significative delay respect to the process described by Fick’s diffusion. For τ = 0 the distribution of species consider to short time a distribution to infinite velocity, while for τ > 0 this consideration is avoided generating a delay in the propagation of species. Due to delay in the dynamic is also observed a increase in the population density (Figure 4). 5.2. Predator-prey model with cross-diffusion Cross-diffusion term expresses the population fluxes of one species due to the presence of other species. A positive value of cross-diffusion term denotes the movement of a specie in the direction of lower concentration of another specie and a negative values expresses the population fluxes of a specie in the direction of higher concentration of another specie. The crossdiffusion can induce pattern forming instability in an ecological 26,27,28 . Some mathematical models for population dynamics with the inclusion of

May 7, 2013

13:58

BC: 8846 - BIOMAT 2012

16˙eliseo

277

Figure 3.

Effect of relaxation time in population dynamics.

cross-diffusion as well as self diffusion are developed and showed that the effect of cross-diffusion may give rise to the segregation of two species 29 . We consider a predator-prey model with cross-diffusion as ∂u(x, t) ∂ 2 u(x, t) ∂ 2 v(x, t) = D11 + D12 + fu (x, t) 2 ∂t ∂x ∂x2

(18)

∂v(x, t) ∂ 2 u(x, t) ∂ 2 v(x, t) = D21 + D22 + fv (x, t) 2 ∂t ∂x ∂x2

(19)

where fu (x, t) = u(x, t)(1 − u(x, t)) − fv (x, t) = β

v(x, t)u(x, t) u(x, t) + α

v(x, t)u(x, t) − γv(x, t) u(x, t) + α

13:58

BC: 8846 - BIOMAT 2012

16˙eliseo

278

Population Density, u

1 0.8 0.6

τ=0 τ=0.1 τ=0.2 τ=0.3

0.4 0.2 0 0

20

40

60

80

20

40

60

80

100

120

140

160

180

200

100

120

140

160

180

200

0.2

Population Density, v

May 7, 2013

0.15 0.1 0.05 0 0

Time

Dynamic of the predator-prey interactions in x = 5.0, where it is possible to observe a time delay effect in the population propagation. Figure 4.

where α, β and γ are positive constants; u and v represent the densities of two competing species; D11 and D22 are diffusion coefficients for u and v, respectively. The boundary conditions are zero-flux type. For numerical simulations, we consider the following set of parameters: D11 = 1.0, D12 = D22 = 0.1, D21 = 0.01, α = 0.2, β = 2.0 and γ = 4/5. Figure 5 shows the spatiotemporal profiles of both predator and prey densities, where irregular periodic patterns are observed. The corresponding telegraphic reaction-diffusion model of Eqs. (18)-(19) is given by ∂ 2 u(x, t) ∂fu (x, t) ∂u(x, t) ∂2u ∂ 2 v(x, t) τ + 1 − τ = D11 2 + D12 + fu (x, t) 2 ∂t ∂u(x, t) ∂t ∂x ∂x2 (20) 2 2 2 ∂ v(x, t) ∂fv (x, t) ∂v(x, t) ∂ u ∂ v(x, t) τ + 1−τ = D21 2 + D22 + fv (x, t) 2 ∂t ∂v(x, t) ∂t ∂x ∂x2 (21) Figure 6 shows the spatiotemporal patterns of telegraphic population interaction model with cross-diffusion. For τ > 0, it is possible to observe a rearrangement in the spatiotemporal patterns and as well as in the previous case the dynamical behavior exhibit an delay effect. This result is

May 7, 2013

13:58

BC: 8846 - BIOMAT 2012

16˙eliseo

279

Spatiotemporal patterns from model given by Eqs. (10)-(11), a) predator and b) prey species. Figure 5.

according to report by Holmes 7 (1993), where showed that traveling wave of velocity profiles for telegraph model exhibit lower propagation velocities that classical diffusion model. 6. Conclusions In this work, we have introduced a spatiotemporal prey-predator model based on the Cattaneo’s diffusion law leading to a telegraphic reactiondiffusion equation. Furthermore, we present numerical evidence of the effect of the relaxation time and cross-diffusion in prey-predator models with functional responses. The influence of relaxation time on the spatiotemporal patterns indicate that telegraph reaction-diffusion model can be a good alternative to describe the dynamic of population interactions where exist presence of time delays respect to the classical models. A comparison with predator-prey model with time delay will be presented elsewhere. Acknowledgements The work was supported by Mexican Petroleum Institute (IMP) and Y.00114 SENER-CONACyT Project.

May 7, 2013

13:58

BC: 8846 - BIOMAT 2012

16˙eliseo

280

Figure 6.

Effect of relaxation time in a predator-prey model with cross-diffusion.

References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.

Fisher R.A. Annals and Eugenics, 7, 353-369 (1937). Skellam J.G. Biometrika, 38, 196-218 (1951). Murray J.D. Biomathematics, Springer, Berlin (1989). Okubo, A. Levin, S. Diffusion and Ecological Problems: Modern Perspective, Springer-Verlag, 2001. Turchin P. Commennts on Theorical Biology, 1, 65-83 (1989). Cattaneo G. Atti. Sem. Mat. Fis. Univ. Modena, 3, 83 (1970). Holmes E.E. American Naturalist, 142, 779-795 (1993). Ahmed E., Abdusalan H.A., Fahmy E.S. Int. J. Modern Phys. C, 12(5), 717-726 (2001). Abdusalam A.H. Fahmy E.S. Chaos Solitons and Fractals, 18, 259-266 (2003). Prajneshu A. Math. Biosciences, 52, 217-226 (1980). Ahmed E., Abdusalam H.A., Fahmy E.S. Int. J. Modern Phys. C, 12(10) 1525-1535 (2001).

May 7, 2013

13:58

BC: 8846 - BIOMAT 2012

16˙eliseo

281

12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29.

Fahmy E.S., Abdusalam H.A. Appl. Math. Sci., 3(11) 533-540 (2009). Holling C. S. Canadian Entomologist, 91, 293-320 (1959). Holling C. S. Canadian Entomologist, 91, 385-398 (1959). Beddington, J.R. J. Animal Ecology. 44, 331-340 (1975). DeAngelis D.L., Goldstein R.A., O’Neill R.V. Ecology. 56, 881-892 (1975). Segel L., Jackson J. J. Theor. Biol. 37, 545-559 (1972) Compte A., Meztzler R. J. Phys. A, 30, 7277 (1997). Camera-Roda G., Sarti G. Transp. Theor. Statis. Phys., 15, 1023-1050 (1986). Mishra S., Shai H. Int. J. Heat and Mass Transfer, 55, 7015-7023 (2012). Garvie M.R. Bull. of Math. Biol., 69(3), 931-956 (2007). Skalski G., Gilliam J.F. Ecology, 82(11), 30831792 (2001). Jeschke,J., Kopp M., Tollrian R. Ecol. Monogr., 72(1), 95172 (2002). Gentleman W., Leising A., Frost B., Strom S., Murray, J. Deep Sea Res. II, 50, 28471775 (2003). Sherratt, J. IMA J. of Appl. Math., 73, 759-781 (2008). Kerner E.H. Bull. Math. Biol., 21, 217-255 (1959). McGehee. E., Peacock-Lpez,E. Phys Lett A 1-2, 342, 2005. Wang W., Lin Y., Zhang L., Rao F., Tan Y. Commun. Nonlinear Sci. Numer. Simulat., 16, 2006-2015 (2011). Gurtin M.E. Quart. J. Appl. Math., 32, 1 (1974)

May 7, 2013

15:10

BC: 8846 - BIOMAT 2012

17˙cordovez

282

POPULATION DYNAMICS OF SPIDER MONKEY (ATELES HYBRIDUS ) IN A FRAGMENTED LANDSCAPE IN COLOMBIA ˜ 3,5 , A.G. DE LUNA4 , J.M. CORDOVEZ1 ∗ , J.R. ARTEAGA B.2 , M. MARINO A. LINK4,5 1

Departamento de Ingenier´ıa Biom´ edica, Universidad de los Andes, 2 Departamento de Matem´ aticas, Universidad de los Andes, 3 Departamento de Ingenier´ ıa Ambiental, Universidad de los Andes 4 Fundaci´ on Proyecto Primates, Colombia 5 Departamento de Biolog´ ıa, Universidad de los Andes Bogot´ a D.C., Colombia ∗ Corresponding author: e-mail: [email protected]

Abstract. Mathematical models have been used as a novel approach to better understand the complex spatio-temporal dynamics of animal populations living in fragmented landscapes. In this study we developed a patch model in order to simulate the population dynamics and conservation status of one of the world’s most endangered primates, the brown spider monkey (Ateles hybridus), in a fragmented landscape in Santander, Colombia. Spider monkeys are one of the most vulnerable mammals to habitat destruction and fragmentation given that they have very high ecological requirements and extremely slow reproductive cycles. Wild populations of Ateles hybridus have dramatically declined in the last 45 years due to habitat loss and hunting and the fate of their remaining population largely depends on the conservation strategies and actions that may be implemented in the near future. The model divides the population in isolated patches. The operational population consist of adult males, adult females and sub-adult females in each patch. Parameters such as growth rate, death rate, carrying capacity and time for sexual maturity were estimated for A. hybridus based on the available data on the biology and demography of wild spider monkey populations. The model evaluated the effects of habitat degradation on population size by specifically modeling: (i) patch connectivity (ii) number of patches and (iii) patch size. Our results suggest that given the pervasive fragmentation of brown spider monkey’s habitat, they are at risk of extinction confirming their known sensibility to habitat degradation. The results of the model point out that patch connectivity might be a key variable to be considered in conservation strategies for wild spider monkeys in fragmented landscapes, as it reduces the effects of habitat fragmentation on their population dynamics and thus on their ability to survive in a given patch. Keywords: Ateles hybridus; patch model; conservation biology, population biology.

May 7, 2013

15:10

BC: 8846 - BIOMAT 2012

17˙cordovez

283

1. Introduction Habitat destruction and fragmentation are considered some of the more important factors driving the rapid decline of wildlife.1 The expansion of large-scale anthropogenic activities such as agriculture, cattle ranching and mining, amongst others, have resulted in the transformation of natural ecosystems, as well as on the progressive segmentation of previously continuous habitats.2 Thus, the pervasive process of habitat fragmentation has been suggested to resemble the postulates of Island Biogeography theory.3 This theory predicts that both the size of an island (or forest fragment) and its distance to other islands or mainland would largely determine the probability of species to persist in this isolated islands or fragments.4 Nonetheless, as the remaining forest fragments become more accessible to human populations and are still used for activities such as hunting and timber extraction, the fate of wildlife within it is exposed to further challenges in order not to face local extinctions.5,6 The transformation of tropical forests into agricultural fields and pastures has been suggested to have major impacts in regional climate dynamics.7,8 In Colombia rates of deforestation in lowland forests have been estimated to be as high as 4.5% per year,9 and in particular the tropical lowland forests between the Andes Cordilleras have been estimated to have lost over 80% of their historical forest coverage.10,11 The increasing human population growth and the rapid processes of globalization will certainly continue to pose a strong pressure on the remaining habitats and the wild populations within them.2 Thus understanding the dynamics of wild population living in fragmented landscaped becomes a critical issue for the implementation of conservation strategies and actions that may increase their chances of survival, specially for critically endangered taxa. Primates are one of the most threatened groups of animals in the world, with over half of its species considered to face a serious threat of extinction in the near future. Given that Neotropical primates are almost strictly arboreal, they are presumed to be more heavily impacted by forest fragmentation,12 specially when their ability to move across pastures and intervened areas is extremely low (e.g. spider monkeys: genus Ateles). Spider monkeys (Ateles spp.) are one of the largest Neotropical primates with a large geographical distribution in Central and South America.13 They have been suggested to be one of the most vulnerable mammals to

May 7, 2013

15:10

BC: 8846 - BIOMAT 2012

17˙cordovez

284

anthropogenic disturbances due to their high ecological requirements and the fact that for many local populations they are preferred hunting targets.5,13,14 Adult spider monkeys can weigh up to 10 kg. and heavily rely on ripe fruits in their diet, thus obliging them to use large forest areas in order to fulfill their energetic requirements.15 Spider monkeys have been suggested to employ a behavioral strategy of “maximizing energy input”16 by which they cover large daily paths in search of large fruiting trees.17 Also, spider monkeys have extremely low reproductive cycles and life history variables. Females begin to reproduce when they are around eight years old, and have inter-birth cycles that last approximately three years.18,19 These ecological and developmental constrains make spider monkeys unable to cope with catastrophic events such as that of forest fragmentation and destruction. A comprehensive analysis on the determinants of local extinction for primates and carnivores evidenced that spider monkeys are extremely susceptible to the loss of forest connectivity and habitat fragmentation.5 Brown spider monkeys (Ateles hybridus) are one of the 25 most endangered primates in the world20 and have been classified as critically endangered. They are endemic to Colombia and Venezuela and their potential habitat has been experiencing a dramatic decline due to forest clearance and fragmentation.11 Here we aim to assess the potential effects of habitat fragmentation on the viability of wild brown spider monkeys using mathematical modeling. We implemented a patch model in order to describe the population dynamics of Ateles hybridus which takes into account female dispersal (migration) and the probability of success in the dispersal process. After validating the model, we evaluate the effects of (i) habitat connectivity, (ii) patch size, and (iii) the number of patches on the size and viability of the population of a critically endangered primate. 2. Mathematical Model We implemented a patch model to study population dynamics of Ateles hybridus. The different patches resemble a fragmented landscape in which the species has been studied over the past five years in Colombia. We divided the population into sex categories: males and females, and further subdivide females into young and adult, due to the fact that females are the dispersing sex in spider monkeys (Shimooka et al. 2008). A young female usually acquires sexual maturity at approximately seven years of age, at which point they disperse from their natal group in search for another

May 7, 2013

15:10

BC: 8846 - BIOMAT 2012

17˙cordovez

285

group were they will spend their reproductive life. In the model, the search for new groups requires, from these young females, that they select a target patch, and subsequently, that they successfully cover the distance between the source patch and the target one. However, it is impossible for them to predict whether or not the target patch will have an operational sex ratio that is adequate for them to fit in. In addition, the target patch might also be close to the carrying capacity of the environment and thus the female might have to continue her quest for a suitable patch. To include these observations in the model we considered the following ecological processes: 1) natural per-capita birth and death rates, 2) the average time for females to acquire sexual maturity and 3) a forced migration process, where young females unequivocally disperse from their natal groups after reaching sexual maturity. In the following sections we explain the model assumptions with further detail. 2.1. State variables Each patch is composed by a single group on individuals divided into male and female (see figure 3). To be able to account for female sexual maturity and migration we divided the female population into young and adult classes. We assumed that time to achive sexual maturity occurs around 7 years of age and is represented in the model by rA . Each one of these groups is represented by the following variables where i represents the patch number and n is total number of patches: Mi = number of male individuals in patch i. Yi = number of young females in patch i. Ai = number of adult female in patch i. 2.2. Model parameters Parameters for the model were estimated from previous studies and published data (see Table 1). We assumed that new individuals are the result of births at a per-capita birth rate r [years]. Out of these new individuals a proportion p are male at birth. The life expectancy of the population was considered different for each age-sex category. Thus, 1/δM represents male life expectancy [years], 1/δY is young female life expectancy [years] and 1/δA is adult female life expectancy [years]. Young females would acquire sexual maturity in 1/rA [years], which is equivalent to the average time for them to leave the patch. It has been

May 7, 2013

15:10

BC: 8846 - BIOMAT 2012

17˙cordovez

286

reported for these animals that they tend to segregate in very conservative female to male ratio when they are in ideal ecological conditions (Symington, 1987). We called this ratio the ideal female to male ratio IR which was assumed to be 3 in the model. We computed continuously the actual ratio Ri in the i-patch as Ri = Ai /Mi and used the difference Ri − IR to promote/prevent female arrivals into patch i (see R(Ri ) below). We consider that a dispersing female would move only in the direction of other patches (directional movement) as opposed to random walk. In this way we created a parameter ρij which is an n × n matrix with entries ρij representing the proportion of young females leaving from patch j that n P move in direction to patch i. Note that ρii = 0 and ρij = 1, in other j=1

words females are not allowed to stay in the patch they were born and all females reaching sexual maturity must leave their current group. Once a female identifies a target patch then they will have to overcome the distance between the fragments. Being a fragmented habitat we speculated that there is a probability for that female to reach the target patch that reflects how suitable is the environment for moving. The better the quality (i.e. forest cover as opposed to open fields) the higher the chances of that female will reach the target patch. This probability is represented in the model by τij which is an n × n matrix and each entry τij gives the probability that given a young female leaving from patch j arrives to patch i. The probability of reaching the current patch is equal to one (τii = 1). Young females that reach patch i need to be accepted by the group. We assumed in the model that acceptance of arriving females will be determined by two conditions: 1) that the patch is not over its carrying capacity Ki and 2) that the female to male ratio Ri is not too far from the ideal one IR . For this purpose we included in our model modified Boltzman inactivation functions with the form: (see figure 1 for graphical explanation): f (x) =

1 1 + exp [(x − a)/b]

(1)

Where b is the slope factor and a is the half inactivation point. The effect of female ratio on the acceptance/rejection probability for incoming females on patch i was computed using: R(Ri ) =

1 1 + exp [(Ri − IR )/SR ]

(2)

where SR is the slope of the curve at Ri = IR. Similarly the effect of reaching carrying capacity was included in the

May 7, 2013

15:10

BC: 8846 - BIOMAT 2012

17˙cordovez

287

Fig. 1. graph of the function f (x) for a = 3 and b = 0.2 (dashed), b = 0.4 (solid thin black line) and b = 0.6 (solid gray thick line). Note that y is different from 1 or 0 only at very limited range determined by a and b.

model by: N (Ni ) =

1 1 + exp [(Ni − Ki )/Sk ]

(3)

Where Ni = Mi + Yi + Ai , is the total population size for patch i and Sk is the slope of the curve at Ni = Ki . Both R(Ri ) and N (Ni ) give a high acceptance probability (close to 1) when the population is below the carrying capacity and/or with favorable female to male ratio (less than 3 females per male). On the other hand, if the population is above the carrying capacity and/or with an unfavorable female to male ration (too many females per male) then the probability of being accepted in the patch decreases almost exponentially. Thus the final probability of a female to become part of patch i is given by: (see also Figure 2). H(Ri , Ni ) = R(Ri ) · N (Ni )

(4)

where H(Ri , Ni ) can be seen as the probability that an incoming female that left patch j in direction to patch i, successfully covered the distance and was accepted by the group in patch j. 2.3. Model diagram and equations A three patch example of the model is illustrated in figure 3. Note that only young females move between patches denoted by the curved arrows in the Figure. Mortality rates are different for each group but equal between patches. The proportion of male and females at birth is considered to be 1 : 1.

May 7, 2013

15:10

BC: 8846 - BIOMAT 2012

17˙cordovez

288

Fig. 2. 0.3.

Function H(Ri , Ki ) = R(Ri ) · N (Ki ) with IR = 3, Ki = 4 and SR = 0.2, SK =

Fig. 3. Patch model for Ateles hybridus in an fragmented landscape. Patches are represented by boxes and individual groups by circles. M denote males, Y denote Young Females and A Adult Females. Note that only young females migrate between patches when they reach sexual maturity. dashed arrows within boxes indicate that adult females are the only posible source for both male and young female.

May 7, 2013

15:10

BC: 8846 - BIOMAT 2012

17˙cordovez

289

The rate of change in population size is given by the following system of ordinary differential equations:  dMi   = p · r · Ai − δM · Mi   dt         dYi = (1 − p) · r · Ai − δY · Yi − rA · Yi (5) dt      n  P dAi   = αji · rA · Yj − δA · Ai    j=1  dt i6=j

where all variables and paramters have the meaning refered in section 2.2. Note that the total population of adult females in the ith patch is the result of adding all the young females coming from other patches reaching the ith patch and finally being accepted. As before:  T   αji = τij · ρji · R(Ri ) · N (Ni )        1 R(Ri ) = (6) 1 + exp [(R i − IR )/SR ]        1   N (Ni ) = 1 + exp [(Ni − Ki )/Sk ] 2.4. Steady state

Lets assume for each patch i the coordinate system (Mi , Yi , Ai ) that gives a point in the phase space.The steady state of the model in these coordinates are solutions of right hand side (RHS) of (5), thus: ¯i = 0 p · r · A¯i − δM · M (1 − p) · r · A¯i − (δY + rA ) · Y¯i = 0 n X αji · rA · Y¯j − δA · A¯i = 0

(7) (8) (9)

j=1 i6=j

From (7) and (8) we have the following proportions of males and young females in terms of adult females at any patch at equilibrium: ¯ i = p · r A¯i = φ · A¯i M (10) δM (1 − p) · r ¯ Y¯i = Ai = σ · A¯i (11) rA + δY

May 7, 2013

15:10

BC: 8846 - BIOMAT 2012

17˙cordovez

290

or equivalently we can write a simple relation between males and young females(from (10) and (11)): p rA + δY ¯ ¯ Mi = Yi (12) 1−p δM From equations (10) and (11) we can conclude that: (1) the ratio Ri (i.e. ratio between Ai and Mi ) may change depending of the population dynamics, but at equilibrium is constant and given by: ¯ ¯ i := Ai ⇒ R ¯ i = δM = 1 R ¯i p·r φ M

(13)

(2) at equilibrium the total population at any given patch depend only on the adult females at that patch (equations (10) and (11)), ¯i = λA¯i , N

where

λ=φ+σ+1

(14)

(3) furthermore, at equilibrium, the total population in the ith-patch depends strictly on the number of young females arriving from all patches to ith patch. We can see this using (9) and (14), n n X 1 X ¯i = λ · rA A¯i = αji · rA · Y¯j ⇒ N αji · Y¯j δA j=1 δA j=1 i6=j

(15)

i6=j

(4) and finally, the equilibrium points for the i-patch satisfy, ¯ i , Y¯i , A¯i ) = A¯i (φ, σ, 1) , (M

φ 6= 0, σ 6= 0

(16)

2.5. Stability The system (5) have a trivial steady state (where all groups are equal to zero in all patches) and a non-trivial one that can be found numerically. Stability of the equilibrium point is hard to find analitically and thus we were not able to prove that the non-trivial equilibrium points were stable. However, after extensive exploration with the model we believe that the biologically relevant equilibrium point is a stable node (because for all the simulations all the eigenvalues asociated with the linearized system were less than zero, data not shown). The model was implemented in Matlab rand the system of ordinary differential equations was numerically integrated using built in numercal methods (i.e. ode45).

May 7, 2013

15:10

BC: 8846 - BIOMAT 2012

17˙cordovez

291

2.6. Parameters for numerical simulations We used in our model the following parameters estimated from the literature to run the simulations described in the results section. We assumed that individuals in every patch were equal, thus the same set of parameters was shared by populations in different fragments. Table 1. Model parameters. Name Symbol Value Male proportion at birth p 0.5 Intrinsic growth rate r 0.196 Male death rate δM 0.14 Young female death rate δY 0.07 Adult female death rate δA 0.01 Ideal Ratio IR 4 Slope factor for female to male ratio SR 0.5 Slope factor for carrying capacity SK 0.5 Rate to achieve sexual maturity rA 0.13 Carrying capacity Ki 25

Units − [1/year] [1/year] [1/year] [1/year] − − − [1/year] [ind./100 Ha.]

Ref. 18 11 11 11 11 19

18 11

2.7. Model simulations We wanted to investigate with the model the effect of habitat fragmentation in population dynamics of Ateles hybridus. Fragmentation can be understood as a biological disturbance in which an originally countinous landscape is degraded into a series of weakly connected patches of remanent forest. It has the obvious inmediate effect that the remaining group of individuals experience a reduction in available area and resources, but more importantly, a separation effect that is particularly challenging for mobile species. Thus, the newly formed fragments can be characterized by the number of patches left, the area available for the species and the degree of connectivity between the patches. Connectivity is the result of: number of connections and connection quality. For example, patches can be all connected to each other or there might be only a few of them connected (number of connections). In addition, the connections present can have different quality, measured in terms of landcover in the degraded habitat between the patches afecting the ability of subjects to reach new groups (connection quality). In the model each one the variables cited above can be investigated by varying a single parameter. For example, connection quality is captured

May 7, 2013

15:10

BC: 8846 - BIOMAT 2012

17˙cordovez

292

by the parameter matrix τ , where each entry τij gives the probability of succesfully covering the distance between patch j and patch i. Thus, if τij = 1, then the region is not disturbed between patches i and j and if τij = 0 the two patches are not connected. Values in between give different levels of connectivity. In a similar way if one wants to vary the number of connections between patches (not only the quality of the connection) then the number of entries in τ equal to 0 provide a way to account for number of patches not connected. To investigate the effect of an increase/decrease in the number of patches in population size we just have to consider more groups by increasing n in the model. Similarly, patch area can be manipulated in the model by varing the carrying capacity Ki which has a linear relation with patch area. In order to make sure that the population size has achieved equilibrium we ran the simulations for sufficient time to make sure no further changes were happening in the state variables thus, the figures in the following section correspond to final population sizes. Note that some of them take a very long time to achieve steady state (e.g., 1000 years).

3. Results The following scenarios were used to asses the effect of forest fragmentation on the viability of brown spider monkey populations: • • • • •

Control scenario Variable patch connection quality Variable number of patches Variable patch size variable number of connections between patches

3.1. Control scenario We simulated a continous habitat composed by three groups inhabiting a forest with 300 Ha. (75 individuals).The groups have high connectivity (τij = 1). In this and all other simulations we consider ρij = a and ρii = 0, where a is 1/(No. of patches − 1) (i.e., females have the same probability to chose from all patches available when moving). Control scenario was used to compare the other simulations to an undisturbed habitat with the same available area.

15:10

BC: 8846 - BIOMAT 2012

17˙cordovez

293

3.2. Variable patch connection quality The habitat described in control was fragmented into three equally sized patches that account for the same original area. All patches were connected to each other (ρ was kept constant from control) but τij was varied continously from 1 (control scenario) to 0 (all patches isolated) (Figure 4). 30

25 Population size (for one Patch)

May 7, 2013

Male Young female Adult female Total population fragmented Total population control

20

15

10

5

0

1

0.9

0.8

0.7 0.6 0.5 0.4 0.3 Connectivity between patches

0.2

0.1

0

Fig. 4. Population size after t = 500 years for each age-sex category as a function of the quality of the connection between all patches varying from 1 (no fragments) to zero (isolation). The values shown in the figure correspond to one patch, all patches are equal. The number of individuals obtained for control scenario for one patch is shown for comparison.

3.3. Variable number of patches To understand the effect of the number of fragments in population size, we initially fragmented the forest described in the control scenario to 3 patches, as we did in the previous simulation, but keeping τij = 0.5 and τii = 1. Then we increased the number of patches one by one until we reached 15 equally sized patches. On every step the sum of the area of the individual patches was the same as the area of the total area of the control patch. ρ was kept constant with the same values used from the control scenarion (Figure 5).

15:10

BC: 8846 - BIOMAT 2012

17˙cordovez

294 80 70

Male Young female Adult female Total population in a patch total population control total population fragmented

60 Population size

May 7, 2013

50 40 30 20 10 0

3

4

5

6

7

8 9 10 No of patches

11

12

13

14

15

Fig. 5. Population size in each age-sex category after t = 1000 years as a function of number of patches varying from 3 to 15. Results are shown for one patch (all are equal) and for the total population size (all patches considered). The number of individuals for the control scenario for the whole population is shown for comparison.

3.4. Variable patch size As in prevous treatments the control scenario was fragmented into three patches. In this case one of the patches was disproportionate in size compared to remaining two. Patch 1 varied between 10% to a 99% of the original area, while the other two patches were assigned the remaining proportion evenly. As before τij were set equal to 0.5 and τii = 1 and ρ was kept constant from control (Figure 6). 3.5. Variable number of connections between patches In this case we fragmented the control scenario into 5 equally sized fragments which account for the original area. We varied the proportion of the entries in τ to be equal to 0 from 0% to 100% and kept the rest equal to 0.5. ρ was kept constant from control. When τij equals zero individuals from patch j that are moving in the direction to patch i have no probability of arriving, in other words path j is not connected to patch i. Note that τ is not symmetric, meaning that individual from i could be connected to j (Figure 7).

15:10

BC: 8846 - BIOMAT 2012

17˙cordovez

295 80 70

Population size

60 Population in patch 1 Population in patch 2 & 3 Total population control Total population fragmented

50 40 30 20 10 0 0.1

0.2

0.3 0.4 0.5 0.6 0.7 0.8 Proportion of the original area assigned to patch 1

0.9

1

Fig. 6. Population size in each age-sex category after t = 500 years as a function of area assigned to one patch varying from 10% to 99%. Results are shown for the whole population and for each patch (patch 2 and 3 are equal). The number of individuals at the control scenario for the whole population is shown for comparison. Note that when the proportion assigned to patch 1 is exactly 1/3 all patches have the same population size as expected.

80 70 Total population fragmented Total population control

60

Population size

May 7, 2013

50 40 30 20 10 0

0

0.1

0.2

0.3

0.4 0.5 0.6 0.7 Proportion of links removed

0.8

0.9

1

Fig. 7. Population size in each age-sex category after t = 1000 years as a function of the proportion of possible links between patches varying from 0 to 1. Results are shown for the whole population. The number of individuals at the control scenario for the whole population is shown for comparison.

May 7, 2013

15:10

BC: 8846 - BIOMAT 2012

17˙cordovez

296

4. Discussion Studies predicting the population viability of spider monkeys in fragmented habitats are still scarce, specially at a metapopulation level. Nonetheless, they have begun to be implemented in other primate taxa, specially on those threatened with extinction.21–24 These studies become extremely useful in order to increase our current understanding of which variables might be crucial for their survival. Given the logistic and operational limitations to derive these results from long term studies of wild populations, we implemented a simple mathematical model aimed to predict the population dynamics of one of the most endangered primates in th world (A. hybridus) under a scenario of habitat fragmentation. Although the model did not take into account micro-scale processes that characterize spider monkeys’ ecology and sociality (e.g., intra-group dynamics) it is able to predict the fluctuation of population size as a function of different non-exclusive processes, such as connectivity between fragments, fragment size and the number of remaining patches. We initially evaluated the effects of connectivity quality on population size and found that there were only minor effects on population size when the quality of connectivity decreased up to 80% of the initial scenario (Figure 4).Once the probability of reaching a target patch is decreased below 20% there is a sharp decline in population size, suggesting that only severely disrupted areas between fragments will induce species extinction.These results might be directly linked to key aspects of spider monkey’s population dynamics, as females tend to disperse only once in their lifetime.19 Thus, dispersing females might be prone to use more degraded or unsuitable areas during their dispersal journey, including areas that may not be preferred for daily activities such as foraging and socializing. If this is the case, dispersing females would be able to move through degraded areas during their relatively short dispersal process as long as they have not been completely transformed into pastures and large agricultural fields. Furthermore, these results suggest that many fragmented areas that still have some degree of connectivity between them might still hold the basic requirements to maintain a viable population without completely restraining their dispersal processes and gene flow. Given that less than 20% of the historical distribution of wild brown spider monkeys remains nowadays and that most of these areas are comprised by a complex network of forest patches,10 understanding how the quality of the connectivity between patches influences population dynamics and viability results relevant in the management of their remaining populations.25

May 7, 2013

15:10

BC: 8846 - BIOMAT 2012

17˙cordovez

297

Fragmentation processes can result in a variable (small or large) number of fragments. Starting with the initial conditions for the three patches in the model, and keeping the area constant to that of the control scenario, we found that the number of fragments has an relevant effect on population size. When we divided the total suitable area into a larger number of fragments we evidenced that population size rapidly decreases as the number of patches increases, and in fact for our study area, when the area is divided into nine or more fragments, population size tends to 0. The results of the model indicate that total population size also decreases, evidencing that the population has not just been fragmented into small more numerous groups while remaining constant as a whole. In a natural scenario, only rarely will a fragmentation process yield equally sized patches. By changing the proportion of area assigned to one patch but maintaining the total fragmented area constant, we showed that the area (carrying capacity) of the patch increased, as well as the population size in that patch at the expense of the other two (Figure 6). Once the area allocated to patch 1 reaches 65%, population size in that patch, as well as total population size, rapidly decreases. When the area assigned to patch 1 is exactly 100% then total population decreases to zero (not shown in Figure 6). In fact, studies on other Neotropical primates (e.g. howler monkeys) have found that the probability of extinction increases exponentially as fragment size decreases.24 Connectivity, as we mentioned before, was a two sided variable, on the one hand connectivity increases as the quality of the link between two patches increases; on the other hand, connectivity decreases as links are removed from the system.In figure 7 we removed the links between patches by making τij = 0. As we gradually increased the proportion of links re¬moved from the system from 0 to 1 we discovered that this resulted in a catastrophic event. Initially and up to 0.3 (30% of the links removed) very little changed in population size as a whole. Once we went over 0.3 a small persistent decrease in population size persisted until we reached 50% at which point the population turned basically zero. This result suggests that degradation of the environment in between patches (not necessarily big) can result in species extinction well before a dramatic change in landscape has been achieved. These results of the model are according to Swart and Lawes [1996] work on blue monkeys (Cercopithecus mitis), in which they evaluated different management strategies designed to create corridors between forest patches containing wild C. mitis population and concluded that in the long-term, connecting corridors would increase metapopulation

May 7, 2013

15:10

BC: 8846 - BIOMAT 2012

17˙cordovez

298

persistence. In summary, the model implemented in this study suggest that patch connectivity at the levels of: [1] connection quality and [2] number of connections, produces a decrease in population size as connectivity decreases. However, the system seems to respond in a most robust manner to changes in number of connections compared to changes in connection quality. In order to decrease population size by 85% we needed an 80% decrease in connection quality, but only a 30% reduction in the number of connections between patches was sufficient to produce roughly the same change in population size. In addition, number of fragments is a very important factor impairing population growth. As the number of fragments increases (while the area remains constant) population size decreases in quasy exponential fashion. Finally, unequal size of fragments, which is probably more often the case, produces negative effects in population growth when most of the area (above 50%) goes into one single patch. Thus, when fragmenting is better to have an even distribution of patch size. The previous results suggest that when habitat is being altered the least amount of damage to the population can be achieved by leaving connection between all patches (even at the expense of less connection quality), having similar size patches and minimizing the number of patches. Finally, this study provides an example on how mathematical modeling might be used to respond biological questions and conservation issues at relatively large spatio-temporal scales. Given the limited resources allocated to conservation and the rapid proliferation of anthropogenic activities driving the decline and extinction of wild populations, it becomes crucial to derive the most effective and efficient strategies aimed to increase the probability of species survival.1 This study provides an example on the potential contributions of mathematical modeling to the field of conservation biology. The model presented here represents an initial assessment of how forest fragmentation influences the population dynamics of brown spider monkeys it has limitations that will need to be addressed in the near future. For example, the model does not include the effects of additional variables that may play a strong role in the survival of wild spider monkey populations in fragmented landscapes, such as the synergetic effects of habitat fragmentation, accessibility of previously remote areas and hunting pressure as was nicely described by Michalski and Peres in 2005. Nonetheless, it provides the first assessment on the effects of (i) habitat connectivity, (ii) patch size, and (iii) patch number on wild brown spider monkey populations and a builds on our current understanding of the effects of fragmentation on the remaining

May 7, 2013

15:10

BC: 8846 - BIOMAT 2012

17˙cordovez

299

populations of the critically endangered brown spider monkey. References 1. L. Fahrig, Ecological applications , p. 346–353 (2002). 2. E. Barbier and J. Burgess, Journal of economic surveys 15, p. 413–433 (2001). 3. R. MacArthur and E. Wilson, The Theory of Island Biogeography (Princeton University Press, Princeton, New Jersey, 1967). 4. T. Lovejoy, R. Bierregaard Jr., A. Rylands, J. Malcolm, C. Quintela, L. Harper, K. Brown Jr., A. Powell, G. Powell, H. Schubart and M. Hays, Edge and other effects of isolation on Amazon forest fragments (Sinauer, Sunderland, Massachusetts, 1986), ch. Conservation Biology: the Science of Scarcity and Diversity, p. 257–285. 5. F. Michalski and C. A. Peres, Biological Conservation , p. 383–396 (2005). 6. C. A. . Peres, Conservation Biology 15, p. 1490–1505 (2001). 7. W. Laurance, Biological conservation , p. 109–117 (1999). 8. O. Phillips and et al., Science 323: 1344 (2009). 9. A. Etter, C. McAlpine, K. Wilson, S. Phinn and H. Possingham, Agriculture ecosystems and environment , p. 369–386 (2006). 10. A. Morales-Jimenez, Modeling distributions for colombian spider monkeys (ateles sp.) using garp and gis to find priority areas for conservation, Master’s thesis, Oxford Brookes University (2004). 11. A. Link, A. De Luna and J. Burbano-Giron, Estado de conservaci´ on en Colombia de uno de los primates mas amenazados con la extinci´ on: El mono ara˜ na caf´e (Ateles hybridus) ch. Primates amenazados de Colombia. 12. K. A. . Gilbert, Primates and fragmentation of the Amazon forest (New York: Kluwer Academic, 2003), ch. Primates in fragments: Ecology and conservation, p. 145–157. 13. A. Di Fiore, A. Link and C. J. Campbell, The atelines: Behavioral and socioecological diversity in a New World radiation (Oxford University Press, Oxford), Oxford, ch. Primates in perspective, p. 155–188. 14. M. Franzen, Environmental conservation , p. 36–45 (2006). 15. A. Di Fiore, A. Link and J. L. Dew, Diets of wild spider monkeys (Cambridge University Press, New York, 2008), New York, ch. Spider monkeys: Behavior, ecology and evolution of the genus Ateles, p. 81–137. 16. K. Strier, American Journal of physical Antrophology 88, p. 515–524 (1992). 17. S. A. Suarez, International Journal of Primatology 27, p. 411–436 (2006). 18. M. M. Symington, Ecological and social correlates of party size in the black spider monkey, ateles paniscus chamek, PhD thesis, Princeton University, (Princeton, NJ, 1987). 19. Y. Shimooka, C. Campbell, A. Di Fiore, A. M. Felton, K. Izawa, A. Link, A. Nishimura, G. Ramos-Fernandez and R. Wallace, Demography and group composition of Ateles (Cambridge University Press, Cambridge, 2008), Cambridge, ch. Spider monkeys: Behavior, ecology and evolution of the genus Ateles, p. 329–348. 20. R. A. Mittermeier, J. Wallis, A. B. Rylands, J. U. Ganzhorn, J. F.

May 7, 2013

15:10

BC: 8846 - BIOMAT 2012

17˙cordovez

300

21. 22. 23.

24. 25.

Oates, E. A. Williamson, E. Palacios, E. W. Heymann, M. C. M. Kierulff, L. Yongcheng, J. Supriatna, C. Roos, S. Walker, L. Cort´es Ortiz and C. Schwitzer, Primate conservation , 1 (2009). G. Cowlishaw and R. Dunbar, Primate Conservation Biology (Chicago University Press, Chicago, 2000). J. Swart and M. J. Lawes, Ecological Modeling 93, p. 57–74 (1996). C. A. Chapman, M. J. Lawes, L. Naughton-Treves and T. Gillespie, Primates in Fragments: Ecology and Conservation (Kluwer Academic/Plenum Publishers, New York,NY, 2003), New York,NY, ch. Primate survival in communityowned forest fragments: Are metapopulation models useful amidst intensive use?, pp. 63–78. S. Mandujano, L. Escobedo-Morales and R. Palacios-Silva, Neotropical Primates 12, p. 126–131 (2004). G. Ramos-Fernandez, J. Mateos, O. Miramontes, G. Cocho, H. Larralde and B. Ayala-Orozco, Behavioral Ecology and Sociobiology 55, p. 223–230 (2008).

May 7, 2013

17:21

BC: 8846 - BIOMAT 2012

18˙carels

THE CONTRIBUTION OF STOP CODON FREQUENCY AND PURINE BIAS TO THE CLASSIFICATION OF CODING SEQUENCES∗

N. CARELS Laborat´ orio de Genˆ omica Funcional e Bioinform´ atica, Funda¸c˜ ao Oswaldo Cruz (FIOCRUZ), Instituto Oswaldo Cruz (IOC), Rio de Janeiro, RJ, Brazil. E-mail: [email protected] D. FRIAS Departamento de Ciˆencias Exatas e da Terra, Universidade do Estado da Bahia (UNEB), Salvador, BA, Brazil.

In this report, we revisited the classification of coding sequences (CDS) based on nucleotide statistics using the Universal Feature Method (UFM). We show that the rules (i) G1>(1.1*G2) (G1 and G2 are the guanine levels in 1st and 2nd position of contiguous DNA triplets, respectively) and (ii) T1(1.1*G2) and T1G2 is a key rule of coding frame diagnosis 22,23 . How-

May 7, 2013

17:21

BC: 8846 - BIOMAT 2012

18˙carels

304

ever, G1>G2 may also occur in the frame −1. Given the inherent variability of the base composition in CDSs 24 , it is necessary to consider features that take the six frames into account. In this work, we revisit the main determinants of the coding frame and how it allows CDS classification using very simple features. 2. Material and Methods 2.1. Basic considerations on the base composition and their consequences The linear relationship of GC2 (guanine or cytosine in 2nd position of codons) or GC1 (GC in 1st position of codons) vs. GC3 (GC in 3rd position of codons) over the complete interval of GC in the biosphere, has been called the universal correlation 24,25 . The regression lines of the universal correlation are GC3=3.07GC1-116.21 and GC3=5.32GC2-163.41; the interval of variation is 35% < GC1 < 75%, 18% < GC2 < 55% and 5% < GC3 < 95% 25 . Following its biological interpretation, the universal correlation is generated by the mutational bias that is the preferential incorporation of AT or GC by the DNA reparation machinery according to the species 24 . The largest interval of variation for GC3 is expected from the degeneration of the genetic code in 3rd position of codons. As expected from the codon redundancy for amino acid synthesis, the lowest GC interval of variation is associated to GC2. According to their intra-genomic variation in GC3, genomes can be classified in homogeneous (∆GC < 50%), which is the case of the majority of biological species or heterogeneous (∆GC > 50%), which is the case of warm-blood vertebrates (mammals and birds) and plant species from the Poaceae (Graminaceae) family. The G1 > G2 rule has been shown to occur in CDSs of organisms whose genomic base compositions cover the complete interval of GC (guanine + cytosine) known so far 25 such as Plasmodium falciparum (0% < GC3 < 30%), Arabidopsis thaliana (25% < GC3 < 65%), Oryza sativa (25% < GC3 < 100%), Drosophila melanogaster (40% < GC3 < 85%), Homo sapiens (30% < GC3 < 90%) and Chlamydomonas reinhardtii (60% < GC3 < 100%) 23 . The fact that T 1 < A2 in the coding frame is another rule that we introduce here and that also hold for the species just quoted, i.e. for the entire GC range of the universal correlation 23 . In order to simplify the presentation, we discuss the CDS classification for only two species: Homo sapiens (the human) and Oryza sativa (rice).

May 7, 2013

17:21

BC: 8846 - BIOMAT 2012

18˙carels

305

Figure 1. The universal correlation. Since the universal correlation was drawn as the average GC2 vs. average GC3 of CDSs per genomes, panel A (modified from [25]) represents the inter-genomic universal correlation (U.c.). Open circle are for prokaryotes and black circles for eukaryotes. Panels A and B are for the intra-genomic universal correlation of H. sapiens (Hs) (Fedorov’s group sample, see below) and O. sativa (Os) (TIGR sample, see below), respectively. In this case GC2 and GC3 are the averages per sequences.

We chose these two species because (i) they belong to two different phyla (warm-blood vertebrates and plants) separated from their common ancestor by 1 billion years, (ii) they share roughly the same GC3 interval in their CDSs26,27 that explains 75% of the GC3 range of the universal correlation (Fig. 1), (iii) they show a purine bias in their CDSs that follows a similar trend as that of C. reinhardtii and P. falciparum that stand at both extremities of the universal correlation and (iv) their genomes are well described. These reasons are sufficient to generalize the classification methodology outlined below to the coding DNA of other living organisms. 2.2. Rice sequences We selected a sample of 59,712 rice CDSs from the TIGR dataset (http://www.jcvi.org, 2005) and cleaned it to eliminate sequences annotated as transposons or retrotransposons. The sequences from the remaining dataset (n=47,314) were then translated into their corresponding amino acids and compared to PDB (January, 2010) for homology (E ≤ 0.0001) using BLASTp. We did this to ensure that our sample was based on experimental evidences, which warrant the elimination of possible false positives from in silico predictions. The homologous hits were then filtered to keep only the best hit for each rice accession (n=13,989). We further filtered the list to keep pairs with an identity ≥ 35% (n=8,547) in order to warrant function similarity and used the accession identifiers to retrieve their

May 7, 2013

17:21

BC: 8846 - BIOMAT 2012

18˙carels

306

corresponding DNA sequences from the original CDSs. 2.3. Human sequences We used the curate dataset of 23,366 human CDSs from Fedorov’s group listed in the file hs37p1.EID.tar.gz, which can be downloaded from: http://www.utoledo.edu/med/depts/bioinfo/database.html28,29 . The protein sequences translated from these CDSs were compared with PDB for homology (E ≤ 0.0001), as above. The homologous hits were then filtered to keep only the best hit (n=13,672) for each human accession. We then filtered the list to keep the pairs with an identity ≥ 35% (n=10,892) and used the accession identifiers to retrieve their corresponding DNA sequences among the original CDS file (23,366). The curate sample of human intron sequences (hs35p1.intrEID) that we used to measure the success rate of CDS vs. non-CDS classification was also from the Fedorov’s group. For sake of comparison with CDSs, we just considered the first 10,348 sequences of this file that were larger than 300 bp. We did not consider rice introns here because a curate sample is not currently available. However, the CDS vs. non-CDS classification is less difficult in rice than in the human22 because the difference of GC level between CDS and introns is larger in the first30,31 than in the second32,33 . 2.4. Classification method We (i) measured the relative frequencies, Pi(j) , of the four nucleotides i (i = {A, C, G, T }) in the three positions j of triplets (j = {1, 2, 3}) over all six frames k (three on the plus strand “+”, i.e. Fr1, Fr2, Fr3 and three on the minus strand “−”, i.e. Fr1, Fr2, Fr3) of the rice and human CDSs using a Perl script; (ii) calculated UFMk 22 over the six frames, with UFM standing for the scoring function: fk = PA(1) PG(1) / PC(1) PG(2) PA(3) + ST OP + 0.01

(1)

where STOP is the number of stop codons in-frame with the frame k considered; (iii) tested, in Excel, if the condition “U F M k > U F M k + 1, U F M k > U F M k + 2 together with Gk > (1.1 ∗ Gk + 1) and T k < Ak + 1” was true (1) or false (0) according to the frame, i.e. [ UFM1 > UFM2 & UFM1 > UFM3 & G1 > (1.1 ∗ G2) & T1 < A2 ]k. If the condition was not true, the sequence was considered non-coding. In order to

May 7, 2013

17:21

BC: 8846 - BIOMAT 2012

18˙carels

307

measure the contribution of the purine bias to the coding frame and coding status independently of the stop codon contribution, we considered two cases: (i) the algorithm just described with equation (1) and (ii) the same algorithm equation (1) without the “STOP” variable, i.e.: fk = PA(1) PG(1) / PC(1) PG(2) PA(3) + 0.01

(2)

We also considered whether to implement (i) the “G1 > (1.1 ∗ G2) & T1 < A2” condition within the loop in the process of frame scanning for the maximum of UFM (“in-loop filter”) or (ii) outside the loop after choosing the coding frame candidate as the one associated to the maximum of UFM over the six frames (“end filter”). The difference is that the first case implies to consider the two strands independently and to choose the putative coding frame in two steps: (i) among the three frames of each strand and (ii) among two candidate frames between the two strands. In the second case, the choice is taken on the six strands in one step. We scored the rates of true positives (considered coding and in the coding frame when they are), false positives (considered coding and in the coding frame when they are not), false negatives (considered not coding or not in the coding frame when they are) and true negatives (considered not coding or not in the coding frame when they are not) with the corresponding logical tests in Excel according to the features just outlined. 3. Results 3.1. Base composition constrains The purine bias is characterized by a regularity of the G1 vs. G2 distribution such that most CDS are restricted to a 22.5% 6 G1 6 50% vs. 10% 6 G2 6 30% domain in their coding frame. The bias of the G1 vs. G2 levels in the coding frame causes nucleotide compensations in the +2 and +3 frames in such a way that almost all of the points are excluded from the G1 vs. G2 domain in these two frames (Figs. 2B, 2C). The G1 vs. G2 domain corresponds roughly to the Boolean string “100” for true in the frame “+1” (Figs. 2A), false in the frame “+2” (Figs. 2B), false in the frame “+3” (Figs. 2C). The G1 vs. G2 pattern is very robust since the G1 vs. G2 domain is practically identical in the human and rice (see on Fig. 3) despite a ˜1 billion year separation from their common ancestor and varies only a little from P.

May 7, 2013

17:21

BC: 8846 - BIOMAT 2012

18˙carels

308

Figure 2. G1 vs. G2 scatter plots among the six frames (Fr) of H. sapiens CDSs (n=10,892). The level of G1 is generally larger than that of G2 in the coding frame (+1). In their coding frame (panel A), CDSs are restricted to a G1 vs. G2 domain (vertical and horizontal lines) whose level varies in the ranges of roughly 22.5% to 50% and 10% to 30%, respectively. This domain of G1 vs. G2 variation is generally avoided in the other two frames (+2 and +3) of the coding strand (panels B,C). However, this pattern can be imitated on the minus strand (panels D,E,F).

falciparum to C. reinhardtii (data not shown). Unfortunately, the “true, false, false” pattern of G1 vs. G2 may be eventually imitated on the minus strand (Figs. 2D,E,F). UFM translates this pattern in numbers (Fig. 4A), but the UFM score also translate the ambiguity on the minus strand (Fig. 4B). However, the UFM score in the coding strand is generally larger than the values that it takes in the minus strand (Fig. 5A,B,C). Therefore, when aiming to classify the coding frame, the best strategy is to eliminate the interferences from frames of the minus strand. Considering complementary bases, the minus strand is a mirror of the coding strand, which means that frame −1 has the same variability in nucleotide composition as frame +3 (Fig. 3). An example of interference is that the range of T variation is narrow in frame +1, but large in frame −1 with the consequence that T has no classification power in frame −1. Confusion between frame +1 and frame −1 or −2 is common and difficult to figure out. Figure 6 gives an example of this kind of imprecision. The

May 7, 2013

17:21

BC: 8846 - BIOMAT 2012

18˙carels

309

Figure 3. The pattern of base variation among frames is similar in H. sapiens (Hs) and O. sativa (Os). Base distributions on the coding strand (“+” strand) is indicated by a bold line when in 1st (+1) triplet position, by a dash line when in 2nd (+2) triplet position and by a thin line when in 3rd (+3) triplet position. In the complementary strand (“−” strand), the bold line is also used to indicate the 1st (−1) triplet position, the dash line is for the 2nd (−2) triplet position and the thin line for the 3rd (−3) triplet position. Adenine (A), guanine (G), cytosine (C) and thymine (T) distributions are represented in panels A,B,C,D; E,F,G,H; I,J,K,L; and M,N,O,P, respectively. Pi is the relative frequency of the base considered.

Figure 4. Frame scoring by UFM (formula 2) in H. sapiens in the coding strand (panel A) and in the complementary strand (panel B). Bold, dashes and thin lines are for 1st , 2nd and 3rd frames (the small numbers) of the strand under consideration.

comparison of GC2 vs. GC3 scatter plots of CDSs from Fedorov’s group (Fig. 6A) and of the same CDSs, but whose coding frame was deduced by

May 7, 2013

17:21

BC: 8846 - BIOMAT 2012

18˙carels

310

Figure 5. Classification power of UFM in H. sapiens (Hs). The score value of UFM (formula 2) is generally higher in the coding frame (y axis) than in the concurrent frames of the minus strand (x axis).

alignment with the protein sequences of PDB (Fig. 6B) indicates doubtful sequences associated to the gray circle of Fig. 6A. These sequences could be due to CDSs diagnosed in frame −2 rather than frame +1 or simply due to non-coding false positives since they do not appear in Fig. 6B.

Figure 6. GC2 vs. GC3 scatter plots of CDSs in H. sapiens (Hs). The sequence sample curate by the Fedorov’s group (panel A) is compared to the scatter plot of the same sequence sample after alignment with the protein sequences of PDB (panel B). The gray circle of panel A indicates sequences that are most probably in frame −2 rather than frame +1 or possibly non-coding sequences.

Among the consequences of the G1 vs. G2 pattern, we have that the statement “UFM1>UFM2” & “UFM1>UFM3” & “UFM1>1” & “G1 > (1.1∗G2)”22 & “T1(1.1*G2)” & “T1UFM2” & “UFM1>UFM3” is generally sufficient to filter out most false positives from the minus strand. Therefore, the false positive rate on the coding frame diagnosis is around 5% or less and the false negative rate is around 5%. The detailed investigation of the system variables needs the independent

May 7, 2013

17:21

BC: 8846 - BIOMAT 2012

18˙carels

311

analysis of stop codon and purine bias contributions. This can be assessed by implementing the UFM classification with or without the stop codon contribution in the formula. The other question is whether a difference is observed by filtering the UFM diagnosis (i) after processing all of the six frames (Fig. 8A) or (ii) after processing the three frames of each strand separately prior to take the final decision on what strand is the putative coding frame (Fig. 8B).

Figure 7. T1 vs. A2 scatter plots among the six frames (Fr) of H. sapiens CDSs (n=10,892). The level of A2 is generally larger than that of T1 (panel A) in the coding frame (+1). The contrary occurs on the minus strand in the frame −2 (panel E).

3.2. Coding frame classification 3.2.1. UFM without the stop codon contribution The results of UFM classification of the coding frame among the six frames of human and rice CDS without the stop codon contribution is given at Table 1. Following the results of Table 1, the purine bias allows the classification of the coding frame with a false positive rate lower than 5%, on the average. By contrast, the false negative rate is large and around 16% when filtering occurs after UFM diagnosis among the six frames. The reduction

May 7, 2013

17:21

BC: 8846 - BIOMAT 2012

18˙carels

312

Figure 8. UFM classification can be implemented by filtering the frame that satisfies the maximum of UFM over the six frames with compositional conditions (panel A). We called this strategy “end filtering”. Alternatively, Filtering by compositional conditions can be implemented at each decision step with the consequence that the final decision occurs between the best candidates of each strand (panel B). We called this strategy “in-loop filtering”.

of false positive rate by the filtering process was ∼ 10% in both rice and the human. The corresponding cost in false negative rate increase was only ∼ 2%, which means a net increase of success rate by ∼ 6%(Table 1). Because, UFM has been designed to optimize the contribution of the purine bias, the rate of false negatives is low on the coding strand. As a consequence, if it exits a higher value of the UFM scoring function on the minus strand, it will also exist a solution on the actual plus strand. The “in-loop filtering” process gives a chance to filter out the solution of the minus strand and to select the solution of the actual coding strand if it is more appropriate to the filter condition (algorithm Fig. 8B). Table 1. The contribution of the purine bias to the coding frame diagnosis in H. sapiens (n = 10,891) and O. sativa (n = 8,641). The classification was carried out with UFM without stop codon contribution (formula 2) using the “end filtering”. The values in the table are in %.

May 7, 2013

17:21

BC: 8846 - BIOMAT 2012

18˙carels

313

This would no be possible by filtering after taking the decision regarding which frame corresponds to the maximum of UFM scoring function (algorithm Fig. 8A). In that case, the solution of the minus strand would be selected to be, then, eliminated by the filters. This is the reason why the false negative rate of the “in-loop filtering” (Fig. 6B) is lower by as much as ∼ 6% in the human and by ∼ 8% in rice (Table 2) than that of the “end filtering” (Fig. 8A, Table 1). The reduction of false negative rate is at cost of the increase by a factor ∼ 2 of the false positive rate, which means a net benefit of 4 to 6% on the global error.

Table 2. The contribution of the purine bias to the coding frame diagnosis in H. sapiens (n = 10,891) and O. sativa (n = 8,641).). The classification was carried out with UFM without stop codon contribution (formula 2) using the “in-loop filtering”. The values in the table are in %.

3.2.2. UFM with the stop codon contribution The contribution of stop codons to the UFM success rate with “end filtering” is given in Table 3.

May 7, 2013

17:21

BC: 8846 - BIOMAT 2012

18˙carels

314 Table 3. The contribution of the purine bias to the coding frame diagnosis in H. sapiens (n = 10,891) and O. sativa (n = 8,641). The classification was carried out with UFM with stop codon contribution (formula 1) using the “end filtering”. The values in the table are in %.

At first glance it appears obvious that stop codons that are present in the five non-coding frames decrease the information loss that results from false positives that would have higher purine bias than that of the coding frame, but would be eliminated by compositional filters. This is obvious since the contribution of stop codons to UFM (formula 1) promotes the maximum of its score function in the coding frame. The direct consequence is that the rate of false negatives decreases by ∼ 10% at the benefit of the success rate and that the false positive rate decreases below 1%. Table 4. The contribution of the purine bias to the coding frame diagnosis in H. sapiens (n = 10,891) and O. sativa (n = 8,641). The classification was carried out with UFM with stop codon contribution (formula 1) using the “in-loop filtering”. The values in the table are in %.

In the case of the “in-loop filtering”, the contribution of stop codons to the UFM success rate concerns the increase in robustness with regard to sequence entropy. From the comparison of tables 3 and 4, it results that the false negative rate decreases upon introduction of the stop codon variable in UFM (formula 1) with, as a corollary, an increase of success rate by ∼ 5% and a leveling of false positive rate at 2-3% independently of the species being H. sapiens or O. sativa (table 1 shows that the noise in H. sapiens

May 7, 2013

17:21

BC: 8846 - BIOMAT 2012

18˙carels

315

CDSs is larger than in CDSs of O. sativa by about 4%). Compared to table 3, table 4 shows that even if the “in-loop filtering” strategy generates a larger false positive rate (∼ 2%) than the “end filtering”, however, it allows an increased robustness of UFM performance to CDS shortening or GC increase. 3.2.3. Stop codon frequency according to the size and GC level of CDSs The attractive performances depicted in tables 3 and 4 deserve additional comments: the classification rate that is presented in the case presented here is measured on a sample of full size CDSs statistically representative (∼ 50%) of the complete gene set. In real cases, CDSs are most of the time obtained as partial stretches (expressed sequence tag, EST) and can be very short (around 300 bp or below). When short stretches of DNA are considered, the probability to find a stop codon (1 over 60 bp) becomes small, but their number is still between 5 and 20 over the five non-coding frames at 200 bp (Fig. 9). In addition, since stop codons (TAA, TAG and TGA) are rich in AT, their probability decreases in the non-coding frames of GC-rich CDS (Fig. 10).

Figure 9. Frequency of stop codons in CDS of H. sapiens (n = 10,891). Each dot of the plot is the sum of stop codons among the five non-coding frames (since the coding frame does not have in-frame stop codon). “r” is the correlation coefficient. “P” is the probability. Stop codon count scatters between two lines: y = 0.02x and y = 0.1x with x being the size of CDs in Kbp.

Thus, in GC-rich or short CDSs, the performances of UFM would be closer to those of table 1 and 2 than to table 3 and 4. However, UFM is

May 7, 2013

17:21

BC: 8846 - BIOMAT 2012

18˙carels

316

Figure 10. Relationship between GC3 and stop codon count in H. sapiens (n = 10,891). “r” is the correlation coefficient. “P” is the probability.

Figure 11. Relationship of the UFM score in the coding frame (+1) and the size of CDS in H. sapiens (n=10,891).

largely independent of (i) the CDS size, since its scoring value varies in the range of ∼ 2 to 8 for CDS between 0.3 and 5 kb, with an average of 4.8 (Fig. 11) and (ii) the GC level or, in other terms, the codon usage (Fig. 12). 3.3. Coding status classification in non-coding DNA (introns) The purpose of UFM is to be used for the CDS classification in unknown sequence samples. Therefore, the success rate of the coding frame classification must be associated to that of the CDS vs non-CDS classification. A first basic difference between CDSs and non-CDSs is the purine bias.

May 7, 2013

17:21

BC: 8846 - BIOMAT 2012

18˙carels

317

Figure 12. Relationship between GC3 of human CDSs (n = 10,891) and UFM score in the coding frame (+1). “r” is the correlation coefficient. “P” is the probability.

Non-CDSs do not show any purine bias. Therefore, the scatter plots of G1 vs. G2 match that of random sequences (Fig. 13). Because the diagonal of G1 vs. G2 scatter plots of non-CDSs, may overlap that of CDSs (Fig. 2A), it is necessary to test the success rate in introns with the same strategy as outlined above. Thus, we apply UFM without stop codon contribution (formula 2) or with stop codon contribution (formula 1) according to both “in-loop filtering” and “end filtering” strategies previously described and with the condition that UFM¿1 (always satisfied in the coding frame of CDSs). 3.3.1. UFM without the stop codon contribution Table 5 shows that non-CDS (introns) can be classified as coding (false positives of CDS classification) in ∼ 5% and ∼ 10% of the cases according to “end filtering” and “in-loop filtering” strategies, respectively. The table also shows that without the combination of G1> (1.1*G2) and T1G223 . Whatever the functional reason for this bias, it is conserved throughout living beings, which makes it extremely important in particular for biotechnology applications. Because of the purine bias, CDSs are the only structural elements that show significant robustness to anchor a process of genome annotation. Promoters, splicing sites and other small signal sequences are too short to be searched on their own. The probability of their existence may become eventually significant in the light of prior knowledge on CDS location. CDS retrieval from transcriptome data is the best strategy since the proof of sequence expression is already given. The exercise is limited to the classification of coding and non-coding sequences in a context whose complexity is simplified. ORFs can be easily extracted from assembled transcriptome sequences (contigs) and classified for coding potential by species independent methods such as FT18 , SRM (Spectral Rotation Measure)19 , AMI20 and UFM22 with the last one allowing the confirmation that the ORF is effectively classified in the coding frame. The process can be easily automated in line with the training of HMM or support vector machine (SVM). This strategy should help for the automated gene search in genomes without prior knowledge. Satisfactory success rate (> 94%) and error rate (< 5%) using UFM are warranted despite huge GC3 heterogeneity. UFM may also help to take a decision when the coding status of a CDS candidate is doubtful35 . According to this, it could be necessary to verify the coding potential of putative CDSs generated by automated processes in the context of synthetic biology36 . The minimal condition is, indeed, the sequence to be in agreement with the RNY16 (Rrr)23 pattern if one cogitates to use the cell machinery for its expression. Finally, it is expected that new protein functions could be found by sequencing bulk metagenomic samples37,38 . If a function is new, by definition there is no reference for it. The question is therefore to validate it as coding before to essay it and measure its potentialities. Again, our work suggests that the respect of the Rrr bias by any sequence is a necessary condition to consider it as coding. When this condition is fulfilled, the translation, modeling and expression

May 7, 2013

17:21

BC: 8846 - BIOMAT 2012

18˙carels

321

should be successful. 5. Conclusions We believe that G1 vs. G2 is the primary determinant of coding DNA. This feature is rather independent of the GC composition of CDSs. UFM is a simple method that does not suppose sophisticate programming skills or theoretical background for testing. The in-loop filtering implementation of UFM with the conditions “G1 > (1.1 ∗ G2) & T 1 < A2” and stop codon contribution is the best strategy to warrant robust classification of coding frame as well as coding status of CDSs as small as 300 bp over the whole GC3 range of organisms. UFM is largely independent of the codon usage and may serve for (i) prior knowledge acquisition via coding ORF extraction from transcriptome sequences, (ii) coding sequence extraction from bulk sequence samples from metagenomic sequencing, (iii) false positive filtering of CDSs classified by other algorithms and (iv) purine bias matching in automate process of synthetic sequence designing. Acknowledgements We thank N´ ucleo de Biologia Computacional e Gest˜ ao de Informa¸c˜ oes Biotecnol´ ogicas (NBCGIB) from the Universidade Estadual de Santa Cruz (UESC, Ilh´eus, Brazil), for allowing the use of its computing facilities. References 1. J. W. Fickett, Nucleic Acids Res. 10, 5303 (1982). 2. R. Staden and A. D. McLachlan, Nucleic Acids Res. 10, 141(1982). 3. O. White, T. Dunning, G. Sutton, M. Adams, J. C. Venter and C. Fields, Nucleic Acids Res. 21, 3829 (1993). 4. P. M. Sharp, E. Cowe, D. G. Higgins, D. C. Shield, K. H. Wolfe and F. Wright, Nucleic Acids Res. 16, 8207 (1988). 5. M. Borodovsky and J. McIninch, Comput. Chem. 17, 123 (1993). 6. A.S. Lapedes, C. Barnes, C. Burks, R. Farber and K. Sirotkin, Computers and DNA. G. Bell and T. Marr, Eds. Addison-Wesley, Redwood City, CA. 157 (1990). 7. E. C. Uberbacher and R. J. Mural, Proc. Natl. Acad. Sci. USA 88, 11261 (1991). 8. R. B. Farber, A. S. Lapedes and K. M. Sirotkin, J. Mol. Biol. 226, 471 (1992). 9. Y. Xu, R. J. Mural and E. C. Uberbache, Comput. Appl. Biosci. 10, 613 (1994).

May 7, 2013

17:21

BC: 8846 - BIOMAT 2012

18˙carels

322

10. E. E. Snyder and G. D. Stormo, J. Mol. Biol. 258, 1 (1995). 11. A. Krogh, I. S. Mian and D. Haussler, Nucleic Acids Res. 22, 4768 (1994). 12. P. Baldi and S. Brunak,, Bioinformatics: The machine learning approach, 2nd ed., chapter 7. MIT Press, Cambridge, MA (2001). 13. A. Abu-Hanna,, Artificial Intelligence in Medicine, Elsevier 16, 201 (1999). 14. S. Audic and J. M. Claverie, Proc. Natl. Acad. Sci. USA 95, 10026 (1998). 15. A. Lomsadze, V. Ter-Hovhannisyan, Y. O. Chernoff and M. Borodovsky, Nucleic Acids Res. 33, 6494 (2005). 16. J. C. W. Shepherd, Proc. Natl. Acad. Sci. USA 78, 1596 (1981). 17. E. N. Trifonov and J. L. Sussman, Proc. Natl. Acad. Sci. USA 77, 3816 (1980). 18. S. Tiwary, S. Ramchandran, A. Bhattacharya, S. Bhattacharya and R. Ramaswamy, CABIOS 13, 263 (1997). 19. D. Kotlar and Y. Lavner, Genome Res. 13, 1930 (2003). 20. I. Grosse, H. Herzel, V.H. Buldyrev and H. E. Stanley, Phys. Rev. E. 61, 5624 (2000). 21. C. Nikolaou and Y. Almirantis, J. Mol. Evol. 59, 309 (2004). 22. N. Carels and D. Frias, Bioinform. Biol. Insights 3, 141 (2009). 23. N. Carels, R. Vidal and D. Fras, Bioinform. Biol. Insights 3, 37 (2009). 24. N. Sueoka, Proc. Natl. Acad. Sci. USA 47, 1141 (1961). 25. G. D’Onofrio, K. Jabbari, H. Musto and G. Bernardi, Gene 238,3 (1999). 26. G. Bernardi, Gene 241,3 (2000). 27. N. Carels, Research Signpost 3, 129 (2005). 28. S. Saxonov, I. Daizadeh, A. Fedorov and W. Gilbert, Nucleic Acids Res. 28, 185 (2000). 29. V. Shepelev and A. Fedorov, Brief. Bioinform. 7, 178 (2006). 30. N. Carels, P. Hatey, K. Jabbari and G. Bernardi J. Mol. Evol. 46, 45 (1998). 31. N. Carels and G. Bernardi, Genetics 154, 1819 (2000). 32. O. Clay, S. Caccio, S. Zoubak, D. Mouchiroud and G. Bernardi, Mol. Phylogenet. Evol. 5, 2 (1996). 33. M. Costantini and G. Bernardi, Gene 410, 241 (2008). 34. M. L. Chiusano, F. Alvarez-Valin, M. Di Giulio, G. D’Onofrio, G. Ammirato, G. Colonna and G. Bernardi, Gene 261, 63 (2000). 35. N. Carels, R. Vidal, R. Mansilla and D. Frias, FEBS Lett. 568, 155 (2004). 36. A. Villalobos, J. E. Ness, C. Gustafsson, J. Minshull and S. Govindarajan, BMC Bioinformatics 7, 285 (2006). 37. K. J. Hoff, M. Tech, T. Lingner, R. Daniel, B. Morgenstern and P. Meinicke, BMC Bioinformatics 9, 217 (2008). 38. N. G. Yok and G. L. Rosen, BMC Bioinformatics 12, 20 (2011).

May 9, 2013

11:54

BC: 8846 - BIOMAT 2012

19˙ﬂesia

MULTICLASS CLASSIFICATION OF TREE STRUCTURED OBJECTS: THE K-NN CASE

ANA GEORGINA FLESIA Conicet at UTN-Regional C´ ordoba and FaMAF-UNC Ing. Medina Allende s/n, Ciudad Universitaria CP 5000, C´ ordoba, Argentina. E-mail:[email protected]

In this paper, we consider the problem of supervised classiﬁcation of tree structured objects. Being the tree structured population included in a metric space, we deﬁne a k-nearest neighbors (k-nn) procedure and we argue about its reliability by showing its statistical consistency. We assess ﬁnite sample classiﬁcation errors within two diﬀerent sets of trees. First, an example from proteomics. We deﬁne a k-nn classiﬁcation procedure based on Variable Length Markov Chain Modeling of primary sequence protein families from Pfam database, and compare its performance with the standard Hidden Markov Chain approach, with competitive classiﬁcation errors . Secondly an example from phylogenetics. We deﬁne a k-nn classiﬁcation procedure on binary labeled trees to study the diﬀerences introduced by several phylogenetic tree building methods and the bootstrap on ﬂu virus data. Low classiﬁcation errors imply signiﬁcant diﬀerences between trees.

1. Introduction Structured information, in a multilevel, or hierarchical fashion, is often represented as a simple tree, a collection of nodes and edges, each of which connecting some pairs of nodes. Trees could be rooted or not, grow unbounded or have a ﬁxed number of children, have labels in leaves ending branches or have not labels at all, have a vector of attributes assigned at each node or being simply a geometric structure. Sets of trees are very diverse, and each application implies a particular deﬁnition of tree. We are concerned with the construction of random elements in a set of trees, which allows the deﬁnition of classiﬁcation rules. Metric spaces are then a natural choice 3,4 . We will now introduce some examples of metric spaces of trees. 323

May 9, 2013

11:54

BC: 8846 - BIOMAT 2012

19˙ﬂesia

324

Binary trees with labeled terminal nodes Biologists are used to taxonomic hierarchies: species are grouped into genera which are grouped into families and so on, building a graph called Philogenetic tree. In this molecular genomic era, phylogenetic trees are usually constructed from amino acid discrepancies in a speciﬁc protein sampled from many species. When many related proteins are used, a random sample of descendant trees (cladograms) are obtained, each with labeled terminal nodes (the species) and unlabeled internal nodes. Information is then resumed in a central tree, or consensus tree, see 9 and references herein for a summary of speciﬁc methods. Problems appear when information from diﬀerent sources is gathered, for example, when merging databases with data obtained at diﬀerent time frames, or when genomic data is collected from a sample of diﬀerent subjects. In these cases, a single consensus tree may result unsatisfactory, since diﬀerent populations of trees may have emerged. Information about the whole set of candidate trees is lost, including how trees are distributed on the set of all binary trees and how the trees are similar to each other. In 20 , a reduced set of characteristic trees was proposed instead of a single consensus tree, by clustering the data using data mining techniques and computing the consensus tree of each cluster, with the Robinson-Foulds metric. In Flesia (2009) 16 , the statistical consistency of the clustering procedures for trees used by 20 were studied, ﬁnding that metagenomic data may indeed generate diﬀerent trees. Trees with bounded number of children but unlabeled nodes. Another important problem in computational molecular biology is the one of determination of protein functionality using only information about its primary sequenceLarge databases of protein sequences have been constructed with semi-supervised methods based on sequence alignment, which can be seen as populations of proteins. To construct the Pfam database 1 , a small sample of curated, well known proteins were selected to build one Hidden Markov Model (HMM) for each family (population), assigning the rest of the proteins into the family of the model with highest score. Recently, 11 has mapped each protein sequence to a tree with up to 20 children per node (amino acids codes), ﬁxed depth D, and unlabeled terminal nodes, transforming the population of proteins into populations of trees. They built a metric random space of trees for this tree topology, with a Hamming type of metric, and deﬁned nonparametric hypothesis tests for the two sample cases. In Flesia(2011) 15 , a simpler nonparametric test was

May 9, 2013

11:54

BC: 8846 - BIOMAT 2012

19˙ﬂesia

325

proposed for the same problem. Trees with attributes on each node: Medical image analysis is motivating some interesting new developments on statistical analysis. These new developments are not in traditional imaging areas, such as the denoising, segmentation and enhancement of a single image, but instead are about the analysis of populations of images. Again, common goals include ﬁnding center points and variation about the center, and classiﬁcation and clustering. One special example introduced on 21 is the analysis of tree shaped objects extracted from 3D images. These tree shaped objects are segmentations of images of blood vessels of the brain from several patients, and the features collected from them are the tree topology of the blood vessels, (connection between arteries) and attributes of the ongoing edges as length and 3D orientation, among others. The metric space of trees considered by 21 is an extension of the one considered by 11 for testing statistical hypothesis of diﬀerences between populations of trees. In a more recent paper, a large scale study of cerebrovascular trees 2 , only the tree topology of the trees was recorded, and the same Hamming type of metrics from 11 were used to detect principal lines in the random tree that modeled the sample. In this paper, we address the problem of classifying a data set of trees from an statistical point of view. We assume that the tree data to be classiﬁed has been sampled from some unknown probability distribution, and we try to verify the intuitive idea of consistency: the more sample points we add to the training set, the more reliable the classiﬁcation should be. This condition is close to the notion of mathematical consistency, or convergence of some type. An algorithm that does not converge produces rather unpredictable results on any given sample and thus is completely unreliable. We focus the discussion on the speciﬁc case of the k-nn non parametric procedure. The paper is organized as follows. First, Section 2 provides details about the metric spaces that model the examples in this introduction. Further, the k-nn procedure for such metric spaces of trees is discussed on Section 3. We provide some results on convergence properties and discuss computational aspects as well. Finally, Section 4 is devoted to assess the performance of k-nn with two real datasets. Some concluding remarks are postponed to Section 5.

May 9, 2013

11:54

BC: 8846 - BIOMAT 2012

19˙ﬂesia

326

2. Unsupervised Learning in the space of trees To cast classiﬁcation as an statistical problem we start by regarding the data t1 , . . . , tn as an i.i.d sample from some unknown probability distribution ν(t). The usual nonparametric approach to classiﬁcation is based on variations of the k-nearest neighbor rule (k-nn), a distance based rule introduced by Fix, 12 . In these rules, each of the k-nearest neighbors (in the training sample) of a new observation votes in favor of the label of the group to which it belongs to. The new observation is classiﬁed into the group that obtains more votes. In spite of its simplicity, the performance of such set of rules with ﬁnite samples is remarkably good for many diﬀerent type of data. In the case of Euclidean data, it is also a universally consistent set of rules, so it is not coincidence to observe improvements in performance when the training set increases in size. We will show that it is also the case in the spaces of trees we will deﬁne now. 2.1. General specifications on metric spaces of trees A metric space of trees consists of a set of tree-shaped objects along with a metric. There are two broad strategies to deﬁne a suitable metric: one strategy counts and weights the number of discrepancies between two trees whereas the other strategy maps the trees into alternative mathematical structures for which natural metrics already exists. In this paper we will follow the ﬁrst approach. 2.1.1. Example A: rooted topological trees A rooted topological tree is a graph tree with a node called root, and every node with at most m diﬀerent children. The set of all trees is denoted by TA . In 17 , Hamming introduced a metric that counts the number of edge discrepancies between two graphs. In 3 , a reﬁnement of this metric was introduced to show that the set of all unbounded rooted topological trees is an inﬁnite compact metric space. In 2 , the same metric was introduced considering only binary trees with ﬁnite depth, in order to deﬁne principal lines on a random tree. Consider an alphabet A = {1, . . . , m}, with m ≥ 2 integer, representing the maximum number of children of a given node of the tree. Let V = ∪s≥0 ({1} × {1, . . . , m}s ) = {1, 11, 12, . . . , 1m, 111, 112 . . . }, the set of ﬁnite sequences of elements in A, all of them starting with the symbol 1. Elements

May 9, 2013

11:54

BC: 8846 - BIOMAT 2012

19˙ﬂesia

327

Figure 1. In the left, an example of a binary tree of depth 4. The drawing in the write shows the same tree with its leaves enumerated in Breadth-First Search way.

of V are called nodes and the node 1 is called root of the tree. The full tree is the oriented graph t˜ = (V , E) with edges E ⊂ V × V given by E = {(v, va) : v ∈ V , a ∈ A}, where va is the sequence obtained by concatenation of v and a. In the full tree each node has exactly m outgoing edges to its oﬀsprings and one ingoing edge from her father, except for the root that has only outgoing edges. The node v = 1a2 . . . ak is said to belong to the generation k; in this case we write gen(v) = k. Generation 1 has only one node, the root. Definition 2.1. We deﬁne tree as a function t : V → {0, 1} satisfying t(v) ≥ t(va).

(1)

for all v ∈ V and a ∈ A. If t(1) = 0 we get the empty tree. Let TA be the set of trees. Abusing notation, a tree t is identiﬁed with the graph t = (V, E) with V = {v ∈ V : t(v) = 1} and E = {(v, va) ∈ E : t(v) = t(va) = 1} .

The depth of the tree t is deﬁned by max{gen(v) : v ∈ t}. Figure 1 shows a tree of depth 4. Definition 2.2. Let φ : V → R+ be a strictly positive function so that v∈V φ(v) < ∞ and consider the distance φ(v)(t(v) − y(v))2 . (2) dA (t, y) = v∈V

With this distance (TA , dA ) becomes a compact metric space, see details.

3

for

May 9, 2013

11:54

BC: 8846 - BIOMAT 2012

19˙ﬂesia

328

In particular, we use the function φ(v) = z gen(v) , for z < m−3/2 , as in Balding et al. (2009) 3 . Other choices have been made in Flesia(2011) 15 and Busch et al. (2009) 11 , working with sets of simulated trees with many generations. The distance dA measures the discrepancy between the topology of the trees node by node, see 3 for details. Two very diﬀerent examples of such spaces can be cited as previous work. One example is the metric space with trees with up to 20 children per node, and depth 3, considered by 11 for modeling protein functionality, and the other is the metric space with binary trees with depth up to 15, considered on 2 for modeling image segmented cerebrovascular trees. 2.1.2. Example B: Binary trees with labeled leaves A (labeled) binary tree is a graph with a node called root, all internal nodes with degree 3, and n labeled terminal nodes, with labels in L. The set of all labeled binary trees with n distinct terminal nodes is denoted by TB . When there is a label preserving isomorphism between two trees they are considered identical, thus if two nodes have the same father it does not mater which is on the left or which is on the right. We must note that a set of diﬀerent labeled trees may have the same topology tree, so TB is not a subset of TA . The space TB is the space deﬁned on 4 in order to compare diﬀerent hierarchical clustering algorithms. It is also the typical representation of cladograms or evolutionary trees 9 . Deﬁning a distance on this space is a diﬃcult problem because there is no reasonable metric that imposes a neighborhood structure which is the same for all trees. There are two main approaches considered in the literature, an extension based on hyper-graphs of the Hamming metric strategy described before 8 , and the symmetric distance, which is sometimes called Robinson Foulds metric 19 . An hyper-graph generalizes traditional graphs by admitting edges that link more than two nodes . Each edge in the hypergraph of a binary tree is a cluster of cardinality r greater than one in the hierarchy associated with the tree. In this framework the extension of the Hamming strategy is to count and weight the number of discrepancies in each type of redge. The symmetric distance is a popular choice between computational biologists. It counts the number of ”crossover” operations needed to change

May 9, 2013

11:54

BC: 8846 - BIOMAT 2012

19˙ﬂesia

329

one tree into another. It is also the number of bi-partitions induced by one tree but not by the other. If we choose an internal node, and cut the subtree pending from that node out of the tree, we obtain a bipartition of the set of taxa. If we take that subtree and inter-exchange it with another subtree, we have a crossover operation. Crossover operations are easier to determine on un-rooted trees, bi-partitions are clearly deﬁned, and characterize rooted trees. We shall denote by B(t) the set of bi-partitions of A induced by t. Definition 2.3. Given a set L of taxa and two binary trees t and y, the symmetric distance between t and y is dS (t, y) =

1 [|B(t) − B(y)| + |B(y) − B(t)|] 2

where |.| denotes the cardinality of the argument set. The symmetric distance is a particular case of the Robinson Foulds distance, (see 19 ). The Robinson Foulds distance is a weighted symmetric distance, deﬁned over the set of binary trees that have a length value attached to each edge. The weights are computed using the branch length information. 3. Supervised classification problems on tree space Now we deﬁne random trees on general metric spaces of trees, and classiﬁcation rules in such spaces. Definition 3.1. Let (T , d) be a metric space of trees, in particular any of the preceding examples. A random tree with distribution ν is a measurable function ν(dt) . (3) T : Ω → T such that P(T ∈ A) = A

for any Borel set A ∈ B, where (Ω, F , P) is a probability space and ν a probability on (T , B), with B the Borel σ-algebra in T . Definition 3.2. Given a ﬁnite set {1, . . . , M } of labels and T a space of trees, we will call an observation to a pair (t, l) ∈ T × {1, . . . , M } where t is known and l is a class or label that denotes the unknown nature of the observed tree. A mapping g : T → {1, . . . , M } is called a classiﬁer and represents our guess of the class l given its associated tree t ∈ T . The classiﬁcation is wrong if given an observation (t, l), occurs that g(t) = l.

May 9, 2013

11:54

BC: 8846 - BIOMAT 2012

19˙ﬂesia

330

Definition 3.3. Let (T, L) ∈ T × {1, . . . , M } be a random pair. Since error occurs if g(T ) = L, probability of misclassiﬁcation for g is L(g) = P [g(T ) = L].

(4)

Then the best possible classiﬁer is the function g ∗ that minimizes (4). The minimum error probability (the Bayes error) is denoted by L∗ = L(g ∗ ). In order to obtain g ∗ , the distribution of (T, L) should be known, but typically this is not the case. One must build up a classiﬁer based on a training sample of independent pairs {(Ti , Li ) : 1 ≤ i ≤ p}, with the same distribution as the pair (T, L) and known L1 , . . . , Lp values. Definition 3.4. A classiﬁer is a function gp (·; T1 , L1 , . . . , Tp , Lp ) : T × (T × {1, . . . , m})p → {1, . . . , M } with probability of misclassiﬁcation given by the conditional error probability Lp (gp ) = P [gp (T ; T1 , L1 , . . . , Tp , Lp ) = Y | T1 , L1 , . . . , Tp , Lp ] A sequence of classiﬁers {gp ; p ≥ 1} is called a rule. A rule is consistent when lim Lp (gp ) = L∗ almost surely p

and weakly consistent if lim E(Lp (gp )) = L∗ p

3.1. k-nearest neighbor rules The k-nearest neighbor rule is a well studied method on Euclidean data, which only depends on the distance between individuals. Formally, we deﬁne the k-nn rule by Definition 3.5. k-nn rule gp (t) = arg max

j=1,...,M

p

wni I{Li =j}

i=1

where

wni =

1 if Ti is among the k nearest neighbors of t 0 otherwise

May 9, 2013

11:54

BC: 8846 - BIOMAT 2012

19˙ﬂesia

331

Observation: Ti is said to be the k-th nearest neighbor of t when distance d(Ti , t) is the k-th smallest among d(T1 , t), . . . , d(Tp , t). Distance ties are broken by comparing indexes, that is to say, if d(Ti , t) = d(Tj , t), Ti is considered closer to t, when i < j. Definition 3.6. The k-nearest neighbor classiﬁer is (1) weakly consistent if p → ∞,k → ∞, kp → 0 implies limp E(Lp (gp )) = L∗ (2) strongly consistent if p → ∞,k → ∞, kp → 0 implies limp Lp (gp ) = L∗ Let’s assume, for the sake of simplicity, that M = 2, that means, we have only two classes. Let also be η(t) = E(L|T = t) the regression function, and PT the probability distribution of the random element T ∈ T . Theorem 3.7. The tree spaces (TA , dA ) or (TB , dB ) are both separable spaces where the Besicovich condition holds. Thus, using the main theorem from Cerou and Guyader (2008), the k-nn classifier becomes weakly consistent. Comments on the proof of this statement are shown in the appendix. In the next section we will study ﬁnite sample performance in a classical classiﬁcation problem, related to protein functionality determination, and in a more sophisticated setting, related to the uniqueness of the phylogenetic tree constructed over a set of taxa. 4. Computational examples 4.1. Classification of protein sequences into known families Functional genomics is the ﬁeld of molecular biology involved in the determination of functionality of identiﬁed genes and proteins. When proteins are studied in their primary structure, represented by a sequence of 20 diﬀerent symbols called amino acids, their functional regions (commonly termed domains) are sought, since diﬀerent combinations of domains give rise to the diverse range of proteins found in nature 5 . The Pfam database is a large collection of protein families, each represented by multiple sequence alignments and Hidden Markov models

May 9, 2013

11:54

BC: 8846 - BIOMAT 2012

19˙ﬂesia

332

(HMMs). It is one of the largest databases of protein families and domains, composed of a subset of well curated sequences, called PfamA, and another set of related sequences labeled using the family related HMM trained with PfamA sequences. Bejerano (2000) 7 worked out another Markov model for protein classiﬁcation based only on linguistic related searches, without the need of previous sequence alignment. These models are called Variable Length Markov Chains (VLMC), and were ﬁrst studied by 18 . As in the case of proﬁle HMM (Hidden Markov Model) in the construction of the Pfam families, the (VLMC) approach of Bejerano and Yona takes, for each family, a set of already classiﬁed protein domains and estimates a (VLMC) model. Then, the estimated (VLMC) model is used to classify other protein sequences into the family.The motivation of such approach is related to biological understanding of the evolution and composition of protein families. They suppose that a group of evolutionary related protein sequences should exhibit many identical short segments (domains) which have been either preserved by selection or have not diverged long enough from their common single ancestral sequence. The variable memory model is well equipped to pick up these locally conserved segments, showing them in the architecture of the context tree 6 . Our approach is diﬀerent from Bejerano’s, since we see the context tree of an observed protein sequence as a realization of the random variable T with a probability law ν on the space of trees. Other sources of randomness absorbed also by ν are the ﬁniteness of the observed sequence, which may not correctly determine the true context tree, and random errors in the observations. The labels of the trees are governed by L, a random variable that denotes the unknown nature of the observed tree. Observations are then computed sequence by sequence with the PST algorithm, while Bejerano computes only one tree using all the information available for training about the family. Thus, we have a training sample of independent pairs {(Ti , Li ) : 1 ≤ i ≤ n}, with the same distribution as the pair (T, L) and known L1 , . . . , Lp values. Given a new observation t, its label is computed by majority vote between the k nearest neighbors in the training sample, using the distance dA .

May 9, 2013

11:54

BC: 8846 - BIOMAT 2012

19˙ﬂesia

333

4.1.1. Variable Length Markov Chain Modeling of protein functionality A Variable Length Markov Chain is a stochastic process introduced by uhlmann and Wyner Rissanen (1983) 18 in information theory; see also B¨ (1999) 10 . In this model the probability of occurrence of each symbol at a given time depends on a ﬁnite number of precedent symbols. The number of relevant precedent symbols may be variable and depends on each speciﬁc sub-sequence. More precisely, a VLMC is a stochastic process (Xn )n∈Z , with values on a ﬁnite alphabet A, such that n−1 n−1 n−1 P [Xn = · | X−∞ = xn−1 −∞ ] = P [Xn = · | Xn−k = xn−k ] ,

(5)

where xrs represents the sequence xs , xs+1 , . . . , xr and k is a stopping time that depends on the sequence xn−k , . . . , xn−1 . As the process is homogeneous the relevant past sequences (xn−k , . . . , xn−1 ) do not depend on n and are denoted by (x−k , . . . , x−1 ). Each relevant past (x−k , . . . , x−1 ) is called a context. The set of contexts τ can be represented as a rooted tree t, where each complete path from the leaves to the root in t represents a context. Calling p the transition probabilities associated to each context in τ given by (5), the pair (τ, p), called probabilistic context tree, has all the information relevant to the model, see 18 and 10 . The topological structure of the probabilistic context trees can be easily seen as a tree t from the space of rooted trees TA with distance dA . In particular we use the function φ(v) = z gen(v) ,

(6)

where z = 0.1 and gen(v) stands for the generation of the node, the number of symbols to reach the root. 4.1.2. Material and Methods In the following examples we constructed a FASTA ﬁle database of 1700 protein sequences, divided into 50 families of diﬀerent sizes, all of them extracted from the Pfam database, release 2000. The data may be considered old, since Pfam has evolved considerably in the last 10 years, but it is important to have published classiﬁcation error rates to compare with 6 . Besides, this old data was well curated, labels have not changed from this version to the 2010 version of the Pfam, and can be considered a benchmark database.

May 9, 2013

11:54

BC: 8846 - BIOMAT 2012

19˙ﬂesia

334

Each string was transformed into a tree of depth 4 using the original PST algorithm for Variable Length Markov Chain modeling from (Bejerano, 2003). The additional modules for computing distances and the k-nn algorithm, necessary for protein sequence classiﬁcation can be obtained from the authors upon request. Overall accuracy and Overall error rates were computed training with the 80% of the database, and scoring all 1700 proteins (as in Bejerano (2003)) and scoring the 20% set aside. We also computed the relative improvement rate deﬁned below Relative improvement =

ERV LMC − ERk−nn ERV LMC

4.1.3. Election of k, number of neighbors In this ﬁrst example we would like to explore the size of the neighborhood that will be used in the main classiﬁcation procedure. Thus, we selected 11 families of proteins from the Pfam database, release 2000, named ’7tm-1’,’actin’, ’adh-short’, ’adh-zinc’,’ank’,’ATP-synt-A’, ’beta lactamase’, ’cox2’, ’cpn10’, ’DNA-pol’, ’efhand’, keeping 80% of each family as training set. Figure 4.1.3shows a plot of the percentage of true positives (accuracy) as a function of k, the number of neighbors, using a leave-one out procedure. The blue plot is the accuracy computed with all 1700 trees (reported by Bejerano on 6 ), and the red plot is the one computed only with the 20% of sequences set aside as test set. We concluded that three neighbors are enough for classiﬁcation, since trees are short, and distances are very subtle. More neighbors will induce errors due to tying. 4.1.4. Classification of 50 families We computed classiﬁcation rates for the database studied by Bejerano in his phd thesis in 2002. The database has 1700 protein sequences, divided into 50 families. Short sequences have been discarded. The results, reported in Table 1, are the name of the family, the size of the family , the accuracy rate reported by Bejerano for the VLMC approach, the accuracy rate training with 3 neighbors (3nn-a), computed using a training set of 80% random proteins and scoring 100%, averaging over 10-fold cross validation sets, and the accuracy rate training with 3 neighbors but computed only over the

11:54

BC: 8846 - BIOMAT 2012

19˙ﬂesia

335

Recognition rates using K−NNR as a function of k 98 97

Recognition rates (%)

96 95 94 93 92 91 90 89 88

1

2

3

4

5

6

7

8

9

10

k

Figure 2. In full line with circle markers, we plot the overall recognition rates of k nearest neighbors rule as a function of k, considering a training set of 80% of the total of proteins, classifying the 100%. In dotted line with square markers, we plot the overall recognition rates computed classifying only the 20% set aside from the training set.

100 90 80 70

Accuracy

May 9, 2013

60 3−NNa 3−NNb VLMC

50 40 30 20 10 0

0

10

20

30

40

50

families

Figure 3. In blue: Accuracy of VLMC method, in red: accuracy of 3-NNa and green accuracy of 3-NNb

20% of samples that are not part of the training set (3nn-b), a real 10-fold cross validation scheme. We have also summarized the table in the stem plot of Figure 3. Each stem represents a family, ordered from left to right as it will appear in Table 1. Blue stems are the Accuracy of Bejerano’s VLMC approach, red stems the Accuracy of 3nn-a and green stems the Accuracy of 3nn-b. There are very few families that have a better VLMC accuracy than 3nn, and are

May 9, 2013

11:54

BC: 8846 - BIOMAT 2012

19˙ﬂesia

336 Table 1. Percentage of True Positives, (Accuracy), made by the PST method, and 3nearest neighbors, training with 80% of the database and scoring 100%. Last column shows Accuracy made training with 80% of the database and scoring the 20% left aside with 3-nn. Family

Size PST 3nn-a 3nn-b

7tm_1 7tm_2 7tm_3 AAA ABC_tran actin adh_short adh_zinc aldedh alpha-amylase aminotran ank arf asp ATP-synt_A ATP-synt_ab ATP-synt_C beta-lactamase bZIP C2 cadherin cellulase cNMP_binding COesterase connexin

515 36 12 66 269 142 180 129 69 114 63 83 43 72 79 180 62 51 95 78 31 40 42 60 40

93.0 94.4 83.3 87.9 83.6 97.2 88.9 95.3 87.0 87.7 88.9 88.0 90.7 83.3 92.4 96.7 91.9 86.3 89.5 92.3 87.1 85.0 92.9 91.7 97.5

99.8 100 100 98.4 95.5 100 95.5 97.6 98.5 98.2 95.1 92.6 100 97.1 100 100 100 98 96.8 94.8 100 92.3 100 98.3 100

100.0 100.0 100.0 100.0 88.4 100.0 89.9 97.3 92.8 92.9 94.1 88.3 100.0 97.1 100.0 100.0 100.0 92 90.1 93.2 100 86.9 100.0 93.2 100

Family

Size PST 3nn-a 3nn-b

copper-bind COX1 COX2 cpn10 cpn60 crystall cyclin Cys_knot Cys-protease cystatin cytochrome_b_C cytochrome_b_N cytochrome_c DAG_PE-bind DNA_methylase DNA_pol dsrm E1-E2_ATPase efhand EGF enolase fer2 fer4 fer4_NifH FGF

61 80 109 57 84 53 80 61 91 53 130 170 175 68 48 46 14 102 320 169 40 88 152 49 39

95.1 83.8 98.2 93.0 94.0 98.1 88.8 93.4 87.9 92.5 79.2 98.2 93.7 89.7 83.3 80.4 85.7 93.1 92.2 89.3 100 94.5 88.2 95.9 97.4

100 100 99.0 100 100 100 94.9 96.6 95.5 98.0 85.2 95.5 96.5 88.2 93.6 87.7 92.3 96.0 95.9 92.8 97.4 97.7 97.3 100 100

98.3 100 98.2 100.0 100 100 89.8 96.3 90.4 92.9 84.4 92.7 95.4 87.1 87.2 81.1 90.2 90.0 93.6 90.5 97.4 94.4 90.7 100 100

mostly the smallers ones. Trees are very short, only 4 generations, so distances are very subtle. Global Overall accuracy and errors are given in Table 2, where it could be seen that relative improvement of 3nn over the traditional VLMC approach was of 56%, scoring only the training set, a more reliable performance index. Table 2. Global errors and global improvement. Overall Accuracy Overall Apparent Error Rate: Overall Improvement:

VLMC

3nn-a

3nn-b

0.9280 0.0720

0.9915 0.0085 88.24%

0.9687 0.0313 56.46%

4.2. Phylogenetic classification: Bird Flu virus In this section of the paper we attempt to widen the discussion from eﬀective construction of phylogenetic trees to knowledge construction of phylogenetic networks or phylogenetic islands. Phylogenetic analysis often produces thousands of candidate trees. Biologists resolve the conﬂict by computing a single consensus tree. By using a single tree to conclude the result of phylogenetic analysis, a lot of information is lost about the whole set of candidate trees including how the trees are distributed in the space of all binary trees and how the trees are similar to each other. We start proposing a simple analysis that regards the outcomes of the diﬀerent algorithms as a random sample from an unknown distribution, as in Balding et al. (2009) 3 , Busch et al. (2009) 11 . This formulation treats

May 9, 2013

11:54

BC: 8846 - BIOMAT 2012

19˙ﬂesia

337

the protein data as ﬁxed and the clustering algorithms and distance models randomly, which is unusual in statistics; but is more natural in consensus theory, where one often combines information from diﬀerent trees built from a common dataset. Our approach is equivalent to assuming that the distribution of tree data contain several modes, and the observed trees consists of the modes corrupted by errors capturing independent diﬀerences in the mechanics of the clustering algorithms, which are not trivially duplicative, and thus tend to increase the dispersion of the protein data. Our goal is to verify that the natural islands of trees are the ones generated by each of the tree construction methods. Stockham et al. (2002) 20 computed the islands using standard clustering algorithms modiﬁed to calculate the Robinson Foulds distance between phylogenetic trees. They looked at the best clustering for thee diﬀerent datasets and compared the strict consensus trees of the clusters to the strict consensus trees of the datasets, showing an improvement in the multi-tree consensus over the single strict consensus. Our empirical study propose to randomly select 20% of each of our natural islands, and classify them using the other combined 80% of trees as training data, within a ten fold cross validation scheme. We also compute consensus trees for each one the test set islands, and score them using the 80% set aside as training data.

4.3. Performing a Phylogenetic Analysis of the HA Protein A phylogenetic analysis of the HA protein from H5N1 virus isolated from chickens at diﬀerent times (years) in diﬀerent regions of Asia and Africa has been produced by 13 , in order to investigate the evolution of the virus in time and space. The sequences under study were HA primary sequences from poultry in Nigeria (2006), Afghanistan (2006), Sudan (2006), Hebei (2002), Hebei (2001), Hebei (2005), Kurgan (2005), Viet Nam (2004), Henan (2004), Oita (2004), Yamaguchi (2004), Guangdong (2004), Korea (2003), Jilin (2004), Hong Kong (2001), Hong Kong (1997). In 13 , a phylogenetic tree was made using the Neighbor joining method, using a pairwise matrix distance between sequences corrected following the Jukes Cantor model. The relationship between sequences may also studied by means of multidimensional scaling (MDS), with the distances calculated for the phylogenetic tree. Emerging clusters corresponded to groupings in the phylogenetic tree. In 13 , the more recent H5N1, Nigeria (2006), Afghanistan (2006), Sudan (2006), separated from the older virus,

May 9, 2013

11:54

BC: 8846 - BIOMAT 2012

19˙ﬂesia

338

Sudan_2006

Sudan_2006

Kurgan_2005

Kurgan_2005

Nigeria_2006

Nigeria_2006

Afghanistan_2006

Afghanistan_2006

Yamaguchi_2004

Yamaguchi_2004

Korea_03

Korea_03

Oita_2004

Oita_2004

Henan_2004

Guangdong_04

Jilin_2004

Hebei_01

Viet Nam_2004

Hong Kong_1997

Guangdong_04

Henan_2004

Hebei_01

Jilin_2004

Hong Kong_1997

Viet Nam_2004

Hebei_02

Hebei_02

Hebei_2005

Hebei_2005

Hong Kong_01 0

0.01

0.02

0.03

0.04

0.05

0.06

Hong Kong_01

0.07

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

MDS Plot of HA Sequences 0.02 0.015 Guangdong_04 Korea_03 Yamaguchi_2004

Hebei_01 Hebei_2005

Oita_2004

0.01

Hong Kong_01 Hebei_02

0.005

Hebei_2005

Hong Kong_1997 Korea_03

Henan_2004 Jilin_2004

0

Oita_2004 Yamaguchi_2004 Guangdong_04 Kurgan_2005 Nigeria_2006

Viet Nam_2004

−0.005 −0.01 Sudan_2006

Sudan_2006 Afghanistan_2006

Kurgan_2005

−0.015 0.05

Henan_2004 Jilin_2004 0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

Hong Kong_1997

Nigeria_2006 Afghanistan_2006

Viet Nam_2004

0

Hong Kong_01

Hebei_01 Hebei_02

0 −0.05

−0.03

−0.02

−0.01

0

0.01

0.02

0.03

0.04

0.045

Figure 4. Left panel:WPGMA distance tree of bird ﬂu virus using Jukes-Cantor model. Right panel: UPGMA distance tree of bird ﬂu virus using the Jukes-Cantor model. Second line: Neighbor Joining tree and Multidimensional Scaling plot of data, showing clusters determined by groupings in the phylogenetic tree.

indicating that the virus evolved over time. Also most of the viruses from the same regions group together indicating that virus types mutate when they arrive to a diﬀerent place. In Figure 4 we observe two evolutionary trees over the same taxa obtained by WPGMA (weighted average linkage) and UPGMA (average linkage) methods. 4.3.1. Creation of the Island of trees We tested the eﬃciency of the k-nn classiﬁcation procedure using the virus sequences stated before,under the following hypothesis: • the protein data is ﬁxed and the clustering algorithms and distance models are random.

0.05

May 9, 2013

11:54

BC: 8846 - BIOMAT 2012

19˙ﬂesia

339

• the distribution of tree data contain several modes, and the observed trees consists of the modes corrupted by errors capturing independent diﬀerences in the mechanics of the clustering algorithms, which are not trivially duplicative, and thus tend to increase the dispersion of the protein data. We call a constructor a combination of a tree reconstruction method and the evolutionary model used to estimate distances between sequences, and we used the 6 constructors shown in Table 3 to build collections of trees. The tree reconstruction methods considered besides Neighbor Joining were UPGMA and WPGMA pairwise alignment methods, which are traditional methods for producing trees, and the evolutionary model were the Jukes Cantor model and the Gamma correction. Table 3. Constructors. (NG/JK) (NG/G) (UPGMA/JK) (UPGMA/G) (WPGMA/JK) (WPGMA/G)

Neighbor Joining with Jukes Cantor model Neighbor Joining with Gamma correction Unweighted Pair Group Method Average (UPGMA) method with Jukes Cantor model Unweighted Pair Group Method Average (UPGMA) method with Gamma Correction Weighted Pair Group Method Average (WPGMA) method with Jukes Cantor model Weighted Pair Group Method Average (WPGMA) method with Gamma Correction

Bootstrap is a usual tool in the area of evolutionary biology for both making inferences and evaluating robustness of some methods. The method of bootstrapping is the multinomial nonparametric bootstrap as applied in the binomial setting. For each bootstrap simulation step, a new data matrix is accumulated by choosing columns from the original data matrix at random with replacement, and this is repeated until there are as many columns in the new matrix as were in the original data. Then a new tree is built from this dataset with the constructors. The analysis was performed using the Matlab software, and it is available from the ﬁrst author upon request. 4.3.2. Results As we did in the previous example, we made a small experiment to choose the number of neighbors, keeping it ﬁxed at k = 3 over the whole experience. Within a ten fold cross validation scheme, we classiﬁed 20% of the samples, using 80% of trees as training data, and we also computed the consensus trees of each one of the 20% sets (inside each island), and classiﬁed it using the 80% set aside as training data. In the next table, we show in the ﬁrst line: Global Accuracy classifying 20% of the samples, using 80% of trees as

May 9, 2013

11:54

BC: 8846 - BIOMAT 2012

19˙ﬂesia

340

training data, within a 10 fold cross validation design. Second line: Global Accuracy classifying the consensus trees of 20% of the samples, using 80% of trees as training data. Table 4. First line: Global Accuracy classifying 20% of the samples, using 80% of trees as training data, within a 10 fold cross validation design. Second line: Global Accuracy classifying the consensus trees of 20% of the samples, using 80% of trees as training data. NG/JK)

(NG/G)

(UPGMA/JK)

(UPGMA/G)

(WPGMA/JK)

(WPGMA/G)

0.9280 0.90

0.9015 0.8

0.8187 0.9

0.900 0.8

0.9015 0.7

0.9287 0.7

Global Accuracy is above 81% in all classes. We did no show a Confusion Matrix because trees that were not placed in its own island become scattered among all other islands. We also computed the majority consensus tree for each one of the 20% samples, and classify them using the remaining 80%. We can see that the labeling of the consensus trees is less accurate, since their are more similar to each other, and in the case of tying, the k-nn procedure chooses the label at random. We would like to come back to the original conclusions from 13 where emerging geographic clusters corresponded to groupings in the phylogenetic tree, and the more recent H5N1, Nigeria (2006), Afghanistan (2006), Sudan (2006), separated from the older virus, indicating that the virus has evolved over time. Also most of the viruses from the same regions group together indicating that virus types mutate when they arrive to a diﬀerent place. The same gross analysis can be made with each of the original trees, made with all six diﬀerent constructors, in the case of all 2006 viruses, Sudan(2006), Kurgan(2006), Nigeria(2006) and Afghanistan(2006), but the older viruses show diﬀerent phylogenetic relationships when the tree constructors are changed. The consensus trees are then poorly reﬁned. 5. Discussions We center the discussion of this paper on statistical aspects of supervised classiﬁcation in a framework where the tree data to be classiﬁed have been sampled from some unknown probability distribution. We suggest a very simple strategy for non-parametrical statistical classiﬁcation on metric spaces of trees. The main idea is to re-deﬁne the well known k nearest neighbors classiﬁer, arguing about its reliability with notions of statistical consistency. We also study ﬁnite sample training errors. As it has been discussed 9 , a consensus tree is a summary of a set of trees, and not necessarily an optimal estimator of the phylogeny. In our example, clades in the phylogenetic trees should give evidence of virus mutations in time and space,a consensus tree made with all the tree data

May 9, 2013

11:54

BC: 8846 - BIOMAT 2012

19˙ﬂesia

341

will show only mutations from Asia to Africa, when a detailed analysis of each of the original trees will see evidence of mutations in Asia by the year. How strong is this evidence, is a question that remains with an ambiguous answer. There is no Manova test deﬁned for populations of phylogenetic trees, and we created a classiﬁcation scheme forced by this fact. If our hypothesis are correct, and the Besicovich condition applies, k-nn weak consistency properties gives evidence that the natural islands created with the constructors are indeed diﬀerent. Acknowledgments Work is partially supported by grants PICT 2008 00291 and PID SecytUNC. We would like to thank Florencia Leonardi for kindly providing the data we used on Section 4, and Ricardo Fraiman for many interesting conversations we had regarding consistency of the k-nn procedure, and the diﬀerences between the BFFS space 3 and the CAT(0) spaces. 6. Appendix A 6.1. Besicovich condition A deep insightful perspective of the supervised classiﬁcation problem in vector spaces can be found in the book of Luc Devroye 14 . Establishing consistency on general metric spaces is a much more diﬃcult problem. In 12 , it has been proved that Stone’s theorem on universal consistency is not valid in general metric spaces, but if (F, d) is separable and the Besicovich condition is assumed, then the k-nearest neighbor classiﬁer is weakly consistent. Let’s assume, for the sake of simplicity, that m = 2, that means, we have only two classes. Let also deﬁne η(t) = E(Y |T = t) = P (Y = 1|T = t) the regression function, and PT the probability distribution of the random element T ∈ T . Theorem 6.1. The tree spaces (T , dH ) or (Gs ) are both separable spaces where the following condition, called the Besicovich condition, holds 1 η(t)dPT (t) = η(T ) in probability lim δ→0 PX (BX,δ ) B X,δ with BX,δ = {y ∈ T : d(y, t) ≤ δ} the closed ball with center t and radius δ. Thus, using the main theorem from Cerou and Guyader (2008) 12 , the k-nn classifier become weakly consistent.

May 9, 2013

11:54

BC: 8846 - BIOMAT 2012

19˙ﬂesia

342

The proof of the theorem relies on the fact that the spaces have σ-ﬁnite dimensionality, which implies σ ﬁnite dimensionality of the measures, a notion introduced by Preiss(1983) to characterize the spaces where all measures fulﬁll the Besicovich condition. Without delving into the details of this notion, we just note that the space of phylogenetic trees is ﬁnite, and the BFFS space is a countable sum of spaces of trees of ﬁnite depth. The BFFS space is, essentially, a subset of the space {0, 1}S , for S countable, which is a usual conﬁguration in interacting particle systems. The product of the discrete topologies induces the metric BFFS, and under that metric, the convergence of a sequence tn to t is equivalent to the convergence of xn (v) to v(n) for all vertex v. The phylogenetic tree space with the Robinson Foulds distance can be seen embedded on a connected graph (see Banks et al(1994) 4 for details), were the minimum path between each pair of nodes (trees) is the RF distance.

References 1. S. F. Altschu, T. L. Madden, A. A. Schaeﬀer, J. Zhang, Z. Zhang, W. Miller and D. J. Lipman. Gapped blast and psi blast: a new generation of protein database search programs. Nucleic Acids Research, 25, 3389 (1997). 2. B. Aydan, G. Pataki, H. Wang, E. Bullitt and J. S. Marron. A principal component analysis for trees. The Annals of Applied Statistics, 3 (4), 1597 (2009). 3. D. Balding, P. Ferrari, R. Fraiman and M. Sued. Limit theorems for sequences of random trees. Test, 18 (2), 302 (2009). 4. D. Banks and G. Constantine. Metric models for random graphs. Journal of Classification, 15, 199 (1998). 5. A. Bateman, L. Coin, R. Durbin, R. D. Finn, V. Hollich, S. Griﬃths-Jones, A. Khanna, M. Marshall, S. Moxon, E. L. Sonnhammer, D. J. Studholme, C. Yeats and S. R. Eddy. Nucl. Acid Res., 32 (90001), 138 (2004). 6. G. Bejerano. Automata learning and stochastic modelling for bio sequence analysis. Ph.D. thesis Hebrew University (2003). 7. G. Bejerano and G. Yona. Variations on probabilistic suﬃx trees: statistical modeling and prediction of protein families. Bioinformatics, 17 (1), 23 (2001). 8. C. Berge. Hypergraphs: combinatorics of ﬁnite sets. North Holland (1989). 9. D. Bryant. A classiﬁcation of consensus methods for phylogenies. DIMACS, AMS, 163 (2003). 10. P. B¨ uhlmann and A. J. Wyner. Variable Length Markov chains. Annals of Statistics, 27, 480 (1999). 11. J. Busch, P. Ferrari, A. G. Flesia, R. Freiman, S. Grynberg and F. Leonardi. Testing statistical hypothesis on random trees, and applications to the protein classiﬁcation problem. Annals on Applied Statistics, 3 (2), 542 (2009).

May 9, 2013

11:54

BC: 8846 - BIOMAT 2012

19˙ﬂesia

343

12. F. Cerou and A. Guyader. Nearest neighbor classiﬁcation in inﬁnite dimension. ESAIM: P&S, 10, 340 (2006). 13. N. Cristianini and M.W. Hahn. Introduction to Computational Genomics: A Case Studies Approach. Cambridge University Press (2007). 14. L. Devroye, L. Gyorﬁ and G. Lugosi. A Probabilistic Theory of Pattern Recognition. Springer-Verlag (1996). 15. A. G. Flesia. A note on distinguishing random trees populations. Communications in Statistics Theory and Methods, in press (2011) 16. A. G. Flesia. Unsupervised classiﬁcation of tree structured objects. BIOMAT 2008: International Symposium on Mathematical and Computational Biology, World Sci. Publ. Co., 280 (2009). 17. R. Hamming. Error detecting and error correcting codes. Bell systems technical journal, 29, 147 (1950). 18. J. Rissanen. A universal data compression system. IEEE Trans. Inform. Theory, 29 (5), 656 (1983). 19. D. R. Robinson and L. R. Foulds. Comparison of phylogenetic trees. Mathematical Biosciences, 53, 131 (1981). 20. C. Stockham, L. Wang and T. Warnow. Statistically based postprocessing of phylogenetic analysis by clustering. Bioinformatics, 18, S285 (2002). 21. H. Wang and J. Marron. Object oriented data analysis: Sets of trees. Annals of Statistics, 35 (5), 1849 (2007).

May 9, 2013

9:51

BC: 8846 - BIOMAT 2012

20˙santina

REGULARITY OF OPTIMAL COST FUNCTIONAL APPLIED TO STUDY OF ENVIRONMENTAL POLLUTION∗

SANTINA ARANTES† and JAIME RIVERA‡ National laboratory of Scientific Computation - LNCC, Av. Get´ ulio Vargas, 333 - Quitandinha, CEP:25651-075, Petr´ opolis, Rio de Janeiro, Brazil

We study environmental pollution problems by using the optimal control theory applied to partial differential equations. We consider the problem to find the optimal way to eliminate pollution in the time, such that the concentration is close to a standard level which does not affect the ecological equilibrium when the source is pointwise. We consider a quadratic cost functional and we prove the existence and uniqueness of optimal control. We find a characterization which makes possible the computing of optimal control. Additionally, we consider the problem moving the pointwise source. So we define a functional j(b) that associates to any point b in a region of the space the optimal cost functional applied to the optimal control. We show that j(b) is differentiable, provided that the controls are taken in a convenient subset of admissible functions satisfying the cone properties. We also find the point in the region, for which the cost functional is minimum.

1. Introduction We study pollution’s ambient problems using the optimal control theory applied to partial differential equations. We consider a pointwise source of pollution. We want to control the income of such pollulant over an interval of time [0, T ], such that the corresponding concentration at time T over the domain Ω will be as close as possible to an objetive function zd , that is as close as possible to a standard level, such that the ecological equilibrium is not affected. We denote the controls as u, v, w..., taken in a closed convex subset Uad of the admissible space U, subset of L2 (0, T ). Here, we consider incompress∗ This

work is supported by FAPERJ - Funda¸c˜ ao de Amparo ` a Pesquisa do Estado do Rio de Janeiro and CNPQ - Conselho Nacional de Desenvolvimento Cient´ıfico e Tecnol´ ogico. † [email protected] ‡ [email protected] 344

May 9, 2013

9:51

BC: 8846 - BIOMAT 2012

20˙santina

345

ible viscous fluid as being the water and the pollutant is the mercury (Hg). In this work we consider the diffusive transient case and the convectivity of the problem which is an important datum in the variation of pollution concentration. Problems involving optimal control in one-dimensional parabolic equations applied to the fluids through a pipe, in which a contaminant is distributed, is studied, for example, in Joshi5 . Our result is original in the sense of applications and for introducing the convection term in the model. In this paper, we follow the same procedure as in Rivera 10 and standard notations as in Adams1 and Lions 7 . 2. Model Formulation Let Ω be an open bounded set of Rn (n = 1, 2, 3) with smooth boundary Γ. For numerical computation, we consider Ω as a rectangular region with boundary Γ = Γ1 ∪ Γ2 ∪ Γ3 ∪ Γ4 , as in the Figure below. y

Γ1

Γ4

β

b

Γ3

ω Ω Γ2 Figure 1.

x

F (x, t) = v(t)δ(x − b)

The state is defined by the solution z of system ∂z − λ∆z + β · ∇z = v(t)δ(x − b) ∂t z(x, 0) = 0 z(x, t) = 0 ∂z (x, t) = f (x, t) ∂ν

in

Ω × (0, T )

in

Ω

on Γ1 × (0, T ) on (Γ2 ∪ Γ3 ∪ Γ4 ) × (0, T )

(1)

May 9, 2013

9:51

BC: 8846 - BIOMAT 2012

20˙santina

346

where z(x, t) stands for the pollutant’s concentration at (x, t) ∈ Ω × [0, T ], λ > 0 is the diffusion coefficient, β is the velocity vector with βi ∈ C 1 (Ω), v(t) is the pointwise control, δ(x − b) is the Dirac mass at b, where b determines the point of pollution entrance and T > 0 is given. The condition (1)2 states that in the initial time the pollution is null. The condition (1)3 states that in any time t ∈ (0, T ) in boundary Γ1 there is no pollution. The condition (1)4 represents the diffusive flow of pollution through boundaries 1 Γ2 ∪Γ3 ∪Γ4 by the function f , where f is given in L2 0, T ; H 2 (Γ2 ∪Γ3 ∪Γ4 ) and ∂z/∂ν = ∇z · ν, where ν is the exterior normal of Ω. We want to minimize the cost functional Z Z T 2 J(v) = z(·, T ; v) − zd dx + N |v|2 dt, Ω

0

over a suitable set of admissible functions, where zd ∈ L2 (Ω) defines the objetive function and the cost constant N > 0 is given. Note that the cost functional J is well defined when z(·, T ; v) ∈ L2 (Ω). In fact, to analyse the problem, we decompose (1) as z = y + w, where y and w are, respectively, solutions of systems ∂y − λ∆y + β · ∇y = v(t)δ(x − b) ∂t y(x, 0) = 0 y(x, t) = 0 ∂y (x, t) = 0 ∂ν

in Ω × (0, T ) in Ω

(2)

on Γ1 × (0, T ) on (Γ2 ∪ Γ3 ∪ Γ4 ) × (0, T )

and ∂w − λ∆w + β · ∇w = 0 ∂t w(x, 0) = 0

in Ω × (0, T ) in Ω

w(x, t) = 0 on Γ1 × (0, T ) ∂w (x, t) = f (x, t) on (Γ2 ∪ Γ3 ∪ Γ4 ) × (0, T ). ∂ν The cost functional J can be rewritten as Z Z T 2 J(v) = |y(·, T ; v) − [zd − w(·, T )]| dx + N |v|2 dt, Ω

2

(3)

0

1 2

where f ∈ L 0, T ; H (Γ2 ∪Γ3 ∪Γ4 ) and V = v ∈ H 1 (Ω); v = 0 on Γ1 . The problem (2) is not a usual one, for n = 2, 3. This because δ ∈ H −2 (Ω), which makes the problem more delicate from the mathematical point of

May 9, 2013

9:51

BC: 8846 - BIOMAT 2012

20˙santina

347

view. Instead, (3) can be solved using standard methods for parabolic partial differential equations. Using the transposition method as in Lions 6,8 ), we can prove that for any v ∈ L2 (0, T ) there exists only one y(v) ∈ L2 (0, T ; L2(Ω)) ultraweak solution of (2), verifying Z Z T Z T y Ψ dtdx = v(t)φ(b, t) dt, Ω

2

0

0

2

for any Ψ ∈ L (0, T ; L (Ω)) and φ solution of −

∂φ − λ∆φ − β · ∇φ = Ψ ∂t φ(x, T ) = 0

in Ω × (0, T ) in Ω

φ(x, t) = 0

on Γ1 × (0, T )

∂φ(x, t) + φ(x, t)(β · ν) = 0 on (Γ2 ∪ Γ3 ∪ Γ4 ) × (0, T ), ∂ν where φ satisfies Z T |φ(b, t)|2 dt ≤ CkΨk2 2 2 λ

L

0

0,T ;L (Ω)

and

Z

0

T

|φ(b, t)|2 dt ≤ CkΨk2 2 L

0,T ;L2 (Ω)

.

The main problem here is that y ∈ C([0, T ]; H −1 (Ω)), therefore y(·, T ; v) makes sense only in H −1 (Ω) and not in L2 (Ω) as required for the cost functional. Hence, we need to define the admissible set of controls as U = v ∈ L2 (0, T ); y(·, T ; v) ∈ L2 (Ω) . This space together with the norm Z T Z |v|2 dt + |y(·, T ; v)|2 dx kvk2U = 0

(4)

Ω

is a Hilbert space. Lions 9 showed that U is characterized as ( ) Z TZ T 2 −n/2 U = v ∈ L (0, T ); (2T − t − s) v(t)v(s)dtds < ∞ , 0

0

which is a Hilbert space with the norm Z T Z TZ T 2 2 k|vk|U = |v| dt + (2T − t − s)−n/2 v(t)v(s)dtds. 0

0

0

He also proved that the norms given in (4) and (5) are equivalents.

(5)

May 9, 2013

9:51

BC: 8846 - BIOMAT 2012

20˙santina

348

Here, we consider the closed convex subset Uad of admissible controls U, given by Uad = v ∈ U; v ≥ ψ ≥ 0 a.e. in (0, T ) , ψ given in H 1 (0, T ). (6) Under these conditions, it is not difficult to show that there exists a unique u ∈ Uad solution of J(u) = inf {J(v); v ∈ Uad } ,

(7)

which is characterized by the solution of the optimality system ∂y − λ∆y + β · ∇y = u(t)δ(x − b) Ω × (0, T ) ∂t ∂q − − λ∆q − β · ∇q = 0 Ω × (0, T ) ∂t y(x, 0) = 0 ; q(x, T ; u) = y(x, T ; u)− zd −w(x, T ) Ω y(x, t) = q(x, t) = 0

∂y ∂q(x, t) (x, t) = λ + q(x, t)(β · ν) = 0 ∂ν ∂ν with Z T q(b, t) + N u(t) v(t) − u(t) dt ≥ 0, 0

(8)

Γ1 × (0, T )

(Γ2 ∪Γ3 ∪Γ4 )×(0, T )

∀ v ∈ Uad ,

u ∈ Uad ,

(9)

which implies − 1 1 u(t) = ψ(t) + q(b, t) + ψ(t) = P − q(b, t) . (10) N N In (8), w is the solution of (3) and q ∈ L2 0, T ; V ∩ C 0 ([0, T ]; L2(Ω)) is the solution of the adjoint system and satisfies q ∈ C ∞ ([0, T ) × Ω). Therefore, we can define q(b, t) for t < T . The integral in (9) denotes duality between U ′ and U, f − = max{0, −f } and P is the projection operator of L2 (0, T ) on Uad . Note that for any b ∈ Ω, we get an optimal control u = u(b, ·) solution of (8). So, we can define the optimal cost functional j(b) as being the value of J applied to the optimal control u(b, ·), which is the solution of problem (7). That is, j(b) is defined as

j : Ω −→ R R RT b 7−→ j(b)=J(u(b, ·))= Ω |y(b, T ; u)−[zd − w(·, T )] |2 dx+N 0 |u(b, t)|2 dt.

For applications, it is important to know the regularity of j(b). Lions 9 studies a similar problem with β = 0 and homogeneous Dirichlet boundary conditions. He shows, when the closed convex subset

May 9, 2013

9:51

BC: 8846 - BIOMAT 2012

20˙santina

349

Uad has no restrictions, that is, when Uad = U and zd ∈ H01 (Ω), the optimal cost functional j is differentiable with continuous partial derivatives given by ∂ 2 j(b) = − ∂bk N

Z

T

q(b, t)

0

∂ q(b, t)dt. ∂xk

Rivera 10 studies the same problem as Lions9 and also showed that j is differentiable when Uad is the positive cone {v ∈ U; v ≥ 0}, moreover that the partial derivatives of j(b) are given by ∂ j(b) = 2 ∂bk

Z

T

u(b, t)

0

∂ q(b, t)dt. ∂xk

(11)

In this paper, we improve Rivera and Lions results in the sense that we consider β 6= 0 and the convex set of admissibles controls Uad given by (6). Under this conditions, we show that j(b) ∈ C 1 (Ω), provided that zd ∈ H 1 (Ω) and that the partial derivatives of j(b) are given by (11). Moreover, we introduce numerical computations of the problem.

3. Regularized Problem Let us denote by φ0 a C ∞ (Ω) function with supp φ0 ⊂ B(0, ε) = {x ∈ Ω ; kxk < ε} for ε > 0, such that the function φ defined as φ : Ω × Ω → R,

(x, b) 7→ φ(x, b) = φ0 (x − b)

is a C ∞ (Ω × Ω) function and has support in B(b, ε) ⊂ Ω. Let us consider the regularized state equation ∂y − λ∆y + β · ∇y = v(t) φ(x, b) ∂t y(x, 0) = 0 y(x, t) = 0 ∂y (x, t) = 0 ∂ν

in Ω × (0, T ) in Ω on Γ1 × (0, T )

(12)

on (Γ2 ∪ Γ3 ∪ Γ4 ) × (0, T ).

It is not difficult to see that for any v ∈ L2 (0, T ) the solution y satisfies y ∈ L2 0, T ; V ∩ H 2 ∩ C 0 [0, T ]; H 1 (Ω)

and

∂y ∈ L2 0, T ; L2(Ω) . ∂t

May 9, 2013

9:51

BC: 8846 - BIOMAT 2012

20˙santina

350

Therefore, the solution of the problem (7) associated to the state equation (12) is characterized as ∂y −λ∆y +β · ∇y = u(t) φ(x, b) ∂t ∂q − − λ∆q − β · ∇q = 0 ∂t y(x, 0) = 0 ; q(x, T ; u) = y(x, T ; u)− zd −w(x, T ) y(x, t) = q(x, t) = 0

∂y ∂q(x, t) (x, t) = λ + q(x, t)(β · ν) = 0 ∂ν ∂ν with Z T Z q(·, t) φ dx + N u (v − u)dt ≥ 0, 0

Ω

in Ω × (0, T ) in Ω × (0, T ) in Ω

(13)

on Γ1 × (0, T ) on (Γ2 ∪Γ3 ∪Γ4 )×(0, T ) u ∈ Uad ,

(14)

Z 1 =P − q(·, t) φdx . N Ω

(15)

∀ v ∈ Uad ,

which implies u(t) = ψ +

1 N

Z

q(·, t) φdx + ψ

Ω

−

In (13) w is the solution of (3) and q is the solution of the adjoint system. For the regularized problem, we will show that the optimal cost functional j is differentiable when Uad given in (6) and its partial derivatives satisfy Z T Z ∂ ∂ j(b) = 2 u(b, t) q(·, t) φdxdt. (16) ∂bk 0 Ω ∂xk The next Remark will be useful in what follows. Remark 3.1. As we are considering the viscous incompressible fluid, we have β Γ2 ∪Γ4 = 0 and div(β) = 0. (17)

The first condition comes from viscosity of fluid, because the particles of fluid adhere to the solid surface and thus, the velocity is null in this surface, reaching maximum velocity in the central line (parallel to the axis x) of domain. The second condition comes from incompressibility of fluid. See Schlichting 12 . Using (17), we see that Z Z Z q(β · ∇y)dx = yq(β · ν)dΓ − y(β · ∇q)dx. Ω

Γ3

Ω

In particular Z Z Z Z 1 1 2 (β · ∇y)ydx = |y| |β|dΓ and (β · ∇q)qdx = |q|2 |β|dΓ. 2 Γ3 2 Γ3 Ω Ω

(18)

May 9, 2013

9:51

BC: 8846 - BIOMAT 2012

20˙santina

351

Lemma 3.1. Let us take b ∈ Ω. Then j(b) is bounded in Ω. Proof. We have that u(b, t) ∈ L2 (0, T ) and zd ∈ L2 (Ω). Now, as w(·, T ) ∈ L2 (0, T ; L2(Ω)) and does not depend on b and v, it is sufficient to show that Z T ky(b, T ; u)k2L2(Ω) ≤ c |u(b, t)|2 dt, ∀b ∈ Ω. 0

Multiplying (13)1 by y, integrating over Ω×(0, T ) and using (18), we obtain Z Z T Z TZ Z Z TZ 1 2 2 yt ydtdx + λ |∇y| dxdt + |y| |β|dΓ = uφydxdt. 2 Γ3 Ω 0 0 Ω 0 Ω From Poincare’s and Young’s inequalities, it follows Z Z T Z Z T Z Z T 1 d 2 2 2 |y| dtdx + λ |∇y| dtdx ≤ Cε |φ| dx |u|2 dt 2 Ω 0 dt Ω 0 Ω 0 Z Z T Z T Z TZ +ε |y|2 dtdx ≤ Cε kφk2L2 (Ω) |u|2 dt + cε |∇y|2 dxdt. Ω

0

0

0

Ω

This implies Z Z TZ Z T 1 |y(·, T ; u)|2 dx + (λ − cε) |∇y|2 dxdt ≤ C1 (φ) |u|2 dt, 2 Ω 0 Ω 0 with ε small enough, such that λ − cε > 0. Therefore, the result follows. We denote by φh the h-translation of φ; that is, φh (x, b) = φ(x, b + h). The h-translation of {u, y, q} and w, we denote by {uh , yh , qh } and wh , respectively, which is the solutions of (13) and (3) for φ = φh , respectively. Finally, let us introduce the notation, 1 1 1 (uh − u), Dh y = (yh − y), Dh q = (qh − q). h h h It is not difficult to see that Dh u, Dh y and Dh q satisfy the system Dh u =

(Dh y)t −λ∆(Dh y)+β · ∇(Dh y) = (Dh u)φh +uDhφ,

−(Dh q)t − λ∆(Dh q) − β · ∇(Dh q) = 0

Ω×(0, T ) Ω×(0, T )

Dh y(x, 0) = 0 ; Dh q(x, T ) = Dh y(x, T )

Ω

Dh y(x, t) = Dh q(x, t) = 0

Γ1 ×(0, T )

∂(Dh y) ∂(Dh q) (x, t) = λ + Dh q(β · ν) = 0, ∂ν ∂ν Remark 3.2. Rivera decreasing.

10

(Γ2 ∪Γ3 ∪Γ4 )×(0, T ).

showed that function t → kDh q(·, t)kL2 (Ω) is non-

May 9, 2013

9:51

BC: 8846 - BIOMAT 2012

20˙santina

352

In that follows we denote by W an open set of Rn , such that W ⊂ Ω. Lemma 3.2. Let φ be a C 1 function, Uad the closed convex subset given by (6) and zd ∈ L2 (Ω). Then, Dh u, Dh y(·, T ) ( and therefore Dh q(·, T )) are bounded, for h < d(W, Γ). Proof. See Arantes2 .

Corollary 3.1. Let b ∈ W. Then, functions b → u(·, b), b → y(·, b, T ), b →

∂q (x, b, t), ∂xk

are continuous from Ω to L2 (0, T ), Ω to L2 (Ω), and Ω to L2 (0, T ; L2 (Ω)), respectively. Proof. See Arantes 2 . Lemma 3.3. The solutions {u, y, q} and w of (13) and (3), respectively, satisfy Z [yh (·, T ) − y(·, T )] [qh (·, T ) + q(·, T )] dx Ω

=

Z

0

T

uh

Z

T

Z

Ω

(φh − φ)(qh + q)dx +

Z

Ω

[wh (·, T ) − w(·, T )] [qh (·, T ) + q(·, T )] dx = 0.

0

(uh − u)

Z

φ(qh + q)dxdt,

(19)

Ω

(20)

Proof. See Arantes 2 .

Now, we will show the differentiability of optimal cost functional j of the regularized problem. Theorem 3.1. For Uad given in (6), the optimal cost functional j, for the regularized problem, is differentiable with continuous partial derivatives and (16) holds for any b in Ω. Proof. From the definition of optimal cost functional j, we get Z h i j(b + h) − j(b) = yh (·, T ) − y(·, T ) + wh (·, T ) − w(·, T ) Ω

[qh (·, T ) + q(·, T )] dx + N

Z

0

T

2 uh (t) − u2 (t) dt.

May 9, 2013

9:51

BC: 8846 - BIOMAT 2012

20˙santina

353

Using (19) and (20), we arrive at Z T Z j(b +h) − j(b) = uh (φh − φ)(qh + q)dxdt 0 Ω Z Z T + (uh − u) φ(qh + q)dx + N (uh +u) dt. 0

(21)

Ω

On the other hand Z Z (φh −φ)qh dx = [φ0 (x−b−h) − φ0 (x − b)] qh dx Ω Ω Z = φ0 (x−b) [qh (x + h, t) − qh (x, t)] dx.

(22)

Ω

Substitution of (22) into (21), yields j(b + h) − j(b) Z T Z n o = uh φ(x, b) qh (b + h, t) − qh (b, t) + q(b + h, t) − q(b, t) dxdt 0

Ω

T

−

Z

T

+

Z

0

0

(uh − u)

Z

(uh − u)

Z

h i φ(x, b) qh (b + h, t) − qh (b, t) dxdt

Ω

Ω

h i φ(x, b) qh (b + h, t) + q(b, t) dx + N (uh + u) dt.

From (14), we get that the last term of the above equation is zero. Moreover, dividing the above equation by h, taking the limit when h → 0 and from Corollary 3.1, we have Z T Z ∂ ∂ j(b) = 2 u(b, t) q(·, t) φdxdt. ∂bk ∂x k 0 Ω Thus, the prove is complete.

4. Differentiability of Optimal Cost Functional In this section, we study the differentiability of optimal cost functional j for the non-regularized problem. For this, we will show the convergence of problem (13)-(15) to problem (8)-(10). We will use the properties of projections and the analysis of the translation problem given in Section 3. To regularize (8), we consider a sequence (φη0 )η∈IN of C ∞ (Ω) functions satisfying supp{φη0 } ⊂ B(0, η1 ) = {x ∈ Ω ; kxk < η1 } and Z φη0 ≥ 0 , φη0 (·)dx = 1, ∀ η ∈ IN. R

May 9, 2013

9:51

BC: 8846 - BIOMAT 2012

20˙santina

354

Let denote by (φη )η∈IN the sequence of functions in C ∞ (Ω × Ω) defined as φη (x, b) = φη0 (x − b). Consider ε given with 0 < ε < T . Then, we rewrite (8) as the η-ε-approximated problem ∂(y ε )η − λ∆(y ε )η + β · ∇(y ε )η = χ[0,T −ε] (uε )η φη ∂t ∂(q ε )η − − λ∆(q ε )η − β · ∇(q ε )η = 0 ∂t (y ε )η (x, 0) = 0 (q ε )η (x, T ; (uε )η )= v(y ε )η (x, T ; (uε )η )− zd −w(x, T ) ε η

ε η

(y ) (x, t) = (q ) (x, t) = 0 ε η

ε η

∂(y ) ∂(q ) (x, t) (x, t) = λ + (q ε )η (x, t)(β · ν) = 0 ∂ν ∂ν

Ω × (0, T ) Ω × (0, T ) Ω

(23)

Ω Γ1 × (0, T ) (Γ2 ∪Γ3 ∪Γ4 )×(0, T )

with (uε )η (t) = (ψ ε )η +

1 N

Z

(q ε )η φη dx + (ψ ε )η

Ω

−

Z 1 =P − (q ε )η φη dx N Ω

(y ε )η (x, T ; (uε )η )−−→ − y η (x, T ; uη )−−− →y(x, T ; u) strong in H 1 (Ω) ∩ V (24) η→∞ ε→ 0 φη → δ(x − b) strong in H −2 (Ω),

as η → ∞.

(25)

By virtue from Theorem 3.1, we can write the partial derivatives of jηε as ∂ ε j (b) = 2 ∂bk η

Z

T

(uε )η (b, t)

0

Z

Ω

∂ (q ε )η (·, t) φη dxdt. ∂xk

(26)

Theorem 4.1. The solution of the problem (23) satisfies (y ε )η (·, T ), (q ε )η (·, T ) ∈ H 1 (Ω), provided that zd ∈ H 1 (Ω). Proof. See Arantes 2 .

Theorem 4.2. Let {(uε )η , (y ε )η , (q ε )η } be the solution of problem (23). Then, as ε → 0, we have the convergences (q ε )η → q η ,

(ψ ε )η → ψ η ,

strong in L2 0, T ; H 2(Ω) ∩ V , L2 0, T ; L2(Ω) , respectively.

(uε )η → uη , H 1 (0, T ),

(y ε )η → y η , L2 (0, T )

and

May 9, 2013

9:51

BC: 8846 - BIOMAT 2012

20˙santina

355

Proof. See Arantes 2 .

As consequence from Theorem 4.2, the solution of (23) converges to the solution of the system below, when ε → 0. ∂y η − λ∆y η + β · ∇y η = uη φη Ω×(0, T ) ∂t η ∂q − λ∆q η − β · ∇q η = 0 Ω×(0, T ) − ∂t y η (x, 0) = 0 ; q η (x, T ) = y η (x, T ; uη )− zd −w(x, T ) Ω η

η

y (x, t) = q (x, t) = 0 η

(27)

Γ1 ×(0, T )

η

∂y ∂q (x, t) (x, t) = λ + q η (x, t)(β · ν) = 0 (Γ2 ∪Γ3 ∪Γ4 )×(0, T ) ∂ν ∂ν Z − Z 1 1 where uη (t) = ψ η + q η φη dx + ψ η =P − q η φη dx . N Ω N Ω Therefore, from (26), we can write the partial derivatives of jη as Z T Z ∂ ∂ η η jη (b) = 2 u (b, t) q (·, t) φη dxdt. ∂bk ∂x k 0 Ω

(28)

Lemma 4.1. Consider b ∈ Ω. Then, sequence (jη (b))η∈IN is bounded uniformly in Ω. Proof. The proof is consequence from Theorem 4.2. Lemma 4.2. Let θ be a C ∞ (Ω) function satisfying supp θ ⊂ Ω

and

θ(x) = 1

in

W.

If zd ∈ H 1 (Ω), then for any b ∈ W, ∂q(b, t)/∂xk ∈ U ′ and the identities Z Z T Z Z TZ ∂q η η ∂(θq η ) ∂ξ η uη (t) φ dxdt = y η (·, T ) (·, T )dx+ yη dxdt ∂xk ∂xk 0 Ω ∂xk Ω 0 Ω Z

0

T

∂q u(t) (b, t)dt = ∂xk

Z

∂(θq) y(·, T ) (·, T )dx + ∂xk Ω

Z

0

T

Z

Ω

y

∂ξ dxdt ∂xk

hold. Here ξ η = −2λ

n X ∂q η ∂θ − λ(∆θ)q η − (β · ∇θ)q η . ∂xk ∂xk k=1

Proof. See Arantes 2 .

May 9, 2013

9:51

BC: 8846 - BIOMAT 2012

20˙santina

356

Finally, we will show the main result of this paper. Theorem 4.3. Let zd ∈ H 1 and Uad given in (6). Then, the solution of (27) converge to the solution of (8), as η −→ ∞. Moreover, partial derivatives of j(b) are continuous and (11) holds. Proof. From the limitations obtained in Theorem 4.1 and convergences from Theorem 4.2, we conclude that sequences q η , y η , uη and ψ η , η ∈ IN are bounded in L2 (0, T ; H 2(Ω)∩V ), L2 (0, T ; L2(Ω)), L2 (0, T ) and H 1 (0, T ), respectively. Therefore, analogously to Theorem 4.2 and convergence (24), we get the following convergences, as η −→ ∞. qη → q

yη → y

strong in L2 (0, T ; H 2 (Ω) ∩ V ), strong in L2 (0, T ; L2 (Ω)),

uη → u strong in L2 (0, T ),

ψη → ψ

strong in H 1 (0, T ),

y η (·, T ; uη ) → y(·, T ; u) strong in H 1 (Ω) ∩ V . From the above convergences, (25) and (28), we obtain ∂ lim jη (b) = 2 η→∞ ∂bk

Z

0

T

u(t)

∂ q(b, t)dt. ∂xk

(29)

To conclude our result, it remains to show that sequence

∂ jη (b) ∂bk η∈IN

(30)

is bounded and equicontinuous. In fact, from Lemma 4.2, it follows Z TZ η η η ∂ξ η y (·, T ) ∂(θq ) (·, T ) dx + 2 y ∂xk dxdt ∂xk Ω 0 Ω 2 !1/2 Z 1/2 Z ∂(θq η ) 2 η ≤2 |y (·, T )| dx ∂xk (·, T ) dx Ω Ω !1/2 Z Z !1/2 Z TZ T ∂ξ η 2 η 2 dxdt +2 |y | dxdt 0 there exists σ > 0, such that η Z T Z ∂ξ η ε η ∂ξh |y | − dxdt < . ∂xk ∂xk 4 T −σ Ω

(32)

Using the regularity of q, we arrive at η Z T −σ Z Z T −σ Z ∂ξ η η ∂ξh |y | − dxdt = |y η | ∂xk ∂xk 0 Ω 0 Ω ( n ) ∂ X ∂θ ∂ η η η η η η 2 (qh − q ) + (∆θ)(qh − q ) − (β · ∇θ)(qh − q ) dxdt ∂xk ∂xi ∂xi i=1 Z 1/2 2 ≤ C(σ) |yhη (·, T ) − y η (·, T )| dx ≤ C(σ)h. (33) Ω

May 9, 2013

9:51

BC: 8846 - BIOMAT 2012

20˙santina

358

Combining (32) and (33), we obtain Z η η y η ∂ξh − ∂ξ ≤ C(σ)h + ε . dxdt ∂x ∂x 4 k k Q

(34)

ε 3ε Taking δ = min{ 8C , 8C(σ) }, it follows from (31)-(34) that ∂ ∂ h = khk < δ ⇒ jη (b + h) − jη (b) ≤ ε, ∀η ∈ IN. ∂xk ∂xk

Therefore, the sequence given in (30) is equicontinuous and bounded. Thus, there exists a subsequence, which converges uniformly in Ω to a continuous function. On the other hand, using the same line of ideas as in the proof of Theorem 3.1 and from (22), we conclude that Z T ∂ ∂ j(b) = 2 u(b, t) q(b, t)dt. (35) ∂bk ∂xk 0 From (29) and (35), we have lim

η→∞

∂ ∂ jη (b) = j(b). ∂bk ∂bk

Therefore, our result follows.

5. Numerical Approximation and Convergence To solve numerically (23), we uncouple this system as follows: Let n ∈ IN, 1 Z − ytn − λ∆y n + β · ∇y n = χ[0,T −ε] ψ + φq n dx + ψ φ N Ω −qtn − λ∆q n − β · ∇q n = 0

y n (x, 0) = 0 ; q n (x, T ) = y n−1 (x, T ; u) − zd − w(x, T ) ; Ω y n (x, t) = q n (x, t) = 0 n

(36)

n

∂y ∂q (x, t) (x, t) = λ + q n (x, t)(β · ν) = 0 ∂ν ∂ν y 0 (·, T ) = given in H 1 (Ω). Here, for simplicity we omitted the indexes ε and η. The next theorem shows the convergence of (36) to (23), as n −→ ∞. Theorem 5.1. Consider zd ∈ H 1 (Ω) and N large enough. Then, the solution of (36) converges to the solution of (23), provided that n −→ ∞.

May 9, 2013

9:51

BC: 8846 - BIOMAT 2012

20˙santina

359

Proof. See Arantes 2 .

Remark 5.1. Varying point x = b of pollutant entrance, (1) does not change, and for each point b, among optimal controls u (that depend on point b), there exists only one optimal control ub ∈ Uad , such that J ub = inf J u ; u ∈ Uad .

This means that in Ω there exists only one point bo (called strategic point or optimal point) where pollution is even less harmful to the environment. 6. Numerical Problem In order that the numerical problem represents a practical situation, we will take f = 0 in boundaries Γ2 ∪ Γ3 ∪ Γ4 . Thus, in (3) we have w ≡ 0. For the numerical solution of (36), we use the finite element method combined with iterative methods. The variational formulation consists of finding y n , q n ∈ V = v ∈ H 1 (Ω) ; v = 0 on Γ1 , such that

∂y n , v + λ ∇y n , ∇v + (β · ∇y n , v) = (F n , v), ∂t ∂q n , v + λ ∇q n , ∇v + (β · q n , ∇v) = 0, ∂t ∀ v ∈ V . With the initial and final conditions and “final datum” −

(37) (38)

y n (x, 0) = 0; q n (x, T ) = y n−1 (x, T ; u) − zd ; y 0 (·, T ; u) = given in H 1 (Ω) where

Z − 1 F n = ψ+ q n φdx+ψ φ. N Ω

The approximated problems, semidiscrete and completely discretized by finite elements are defined in usual way (see Hughes 2000). We consider Vh (Ω) ⊂ C 0 (Ω) finite dimensional subspace given by Vh = vh ∈ V ; vhe ∈ P 1 (Ωe ) ,

where vhe is the restriction of vh to the element “e”, and P 1 (Ωe ) is the set of linear polynomials defined in Ωe . We will apply the formulation of Streamline-Upwind/Petrov-Galerkin - SUPG in this subspace. For time discretization, we use the Euler implicit method of finite difference, to approximate terms yt and qt . We divide interval [0, T ] into subintervals

May 9, 2013

9:51

BC: 8846 - BIOMAT 2012

20˙santina

360

[tj−1 , tj ], where tj = j∆T , j = 1, . . . , k with t0 = 0 and tk = T . The approximations of terms yt and qt are given by ∂y(x, tj ) yj − yj−1 = ∂t ∆t

and

∂q(x, tj ) qj+1 − qj = . ∂t ∆t

Therefore, the complete discretized problem associated to (37)-(38) consist n n of: Given j = 1, ..., k, find yh,j , qh,j ∈ Vh ⊂ V , such that yn

h,j

∆t

n n , vh + λ ∇yh,j , ∇vh + β · ∇yh,j , vh +

Ne Z X e=1

yn

h,j

Ωe

∆t

n n − λ∆yh,j + β · ∇yh,j τ β · ∇vh dΩe

Ne Z n n X yh,j−1 yh,j−1 n = + Fh,j + , vh + τ β · ∇vh dΩe ∆t ∆t e Ω e=1 qn h,j n n , ∇vh , vh + λ ∇qh,j , ∇vh + β · qh,j ∆t Ne Z qn X h,j n n + − λ∆qh,j − β · ∇qh,j τ β · ∇vh dΩe ∆t e Ω e=1 Ne Z n qn X qh,j+1 h,j+1 = , vh + τ β · ∇vh dΩe ∆t ∆t e e=1 Ω

n Fh,j

∀ vh ∈ Vh . With the initial and final conditions and “final datum” n yh, 0 (x) = 0,

n−1 n qh, k (x) = yh, k (x; u) − zd ,

where parameter τ is given by τ =

0 1 yh, k (·) = given in H (Ω),

h 2|β| .

7. Computational Results, Analysis and Conclusion The acceptable level zd of mercury (Hg) in drinking water is given in Table 1, (see CONAMA 3 and Pichard 11 ). Table 1: Used values in graphs. Diffusion coefficient of Hg (cm2 /s) λ = 6 × 10−6 Acceptable level of Hg (mg/cm3 ) zd =2 × 10−6 sen(πx/L) sen(πy/L) 2 Velocity field (cm/s) β=10−4 1− y −(L/2) /(L/2) , 0 Cost constant Final time (s)

N = 15 × 104 T = 8640000

May 9, 2013

9:51

BC: 8846 - BIOMAT 2012

20˙santina

361

Resolution Algorithm 0 Given yh, k. For n = 1, . . . , Np . q n = y n−1 − z . n For j = k, . . . , 0, find qh, h, k d j ∈ Vh , such that h, k q n , vh + λ ∆t ∇q n , ∇vh + ∆t β · q n , ∇vh h, j h, j h, j Ne Z X n n e n + qh, − λ ∆t ∆q − ∆t β · ∇q j h, j h, j τ β · ∇vh dΩ e Ω e=1 Ne Z X e n n = qh, j+1 , vh + qh, ∀vh ∈ Vh . j+1 τ β · ∇vh dΩ , e Ω e=1 y 0 = 0. n For j = 1, . . . , k, find yh, h, 0 j ∈ Vh , such that y n , vh + λ ∆t ∇y n , ∇vh + ∆t β · ∇y n , vh h, j h, j h, j Ne Z X n n n e + yh, j − λ ∆t ∆yh, j + ∆t β · ∇yh, j τ β · ∇vh dΩ e Ω e=1 Ne Z X = ∆tF n + y n n n e , v + ∆tFh, h j +yh, j−1 τ β · ∇vh dΩ h, j h, j−1 e e=1 Ω − R n 1 n ∀v ∈ V , where F = ψ + q φ dx + ψ φh . h h j j h, j N Ω h, j h

We consider a square domain Ω = (0, L) × (0, L), where L = 100cm and point (0,0) corresponds to boundaries Γ1 ∩ Γ2 (see Figure 1). We use a mesh of 10000 equal-length quadratic elements with ∆x = ∆y = 1 cm and 400 steps in the time with ∆t = 21600 s fixed. The sequence of functions φl , of compact support in ω ⊂ Ω ⊂ R2 , is chosen as 1 if x ∈ ω 2 φl (x) = l 0 if x ∈ /ω where ω, a square of side l centered in point b ∈ Ω, represents the place of pollutant entrance. We plot the cases of internal pollutant source F (x, t) = v(t)φl (x). Letting l → 0 to get graphs for the pointwise pollutant source F (x, t) = v(t)δ(x − b). Without loss of generality, we consider ψ ≡ 0 in the optimal control u0 (t). To get graph of optimal cost functional j(b), we vary point b in region ω with l = 0.2cm. In Remark 5.1, we see that exists only

May 9, 2013

9:51

BC: 8846 - BIOMAT 2012

20˙santina

362

one optimal point of J(u(bo )), where the pollution is less harmful to the environment. Here, we can get its accurate localization, through the graph of the optimal cost functional. To get the computational results, we use the implemented code in Fortran 90 and the graphs are presented in Maple. We analyse cases of only one pollutant source in water without movement and with movement. In Figure 2, we consider the case where β = (0, 0)cm/s, that is, convective velocity null (water without movement) and in Figure 3, we have the graph when the water is in movement. Thus, we have the graphs

1e–08

9.9e–09

j(b) 9.8e–09

9.7e–09

100

b_2

50

0

Figure 2.

20

0

40

80

60

100

b_1

j(b): Optimal Cost Functional.

1.21e–08

1.2e–08

1.19e–08

1.18e–08 j(b) 1.17e–08

1.16e–08

1.15e–08 0 100

20

80

40 60 b_1

60 40

80

20

b_2

100

Figure 3.

j(b): Optimal Cost Functional.

May 9, 2013

9:51

BC: 8846 - BIOMAT 2012

20˙santina

363

In Figure 2, the graph of optimal cost functional j(b) reaches its minimum in point bo situated in the central region of domain Ω and by Figure 3, we see that the j reaches the point of minimum in the central region of boundary Γ3 (see Figure 1). Being these the optimal points where the environment is less affected. References 1. R. A. Adams, “Sobolev Spaces”, New York, Pure Appl. Math, (1975). ´ 2. S. F. Arantes, Tese de Doutorado: “Controle Otimo Aplicado a Problemas de Polui¸c˜ ao e Estabiliza¸c˜ ao de Vigas Termoel´ asticas com Condi¸c˜ oes de Signorini”, Laborat´ orio Nacional de Computa¸c˜ ao Cient´ıfica - LNCC, (2006). 3. CONAMA, “Conselho Nacional do Meio Ambiente”, Di´ ario Oficial da Uni˜ ao 30/07/1986, Res. N. 20 de 18 de junho, 1-20. 4. T. J. R. Hughes, “The finite Element Method: Linear Static and Dynamic Finite Element Analysis”, New York, Dover Publications, INC. Mineola, (2000). 5. H. R. Joshi, “Control of the Convective Velocity Coefficient in a Parabolic Problem”, Elsevier, Nonlinear Analysis, 63, e1383-e1390, (2005). 6. J. L. Lions, “Optimal Control of Systems Gouverned by Partial Differential Equations”, New York, Springer-Verlag, (1971). 7. J. L. Lions, “Some Aspects of the Optimal Control of Distributed Parameter Systems”, Universit´e de Paris and I.R.I.A., Reg. Conf. Series in Applied Mathematics, (1972). 8. J. L. Lions, “Function Spaces and Optimal Control of Distributed Systems”, Rio de Janeiro, Universidade Federal do Rio de Janeiro - UFRJ, (1980). 9. J. L. Lions, “Some Methods in the Mathematical Analysis of Systems and their Control”, Beijing - China, Science Press, (1981). 10. J. E. Mu˜ noz Rivera, “Differentiability of the Optimal Cost Function in Pointwise Control”, Differential and Integral Equations, 5, 871-889, (1992). 11. A. Pichard, “Mercury and its Derivatives”, Institut National de L’Environnement Industriel et des Risques - INERIS, Version No. 1, 1-45, (2000). 12. H. Schlichting, “Boundary-Layer Theory”, Seventh Edition, New York, McGraw-Hill, Series in Mechanical Engineering (1979).

May 10, 2013

9:4

BC: 8846 - BIOMAT 2012

21˙barbosa

A SENSITIVITY ANALYSIS OF GENE EXPRESSION MODEL

N. A. BARBOSA∗ Universidad Nacional de Colombia Carrera 45 No. 26-85, Building 411 Of. 203A E-mail: [email protected]

The tetracycline repressor/activation system interactions have been modeled successfully using diﬀerential equations. This interactions between diﬀerent molecules populations rely on several parameters and its relations with the model variables. This paper starts with a four compartment model that describes the process of tetracycline mediated gene expression and perform a sensitivity analysis in order to determine a set of parameters that are crucial to the dynamics of the process. This paper shows how to evaluate analytically the sensitivity from a continuous and diﬀerentiable system related with the diﬀerent parameter values on a speciﬁc trajectory by using diﬀerential equations on an augmented system. This trajectory sensitivities provide insights into system behavior which cannot be obtained from traditional simulation.

1. Introduction Tetracyclines has been widely used to induce gene expression2,4,8 . Although this process is known as eﬀective, little progress has been made in modeling and understanding the role that each protein concentration has in the gene repression/expression system. The National University of Colombia is working on achieving graded gene expression in a genetically modiﬁed cell line of monocites. Previous works developed a mathematical model 1 and in this work we evaluate the sensitivity of the model parameter in relation with the protein concentrations. Sensitivity Analysis is a technique used to determine how diﬀerent values of an independent variable (such as the parameters) will impact a particular dependent variable under a given set ∗ nathalie

andrea barbosa roa is an electronic engineer, phd student on national university of colombia. magister on industrial automation from national university of colombia 2011. 364

May 10, 2013

9:4

BC: 8846 - BIOMAT 2012

21˙barbosa

365

of assumptions. This technique is used within speciﬁc boundaries that will depend on one or more input variables and initial conditions. It is expected that the outcome of this analysis provide insights on how to proceed with the experiment.

2. Mathematical Model of a graded gene expression system Critical eﬀort is being made recently to build models for gene expression that could help to discover structures or to interpret its behavior. Over the past decades has been developed several gene activation models 5,6,9,10 . This work is part of a project who looks to achieve, model and control gene expression in a dosedependent, graded manner. Graded changes has already been achieve by several authors 3,12 but not yet modeled or controlled. In this work, we take the model proposed by Barbosa et al. at 1 . This model describes the process of tetracycline controlled gene regulation in a genetically modiﬁed cell line of monocites. These cells were modiﬁed to produce a constant amount of tetracycline repressor protein at steady state, it is denoted by T otT etR . In order to regulate gene expression a pretranscriptional method can be used 9 . In this technique, a repressor is used to inhibit promoter detection and so, protein production. One of the commons inductor-repressor pair used to gene silencing is form by the tetracycline and tetracycline repressor 7,13 . The cell line exhibits the following behavior: In absence of inducer (tetracycline), repressor TetR binds to the operator region tetO and shut down the transcription of its own gene and the adjacent promoter: in this case this adjacent promoter is GFP. In the absence of mRN A, translation process cannot continue and therefore protein expression does not happen. Once added, the inducer promotes a conformational change in TetR resulting in dissociation from the operator tetO and the posterior bound with tetracycline (Tet) molecules, allowing transcription 2 . It should be noted that in the presence of tetracycline, T otT etR = T etR, implying that, the amount of active repressor (T etR) is less than the total amount of repressor produced by the cell. Using the Hill function and simple mass law it was developed a simple gen expression mathematical model of the system. This model can be understood through the interactions of four cell groups: the tetracycline repressor (T etR), the tetracycline (T et), the green ﬂuorescent protein (GF P ) and the corresponding messenger RN A (mRN AGF P ). The interactions of (T etR), (T et), (GF P ) and (mRN AGF P ) are illustrated in ﬁgure 1. Protein and mRNA concentrations are treated as continuous dynamical variables

May 10, 2013

9:4

BC: 8846 - BIOMAT 2012

21˙barbosa

366

δT et

αT etR Tet β1 (1+(T etR/K))

η ARN mT etR

TetR δ3

γ ARN mGF P δ1

αT et

GFP δ2

TRi δ3 + ν Figure 1.

Tetracycline controlled activation/repression system diagram.

and its interactions are modeled in a four-compartment model presented on equations 1 to 4. dT etR = −δ3 T etR − αT etRT et + η mRN AT etR + ν (T otT etR − T etR) (1) dt β1 dmRN AGF P n = −δ1 mRN AGF P + (2) dt 1 + T etR K

dGF P = −δ2 GF P + γmRN AGF P dt dT et = −δtet T et − αT etRT et + ζu + (δ3 + ν) (T otT etR − T etR) dt

(3) (4)

Equation (1) represents the dynamical behavior of the ‘active’ tetracycline repressor. This protein concentration will increase proportionally with the amount of mRN AT etR in a scale of η. The concentration will decrease proportionally with the amount of tetracycline (Tet) available (in a meeting rate α) and also because the protein degradation rate δ3 . The dinamycal behavior of mRN AGF P concentration are modeled in equation (2). The production of this mRN A are repressed by T etR, as shown in the second term, in other words, T etR inhibits the transcription of GFP by diminish the amount of mRN AGF P produced. In absence of T etR the number of protein copies per cell produced from its promoter during continuous growth is β1 , in presence of T etR this amount decay into a minimal production rate, and the amount of GF P protein produced at this rate is named basal expression level, GF P dynamics are modeled in equation (3). The concentration of GF P decay by degradation at a δ2 rate. Finally equation (4) models how active tetracycline decay for two main reasons: ﬁrst, protein degradation, second, protein deactivation due to T et bounding with T etR. It is worth noting that in all four equations is only considered degradation

May 10, 2013

9:4

BC: 8846 - BIOMAT 2012

21˙barbosa

367

due half-life time, that is, the amount of time needed for the concentration to reach half of its asymptotic value due natural degradation only. 3. Sensitivity Analysis Sensitivity analysis is very useful when attempting to determine the impact that will have to the actual outcome the change of a particular variable or parameter, that is, when it diﬀers from the value that what was previously assumed. By creating a given set of scenarios, the analyst can determine how changes in one variable (or parameter) will impact the output variable. This analysis shows which parameters are critical to system behavior and then we can set this parameters as priority in laboratory experimental identiﬁcation tests. Sensitivity analysis also can be used to: validate a model, point out important assumptions, help formulate model structure, simplify a model, warn of unrealistic model behavior, suggest new experiments, guide future data collection eﬀorts, suggest accuracy for calculating parameters, choose an operating point, among others critical actions. To examine parameter sensitivity against variables, an analytic approach are chosen and a new diﬀerential equation system is used. In this new system, a set of diﬀerential equations are added to system original equations, this new equations represent the output variable change due to parameter change. In order to run the sensitivity analysis we have to built two matrices. The ﬁrst matrix represents dynamic change on a population due to variables change, this matrix is denoted by W . The second matrix, denoted as V , represent the change in output variables due parameter change. Matrices W and V are shown on equation (5). W (t, λ) =

∂f (t,x,λ) |x=x(t,λ) ∂x

, V (t, λ) =

∂f (t,x,λ) |x=x(t,λ) ∂λ

(5)

The procedure for calculating the sensitivity function S(t) is the following, based in the methodology shown in 11 , for continuous system: First the Jacobian matrices V and W are evaluated. Second, an augmented equation system of the form shown on (6) is created. Finally, we solve the augmented equation system, ﬁnding the sensitivity function and the nominal solution at the same time. x˙ = f (t, x, λ) , x(t0 ) = x0 x˙ λ = W (t, λ) xλ (t, λ) + V (t, λ) , xλ (t0 , λ) = 0

(6)

The function xλ is called the sensitivity function, and equation (6) is

May 10, 2013

9:4

BC: 8846 - BIOMAT 2012

21˙barbosa

368

Table 1. Simulation model parameters. Parameter

Value

Units

δ1 δ2 δ3

0.06931 0.00193 0.01733 0.00136 20000 1.0 × 10−6 3 299.7576 2 1 20 0.1

min−1 min−1 min−1 min−1 molec/dose (molec · min)−1 molec/transcrip molec no dimensio molec molec/transcrip min−1

δT et ζ α η K n β1 γ ν

called the sensitivity equation. Sensitivity functions provide ﬁrst-order estimates of the eﬀect of parameter variations on solutions. Furthermore, each sensitivity function represents the eﬀect of one parameter or variable variation, against one determined population or output, for one speciﬁc trajectory. Note that trajectory sensitivities are generally time varying quantities, dependant on initial conditions and input values. The following, presents sensitivity analysis results for several trajectories associated with the GFP tetracycline-activated expression. 4. Results R software In order to evaluate sensitivity for several trajectories, MATLAB is used. The nominal parameters used in this analysis are shown in table 3. Four diﬀerent trajectories were evaluated:

(1) The change from GFP repression to expression (T et dose) (2) Steady GFP repression (T et = 0) (3) Change from GFP expression to repression (T et wash out) (4) Multiple changes between repression and expression (Multiple T et dosages) The following shows the results for each one of this trajectories. Trajectory 1 : The trajectory given by the transition from the repression state to the maximum production state. The nominal system is given by the equations (1) to (4) using the parameter values shown in table 3. The initial conditions are T etR = T otT etR = 2.9975761×103 , mRN AGF P = 1.4426489×10−3 , GF P = 14.9863983 and T et = 0, this values correspond to the steady state system values

May 10, 2013

9:4

BC: 8846 - BIOMAT 2012

21˙barbosa

369

Figure 2.

Trajectory from GFP repression to expression due T et dosage. Trajectory 1.

with zero entrance. The system input are deﬁned as u = 1, this represents a constant dose of tetracycline in the system. The trajectory associated with this test is shown on ﬁgure 2. After the evaluation of the augmented equation system for the given conditions, the sensitivity charts are found. The sensitivity time integral value evaluated on the given trajectory, for each variable and parameter are shown on the ﬁgure 3 as a three variable bar chart. In the x axis we found the variables of the model and, in the y axis, the parameters. The z axis show the sensitivity as a dimensionless quantity. From ﬁgure 3 it is deduced that in the protein non production to production step, the most sensitive parameters are: the rate constant for Tet-TetR binding forward reaction α and degradation rates of the molecules. Among the latter, the more crucial was that associated with tetracycline δT et . Trajectory 2 : The trajectory given by steady GF P repression. When the nominal system is analyzed using the steady state system values with zero entrance as initial conditions, as in previous trajectory, with no Tetracycline dosage, the system keeps the initial values in a constant trajectory, as seen in ﬁgure 4. The sensitivity analysis in this case shows delta3 , the protein degradation rate, as the most inﬂuent parameter, see ﬁgure 5. This result says that T etR degradation rate variation will change the most the steady state values of ﬁrst: the T et concentration and second: the T etR concentration. The GF P production and its mRN A concentration in this case are unresponsive to parameters variations. Trajectory 3 : The trajectory given by the transition from the production state to the repression state. Having a full expression system requires continuous

May 10, 2013

9:4

BC: 8846 - BIOMAT 2012

21˙barbosa

370

Figure 3. Sensitivity function time integral value of the sensitivity functions for each pair parameter variation - variable repercussion for the trajectory 1

(a) Tet

Ŧ3

Molecules

1

x 10

0.5 0 Ŧ0.5 Ŧ1

0

50

100

150

200

250

300

350

250

300

350

(b) GFP Molecules

14.99

14.988

14.986 0

50

100

150

200

Time ( Hours ) Figure 4.

Trajectory for GFP constant expression. Trajectory 2.

T et dose, if the inducer dosage stops, the system with migrate to its original equilibrium point, showing no GFP expression. The initial conditions for this

May 10, 2013

9:4

BC: 8846 - BIOMAT 2012

21˙barbosa

371

Figure 5. Sensitivity function time integral value of the sensitivity functions for each pair parameter variation - variable repercussion for the trajectory 2

(a) Tet

6

Molecules

15

x 10

10 5 0

0

50

100

Molecules

200

250

300

350

250

300

350

(b) GFP

4

10

150

x 10

5

0

0

50

100

150

200

Time ( Hours ) Figure 6. Trajectory from GFP expression to repression due T et concentration fall to zero. Trajectory 3.

analysis are T etR = T RT ot = 23.709216, mRN AGF P = 3.579701, GF P = 9.219942 × 103 and T et = 1.471649 × 107 , this values correspond to the steady state system values with u = 1. The input is set to zero. The trajectory associated with this case is shown in ﬁgure 6, and the bar chart showing the absolute of the

May 10, 2013

9:4

BC: 8846 - BIOMAT 2012

21˙barbosa

372

Figure 7. Sensitivity function time integral value of the sensitivity functions for each pair parameter variation - variable repercussion for the trajectory 3

sensitivity values is shown in ﬁgure 7. In this case, as in case 1, the sensitivity analysis shows that α is the parameter whose variation aﬀects the output more, followed by the T et degradation rate. Trajectory 4 : The trajectory given by multiple expression-repression changes due a intermittent T et dosage. In practice the repression-activation system will be changing between this two states and so multiples variations has to be tested too. This case shows a 2-hour T et dose on day ﬁve (u = 1 in this period), followed by a ﬁve and a half days wait until the next 2-hours dose. The initial conditions are T etR = T RT ot = 2.9975761 × 103 , mRN AGF P = 1.4426489 × 10−3 , GF P = 14.9863983 and T et = 0. The trajectory associated with this behavior is shown in ﬁgure 8. Sensitivity analysis over trajectory 4 shows α is the parameter whose variation aﬀects the output more, this parameter change also aﬀects the T etR and T et concentrations. This last variable are also substantially aﬀected by changes in δT et .

5. Conclusions The sensitivity analysis performed in the tetracycline repressor/activation system brought to light several ﬁndings: • The parameter α which represents the rate constant for T et − T etR binding forward reaction, is the key parameter in this model, since a little variation in it could trigger a huge variation in the output. As presumed

May 10, 2013

9:4

BC: 8846 - BIOMAT 2012

21˙barbosa

373

(a) Tet

6

Molecules

3

x 10

2 1 0

0

50

100

150

200

250

300

350

250

300

350

(b) GFP Molecules

3000 2000 1000 0

0

50

100

150

200

Time ( Hours ) Figure 8. Trajectory from GFP repression to expression due T et multiple dosages. Trajectory 4.

Figure 9. Sensitivity function time integral value of the sensitivity functions for each pair parameter variation - variable repercussion for the trajectory 4

May 10, 2013

9:4

BC: 8846 - BIOMAT 2012

21˙barbosa

374

at the beginning the high aﬃnity between the tetracycline, T et, and the repressor, T etR, is the element that makes the repression and posterior activation possible, this fact suggest that experimental eﬀorts should be focused on collect data to estimate this parameter, and, if it is possible, measure it, with the purpose of tunning the model for the experimental case. • Degradation rates should be also measured experimentally, since this measure is not complicated and this parameters proved to be important as well, direct measured of these, will be quite helpful so that the remaining parameters could be estimated through measurements of the output variable, using approximation techniques. • Parameter estimation should be more accurate for α and the degradation rates and in a slack manner the number of proteins produced per each mRN A molecule (η, γ), the Hill coeﬃcient (n) and the amount of repressor necessary to half-maximally repress a promoter (K). • Sensitivity analysis showed that major changes are needed in the size of the dose (ζ) to produce changes in the output, this result is contrary to what was initially believed, suggesting the inclusion of a dose change experiment in order to prove this result. References 1. N. A. Barbosa, H. D´ıaz and A. Ramirez. Two diﬀerent approaches for gene expression model control. In Rubem P. Mondaini, editor, BIOMAT 2011, chapter Optimal Control Techniques in Mathematical Modelling of Biological Phenomena, pages 153–177. World Scientiﬁc, 2011. 2. C. Berens and W. Hillen. Gene regulation by tetracyclines. European Journal of Biochemistry, 270(15):3109–3121, 2003. 3. S.R. Biggar and G.R. Crabtree. Cell signaling can direct either binary or graded transcriptional responses. Science’s STKE, 20(12):3167, 2001. 4. S. Buates, X. Xuan, M. Igarashi, C. Sugimoto and N. Inoue. The inﬂuence of the regulation of toxoplasma gondii tgmic2 transgene on host cell infection. 2008. 5. D. Chandran, WB Copeland, SC Sleight and HM Sauro. Mathematical modeling and synthetic biology. Drug Discovery Today: Disease Models, 5(4):299– 309, 2009. 6. Ting Chen, Hongyu L. He and George M. Church. Modeling gene expression with diﬀerential equations. In Pac. Symp. Biocomput, pages 29–40, 1999. 7. Michael B. Elowitz and Stanislas Leibler. A synthetic oscillatory network of transcriptional regulators. Letters to Nature, 443:335–338, 2000. 8. C. Gatz and P.H. Quail. Tn10-encoded tet repressor can regulate an operatorcontaining plant promoter. Proceedings of the National Academy of Sciences, 85(5):1394, 1988. 9. M. Gossen and H. Bujard. Tight control of gene expression in mammalian cells by tetracycline-responsive promoters. Proceedings of the National Academy of Sciences, 89(12):5547, 1992.

May 10, 2013

9:4

BC: 8846 - BIOMAT 2012

21˙barbosa

375

10. JL Hargrove and FH Schmidt. The role of mRNA and protein stability in gene expression. FASEB J., 3(12):2360–2370, October 1989. 11. H. Khalil. Nonlinear Systems. Prentice Hall, 1996. 12. H. Niwa, J. Miyazaki, A.G. Smith, et al. Quantitative expression of oct3/4 deﬁnes diﬀerentiation, dediﬀerentiation or self-renewal of es cells. Nature genetics, 24(4):372–376, 2000. 13. Kavita Iyer Ramalingam, Jonathan R. Tomshine, Jennifer A. Maynard and Yiannis N. Kaznessis. Forward engineering of synthetic biological and gates. Biochemical Engineering Journal, 47:38–47, 2009.

May 9, 2013

12:12

BC: 8846 - BIOMAT 2012

22˙salcedo

A WAVELET-BASED TIME-VARYING IRREGULAR VECTOR AUTOREGRESSIVE MODEL∗

G. E. SALCEDO AND O.E. MOLINA University of Quindio, Carrera 15, Calle 12N, Armenia, Colombia E-mail: [email protected] R. F. PORTO Bank of Brazil, SBS Quadra 1, Bloco C, Lote 32, Brasilia-DF, Brazil E-mail: [email protected]

In this paper we propose a wavelet-based time-varying irregular vector autoregressive (tv-IVAR) model in order to explain the relationships among a set of unequally spaced time series that can be stationary or not. The elements of the autoregression matrices are functions that depend on both the time and the data irregularity. The estimation procedure is made by least squares after applying a previous wavelet basis expansion of these functions and a truncation at a suitable resolution level. We also present some simulations in order to evaluate both the estimation method and the model behavior on finite samples. Applications to a set of irregularly observed biological data are provided as well.

1. Introduction Theory, models and applications of time series have evolved considerably during the last decade due to the development of modern mathematical and statistical theories, new application challenges and specially to the development and easy access to the new computational tools. When samples are regularly observed over time we say the time series is regular, but in many real situations it is not possible to obtain equally spaced observations. In these cases we say the time series is irregular or unequally spaced. Models for time series have aimed univariate stationary time series equally spaced on time. ∗ This

work is supported by colciencias and university of quindio-colombia 376

May 9, 2013

12:12

BC: 8846 - BIOMAT 2012

22˙salcedo

377

On the other hand, for multivariate time series, when we want to study several time series, or a vector series, and we want to know the relationships among them, the vector autoregressive (VAR) models have been an useful tool in part because estimation, interpretation and statistical properties of VAR models are reasonably easy to obtain. Furthermore, Granger causality 12 can be extracted from the structure of the VAR model. Similar to the univariate case, the VAR model was developed for stationary and regular time series. Usually, when time series are non-stationary an appropriate transformation can be applied to the vector series through a diﬀerentiation matrix. However, when the non-stationarity is due to structures of second order changing in time, these transformations have not sense. Autoregressive models with functional parameters 1,2,7 and 15 , is an alternative family of appropiate models for time series with some non-stationary behaviour. There are just a few approaches for irregular time series modeling. Some focus only on deterministic trends and consider smoothing techniques while others approaches consider irregular time series as data with missing observations (see 3,4,5,9,10,16 , for instance). In 11 and 16,17 the authors have used state space representations to ﬁt continuous time autoregressions to unequally spaced time series. If we have a non-stationary time series that is also irregularly spaced on time, the modeling is further complicated. For this situation Salcedo et al.14 propose a wavelet-based time-varying irregular autoregressive model. This model incorporates both the non-stationarity and the irregularity through functional parameters and appropiate indexes, respectively, but for univariate series only. In this paper we propose an autoregressive model with parameters varying in time in order to model the relationships of a non-stationary and irregular vector time series. As given in 14 , the idea is not to transform the data but instead to incorporate both phenomena in the model, where the non-stationarity is explained by the functional parameters and the irregularity is explicit by the functional indexes. This model can be considered an extension of the model in 14 to the case of vector time series as well as an extension of the model in 15 to irregular time series. The paper is organized as follows. Section 2 brieﬂy describes wavelet bases and functional wavelet expansions. Section 3 introduces the tv-IVAR model and the estimation procedure. A simulation study is given in Section 4 and an application is presented in Section 5. Finally, some conclusions are given in Section 6.

May 9, 2013

12:12

BC: 8846 - BIOMAT 2012

22˙salcedo

378

2. Wavelets and wavelet approximations An orthonormal wavelet basis is generated from dilation and translation of a “father” wavelet φ and a “mother” wavelet ψ. These functions are assumed to be compactly supported in [0, T ] and φ satisﬁes φ = 1. A wavelet is r-regular if it has r vanishing moments and r continuous derivatives. A wavelet ψ satisﬁes the admissibility condition if it is 1-regular. Let φj,k (t) = 2j/2 φ(2j t − k)

and

ψj,k (t) = 2j/2 ψ(2j t − k),

j, k ∈ Z,

where φj,k (t) and ψj,k (t) are the scaling and the wavelet functions, respectively, at level j and translation index k. Thus, ψj,k has support [2−j k, 2−j (T + k)]. Notice that the support of the wavelets are translated by shifts of 2−j . Also, the periodized wavelets (see 6 ) are given by φ˜j,k (t) = φj,k (t − l) and ψ˜j,k (t) = ψj,k (t − l), l∈Z

l∈Z

for t ∈ [0, 1]. These are the wavelets that we will use in this paper and so the superscript “∼” will be suppressed thereafter. For a given j0 ∈ Z, the collection {φj0 ,k , k = 0, . . . , 2j0 − 1; ψj,k , j ≥ j0 , k = 0, . . . , 2j − 1} constitutes an orthonormal basis of L2 [0, 1], the space of square integrable functions. Notice that for each level j we have 2j basis functions. Such an orthogonal wavelet basis has an associated multiresolution analysis on [0, 1] that enables one to analyze the data through a number of resolution scales. Let Vj and Wj , j ∈ Z, be the closed linear subspaces generated by {φj,k , k = 0, . . . , 2j −1} and {ψj,k , k = 0, . . . , 2j −1}, respectively. Then, (i) (ii) (iii) (iv)

· · · ⊂ Vj0 −1 ⊂ Vj0 ⊂ Vj0 +1 ⊂ · · · ⊂ Vj ⊂ · · · ; ∪∞ j=−∞ Vj = L2 [0, 1]; Vj+1 = Vj ⊕ Wj ; Wj ⊥ Vj .

Denote the usual inner product by ·, · . For a given square-integrable function f (t), t ∈ [0, 1], we have that the wavelet transform is given by cj0 ,k = f, φj0 ,k and dj,k = f, ψj,k , where cj0 ,k and dj,k , j ≥ j0 , and k = 0, . . . , 2j − 1, are the wavelet coeﬃcients of the coarse and the details scales, respectively.

May 9, 2013

12:12

BC: 8846 - BIOMAT 2012

22˙salcedo

379

So, for a given j0 ≥ 0, the function f (t) can be expanded, in L2 norm sense, into an inﬁnite wavelet series as f (t) =

j0 2 −1

j

cj0 ,k φj0 ,k (t) +

∞ 2 −1

dj,k ψj,k (t), t ∈ [0, 1],

(1)

j=j0 k=0

k=0

i.e., the wavelet transform decomposes the function into diﬀerent resolution components. In practice, j0 = 0 and the expansion in (1) is approximated by the ﬁnite summation f (t) = c0,0 φ0,0 (t) +

j J−1 −1 2

j=0 k=0

dj,k ψj,k (t) =

j J−1 −1 2

dj,k ψj,k (t), t ∈ [0, 1],

j=−1 k=0

(2) where d−1,0 = c0,0 and ψ−1,0 (t) = φ0,0 (t), and J is chosen based on the expected smoothing degree of the function f . “Daublets”, “Symmlets” and “Coiﬂets” are the most used wavelet bases and were introduced by Daubechies in 8 . These wavelets are orthogonal and have compact support. 3. The tv-IVAR model and estimation procedure Let Xti ,T = [X1ti ,T X2ti ,T . . . Xmti ,T ] be a m−dimensional real-valued vector consisting of m univariate series Xlti ,T , l = 1, 2, . . . , m, of the same lenght T and that were sequentially but irregularly observed on time ti ∈ [0, 1]; i = 1, 2, . . . , T = 2J , J ∈ N, and N the set of natural numbers. The time series Xlti ,T , l = 1, . . . , m, represents the l−th irregular univariate time series with observations {Xlt1 , Xlt2 , . . . , XltT }. We do not consider that the time series are stationary but that they are a portion of a locally stationary stochastic process. The time-varying irregular vector autoregressive model of order p, (p ≥ 1), denoted by tv-IVAR(p) is given by Xti ,T = μ(ti ) + A1 (ti )Xti−1 ,T + A2 (ti )Xti−2 ,T + . . . + Ap (ti )Xti−p ,T + εti ,T , (3) where μ(ti ) is the vector of intercepts, A1 (ti ), A2 (ti ), . . . , Ap (ti ) represent the p matrices of autoregressive functionals and εti ,T is the innovations vector of the model. We assume that εti ,T is normally distributed with zero mean and variance and covariance matrix Σ. More speciﬁcally, the innovations vector is given by εti ,T = [ε1ti ,T ε2ti ,T . . . εmti ,T ] ,

May 9, 2013

12:12

BC: 8846 - BIOMAT 2012

22˙salcedo

380

with 2 ). E(εti ,T ) = 0 and Σ = E(εti ,T ε ti ,T ) = diag(σ12 , σ22 , . . . , σm

Model (3) can be written as ⎤ ⎡ ⎤ ⎤ ⎡ 1 ⎤⎡ μ1 (ti ) a11 (ti ) a112 (ti ) . . . a11m (ti ) X1ti−1 ,T X1ti ,T ⎢ X2ti ,T ⎥ ⎢ μ2 (ti ) ⎥ ⎢ a121 (ti ) a122 (ti ) . . . a12m (ti ) ⎥ ⎢ X2ti−1 ,T ⎥ ⎢ ⎥ ⎢ ⎥ ⎥ ⎢ ⎥⎢ ⎢ . ⎥ = ⎢ . ⎥+⎢ . ⎥ + ... ⎥⎢ .. .. .. . . . . . ⎣ . ⎦ ⎣ . ⎦ ⎣ . ⎦ ⎦⎣ . . . . μm (ti ) Xmti ,T Xmti−1 ,T a1m1 (ti ) a1m2 (ti ) . . . amm (ti ) ⎡ p ⎤ ⎡ ⎤ ⎤⎡ a11 (ti ) ap12 (ti ) . . . ap1m (ti ) ε1ti ,T X1ti−p ,T ⎢ ap21 (ti ) ap22 (ti ) . . . ap2m (ti ) ⎥ ⎢ X2ti−p ,T ⎥ ⎢ ε2ti ,T ⎥ ⎢ ⎥ ⎢ ⎥ ⎥⎢ +⎢ . ⎥ + ⎢ . ⎥ . (4) ⎥⎢ .. .. .. .. . ⎣ .. ⎣ ⎣ ⎦ ⎦ . . . ⎦ . . ⎡

apm1 (ti ) apm2 (ti ) . . . apmm (ti )

Xmti−p ,T

εmti ,T

The estimation problem is to estimate all functions μl (ti ) and a (ti ), l, l = 1, 2, . . . , m, from the observed vector time series Xti ,T . In order to deal with the irregularity, we suppose that ti = H −1 (i/T ), i = 1, 2, . . . , T , where H is a mapping cumulative density function H on [0, 1]. The points ti are assumed to be ﬁxed, not randomly drawn from H. The idea is to initially consider the compositions

i , i = 1, 2, . . . , T ; l = 1, . . . , m, μl (ti ) = μl H −1 T ll

and all (ti ) = all

i , i = 1, 2, . . . , T ; l, l = 1, . . . , m, H −1 T

and then to deﬁne the corresponding equispaced functions

i , l = 1, . . . , m, hl (i/T ) = μl H −1 T and gll (i/T ) = all (H −1 (i/T )), l, l = 1, . . . , m, such that each value μl (ti ) and all (ti ) from an unequally spaced point ti is mapped into a value of the composite functions μl ◦ H −1 and all ◦ H −1 , respectively, at the equally spaced point i/T . Finally, an equispaced model equivalent to (3) is given by Xti ,T = h(i/T )+G1 (i/T )Xti−1 ,T +G2 (i/T )Xti−2 ,T +. . .+Gp (i/T )Xti−p ,T +εti ,T , (5)

May 9, 2013

12:12

BC: 8846 - BIOMAT 2012

22˙salcedo

381

or equivalently to ⎤ ⎡ ⎤ ⎤ ⎡ 1 ⎤⎡ ⎡ 1 1 h1 (i/T ) g11 (i/T ) g12 X1ti−1 ,T X1ti ,T (i/T ) . . . g1m (i/T ) 1 1 1 ⎥ ⎥⎢ ⎢ X2ti ,T ⎥ ⎢ h2 (i/T ) ⎥ ⎢ g21 ⎥ ⎢ ⎥ ⎢ (i/T ) g22 (i/T ) . . . g2m (i/T ) ⎥ ⎢ X2ti−1 ,T ⎥ ⎢ ⎥ ⎥+⎢ ⎥⎢ ⎢ . ⎥=⎢ . . . . . .. .. .. .. .. ⎦ ⎦ ⎣ ⎦⎣ ⎣ .. ⎦ ⎣ ... 1 1 1 gm1 (i/T ) gm2 (i/T ) . . . gmm (i/T ) Xmti−1 ,T hm (i/T ) Xmti ,T ⎡ p ⎤ ⎡ ⎤ ⎤⎡ p p ε1ti ,T X1ti−p ,T g11 (i/T ) g12 (i/T ) . . . g1m (i/T ) p p p ⎢ g21 ⎥ ⎢ ⎥ ⎥⎢ ⎢ (i/T ) g22 (i/T ) . . . g2m (i/T ) ⎥ ⎢ X2ti−p ,T ⎥ ⎢ ε2ti ,T ⎥ + +...+ ⎢ ⎢ ⎢ ⎥ ⎥ .. .. . ⎥ . (6) .. .. ⎣ ⎦ ⎣ .. ⎦ ⎦⎣ . . ... . . p p p (i/T ) gm2 (i/T ) . . . gmm (i/T ) gm1

Xmti−p ,T

εmti ,T

Using only one wavelet basis {ψj,k , j, k ∈ Z}, and assuming that hl (i/T ) ∈ L2 (R), l = 1, . . . , m, and glls (i/T ) ∈ L2 (R), l, l = 1, . . . , m; s = 1, . . . , p; we can expand each function in the following way ∞ ∞ dlj,k ψj,k (i/T ), l = 1, . . . , m; hl (i/T ) = j=−1 k=0

glls (i/T ) =

∞ ∞

s,(l,l )

dj,k

ψj,k (i/T ), l, l = 1, . . . , m; s = 1, . . . , p,

j=−1 k=0

such that model (6) can be represented by

Xti ,T =

∞ ∞

hj,k ψj,k (i/T ) +

=

⎣

hj,k ψj,k (i/T ) +

p

j=−1 k=0

s=1

∞

p

+

=

⎡

s=1

j=−1 k=0 j J−1 −1 2

p

hj,k ψj,k (i/T ) +

j≥J k=0

s=1

j J−1 −1 2

p

hj,k ψj,k (i/T ) +

s=1

j=−1 k=0

∞ ∞

⎤ Gsj,k ψj,k (i/T )⎦ Xti−s ,T

j=−1 k=0

⎡

j J−1 −1 2

⎣

j=−1 k=0

⎡ ⎣ ⎡

∞

⎣

⎤ Gsj,k ψj,k (i/T )⎦ Xti−s ,T ⎤

Gsj,k ψj,k (i/T )⎦ Xti−s ,T

j≥J k=0 j J−1 −1 2

+ εti ,T

+ εti ,T

⎤ Gsj,k ψj,k (i/T )⎦ Xti−s ,T

j=−1 k=0

+ υti ,T + εti ,T

with υti ,T =

∞

j≥J k=0

⎤ ⎡ p ∞ ⎣ hj,k ψj,k (i/T ) + Gsj,k ψj,k (i/T )⎦ Xti−s ,T , s=1

j≥J k=0

May 9, 2013

12:12

BC: 8846 - BIOMAT 2012

22˙salcedo

382

where hj,k and Gsj,k , s = 1, . . . , p, represent the vectors and matrices with the wavelet coeﬃcients of the correspondent functions in vector h(i/T ) and matrices Gs , respectively, from model (5). Vector υti ,T contain the errors due to the truncation in the resolution level J − 1. If we suposse that we have a matrix x with p additional observations of each time series, i.e. x = [x1ti x2ti . . . xmti ] for i = 0, −1, . . . , −p + 1; where xlti = {Xlt0 , Xlt−1 , . . . , Xlt−p+1 }; l = 1, 2, . . . , m, we can to expand the model (6) for i = 1, 2, . . . , T, and we can obtain the X = ΨD + v + , where

X = (X1t1 , X1t2 , . . . , X1tT , X2t1 , X2t2 , . . . , X2tT , · · · , Xmt1 , Xmt2 , . . . , XmtT ) ,

v = (υ1t1 , υ1t2 , . . . , υ1tT , υ2t1 , υ2t2 , . . . , υ2tT , · · · , υmt1 , υmt2 , . . . , υmtT ) , = (1t1 , 1t2 , . . . , 1tT , 2t1 , 2t2 , . . . , 2tT , · · · , mt1 , mt2 , . . . , mtT ) ,

D = d1 d11 d12 . . . d1m d2 d21 d22 . . . d2m · · · dm dm1 dm2 . . . dmm

with

dl = dl−1,0 , dl0,0 , dl1,0 , dl1,1 , · · · , dlΔ,0 , dlΔ,1 , , . . . , dlΔ,2Δ −1 , Δ = J − 1, and ll ll ll ll ll ll dll = dll −1,0 , d0,0 , d1,0 , d1,1 , · · · , dΔ,0 , dΔ,1 , , . . . , dΔ,2Δ −1 , the vectors with the respective coeﬃcients of the wavelet expansions of functions hl (i/T ), l = 1, 2, . . . , m, and functions gll (i/T ), l, l = 1, 2, . . . , m. Matrix Ψ is a block-diagonal matrix with m identical diagonal blocks Ψ

.. .. .. , where given by Ψ = Ψ . Ψ . . . . . Ψ 0

1

ψ−1,0 T1 ⎢ ψ−1,0 2 T ⎢ ⎢ .. Ψ0 = ⎢ . ⎢ ⎣ ψ−1,0 T −1 T ψ−1,0 TT ⎡

m

ψ0,0 T1 ψ0,0 T2 .. .T −1 ψ0,0 T ψ0,0 TT

⎤ . . . ψΔ,2Δ −1 T1 . . . ψΔ,2Δ −1 T2 ⎥ ⎥ ⎥ .. .. ⎥, . . ⎥

T −1 ⎦ . . . ψΔ,2Δ −1 T . . . ψΔ,2Δ −1 TT

May 9, 2013

12:12

BC: 8846 - BIOMAT 2012

22˙salcedo

383

ψ−1,0 T1 Xlt0 ψ0,0 T1 Xlt0 ⎢ ψ−1,0 2 Xlt1 ψ0,0 2 Xlt1 T T ⎢ Ψl = ⎢ .. .. ⎣ T. T . ψ−1,0 T XltT −1 ψ0,0 T XltT −1 ⎡

⎤ . . . ψΔ,2Δ −1 T1 Xlt0 . . . ψΔ,2Δ −1 T2 Xlt1 ⎥ ⎥ ⎥, .. .. ⎦ . . T . . . ψΔ,2Δ −1 T XltT −1

. . and each matrix Ψl represents the matrix Ψl = (Ψ1l .. . . . ..Ψpl ) with ⎡ ⎤ ψ−1,0 T1 Xlt1−s ψ0,0 T1 Xlt1−s . . . ψΔ,2Δ −1 T1 Xlt1−s ⎢ ψ−1,0 2 Xlt2−s ψ0,0 2 Xlt2−s . . . ψΔ,2Δ −1 2 Xlt2−s ⎥ T T T ⎢ ⎥ s Ψl = ⎢ ⎥, .. .. .. .. ⎣ ⎦ . T. T . .T ψ−1,0 T XltT −s ψ0,0 T XltT −s . . . ψΔ,2Δ −1 T XltT −s

for l = 1, 2, . . . , m, and s = 1, . . . , p. It is clear that the wavelet coeﬃcients in D are now the parameters of interest, and from them we can to estimate the functions hl (i/T ), l = 1, . . . , m, and gll (i/T ), l, l = 1, . . . , m. The ordinary least squares esti = (Ψ Ψ)−1 Ψ X, and the predictor of X can mator of D is given by D = ΨD. The variances of the errors can be estimated be calculated by X ˆ lti , l = 1, . . . , m. by the variances of the residuals εˆlti = Xlti − X as unbiasedness and Some asymptotic properties of the estimator D, consistence, is sudied through a simulation study. 4. Simulation study In this section we present some simulation results to evaluate the estimation procedure and the goodness-of-ﬁt for the proposed tv-IVAR model for a vector irregular time series. The simulations were done in the R language using the Wavethresh package 13 . Initially, we have generated 1000 irregular time series from the tv-IVAR model of order p = 1 with m = 2. In order to evaluate the consistence of the estimator, we considered series of length T = 64 and T = 128. For simplicity, we have considered a ﬁxed null mean vector and the autoregressive matrix

−0.35 0.35 + 13 sin 2πt T

. A(t) = 0.5 cos 2πt −0.85 sin 2πt T T The innovations were generated from a standard bivariate normal distribution. From each bivariate time series generated we estimated the functional ˆ in order matricial parameter A(t), and then we have used the estimator A(t)

12:12

BC: 8846 - BIOMAT 2012

22˙salcedo

384

to reconstruct the simulated values through the model: ˆ i )X ˆ ti−1 ,T . ˆ ti ,T = A(t X

0

10

20

30

40

50

−0.2 −0.6 −1.0

Theoretical vs. Estimated a12(t)

1.0 0.5 −1.0 −0.5 0.0

Theoretical vs. Estimated a11(t)

Figures 1 and 2 show the results of simulations for the case T = 64. More speciﬁcally, Figures 1(a) - 1(d) show estimatives for each of the four functions in matrix A(t), with respective conﬁdence intervals. The dashed curves represent the theoretical functions, continuous curves represent the mean of the 1000 estimatives and dotted curves correspond to the conﬁdence intervals. Figures 2(a) and 2(b) show the two simulated series (dots) and the predicted series (lines) obtained through model ˆ i )X ˆ ti−1 ,T . ˆ ti ,T = A(t X

60

0

10

20

10

20

30 Time (c)

40

50

60

40

50

60

40

50

60

1.0 0.5 −1.0 −0.5 0.0

1.0 0.5 −1.0 −0.5 0.0 0

30 Time (b)

Theoretical vs. Estimated a22(t)

Time (a)

Theoretical vs. Estimated a21(t)

May 9, 2013

0

10

20

30 Time (d)

Figure 1. Simulation results for T = 64. (a) Estimator of a11 (t), (b) Estimator of a12 (t), (c) Estimator of a21 (t), (d) Estimator of a22 (t).

Figures 3 and 4 show the results of simulations for the case T = 128. Figures 3(a) - 3(d) show the estimatives for each of the four functions in matrix A(t), with respective conﬁdence intervals. Figures 4(a) and 4(b) exhibit the two simulated series (dots) and the predicted series (lines). Notice that the estimatives are reasonably good and as we expect, they improve with increasing T . Both the simulated and the reconstructed series match each other reasonably well even for small sample size. For the estimation procedure we used the Daublets wavelets of Daubechies 8 with 8 vanishing moments.

12:12

BC: 8846 - BIOMAT 2012

22˙salcedo

2 0 −2 −4

Fitted vs. Simulated X1,t

385

0

50

100

150

100

150

3 1 −1 −3

Fitted vs. Simulated X2,t

Time (a)

0

50 Time (b)

0

20

40

60

80

100

−0.2 −0.6 −1.0

Theoretical vs. Estimated a12(t)

1.0 0.5 −1.0 −0.5 0.0

Theoretical vs. Estimated a11(t)

Figure 2. Simulation results for T = 64. (a) Simulated and fitted time series X1,ti , (b) Simulated and fitted time series X2,ti .

120

0

20

40

20

40

60 Time (c)

80

100

120

80

100

120

80

100

120

1.0 0.5 −1.0 −0.5 0.0

1.0 0.5 −1.0 −0.5 0.0 0

60 Time (b)

Theoretical vs. Estimated a22(t)

Time (a)

Theoretical vs. Estimated a21(t)

May 9, 2013

0

20

40

60 Time (d)

Figure 3. Simulation results for T = 128. (a) Estimator of a11 (t), (b) Estimator of a12 (t), (c) Estimator of a21 (t), (d) Estimator of a22 (t).

5. Application The proposed model was applied to a vector irregular time series of 32 points of nitrites and phosphates that were sampled from the waters of the Beagle Channel in Argentina. This channel separates the Tierra del Fuego

12:12

BC: 8846 - BIOMAT 2012

22˙salcedo

4 2 0 −2

Fitted vs. Simulated X1,t

386

0

50

100

150

200

250

300

200

250

300

4 2 −4 −2 0

Fitted vs. Simulated X2,t

Time (a)

0

50

100

150 Time (b)

Figure 4. Simulation results for T = 128. (a) Simulated and fitted time series X1,ti , (b) Simulated and fitted time series X2,ti .

0.3 0.1

nitrites

0.5

from the islands in its south and monitoring its water quality is important, for instance, to farmers in this area. The data were collected from March 2005 to December 2006 at irregularly spaced dates due to weather and operational conditions. The claimed non-stationarity and irregularly behavior can be seen in Figure 5 (top:nitrites, bottom:phosphates).

0

5

10

15

20

25

30

20

25

30

phosphates

Days

0.5 1.0 1.5 2.0

May 9, 2013

0

5

10

15 Days

Figure 5.

Series of nitrites (top) and phosphates(bottom).

12:12

BC: 8846 - BIOMAT 2012

22˙salcedo

387

0

100

200

300

400

1.0 −1.0

−0.5

0.0

Estimated a12

0.5

1.0 0.5 0.0

Estimated a11

−1.0

−0.5

0.0 −1.0

−0.5

Estimated u1

0.5

1.0

Fitting the model tv-IVAR(1) to the series in Figure 5 and applying the proposed estimation, we can obtain the estimatives showed in Figure 6. Notice that in the model for nitrites, the estimated intercept function μ ˆ1 (t) and the function a ˆ12 (t) are very similar to the null function. Under this situation we can say that the nitrites in ti depend only on their value in ti−1 . In the model for phosphates, all the estimated functional parameters are visually diﬀerent from the null function, suggesting that phosphates in ti depend on their values and nitrites in ti−1 and also includes an intercept that vary in time. Notice that, if the bivariate time series were generated from a vector stationary process, we would expect that their estimated functions were close to constant functions and therefore, all estimated functions in Figure 6 would be approximately parallel to the horizontal axes.

0

100

200

300

400

0

100

200

300

400

Days (c)

1 100

200

300

Days (d)

400

−1 −2

−3 −4 0

0

Estimated a22

−1 −2

Estimated a21

1 0

Estimated u2

2

0

1

2

Days (b)

3

Days (a)

−1

May 9, 2013

0

100

200

300

Days (e)

400

0

100

200

300

400

Days (f)

Figure 6. Estimated functions for nitrites and phosphates. (a) Estimator of µ1 (ti ), (b) Estimator of a11 (t), (c) Estimator of a12 (t), (d) Estimator of µ2 (ti ), (e) Estimator of a21 (t), (f) Estimator of a22 (t).

6. Conclusions In this paper, we have proposed a tv-IVAR model for an irregular vector time series that can be stationary or not. The irregularity and nonstationarity behaviour of time series are included in the model through functions that depends on time and that are contained in the functional

May 9, 2013

12:12

BC: 8846 - BIOMAT 2012

22˙salcedo

388

intercept vector and autoregressive matrix. Estimation is done by ordinary least squares and is based on wavelet expansions of these functions. The model performance in ﬁnite samples and its usefulness are illustrated through some simulations and an application to two time series of data collected from a shipping channel. Acknowledgments We would especially like to thank Dr. Marcelo Pablo Hernando from University of Moron (Argentina) and Dr. Fernando R. Momo from University of General Sarmiento (Argentina), for providing the irregular time series used in the application. References 1. Z. Cai, J. Fan and Q. Yao, Functional-coeﬃcient regression models for nonlinear time series, Journal of the American Statistical Association, 95, 941–956 (2000). 2. R. Chen and R.S. Tsay, Functional-coeﬃcient autoregressive models, Journal of the American Statistical Association, 88, 298–308 (1983). 3. T. Cipra, Some problems of exponential smoothing, Applications of Mathematics, 34 161–169 (1989). 4. T. Cipra, J. Trujillo and A. Rubio, Holt-Winter’s method with missing observations, Management Science, 41, 171–178 (1995). 5. T. Cipra, Exponential smoothing for irregular data, Applications of Mathematics, 6 597–604 (2006). 6. A. Cohen, I. Daubechies and P. Vial, Wavelets on the interval and fast wavelet transforms, Applied and Computational Harmonic Analysis, 1, 54–81 (1993). 7. R. Dahlhaus, M.H. Neumann and R. von Sachs, Nonlinear wavelet estimation of time-varying autoregressive processes, Bernoulli, 5, 5, 873–906 (1999). 8. I. Daubechies, Ten lectures on wavelets, Regional conference series in applied mathematics, SIAM, CBMS 61, Philadelphia, (1992). 9. R.H. Jones, Maximum likelihood ﬁtting of ARMA models to time series with missing observations, Technometrics, 22, 389–395 (1980). 10. R.H. Jones, Fitting multivariate models to unequally spaced data, In: Time Series Analysis of Irregularly Data, Proceedings, Lectures Notes in Statistics, 158–178 (1983). 11. G. Kitagawa, State space modeling of non-stationary time series and smoothing of unequally spaced data, In: Time Series Analysis of Irregularly Data, Proceedings, Lectures Notes in Statistics, 189–210 (1983). 12. H. Lutkepohl. Introduction to multiple time series analysis. Springer Verlag. Heildelberg, (1991). 13. G. Nason, wavethresh: Wavelets statistics and transforms, R package version 4.5, http://CRAN.R-project.org/package=wavethresh, (2010).

May 9, 2013

12:12

BC: 8846 - BIOMAT 2012

22˙salcedo

389

14. G.E. Salcedo, R.F. Porto, S.Y. Roa and F.R. Momo, A wavelet-based timevarying autoregressive model for non-stationary and irregular time series, Journal of Applied Statistics. To appear. (2012) 15. J.R. Sato, P.A. Morettin, P.R. Arantes and JR. Amaro, Wavelet based timevarying vector autoregressive models, Computational Statistics and Data Analysis, 51 5847–5866 (2006). 16. R.H. Shumway, Some applications of the EM algorithm to analyzing incomplete time series data, In: Time Series Analysis of Irregularly Data, Proceedings, Lectures Notes in Statistics, 290–324, (1983) 17. R.H. Shumway and D.S. Stoﬀer, Time Series Analysis and its applications with R Examples, Second Ed. Springer Verlang, N.Y, (2006).

May 9, 2013

11:10

BC: 8846 - BIOMAT 2012

index

INDEX

Acceptance probability, 287; Active tetracycline repressor, 366; Ae. aegypti, 257; Age of infection epidemic model, 11; Androgen deprivation therapy (ADT), 70; Animal African trypanosomiasis (AAT), 56; Animal cognition, 244; Animal communication, 244; Anti-tumor immune responses, 84; Apoptosis, 70; Approximate Binding Affinity, 173; Arthropod-borne viruses (arboviruses), 45; Associative learning algorithm, 246; ASTRO-FOLD approach, 173; ATM inactive dimer, 212; Autoimmune disorders, 167; Autoregressive matrix, 383; Average learning error, 251; Average mutual information, 303; Avian influenza (H5N1) outbreak, 1;

Brownian sea, 213; Calcium Induced Calcium Release (CICR), 116; Calcium release, 123; Cardiac myocyte, 115, 124; Cattaneo’s diffusion, 269; Centroid-centroid forcefield, 170; Chemical lethal mutagenesis, 220; Chronic heart failure, 126; Cladograms, 328; Climate Warming Effects, 36; Cloverleaf-shape RNAs, 189; Codon structure factor. 303; Criterion of Routh-Hurwitz, 262; Crowd-sourced online multiplayer game, 167; Cryo-electron microscopy, 190, 205; Curvedness, 200; Cytokines, 133; Cα-Cα distance, 170; Daublets wavelets, 384; Davydov solitons, 211; De novo protein design, 166; Del Giudice effect, 213; Delay Time, 98; Diffusion coefficient of Hg, 360; Distance Bin Model, 172; Disulfide bridges, 170; DNAs backbone, 211; Double strand breaks in DNA, 209; Dressed proteins, 214;

Basal expression level, 366; Basic reproduction number, 4; Bayes error, 330; Besicovich condition, 341; Bifurcation point, 260; Biophysical Modeling, 189; Bird Flu virus, 336; Blue monkeys (Cercopithecus mitis), 297; Boltzmann — weighted probability, 173; Boltzmann distribution, 194; Boolean operator, 310; Brown Spider monkeys (Ateles hybridus), 284;

Eastern chipmunk, 38; Electro EncephaloGraphy (EEG), 86; Endemic prevalence, 51; 390

May 9, 2013

11:10

BC: 8846 - BIOMAT 2012

index

Index

Energy function FAMBE-pH, 176; Ensemble Energy Calculations, 179; Epilepsy, 86; Epitope, 166; Epstein-Barr virus (EBV), 148; Excitation Contraction Coupling (ECC), 115; Extinction probabilities, 230; Extinction threshold, 237; False Nearest Neighbor (FNN), 97; False negatives, 307; False positives, 307; FAMBE-pH free energy formulation, 178; FASTA file database, 333; “Father” wavelet, 378; Fedorov’s Group, 309; Female adult population, 62; Fick’s first law, 271; Fick’s law, 269; Fick’s second law, 271; Final size relation, 4; Finite difference (FD) methods, 273; Finite elements, 359; Finite size scaling approach, 251; Finite-size scaling techniques, 246; Fixed null mean vector, 383; Flexible template, 180; Fluorescent Protein Aggregations, 212; Fold specificity calculations, 173; Foot-and-Mouth disease virus, 220; Forest fragmentation, 292; Fourier Transform, 303; Fractal Dimension, 99; Fragment-based Monte Carlo search, 174; Free ligand qL ,174; Free protein qP , 174; Frequency response functions, 205; Galton-Watson branching process, 221; Gaussian curvature, 191; Gaussian molecular surfaces, 191;

391

Gene expression, 364; Genetic diversity, 218; Geometical Modeling, 189; Geometric modeling of macromolecules, 205; Geometrical and potential driven flow, 197; GFP tetracycline-activated expression, 368; Gillespie algorithm, 139; Glossina tachinoides, 58; Graded gene expression system, 365; Green fluorescent protein (GFP), 365; H1N1 influenza pandemic, 1; Habitat destruction, 283; Hairpin-shape RNAs,189; Hamming type of metrics, 325; HAT, 56; Herpes viruses, 148; Hidden Markov models (HMM), 303; High rainfall index week, 263; Hill function, 365; Hopf bifurcation, 125; Human African Trypanosomiasis, 56; Human language, 244; Immune system, 132; Immuno-incompetent stages, 152; Immunotherapy, 72; Initial exponential growth rate, 15; Integer Linear Optimization (ILP), 170; Ionic messenger, 115; Iterated Learning Framework (ILF), 245; Ixodes scapularis, 29 Jacobian matrix, 154; Japanese encephalitis, 45; Jensen-Shannon divergence, 136; Kermack and McKendrick, 2;

May 9, 2013

11:10

392

BC: 8846 - BIOMAT 2012

index

Index

Kirschner and Panetta model, 72; Kolmogorov Entropy, 102;

Language determinism hypothesis, 245; Laplace-Beltrami operaors, 191; Larval infection rate, 31; Latin Hypercube Sampling Method (LHS), 39; Lethal mutagenesis, 232; Linear learning model, 246; Lotka-Voltera Model, 270; Low rainfall index week, 263; Lyapunov Exponents, 97, 101; Lyme disease, 29; Lymph nodes, 135; Lymphocytes, 84 ;

Macroparasite, 52; Major Histo-compatibility Complex (MHC), 133; Male adult population, 62; Markov chain description, 249; Matlab’s SimBiology package, 140; Maximum production state, 368; Maximun replication capacity, 222; Mean curvature flow equation, 193; Mean curvature, 191; Metric space of trees, 326; MHC class I environment, 134; Michaelis-Menten kinetics, 75; Michaelis-Menten type of interaction, 6; Microparasite, 52; Minimal Molecular Surface (MMS), 191; Models of SIR type, 1; Modified Boltzmann inactivation functions, 286; Molecular Surface Energy, 190; Molecular surface free energy, 191; “Mother” wavelet, 378; Muller’s hatchet principle, 232; Myocyte Ca2+ dynamics, 128; Myocyte contractions, 124;

Natural death rates of larvae, 31; Natural Killer (NK) lymphocytes, 132; Neurological disorder, 86; NK cell activating and inhibitory receptors, 135; NK cell activation thresholds, 141; NK cell immunological synapse, 140; Non-Markovian systems, 211; Non-stationary time series, 377; Nonlinear evolution equation, 205; Number of patches in population size, 292; Offspring probability distribution, 224; Optimal control theory, 344; Optimal cost functional, 352, 353; Optimality system, 348; Osher and Sethian’s level set method, 199; Parabolic partial differential equations, 347; Paracelsus challenge, 176; Partition functions, 174; Pathogen transmission, 41; Pattern formation, 271; Peripheral NK cells, 134; Peromyscus leucopus, 31; Perona and Malik’s anisotropic diffusion equation, 199; Phase Space, 89; Plasmodium, 148; Poisson-Boltzmann equation, 192; Poliovirus type 1, 220; Pollutant’s concentration, 346; Population bottlenecks, 219; Population of post-bloodmeal females, 259; Population of pre-bloodmeal females, 259; Population of trees, 324; Populations of proteins, 324; Post-transplant lympho-proliferative disorder (PTLD), 150;

May 9, 2013

11:10

BC: 8846 - BIOMAT 2012

index

Index

Prevalence of infection in the hosts, 46; Prey-predator models, 270; Probability of infection, 50; Protein Data Bank (PDB), 190; Protein WISDOM, 168, 180; Protein-ligand complex qP L ,173; Protruding area, 203; Pupa population, 60; Purine bias, 316; Rainfall Index, 262; Rainfall Season, 257; Random mutagenesis, 166; Reaction-diffusion formulation, 204; Reconstructed manifold, 94; Repeated seizures, 86; Repression state, 368; Retrotransposons, 305; Rift Valley fever, 45; Robinson Foulds metric, 328; Rogers’ epidemiology model, 67; Rooted topological tree, 326; Rosetta 3.4 software package, 174; Rosetta AbRelax function, 173; Rosetta- Dock, 175; Sarcoplasmic reticulum (SR), 116; SARS epidemic, 1; Segregation of two species, 277; Seizure Manifold, 106; Sensitivity analysis, 364; Sequence homology (BLAST), 302; SERCA pump, 116; SERCA pump, 125; Serum PSA concentration, 75; Shape index, 200; Short distance motion, 211; Side chain centroids, 171; Sleeping sickness, 56; Soliton-like wave, 212; Solitons, 209; Spider monkey’s population dynamics, 296; Spider monkeys (Ateles spp.), 283,; Statistical mechanics, 174;

393

Strange attractor, 89; Streamline-upwind/ Petrov-Galerkin, 359; Superconductivity self-organized state, 212; Synapses and Signaling, 135; T. brucei, 58; Takens’s theorem, 95, 96; Target epitope, 166; Telegraph reaction diffusion (TRD) equation, 269; Temperature Model, 63; Teneral population, 61; Teneral stage, 58; Thymus, 135; Tick endemic area, 41; Tick Population Dynamics, 33; Tick seasonality, 36; Ticks invasion, 41; Time Delay Embedding (TDE), 89; TINKER/CYANA approach, 173; Totally immunologically incompetent, 151; Transition Specificity, 173, 176; Transposons, 305; True negatives, 307; True positives, 307; Trypanosoma, 148; Van der Waals surface, 190; Variable length Markov chain, 333; Vector Irregular time series, 383; Vector-borne infections, 45; Vectorial Capacity, 52; Vesicular stomatitis virus (VSV), 220; Viral evolution, 218; Wavelet-based time-varying irregular autoregressive model, 377; Weighted average linkage, 338; Weighted Average Model, 171; West Nile virus, 45; Willmore flow, 199;