151 21 2MB
English Pages 121 [124] Year 1959
Catalytic Models in EPIDEMIOLOGY
Catalytic Models in EPIDEMIOLOGY by
HUGO MUENGH, M.D., Dr.P.H. Professor of Biostatistics Harvard University School of Public Health Boston, Massachusetts
HARVARD UNIVERSITY PRESS Cambridge, Massachusetts · 1959
© Copyright 1959 by the President and Fellows of Harvard College Distributed in Great Britain by Oxford University Press London
Printed in Great Britain Library of Congress Catalog Card 59-0000
To H. H. M.
Preface The material in this book derives from the teaching of one part of a course in biostatistics which is aimed chiefly at the epidemiologist: an individual not entirely easy to define, but more apt than not to be a person with medical background and some interest in statistical methodology. The teaching, in turn, arose from an early interest in investigating the simplest possible hypotheses which might be set up to picture the interaction of populations with infective forces. Somewhat to my pleased surprise, these hypotheses seemed at times to lead to fruitful results. In the beginning, simple catalytic models were applied to situations such as those measured by protection tests in yellow fever and the more labile tuberculin test.^ Soon after this I was fortunate enough to run across a small book on differential equations in chemistry^ which served to crystallize and amplify my ideas. The models discussed here are deterministic rather than stochastic—that is, they deal with assumed definite relationships between forces of infection and the populations on which they act; they do not introduce the factor of chance variation, which will modify the actual outcome observed in each case. The object of these models is to provide a measure of a force rather than to describe its predicted action in detail. Such credit as may be available must be widely distributed. As may be supposed, the staff of our own department was heavily involved in criticism, the development of ideas and the reading of drafts. Also implicated were arguments with a number of now largely unidentifiable students. H. M. VII
Contents I. Introduction: Mathematical Rationalization Epidemiology II. Methodology III.
in 1 7
The Simple Catalytic Curve
IV. Simple Catalytic Curves: Special Types
13 25
V.
The Reversible Catalytic Curve : Two-Way Reactions 44
VI.
The Two-Stage Catalytic Curve : Successive Reactions 54
VII. VIII.
Variable Rates: Survival in Chronic Disease
69
The Use of Historical Data: Sampling
85
Appendixes A. Nomograms
98
B. Table of Negative Exponentials
101
C. Mathematical
101
Notes
References
108
Index
109
Figures 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18.
Dependent variablejv as a function of í 9 Simple catalytic curve 16 Yellow-fever protection tests (interior Amazonas, Brazil) 20 Positive histories of disease 27 Positive histories of measles 34 Yellow-fever protection tests (group of Brazilian towns) 39 Estimation of value of Л 40 Schick tests giving negative results 47 Tuberculin positive reactors 52 Yaws in males in Jamaica 59 Yaws in females in Jamaica 60 Positive reaction to histoplasmin 67 Hypertension, constant rate of loss 78 Hypertension, variable rate of loss 79 Sampling variability 92 Truncated data 103 Compound data 103 Data J»'as complement of catalytic curve 107
Tables 1. 2. 3. 4. 5. 6. 7.
Yellow-fever protection tests (interior Amazonas, Brazil) Positive histories of disease Positive histories of measles Yellow-fever protection tests (group of Brazihan cities) Schick tests giving negative results Yaws in males in Jamaica Yaws in females in Jamaica
21 28 35 41 48 58 61
XI
8. Positive reaction to histoplasmin 9. Survival ratio in hypertension
66 77
Charts I. Moments of simple catalytic curve II. Moments of two-stage catalytic curve I I I . Moments of catalytic curve with variable rate
98 99 100
Catalytic Models in EPIDEMIOLOGY
I. Introduction: Mathematical Rationalization in Epidemiology In the application of mathematics to science, a chief aim is the creation of a mathematical "model" or picture which will contain the essential relations under study. These relations can then be examined by mathematical means and will frequently respond by yielding information not at all apparent in the original data. In return, a great deal of detail must be ignored when setting up a model of even a simple physical relation. The success of the model depends almost entirely on whether those factors which were included turn out to be those really essential to the explanation. Biological entities are so complex that it is impossible ever to choose more than a small fraction of all the factors which might be of importance. Clearly, a great deal must always be neglected in any mathematical study and the interpreter of the results must always be aware of the simplifying assumptions originally introduced, which place a definite framework of limitations around his conclusions. On the other hand, there is a definite advantage to isolating a few factors from their surroundings in order to study their interactions when freed from the disturbing effects of the others. In this way it is sometimes possible to study the effects of single forces which are not found alone in nature but can be separated for purposes of mathematical consideration. The discipline of epidemiology always deals with at least one highly variable aggregate: mankind. Usually it studies 1
2
Catalytic Models in Epidemiology
interactions between two such aggregates, man and an infective organism which, even in the simplest form of "virus," displays an extraordinary and stubborn refusal to conform to a constant pattern. Occasionally a third or even a fourth population is drawn into the picture in the role of "intermediate host" or of "animal reservoir." When we consider that each group is anything but homogeneous in any one of innumerable variables, and that their interactions may occur in countless ways, it is clear that no possible method of analysis could present a complete picture of the entire situation. Nor would such an analysis, even if possible, be particularly fruitful; it is precisely the function of the mathematical model to present an outline of the forest without itemizing the trees and to disregard particulars which would interfere with seeing the underlying pattern. Clearly, in subjecting epidemiology to mathematical analysis we must employ concepts which merge a great many factors into one measurable whole. We shall therefore talk about "forces of infection" as though they were something like a force of gravitation (whatever that is). Certainly such a force of infection contains variables associated with the nature of the host, his method of life, his previous experience of infection; the nature of the infective agent, its variability in virulence and in mode of transmission; together with a plethora of others, of many of which we may not even be dimly aware. Yet we are examining an average effect of this complex and, on the average, in a large enough group of similar persons, a corresponding complex will exercise similar effects and so can conveniently be regarded as a unit for certain purposes of comparison. It is common, for instance, to split crude totals of forces of mortality into subgroups specific for age and sex. Each of these still is composed of a variety of factors but may usefully be regarded as a homogeneous unit in making comparisons between different subgroups. T h e epidemiologist admittedly faces situations much more complex than those encountered by the mathematical physicist
Mathematical Rationalization
3
analyzing the forces which guide a half brick through a plateglass window. The units in the biological sample are much fewer, for one thing, and vary more between themselves than the atoms in the dornick. Therefore the epidemiologist must reckon with a large sampling variability in addition to everything else and must employ statistical procedures in making his estimates. These take the form, for one thing, of methods of adjusting the results expected on a hypothetical basis to the data as actually observed, averaging out "sampling" deviations so that the result will approximate that which would have been observed had the sample been infinite instead of consisting, as it usually does, of a few or a few hundred individuals. The epidemiologist is also somewhat more concerned than the physicist with finding or manufacturing the most valid measure of the effects of his postulated forces. An atomic nucleus struck by a subatomic projectile either reacts or it doesn't, with no nonsense about it. A human subjected to a force of infection may sicken and die, or just sicken typically, or present symptoms of something entirely different, or possibly shrug the whole thing off. He may or may not exhibit results which can be evaluated in a laboratory; if he does it will probably be at a wide variety of levels. Any histories of disease given by him must be interpreted in a broad humanitarian spirit with the use of every possible check. Among possible measures of the results of infection the best must be chosen: the most objective, the clearest and least ambiguous, the most practicable in execution. What this will be depends on the disease and on the particular circumstances prevaihng, but care and thought must be devoted to its choice. In building our mathematical model we may simplify our concept of a human population until it becomes, in our minds, a swarm of indistinguishable units like the molecules of an elemental gas. We may then mentally subject this swarm to a simple force which is measurable in terms of the fraction of the total acted upon during a short interval of time. We then arrive
4
Catalytic Models in Epidemiology
at a picture which has marked similarity to that used by a chemist in visualizing a catalytic process which converts a fraction of the molecules of A to molecules οΐΒ during each interval of time. Our human population is much smaller and less uniform and therefore subject to greater sampling variations, but the main characteristics of the reaction will be the same and can be measured in much the same terms: the amounts οΐΑ and В present at each instant, and the activity of the force producing the conversion. We simply let В represent the fraction of the entire population who have been subjected to infection and A the remainder who have not. The chemist need not follow the process in order to measure the rate of change or the applied force of catalysis. If his assumptions are correct, he need only measure the relative prevalence of the new substance at different stages of the process. From a series of measured points it is easy enough to deduce the characteristics of the change and so to estimate the size of the catalytic force which was acting. These data are, in a sense, historical; they refer to events in the past. The measurements relate to a process which has already gone on; forces are estimated from the nature and size of their effects, in accordance with a hypothesis of their mode of action. Similarly, the epidemiologist may deduce the size of a hypothetically constant force acting on a hypothetically uniform population by measuring, from time to time, the changes produced in the population as it moves through life. A close parallel can be drawn with the chemist's picture of catalysis, in which the catalyst has become a force of infection measurable in terms of the fraction of all individuals it strikes each year, and which it attacks if it finds them susceptible. Clearly, in setting up such a model, we have neglected all but the very simplest and crudest concepts of the variables involved. Yet very frequently these highly simplified assumptions lead to results which correspond closely to facts as observed. This is not to say that we have uncovered a law of nature to the effect
Mathematical Rationalization
5
that human beings are essentially molecules, and forces of infection merely catalytic agents acting on them. We may, however, claim that we have managed to roll together a complex of forces in such a way that their over-all effect is roughly equivalent to that of a simple process, which can be measured, and the measure used for purposes of analysis and comparison. The measure of the force has validity, naturally, only in the framework of the assumptions which went into our model. We may estimate one force to be twice the size of another; it remains to be found out just why, and owing to what specific components of the total complex. This may be no mean task for the epidemiologist to discover. However, as a starter, it helps him to know that an over-all difference of this size exists. Moreover, it is just in deviations of observations from the predicted results of simple hypotheses that there may be profitable clues leading to further investigations ; it is often more useful to know that events do not follow a prescribed path than that they do. As Pareto seems to have said, "Give me a good fruitful error any time, full of seeds, bursting with its own corrections. You can keep your sterile truth for yourself" Whether a given hypothesis is "right" or not is often a meaningless question; it must, however, serve to build bridges from observed facts to better understanding and further analysis. There are two main reasons why the construction of simple mathematical models of epidemiologic processes may be of value. First, a set of simple hypotheses leads to deduced relations which can be compared with conditions as they are found to exist. A hypothesis which is in accord with observation is at least a step toward understanding the characteristics of a disease and so points the way toward the next investigation. On the other hand, definite discrepancies between hypothesis and observation point to the inadequacy of the model and the need for revision of concepts. The nature of such discrepancies may give a clue to the dimensions and direction of the revisions.
6
Catalytic Models in Epidemiology
I n the second place, hypotheses whose results do fit observation allow of the rationalization and measurement of constants which have been included in the hypothesis. T h e estimate of size of the simple forces implicated in the model gives a means of direct comparison between different groups in terms of their exposure to infection. Moreover, this estimate can be made on the basis of spot surveys of a population without the need of lengthy observations of successive events. T h e mathematical model plays entirely different roles in the two aspects of its use. I n the former it gives clues to the mode of action of a disease which may still be imperfectly understood. It will point toward epidemic or endemic characteristics, differences in age or sex incidence, the rehability of means of indentification of infection, and Hke factors. However, if the model is to be used as a tool for measuring a force of infection, the nature of the disease must be thoroughly understood so that the components of the model can be correctly interpreted in terms of the epidemiologic picture.
п.
Methodology
The branch of mathematics called calculus deals with changes and rates of change. If two variables are connected so that, for example, the one called у or dependent changes in some measurable manner when the independent variable called x o r t increases at a steady pace, then the methods of calculus allow the determination of the rate of change oiji for any value of χ or t. Conversely, specifying or postulating a given rate of change it is possible (within limits) to describe the manner of connection o f ^ with χ or t (called a function) which will produce such a rate of change. It is not the purpose of this chapter to furnish a course in elementary calculus; anyone with a basic knowledge of the subject will find the methods here used simple. On the other hand, sufficient elementary introduction to create this basic knowledge is rather outside the scope of this book. It is instead the aim of this chapter to present, to a reader innocent of calculus, a highly sketchy outUne of the principles involved and how we go about applying them. A basic concept of the calculus is the derivative, or differential coefficient, usually written as a fraction like dy/dt. This may be interpreted as the rate of change oiy, when t increases regularly, at any instant of time or at any given point on the curve which pictures the relation betweenjv and t. In this sense it represents the slope of the curve at a particular point. Suppose we demand a curve that has no slope at all, at any point. This can be stated as dyldt = 0,
(2.1)
8
Catalytic Models in Epidemiolog)/
which says the same thing in mathematical shorthand. This equation is a very simple instance of what is called a diflferential equation, that is, it expresses a relation which the rate of change must follow. T h e methods of calculus let us get a solution to this equation; in other words, we can determine just what relation between J and t will obey the specified conditions. As a solution for Eq. (2.1) we have y = k,
(2.1«)
which merely says thatjv always maintains a constant value к for all values of i. Figure \a illustrates this function. Suppose then that we ask what relation between j and t will produce a curve which always has a constant slope, specified as b. W e can write this condition dyjdt = b,
(2.2)
which states that, whatever the value oït,y will change at that point at a constant rate of b units per unit of t. T h e solution y = k + bt
{2.2a)
is shown in the graph of Fig. \b. It appears as another straight line, but now at an angle with the baseline instead of parallel to it. T h e trigonometric tangent of this angle has the value b. It is clear that any number of parallel lines, at different distances from the baseline, can all have the same slope b ; this is the reason for the insertion of the constant k, which can take any value without changing the fact thatjv will increase at a constant rate ¿ as / increases. T o get a little more complicated, suppose we want to know the relation under which the value ofjv changes at a rate proportional to the value of t, that is, increases faster and faster as t increases. W e can then write dyjdt = ct,
(2.3)
the solution of which is y ^ k + lct\
(2.3fl)
Methodology
(α)
.
(Ь) κ
κ
(d)
Fig. 1. The dependent variable as a function of the independent variable t. {a) Constant: у = k. [b) Straight line: > = к -[• bt. {c) Parabola: у = к ·\- ^cfi. (d) Exponential : jy = Агг" (ascending);^ = ke-^' (descending).
which describes a parabola. Again the constant of integration, k, shows that any one of a number of curves at various heights above the baseline can fulfill the requirements of the differential equation. Figure I с illustrates the parabola.
10
Catalytic Models in Epidemiology
It is of especial interest to investigate another type of relation, namely, that which causes jv to vary in proportion to its own size. This sort of self-determination is common in biology and in many other fields. The rate of growth of populations and the rate of decay of radium depend partly or entirely on the amount there is to be changed. We then can write dyjdt = ry
(2.4)
as an expression of the constant ratio of the rate of change to the size oÎy: then y = ke'* (2.4a) is the relation (called exponential) betweenjv and t. Here e is the base of Napierian, or natural, logarithms, with the constant value of 2.71828 . . . The constant к indicates that the values of«*·' can be multiplied by any constant number and still maintain the relation of inherent growth rates. A sum of money at compound interest will grow according to this law; if r is negative, describes the proportional dying away of a gram of uranium or of the heat in a cup of coffee (see Fig. \d). The epidemiologic situations which we shall attempt to describe by analogies to catalytic processes all involve, in some way, a quantity which changes in direct relation to its size, or to an amount which remains to be changed. This exponential type of connection therefore appears regularly in one form or another. It is, in fact, a rather rare mathematical statement in biology which does not manage to introduce the constant e in some way. In the following chapters, then, we try to find simple descriptions of various types of epidemiologic relations in time between populations and forces of infection. These relations are put into the shape of differential equations, whose solutions provide the form of connection which satisfies the conditions written into the description. To find the expression is only the first step; if the relations are to be of use, they must be adjusted to observed phenomena so that the values obtained for constants
Methodolog))
11
will serve as measures of the forces assumed in the descriptions. This involves the process known as curve fitting: finding those values of the constants which produce the best adjustment of the curve to the data. In all honesty, the word "best" must be written in quotes. What is the best fit depends entirely on the criteria which a particular method must satisfy, and these vary in a number of methods. There are arguments pro and con regarding each, with good reasons on both sides. For our purposes we choose the method known as that of moments, since it gives an opportunity to develop rapid graphic solutions for estimating values of the necessary constants. Any other method would involve far too much complicated calculation for the ends we have in view, which are to get a good approximation, satisfactory for estimates and comparisons, at the least cost in time and trouble. The word "moment" comes into statistics from physical science. Briefly, it is used for constants which are obtained by multiplying each measurement by some power of its distance from a given scale point. These constants are usually modified in some way to arrive at a useful figure. Thus the zero-order moment of a set of measures consists of the sum of its values each multipHed by the zero power of its place on the scale and so is simply the total of the measures. T h e area under a histogram representing the scatter of a series of measurements is a zeroorder moment. The first-order moment is derived from the multiplication of each measure by its distance from the zero point on the measurement scale; when the sum of these products is divided by the zero-order moment we get the first-order moment, which is the mean or arithmetic average. Similarly, the multiplication of each measure by the square of its distance from the mean, and the division of the sum of these quantities by the zero-order moment, produces the second-order moment or variance. T h e method of moments consists in finding values for the constants of the curves which will produce the same values for their moments as were calculated from the data
12
Catalytic Models in Epidemiology
which are to be fitted. That is, the area under the curve will be the same as that under the graph representing the data; the mean value of t for the curve will be the same as that of the graph, and so forth. One moment must be calculated for each constant needed in the curve formula. The curves discussed in this book never have more than two constants so that the basic arithmetic is confined to calculating the area and the mean I for the data. The next step is to find the values of the curve constants which will reproduce the moment values of the data. This cannot generally be done by direct calculation; on the other hand it is not difficult to calculate a series of moment values for different values of constants and to make them into a nomogram, from which the required rates can be read off with little trouble for observed moment values, with accuracy sufficient for the purpose. The moments are obtained by another application of the calculus called integration, which allows us to calculate areas and means of curves if we know their formulas. Nomograms are simply solutions of equations in graphic form, from which answers to specific questions can be obtained with fair accuracy by running lines from the known values of constants to the desired values of the unknown factors along a scale. T h e nomograms used here provide values of area and of mean t along horizontal and vertical base scales and give values for the corresponding curve constants in the body of the graph, A set of three nomograms for the solution of the functions used in this book is given in Appendix A. In Appendix С are included notes on the solution of the various differential and other equations involved and on the calculation of their moments. Appendix G will be of interest to possessors of the rudiments of calculus who want to follow the mathematical part of the development. Perhaps it should be shunned by others.
III. The Simple Catalytic Curve The simplest picture of a catalytic process in chemistry involves molecules of an original substance, which may be equated with the individuals in a population that has not yet been in contact with an infective force. In chemistry the original molecules are subjected to contact with molecules of a catalytic substance; a contact between the two implies the creation of another substance. Similarly, the uninfected individuals of a population are conceived as subjected to a force of infection which changes them to infected individuals. The basic rate at which molecules are changed depends on (a) the relative number of molecules of catalyst and {b) the number of contacts made by each per unit time. Both together make a force which can be expressed as the number of effective contacts per unit time; it is the number of times a molecule of catalyst will make contact with a molecule of substrate so as to change it if it has not already been changed. This again can be measured as the number of effective contacts per unit time per molecule of original substance. The force of infection acting on the population can similarly be measured in terms of effective contacts per unit time (usually a year) per individual. "Effective contact" here has the meaning used by Wade Frost: a contact sufficient to produce infection if the subject is susceptible. The number of molecules of catalyst is assumed to remain constant, as is the rate of contact. The speed at which the reaction takes place will therefore depend on the fraction of contacts which fall on unchanged molecules; this again is 13
14
Catalytic Models in Epidemiology
determined by the fraction of the original substance remaining unchanged at the moment. As more and more is converted, the reaction becomes slower and slower. The same is true of the human population: effective contacts lead to infection only when they are made with susceptible individuals, and as the number of these lessens so does the rate of conversion. The speed of the reaction therefore depends on the fraction remaining of an original population of molecules or people and so will presumably involve the exponential relation of Eq. (2.4) somehow. That is, the expression e'^* will almost certainly take part in the program. Actually e^^ is a so-called "continuous" or stepless function, while the reaction observed proceeds by units of one molecule or one person at a time. But if there are enough individuals in either population the continuous curve can be taken as a very close approximation to what happens and, in fact, furnishes the simplest possible explanation of the happenings. It is well to remember, however, that a continuous curve is never more than an approximation, however close, even in physics and chemistry. HYPOTHESIS
We start out with a quantity of originally unchanged molecules or individuals. This quantity we shall make equal to 1 and deal with the fraction changed at any time t. This fraction we shall designate asj*, so that 1 —y is the relative amount still left unchanged at time t. This then is the part on which the catalytic or infective force can still work, at the rate of r effective contacts per individual per unit of time. The speed at which the reaction goes on will then be measured by dyldt = r[\ - y ) , (3.1) or will depend on the product of the force and the remaining proportion on which it can act. This is a simple linear differential equation which has a general solution j; 1+ (3.2)
The Simple Catalytic Curve
15
which describes a whole family of curves acting in this way, differing only in the value of a constant, c. There is only one of these which starts at zero when time is zero, which is an additional condition we want to impose. If we substitute jv = 0 and Í = 0 in Eq. (3.2), we have 0 = 1 + сй-^-о = 1 + с · 1 or с -
- 1,
(3.3)
under which conditions Eq. (3.2) becomes y = l -
e-'K
(3.4)
This particular form of the equation therefore describes the expected behavior of a group of molecules or persons starting entirely unchanged or susceptible at the beginning of observation or at birth (when ί = 0) and exposed to a continuous bombardment of catalysis or infection at the constant rate of r effective contacts per individual per unit time. It may be well to emphasize that r is measured in terms of numbers of effective contacts per individual and not in numbers of individuals who have contact. T h a t is, if r = 1, implying 1000 contacts per 1000 individuals, this does not mean that all will be exposed. Assuming a random scatter of contacts, some molecules or persons will receive two, three, or more during the period while others (estimated as 368) will escape altogether. T h e relative frequency of contacts is given by the Poisson series, whose terms are + П +
+ · ·
Equation (3.4) (as graphed in Fig. 2) describes a curve which starts at jv = 0 and t = 0 and rises toward, but never quite meets, an upper limit (also called the asymptote) j» = 1. T h e rate at which the curve rises depends on the value of r: the larger this is, the more steeply the curve goes u p and the sooner it flattens out. This is the simplest form of the catalytic curve which
16
Catalytic Models in Epidemiology
describes the form of reactions under the prescribed conditions. In order to transfer the catalytic picture to a model of infection acting on a population it is necessary to start with a set of assumptions. These include: [a) A population entirely susceptible to infection at birth, when Í = 0, to correspond with the original chemical substrate.
Fig. 2. Simple catalytic curve, y = I — e"'^, for values of t from 0 to 5 and for three values of î- : r = 1 ; r = 0.5 ; r = 0.1.
{b) A constant force of infection to which this population is exposed, corresponding to the catalyst acting on the substrate. This force is to be measured in effective contacts per unit time, no matter how complex may be the events leading up to these contacts. [c] Evidence which will show that infection has taken place, allowing the estimate OÎJ, or the fraction infected, at any time
The Simple Catalytic Curve
17
t. This may consist in positive histories, results of laboratory findings, or the like. Given these conditions it is possible, by measuring fractions infected at various times after birth, to determine the rate of rise of the positive fraction with time and so to deduce the size of the effective contact rate. Now it is rarely practicable to follow a population over a sufficiently long period of years to determine the shape of the entire catalytic curve which depicts the progress of a lifetime of infection. O n the other hand it is possible to perform a "spot survey" on an existing population which shows different levels of past infection for different age groups. We can set up a few more postulates : [d) Migration is negligible in this population; alternatively, it is possible to get samples of individuals, at different ages, who have spent their entire lives in the community. (e) Forces of infection have not varied greatly over a fairly long period—long enough to include the oldest age band used in the study. (У) Immediate and later mortality due to the infection is neghgible. (g) Evidence of exposure to infection is definite and remains so during life, or at least through the oldest age band used. Under such circumstances it is possible to say that the 2-yearold group represents essentially what the 1-year-olds will be like a year from now; the 20-24 group can be taken as the equivalent of the 10-14 group 10 years later, and so forth. It is then merely necessary to determine, from the best samples possible, the fraction giving evidence of infection in each age group. This will plot as a histogram rather than a smooth curve since we must give a single value to the fraction in an entire age group, which is usually several years wide. If, however, the postulates are fulfilled, the columns will show a regular progression of increase in height which closely resembles the catalytic curve. T h e problem of fitting then is to find that value of r which produces a curve closest to the histogram values.
18
Catalytic Models in Epidemiology METHOD
The definition of "closest value" varies with the method of fitting chosen and, as has been stated, there are several methods in common use. Some of them lead to complicated procedures in the case of exponential curves such as the catalytic series. The method of moments has the advantage of permitting the construction of simple nomograms from which the values of the necessary constants can be read directly with sufficient accuracy for the purpose. The calculation of area and mean of the histograms, with which the nomograms are entered, is an uncomplicated procedure. Least-squares and maximum-likelihood methods, on the other hand, require rather laborious methods of successive approximations. It is true that, in the case of the simple catalytic curve y = \ ~ it would be possible to fit a straight line to the natural logarithms of the values of 1 —jc. It is easier, however, to use the area of the histogram applied to the nomogram of Chart I (Appendix A), and this chart can also be used to fit more complicated forms of the catalytic series encountered later. Using Chart I, we simply estimate the value of r which will produce, under its corresponding curve, for the same time limits, an area equal to that found under the histogram representing the data. The bottom scale of the nomogram gives areas while the divisions on the line immediately above indicate the corresponding values of r. These values of areas are figured for a time scale which runs from 0 to 100. Clearly, actual age bands available for observed data may vary considerably and it would be bothersome to prepare a nomogram for each possible one. It is, however, easy to transform observed data for any span for use with Chart I. For this, we simply divide any observed span of years, running from 0 to d, into 100 equal time units and estimate the area in terms of these units, each dj\00 years wide. If, for instance, actual data cover the period from 0 to 45 years, the unit will
The Simple Catalytic Curve
19
be 45/100 years long, or 0.45 year. T h e observed area, ΣΑ, is then divided by 0.45 or multiplied by 2.22 . . . to get ΣΆ or the area in terms of the 100 units, which we apply to the bottom scale of Chart I. If we run a ruler up perpendicularly from this point, it intersects the diagonal line at the value of r', or the estimate of the force of infection, still in terms of units of 0.45 year. Again, r' must be divided by 0.45 or multiplied by 2.22 . . . to find r, the force of infection in terms of effective contacts per person per entire year. For example, if there are 4.5 effective contacts per 1000 per unit of 0.45 year (r'), then there must be 10 per 1000 per year (r). It is entirely proper to ask under what circumstances, if any, the postulates set forth above may be reasonably well fulfilled. They require a fairly stable population subjected over a long period to a steady force of infection producing a disease without high mortaUty but leaving uniform and permanent evidence of infection. Few diseases, probably, completely fulfill these requirements, but one of them is endemic yellow fever in a rural population. This disease involves low mortality (in fact, the great majority of infections do not produce clinically recognizable cases) but protection tests consistently yield clear-cut positive results, which seem to remain throughout life. T h e rural populations in large parts of Brazil are reasonably stable, at least within larger areas. Yellow fever is generally not present constantly in a given small zone; on the other hand, it is probably never entirely absent from a large zone, and it moves around sufficiently fast so that an estimated value of r may be taken as a sort of measure of average force of infection over a period of years. EXAMPLE
Table 1 and Fig. 3 represent fractions positive in a series of protection tests done, a good many years ago, on population samples covering a wide area of interior Amazonas, Brazil. Clearly the histogram representing fractions positive by
20
Catalytic Models in Epidemiologp
age groups shows a fairly regular progression upward with increasing age. The question arises whether the nature of the rise is such as would be produced by a constant force of infection acting on the population; if so, we wish to estimate the size of this force. The details of the fitting procedure are explained in
Fig. 3. Fraction of yellow-fever protection tests giving positive results, by age from 0 to 70 years, interior Amazonas, Brazil, fitted with simple catalytic curve,
y=\—
ί-0·«33'. EXAMPLE
The /j, ratio was applied to data on survival in several groups of patients suffering from various degrees of hypertension^® and subjected to "conservative" treatment. The patients were divided into four grades of severity on the basis of physical findings. Values of the ratio were calculated for each group on the basis of "U.S. White" life tables for 1940, using expected survivals for the appropriate age and sex of each patient. Table 9 and Figs. 13 and 14 show this 4 ratio for Table 9. Survival ratio in hypertension: 4(observed)//a.( expected) Grade II
У
1
(2)
(3)
0.962 .908 .889 .749 .704 .670 .661 .600 .592 .584 .589 .574 .566
0.960 .887 .826 .774 .729 .691 .659 .632 .609 .589 .572 .557 .545
(1) 0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5 10.5 11.5 12.5
У ι
У -
(4)
-
0.002 .021 .063 .025 .025 .021 .002 .032 .017 .005 .017 .017 .021
s y = 9.048 Σ/У = 52.777 Σ(y' - / ι ) = 0.018 Σ(/ -η Σ · / = 69.6 Σ"γ = 30.4
Grade III у ι
У i
у' - у'··
ν' (7)
y'¡
0.957 .885 .825 .776 .735 .699 .667 .641 .616 .594 .574 .556 .540
0.005 .023 .064 .027 .031 .029 .006 .041 .024 .010 .015 .018 .026
0.885 .679 .549 .457 .393 .353 .336 .293 .266 .218 .180 .179 .162
0.884 .698 .560 .459 .385 .330 .289 .259 .237 .220 .209 .200 .193
(6)
(5)
-
Ì = 5.8330
Е(У - Уг) = - 0.017
= 0.008466
Σ(/ -
ППП77П ^ί^! J = η0.000770 ί' = 44.87 (· = 61.75
y\ = 0.476 + 0.524'^^{jae-