129 94 1MB
English Pages 269 Year 2014
Repeated Measurements and Cross-Over Designs
REPEATED MEASUREMENTS AND CROSS-OVER DESIGNS
The Late Damaraju Raghavarao Laura H. Carnell Professor Department of Statistics Temple University Philadelphia, Pennsylvania
Lakshmi Padgett Janssen Research & Development, LLC Spring House, Pennsylvania
Copyright © 2014 by John Wiley & Sons, Inc. All rights reserved. Published by John Wiley & Sons, Inc., Hoboken, New Jersey. Published simultaneously in Canada. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission. Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com. Library of Congress Cataloging-in-Publication Data: Raghavarao, Damaraju V. Repeated measurements and cross-over designs / by Damaraju Raghavarao and Lakshmi Padgett. pages cm Includes bibliographical references. ISBN 978-1-118-70925-2 (cloth) 1. Measure theory. 2. Statistics–Longtitudinal method. I. Title. QA325.P33 2013 519.50 2–dc23 2013034309 Printed in the United States of America 10 9 8 7 6 5 4 3 2 1
Dedicated to the memory of my Dad, Damaraju Raghavarao, for his love of family, friends, students, and statistics
Contents Preface
xi
1. Introduction 1.1 Introduction, 1 1.2 One-Sample RMD, 2 1.3 k-Sample RMD, 4 1.4 Split-Plot Designs, 7 1.5 Growth Curves, 13 1.6 Cross-Over Designs, 14 1.7 Two-Period Cross-Over Designs, 18 1.8 Modifications in Cross-Over Designs, 19 1.9 Nonparametric Methods, 22 References, 23
1
2. One-Sample Repeated Measurement Designs 25 2.1 Introduction, 25 2.2 Testing for Sphericity Condition, 26 2.3 Univariate ANOVA for One-Sample RMD, 29 2.4 Multivariate Methods for One-Sample RMD, 32 2.5 Univariate ANOVA Under Nonsphericity Condition, 34 2.6 Numerical Example, 35 2.7 Concordance Correlation Coefficient, 41 2.8 Multiresponse Concordance Correlation Coefficient, 44 2.9 Repeated Measurements with Binary Response, 47 References, 51 3. k-Sample Repeated Measurements Design 53 3.1 Introduction, 53 3.2 Test for the Equality of Dispersion Matrices and Sphericity Condition of k-Dispersion Matrices, 54 3.3 Univariate ANOVA for k-Sample RMD, 57 3.4 Multivariate Methods for k-Sample RMD, 60 vii
viii
Contents
3.5 3.6 3.7
Numerical Example, 63 Multivariate Methods with Unequal Dispersion Matrices, 67 Analysis with Ordered Categorical Response, 72 References, 75
4. Growth Curve Models 4.1 Introduction, 77 4.2 Sigmoidal Curves, 78 4.3 Analysis of Mixed Models, 84 4.4 Simple Linear Growth Curve Model, 90 4.5 Nonlinear Growth Curve Model, 92 4.6 Numerical Example, 93 4.7 Joint Action Models, 100 References, 103
77
5. Cross-Over Designs without Residual Effects 5.1 Introduction, 105 5.2 Fixed Effects Analysis of CODWOR, 107 5.3 Connectedness in CODWOR, 113 5.4 Orthogonality in CODWOR, 115 5.5 Latin Square Designs, 116 5.6 Youden Square Design and Generalization, 118 5.7 F-Squares, 123 5.8 Lattice Square Designs, 123 5.9 Analysis of CODWOR when the Units Effects Are Random, 125 5.10 Numerical Example, 127 5.11 Orthogonal Latin Squares, 131 References, 133
105
6. Cross-Over Designs with Residual Effects 6.1 Introduction, 135 6.2 Analysis of CODWR, 136 6.3 BRED, 143 6.4 PBCOD(m), 148 6.5 Numerical Example, 152 6.6 Analysis with Unit (or Subject) Effects Random, 156 6.7 Concluding Remarks, 159 References, 160
135
Contents
ix
7. Two-Period Cross-Over Designs with Residual Effects 163 7.1 Introduction, 163 7.2 Two-Period, Two-Treatment CODWR Analysis: Parametric Methods, 164 7.2.1 Analysis of the design based on the model (7.2.9), 167 7.2.2 Decomposition of the model (7.2.9) into intra- and interunit components, 169 7.2.3 Estimating direct effects contrast using cross-over nature of the treatments, 170 7.2.4 Modified two-period, two-treatment design, 171 7.2.5 Cost analysis, 171 7.3 Two-Period, Two-Treatment CODWR Analysis: Nonparametric Methods, 173 7.4 Two-Period t Treatment Cross-Over Design, 174 7.5 Numerical Examples, 177 References, 186 8. Other Cross-Over Designs with Residual Effects 189 8.1 Introduction, 189 8.2 Extra-Period Designs, 191 8.2.1 Residual effect of a treatment effect on itself is the same as residual effect on other treatments, 192 8.2.2 Residual effect of a treatment on itself is different from the residual effect on other treatments, 193 8.3 Residual Effects Proportional to Direct Effects, 194 8.4 Undiminished Residual Effects Designs, 195 8.5 Treatment Balanced Residual Effects Designs, 197 8.6 A General Linear Model for CODWR, 199 8.7 Nested Design, 201 8.8 Split-Plot Type CODWR, 203 8.9 CODWR in Circular Arrangement, 205 8.10 Numerical Examples, 207 References, 213 9. Some Constructions of Cross-Over Designs 9.1 Introduction, 215 9.2 Galois Fields, 215
215
x
Contents
9.3 9.4 9.5 9.6
Generalized Youden Designs, 217 Williams’ Balanced Residual Effects Designs, 221 Other Balanced Residual Effects Designs, 226 Combinatorially Overall Balanced Residual Effects Designs, 229 9.7 Construction of Treatment Balanced Residual Effects Designs, 231 9.8 Some Construction of PBCOD (m), 232 9.9 Construction of Complete Set of MOLS and Patterson’s BRED, 234 9.10 Balanced Circular Arrangements, 235 9.11 Concluding Remarks, 236 References, 237 Index
245
Preface Repeated measurement designs and cross-over designs are considered synonyms by several researchers. However, we consider repeated measurements terminology to be appropriate for a setting where units do not receive different treatments in the course of the study, whereas crossover designs consider a setting where units receive different treatments in the experiment. In cross-over designs, residual effects are not used in the model when there is a washout period between the change of treatments. Such designs are also called row–column designs and two-way elimination of heterogeneity designs. When washout periods are not used between the change of treatments, the model considers residual effects; such cases may be called cross-over designs with residual effects. These two classes of designs will be called cross-over designs without residual effects (CODWOR) and cross-over designs with residual effects (CODWR). Repeated measurements analysis is usually covered in multivariate analysis books, CODWOR in standard design and analysis of experiments textbooks, and CODWR in specialized books on cross-over designs. The literature on repeated measurements and cross-over designs has grown rapidly in the last few decades in a wide spectrum of research disciplines. We strongly suggest that researchers and experimenters should be familiar with the basic concepts in repeated measurements and cross-over designs. With this objective in mind, this monograph provides an extensive but not exhaustive coverage of several topics of interest. The necessary mathematical results are provided along with SAS version 9.2 programming lines and the needed output to draw inferences. The specialty of this work is to bring together useful contributions in repeated measurements, CODWR, and CODWOR. This book will be
xi
xii
Preface
very useful for researchers and experimenters working on these topics. It will also be suitable as a graduate level textbook dealing with special topics on experimental designs. Acknowledgments I would like to express my gratitude to Sharada, Venkatrayudu, Rhandi, and Chris for their love and support in encouraging me to complete this project that I had started with my Dad. LAKSHMI PADGETT JANUARY 2014
CHAPTER
1
Introduction
1.1 INTRODUCTION In experimental work, treatment or treatments are given to units and one or several observations are recorded from each unit. The experimental unit differs from problem to problem. In agricultural experiments, the unit is a plot of land; in preclinical trials, the unit is an animal; in clinical trials, the unit is a subject; in industrial experiments, the unit is a piece of equipment. Treatments are those introduced by the investigator into the experiment to study their effects. In certain experiments, only one observation will be taken on each unit, while in other experiments, several readings will be taken from each unit. In cases where several measurements are made, either they will all be taken at the same time as in a standard SAT consisting of essay/writing, critical reading, and math comprehension or they will be taken over a period of time as in several tests given in a course. In this monograph, we confine ourselves to the designs and analysis of experiments where several observations are taken from each unit. While it is absolutely necessary to take several readings on a unit in some experiments, it is desirable to do so in other investigational settings. Consider an animal feeding experiment where four feeds, A, B, C, and D, are tested. One may plan an experiment using 16 cows in the total experiment in which each cow receives one of the four feeds, with four cows for each feed. Or the experiment may be planned with only four cows in the experiment with each cow receiving each of the Repeated Measurements and Cross-Over Designs, First Edition. Damaraju Raghavarao and Lakshmi Padgett. © 2014 John Wiley & Sons, Inc. Published 2014 by John Wiley & Sons, Inc.
1
2
CHAPTER 1 Introduction
four feeds at different time intervals. In the latter scenario, using only 4 cows rather than 16 cows is not only economical but also eliminates the cow-to-cow variability in testing the feeds. However, the experiment with four cows will take a longer time to complete. The class of designs where several observations are taken on each unit can be broadly referred to as repeated measurement designs (RMD). These can be subclassified as (i) One-sample RMD (ii) k-Sample RMD (or profile analysis) (iii) Cross-over designs (or change-over designs) without residual effects (CODWOR) of the treatments like Latin square designs, Youden square designs, and Lattice square designs (iv) Cross-over designs with residual effects (CODWR) of the treatments like two-period cross-over designs of Grizzle (1965) and balanced residual effects designs (BRED) of Williams (1949) The standard split-plot design in certain situations can also be considered as an RMD. We will elaborate on these designs in the remaining chapters. 1.2 ONE-SAMPLE RMD In this setting, a random sample of N experimental units will be taken from a population and p responses will be taken at the same time or at different times on each experimental unit. Another scenario for this design is that N homogeneous units will be treated alike at the beginning of the experiment and p responses will be recorded on each unit at the same time or at different times. Let Yα0 = (Yα1, Yα2, …, Yαp) be the vector of the p responses on the αth experimental unit for α = 1, 2, …, N. Let us assume that Yα are independently and identically distributed as multivariate normal with mean vector μ0 = (μ1, μ2, …, μp) and positive definite dispersion matrix Σ. Both μ and Σ are unknown. The null hypothesis of interest in this case is H 0 : μ1 = μ2 = … = μ p :
ð1:2:1Þ
1.2 One-Sample RMD
3
The matrix Σ is said to satisfy the circularity condition or sphericity condition if P1 ΣP10 = dIp – 1 ,
ð1:2:2Þ
where d is a scalar, Ip–1 is an identity matrix of order p − 1, and P1 is a (p − 1) × p matrix such that 2 3 1 J pffiffiffi 1, p 5 P=4 p ð1:2:3Þ P1 is an orthogonal matrix, Jm,n being an m × n matrix with 1’s everywhere. If α0 = (α1, α2, …, αp), Σ of the form Σ = Jp, 1 α0 + αJ1, p + λIp
ð1:2:4Þ
clearly satisfies the sphericity condition. In particular, a complete symmetric matrix Σ of the form aIp + bJp,p satisfies the sphericity condition. The matrix Σ of Equation (1.2.4) is said to satisfy the Huynh–Feldt condition, which will be discussed in Section 2.5. In Chapter 2, we will show that the null hypothesis (1.2.1) can be tested by the standard univariate procedures if Σ satisfies the sphericity condition. If Σ does not satisfy the sphericity condition, multivariate methods using Hotelling’s T 2 will be used to test the null hypothesis (1.2.1), and these methods will also be described in Chapter 2. We will now provide three practical problems: EX A MP L E 1.2.1 Three test scores were obtained for 10 randomly selected students in a large elementary statistics course. The methods to test the equality of performance in the three tests for a similar group of students are discussed in Chapter 2. 䊏 EX A MP L E 1.2.2 Rao (1973) discussed an example in which observations were taken on 28 trees for thickness of cork borings in four directions: North (N), East (E), South (S), and West (W). To test the null hypothesis that the mean thickness of cork borings is the same in the four directions, the methods discussed in Chapter 2 are used. 䊏
4
CHAPTER 1 Introduction
EX A MP L E 1.2.3 In a noisy industrial surrounding, one can test the possible loss of hearing due to the outside noise level. For this purpose, audiogram results can be taken of a homogeneous group of employees over specified time intervals and the data can be analyzed by one-sample RMD methods discussed in Chapter 2. 䊏 1.3 k-SAMPLE RMD In this setting, we have k distinct populations and we draw k-independent random samples from these populations. Let Ni be the sample size of the Xk sample taken from the ith population (i = 1, 2, …, k) and let N = N. i=1 i Let Yij0 = (Yij1, Yij2, …, Yijp) be the vector of p responses taken on the jth selected unit from the ith population (j = 1, 2, …, Ni; i = 1, 2, …, k). Alternatively, this design arises by taking N homogeneous experimental units and applying the ith treatment to Ni randomly selected units at the beginning of the experiment (i = 1, 2, …, k). The p-dimensional response vector Yij0 = (Yij1, Yij2, …, Yijp) can then be recorded on the jth unit receiving the ith treatment (j = 1, 2, …, Ni; i = 1, 2, …, k). In each of these cases, we assume that Yij are independently and identically distributed multivariate normal with mean vector μ0i = (μi1, μi2, …, μip) and positive definite dispersion matrix Σ, for j = 1, 2, …, Ni, i = 1, 2, …, k. Both μi and Σ are unknown. In this problem, there are three different null hypotheses of interest to the experimenter and they are 2 3 2 3 2 3 μ11 −μ12 μ21 −μ22 μk1 − μk2 6 μ12 −μ13 7 6 μ22 −μ23 7 6 μk2 − μk3 7 6 7 6 7 6 7 H0c : 6 = = … = 7 6 7 6 7, ð1:3:1Þ .. .. .. 4 5 4 5 4 5 . . . μ1, p − 1 −μ1p
H0a :
μ2, p − 1 −μ2p
p X
μ1j =
j=1
H0b :
k X i=1
p X
μ2j = … =
j=1
μi1 =
k X i=1
μk, p− 1 − μkp
p X
μkj ,
ð1:3:2Þ
μip :
ð1:3:3Þ
j=1
μi2 = … =
k X i=1
1.3 k-Sample RMD
5
Here, μi can be interpreted as the profile of the ith population (i = 1, 2, …, k). The null hypothesis H0c then implies that we are testing the parallelism of the k profiles. If H0c is retained, the parallelism hypothesis is not rejected and the profiles will appear as in Figure 1.3.1. When H0c is rejected, the profiles may be either intersecting one another (Figure 1.3.2) or the slopes may be different between the responses (Figure 1.3.3). In experimental work, H0c is the null hypothesis of testing the interaction effects between the treatments and the responses. If H0c is not rejected, then one will be interested to test H0a and/or H0b. In H0a, we are testing the average of p responses to be constant from population to population (or treatment to treatment). In H0b, we are testing the average of the k populations (or treatments) to be the same for the responses. Testing H0a and H0b are, in essence, testing the main effects in a factorial experiment (see Padgett, 2011, for further details). The analyses of these designs are discussed in Chapter 3. In this case, it is shown that the univariate analysis of variance (ANOVA) can be applied to make all inferences if Σ satisfies the sphericity
Population k Population 2 Population 1 1
2
3
p
FIGURE 1.3.1 Parallel profiles.
Population 1 Population k Population 2 1
2
3
FIGURE 1.3.2 Intersecting nonparallel profiles.
p
6
CHAPTER 1 Introduction
Population k Population 2 Population 1
1
2
3
p
FIGURE 1.3.3 Nonparallel profiles with different slopes.
condition and multivariate methods are needed if Σ violates the sphericity condition. Univariate methods can also be used by adjusting the degrees of freedom, when sphericity assumption is not valid and the necessary adjustment will also be given in Chapter 3. We will close this section with some examples of k-sample RMD given in the literature: EX A MP L E 1.3.1 Paape and Tucker (1969) considered a study of the influence of pregnancy on concurrent lactational performance of rats measured by litter weight gains. The two groups considered were pregnant and nonpregnant rats. The data were taken at four time intervals/periods: 8–12, 12–16, 16–20, and 20–24 days of lactation. 䊏 In this setting, one will be interested to test the parallelism of weight gain profiles for both groups of rats and then test for the differences of groups averaging over periods and for the differences of lactation periods averaging over the two groups following the methods discussed in Chapter 3. Gill and Hafs (1971) discussed different types of statistical analyses for this problem. EX A MP L E 1.3.2 Lee (1977) in a course project at Temple University analyzed the Adaptive Behavior Scale (ABS) values of mentally challenged institutionalized people. There are four groups of individuals based on their mental ages, and the ABS values for three periods were recorded every 6 months. 䊏
1.4 Split-Plot Designs
7
Lee was interested to test the hypothesis that all four groups are progressing equally and the hypothesis of no differences in ABS values from group to group and period to period. The numerical details of this type of analysis will be considered in Chapter 3. EX A MP L E 1.3.3 Danford, Hughes, and McNee (1960) studied the effect of radiation therapy on 45 subjects suffering from cancerous lesions. The subjects were trained to operate a psychomotor testing device, and the average daily scores based on four trials on the day preceding radiation and on each of the 10 days after the therapy were taken as the responses. Six subjects were not given radiation and served as controls, while the remaining subjects were treated with dosages of 25–50, 75–100, or 125–250. The parallelism of group profiles, the differences of radiation levels, and the differences in daily progress can be tested by the methods given in Chapter 3. The dispersion matrices for the k groups of units may not be equal, and we will also discuss this aspect in the analysis in Chapter 3. 䊏
1.4 SPLIT-PLOT DESIGNS Split-plot designs are widely used in agricultural experiments (see Gomez and Gomez, 1984; Raghavarao, 1983). The experimental material is first divided into main plots to accommodate main treatments. Each main plot is then subdivided into s subplots, and the s subplot treatments are randomly assigned to each main plot. The main plot treatments assigned to main plots can either form a randomized block design (RBD) or a completely randomized design (CRD). In the context of RMD, it is more appropriate to consider the main plot treatments to form a CRD. With three main plot treatments a0, a1, and a2 replicated on 3, 4, and 4 main plots and with four subplots treatments b0, b1, b2, and b3, the layout may appear as in Figure 1.4.1. In the RMD setting, one can consider three groups of experimental units a0, a1, and a2, respectively, of sizes 3, 4, and 4. Ignoring the subplot treatments, one considers the sequence of four subplot observations as the four-period observations. The model assumes equal correlation structure of period observations on each experimental unit. Further,
8
CHAPTER 1 Introduction
a0
a2
a2
a1
a2
a0
a0
a1
a1
a2
a1
b0
b1
b0
b0
b3
b3
b2
b1
b0
b0
b1
b2
b0
b1
b1
b2
b1
b3
b2
b1
b1
b0
b1
b3
b3
b2
b0
b2
b0
b3
b3
b2
b2
b3
b2
b2
b3
b1
b0
b1
b0
b2
b3
b3
FIGURE 1.4.1 Split-plot layout.
TABLE 1.4.1 Artificial data for profile analysis Lactation period (days) 8–12
Pregnant rats 1
2
3.4 1.6
3
4
5
5.7
7.3 6.3
Nonpregnant rats 6
7
8.1 7.2
1
2
12.1 8.9
3
4
5
6
7
9.8 7.9 8.6 10.8 11.7
12–16
8.1 9.6 12.9 11.9 9.8 10.4 9.4
12.3 9.4 10.7 7.9 8.5 10.6 12.3
16–20
4.7 7.8 10.8
9.2 6.4
7.7 8.3
12.4 9.4 13.2 7.9 8.3
9.9
9.8
20–24
1.1 2.9
5.6 0.6
2.9 3.4
10.1 7.3
7.5
8.4
3.6
9.7 4.6 5.7
the systematic arrangement of the subplot data somewhat violates the assumptions of split-plot analysis. However, this design is also widely used as RMD. We will not formally discuss this design in this monograph as this design is discussed in detail in several books on experimental designs; however, for completeness, we will provide the SAS program in Example 1.4.1.
EX A MP L E 1.4.1 We will now consider artificial data given in Table 1.4.1 for the problem mentioned in Example 1.3.1. 䊏 The following SAS program provides the necessary output:
1.4 Split-Plot Designs
9
data a; input days $ treatment ratnumber value @@;cards; 8-12 1 1 3.4 8-12 1 2 1.6 8-12 1 3 5.7 8-12 1 4 7.3 8-12 1 5 6.3 8-12 1 6 8.1 8-12 1 7 7.2 8-12 2 1 12.1 8-12 2 2 8.9 8-12 2 3 9.8 8-12 2 4 7.9 8-12 2 5 8.6 8-12 2 6 10.8 8-12 2 7 11.7 12-16 1 1 8.1 12-16 1 2 9.6 12-16 1 3 12.9 12-16 1 4 11.9 12-16 1 5 9.8 12-16 1 6 10.4 12-16 1 7 9.4 12-16 2 1 12.3 12-16 2 2 9.4 12-16 2 3 10.7 12-16 2 4 7.9 12-16 2 5 8.5 12-16 2 6 10.6 12-16 2 7 12.3 16-20 1 1 4.7 16-20 1 2 7.8 16-20 1 3 10.8 16-20 1 4 9.2 16-20 1 5 6.4 16-20 1 6 7.7 16-20 1 7 8.3 16-20 2 1 12.4 16-20 2 2 9.4 16-20 2 3 13.2 16-20 2 4 7.9 16-20 2 5 8.3 16-20 2 6 9.9 16-20 2 7 9.8 20-24 1 1 1.1 20-24 1 2 2.9 20-24 1 3 3.6 20-24 1 4 5.6 20-24 1 5 0.6 20-24 1 6 2.9 20-24 1 7 3.4 20-24 2 1 10.1 20-24 2 2 7.3 20-24 2 3 9.7 20-24 2 4 4.6 20-24 2 5 5.7 20-24 2 6 7.5 20-24 2 7 8.4 ; data final;set a; if days='8-12' then period=1; else if days='12-16' then period=2; else if days='16-20' then period=3; else if days='20-24' then period=4; proc sort;by days ratnumber; proc glm; class ratnumber treatment period; model value=treatment treatment(ratnumber) period treatment∗period ; ∗ If the main treatments are arranged in a RBD, we will use ‘blocks’ and ‘blocks ∗ interaction’ in the model statement and remove treatment (rat number) term. We will also use ‘blocks’ instead of ‘rat numbers’ in the class statement.; test h=treatment e=treatment(ratnumber); In the RBD case, we will use e=blocks∗interaction; run; ∗
means period/snk; means treatment/snk e=treatment(ratnumber); run; ∗∗∗
10
CHAPTER 1 Introduction
The GLM Procedure Dependent Variable: value Source Model Error Corrected Total
DF 19 36 (a6) 55
R-Square 0.916930
Sum of Squares 480.3192857 43.5150000 523.8342857
Coeff Var 13.60923
Mean Square 25.2799624 1.2087500 (a4)
Root MSE 1.099432
F Value 20.91
Pr > F F Fα ððp −1Þ, ðp −1ÞðN − k ÞÞ: ð3:3:6Þ S5 The analysis resulting from the earlier tests can be summarized in the k-sample RMD ANOVA (Table 3.3.1). One can perform standard multiple comparison tests like Scheffe’s to test individual contrasts of responses or groups. When the sphericity condition is not satisfied for the common dispersion matrix, one can TABLE 3.3.1 k-Sample RMD ANOVA Source
d.f.
S.S.
M.S.
F
Responses
p–1
S1
S1/(p–1)
(N–k)S1/S5
Groups
k–1
S2
S2/(k–1)
(N–k)S2/(k–1)S3
Units/groups
N–k
S3
S3/(N–k)
Groups × responses
(k–1)(p–1)
S4
S4/(k–1)(p–1)
Error
(p–1)(N–k)
S5
S5/(p–1)(N–k)
Total
NP–1
S6
(N–k)S4/(k–1)S5
60
CHAPTER 3 k-Sample Repeated Measurements Design
conservatively test the F statistics for responses with numerator and denominator degrees of freedom 1 and (p–1)(N–k).
3.4 MULTIVARIATE METHODS FOR k-SAMPLE RMD Letting Yiα to be the p-component response vector on the αth observation (or unit) taken from the ith population (or treated with ith treatment), it is assumed that Yiα IMNp(μi, Σ) for α = 1, 2, …, Ni; i = 1, 2, …, k. Let L be a (p–1) × p matrix as defined in (2.4.1) and let Wiα = LYiα , α = 1, 2, …, Ni ; i = 1, 2, …, k:
ð3:4:1Þ
Then, clearly Wiα IMNp – 1 ðυi , ψ Þ,
ð3:4:2Þ
υi = Lμi , i = 1, 2, …, k
ð3:4:3Þ
ψ = LΣL0 :
ð3:4:4Þ
where and
The null hypothesis (3.1.1) is equivalent to H 0 0I : υ1 = υ2 = … = υk :
ð3:4:5Þ
This is the standard multivariate ANOVA problem discussed in Rao (1973), Morrison (1976), and Anderson (2003) and can be solved using test statistics involving determinants or maximum eigenvalues of functions of matrices representing the hypotheses and error. Let Ni 1X Wi = Wiα , i = 1, 2, …, k Ni α = 1
ð3:4:6Þ
Ni k X X =1 Wiα : W N i=1 α=1
ð3:4:7Þ
and
Put H=
k X i=1
and
Þ W i − WÞ 0 i −W Ni ðW
ð3:4:8Þ
3.4 Multivariate Methods for k-Sample RMD
E=
Ni k X X
i Þ Wiα − W i Þ0 : ðWiα − W
61
ð3:4:9Þ
i=1 α=1
The matrices H and E are the hypothesis and error matrices for testing the null hypothesis (3.4.5). Let s = min(k – 1, p – 1) and let cs be the greatest eigenvalue of HE −1. The upper percentage points of the greatest root distribution of the matrix H (H + E)− 1 have been computed by Heck (1960), Pillai and Bantegu (1959), and Pillai (1964, 1965, 1967). The maximum eigenvalue cs of HE−1 is related to the maximum eigenvalue θs of H(H + E)− 1 by the relation cs θs = : ð3:4:10Þ 1 + cs Using m = jk − p2j − 1, n = N −2k − p, the upper α percentile point cα(s, m, n) of the distribution θs can be obtained from Heck’s charts or Pillai’s tables, and the critical region for testing the null hypothesis (3.4.5) is θs > cα ðs, m, nÞ:
ð3:4:11Þ
Alternatively, one notes that jH j Up − 1, k − 1, N − k jH + Ej
ð3:4:12Þ
and can construct critical regions accordingly for testing the null hypothesis (3.4.5). Letting Uiα = J1, p Yiα , α = 1, 2, …, Ni ; i = 1, 2, …, k,
ð3:4:13Þ
it follows that the one-dimensional random variable Uiα has Uiα IN J1, p μi , J1, p ΣJp, 1 , ð3:4:14Þ and the null hypothesis (3.1.2) is based on the F statistic with (k – 1) and (N – k) degrees of freedom given by F=
MSb , MSe
where Ni Ni k X X X =1 i = 1 Uiα , U Uiα , U Ni α = 1 N i=1 α=1
ð3:4:15Þ
62
CHAPTER 3 k-Sample Repeated Measurements Design
TABLE 3.4.1 ANOVA for testing groups in a k-sample RMD Source Between groups
d.f.
S.S. k X
k–1
Þ i −U Ni ðU
2
M.S.
F
MSb
MSb/MSe
i=1
Within groups
Ni k X X
N–k
iÞ ðUiα − U
2
MSe
i=1 α=1
Total
Ni k X X
N–1
Þ2 ðUiα − U
i=1 α=1
k 1 X i −U Þ2 , Ni ðU k −1 i = 1
ð3:4:16Þ
Ni k X 1 X i Þ2 : ðUiα − U N −k i=1 α=1
ð3:4:17Þ
MSb = and MSe =
This can be summarized in Table 3.4.1. The critical region for testing the null hypothesis (3.1.2) is thus F > Fα ððk – 1Þ,ðN – k ÞÞ,
ð3:4:18Þ
where F is the F value from the ANOVA Table 3.4.1. This is the same as the test discussed for univariate ANOVA in Section 3.3. When the null hypothesis (3.1.1) or equivalently (3.4.5) is tenable, to test the responses, let υ be the common υ1 = υ2 = … = υk. The null hypothesis (3.1.3) is then equivalent to H 0 0r : υ = 0:
ð3:4:19Þ
Now Wiα ~ IMNp–1(υ, ψ) and testing (3.4.19) can be done as described in Section 2.4. Letting E to be defined as in Equation (3.4.9), one notes that E ~ Wp–1(N–k, ψ). Put 1 S= E ð3:4:20Þ ðN – k Þ and
3.5 Numerical Example
0 S – 1 W: T 2 = NW
63
ð3:4:21Þ
Under the null hypothesis (3.4.19) F=
ðN −k − p + 2Þ 2 T ðN −k Þðp− 1Þ
ð3:4:22Þ
is distributed as an F distribution, with degrees of freedom (p – 1) and (N – k – p + 2). Thus, the critical region for testing the null hypothesis (3.1.3) is N ðN −k −p + 2Þ 0 – 1 W S W > Fα ððp – 1Þ, ðN – k – p + 2ÞÞ: ðN − k Þðp −1Þ
ð3:4:23Þ
3.5 NUMERICAL EXAMPLE In this section, the methods discussed in Sections 3.2–3.4 will be illustrated through an example with artificial data for the situation considered in Example 1.3.2. TABLE 3.5.1 Artificial data of ABS scores Periods Groups
1
2
3
1
109
115
125
110
112
135
105
102
100
2
3
4
97
120
121
126
142
148
151
197
189
122
162
138
157
152
217
212
235
213
137
183
273
225
204
197
175
181
203
64
CHAPTER 3 k-Sample Repeated Measurements Design
EX A MP L E 1.3.2 䊏
(Continued) Let us consider the data given in Table 3.5.1.
We assume that the dispersion matrices for the four groups are equal as we cannot test the equality of dispersion matrices for this example because the sample sizes in each group is not 1 more than the number of variables. The following SAS program provides the necessary output: data a;input patnum group period1 period2 period3 @@;cards; 1 1 109 115 125 2 1 110 112 135 31 105 102 100 4 2 97 120 121 5 2 126 142 148 6 2 151 197 189 7 3 122 162 138 8 3 157 152 217 9 3 212 235 213 10 3 137 183 273 11 4 225 204 197 12 4 175 181 203 ; proc glm;class group; model period1 period2 period3=group/nouni; ∗ the nouni will not provide the univariate Anova for each dependent variable; repeated period 3 profile /printe uepsdef= HF;run; ∗ The period indicates the label for the repeated measurements variable, 3 is the number of repeated measurement, profile is the type of transformation specified and printe provides a test for sphericity (see the SAS manual for further details); data b;set a; diff1=period1-period2;diff2=period2-period3; ∗ This is needed to study the linear and quadratic period effects across groups; proc glm;class group; model diff1 diff2=group/nouni; repeated diff 2profile;run; ∗∗∗
Sphericity Tests
Variables
DF
Mauchly's Criterion
Chi-Square
Pr > ChiSq
Transformed Variates
2
0.5882072
3.7147324
0.1561
Orthogonal Components
2
0.457953
5.4669214
0.0650 (a1)
3.5 Numerical Example
65
The GLM Procedure Repeated Measures Analysis of Variance MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no period Effect H = Type III SSCP Matrix for period E = Error SSCP Matrix S=1 M=0 N=2.5 Statistic
Value
Wilks' Lambda
0.54657575
Pillai's Trace
F Value Num DF Den DF Pr > F 2.90
2
7
0.1207 (a2)
0.45342425
2.90
2
7
0.1207
Hotelling-Lawley Trace 0.82957256
2.90
2
7
0.1207
Roy's Greatest Root
2.90
2
7
0.1207
0.82957256
MANOVA Test Criteria and F Approximations for the Hypothesis of no periodgroup Effect H = Type III SSCP Matrix for periodgroup E = Error SSCP Matrix S=2 M=0 N=2.5 Statistic
Value
Wilks' Lambda
0.43419217
F Value Num DF Den DF Pr > F 1.21
6
14
Pillai's Trace
0.3584 (a3)
0.62009838
1.20
6
16
0.3564
Hotelling-Lawley Trace 1.17808962
1.32
6
7.7895
0.3502
Roy's Greatest Root
2.83
3
8
0.1066
1.06014535
NOTE: F Statistic for Roy's Greatest Root is an upper bound. NOTE: F Statistic for Wilks' Lambda is exact. The GLM Procedure Repeated Measures Analysis of Variance Tests of Hypotheses for Between Subjects Effects Source
DF
Type III SS
Mean Square
F Value
Pr > F
group
3
37607.02778
12535.67593
5.50
0.0240 (a4)
Error
8
18234.86111
2279.35764
66
CHAPTER 3 k-Sample Repeated Measurements Design
The GLM Procedure Repeated Measures Analysis of Variance Univariate Tests of Hypotheses for Within Subject Effects4 Adj Pr > F G–G
H–F
Source
DF Type III SS
Mean Square F Value Pr > F
period
2
3070.676471
1535.338235
2.74
0.0949(a5) 0.1230(a6)
periodgroup 6
2958.555556
493.092593
0.88
0.5317(a8) 0.5064 (a9) 0.5317(a10)
Error(period) 16
8972.388889
560.774306
0.0949 (a7)
Greenhouse-Geisser Epsilon 0.6485 (a11) Huynh-Feldt Epsilon 1.0118(a12)
∗∗∗ MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no diffgroup Effect H = Type III SSCP Matrix for diffgroup E = Error SSCP Matrix S=1 M=0.5 N=3 Statistic
Value
F Value Num DF Den DF Pr > F
Wilks' Lambda
0.82126201
0.58
3
8
0.6442 (a13)
Pillai's Trace
0.17873799
0.58
3
8
0.6442
Hotelling-Lawley Trace 0.21763820
0.58
3
8
0.6442
Roy's Greatest Root
0.58
3
8
0.6442
0.21763820
The GLM Procedure Repeated Measures Analysis of Variance Tests of Hypotheses for Between Subjects Effects Source
DF
Type III SS
Mean Square
F Value
group
3
2442.750000
814.250000
0.99
Error
8
6602.375000
825.296875
Pr > F 0.4464 (a14)
∗∗∗ In the output at (a1), we have the p-value to test the sphericity condition of the dispersion matrices (Mauchly, 1940). If this p-value is more than 0.05, the sphericity assumption is not rejected. In our example, it is 0.065
3.6 Multivariate Methods with Unequal Dispersion Matrices
67
and the sphericity condition is valid and hence the univariate approach can be used to test the hypothesis. At (a5) and (a8), we have the p-values for testing the period effect and the period∗group interaction effect. In our example, these are not significant. Had the sphericity condition been rejected, p-values given at (a6) and (a9) adjusting by the method of Greenhouse and Geisser or the p-values given at (a7) and (a10) adjusting by the method of Huynh and Feldt could have been used. In our problem, these p-values are also not significant. The Greenhouse and Geisser (1959) adjustment ε^ is given at (a11) and is 0.6485. The Huynh and Feldt (1976) adjustment eε is given at (a12) and is 1.0118. Note that instead of the Huynh–Feldt epsilon, the Lecoutre correction of the Huynh–Feldt epsilon is displayed as a default for SAS releases from 9.22 onwards. The Huynh–Feldt epsilon’s numerator is precisely unbiased only when there are no between-subject effects. Lecoutre (1991) made a correction to the numerator of the Huynh–Feldt epsilon. However, one can still request Huynh–Feldt epsilon by using the option UEPSDEF = HF in the REPEATED statement. In our example, the Huynh–Feldt–Lecoutre epsilon is 0.7215. The p-values for period and period∗group adjusting by the method of Huynh–Feldt–Lecoutre are 0.1167 and 0.5126. By using the multivariate methods, the p-value at (a2) will be used to test the no period effect hypothesis and the p-value at (a3) to test the interaction effect of period and groups. The p-value at (a4) is used to test the group effect in univariate or multivariate analyses. In our example, this p-value is significant, and hence, the ABS scores differ in the four groups. If the periods are of equal time intervals, the linear and quadratic effects of periods can be tested by using the p-values at (a13) and (a14). The linear effect equality across groups is tested by the p-value given at (a14), and it is not significant in our example. The quadratic effect equality across groups of the periods can be tested by the p-value given at (a13) and is not significant. For further programming details, see the SAS manual.
3.6 MULTIVARIATE METHODS WITH UNEQUAL DISPERSION MATRICES Continuing the notation used in this chapter so far, let Yiα be distributed IMNp(μi, Σ i) for α = 1, 2, …, Ni; i = 1, 2, …, k. Put
68
CHAPTER 3 k-Sample Repeated Measurements Design
Ni 1X Yi = Yiα , Ni α = 1
Si =
1X i Þ Yiα − Y i Þ0 , where ni = Ni − 1, ðYiα − Y ni
S = DðS1 , S2 , …, Sk Þ, 0 0 1, Y 2 , …, Y 0k , 0 = Y Y μ' = ðμ01 , μ02 , …, μ0k Þ, 2 3 1 − 1 0 0 6 1 0 −1 0 7 7, i = 1, 2, Li = 6 4: : : : 5 1 0 0 −1 C1 = L1 L2 , C2 = L1 J1, p , C3 = J1, k L2 , where L1 is (k–1) × k and L2 is (p–1) × p matrices. Further, D(…) is a block diagonal matrix with entries in the diagonal position. Note that the rows of C1μ are independent contrasts of groups and period interaction, C2μ are independent contrasts of groups, and C3μ are independent contrasts of periods. Following Welch (1947, 1951) and James (1951, 1954), the three null hypotheses (3.1.1), (3.1.2), and (3.1.3) can be tested using the test statistics: 0 ðCℓ SC0 ℓ Þ − 1 ðCℓ Y Þ, ℓ = 1, 2, 3: Tℓ = ðCℓ YÞ Let q = kp, m1 = (p–1)(k–1), m2 = k–1, m3 = p–1, f1ℓ = q− mℓ , f2ℓ =
ðq− mℓ Þðq −mℓ + 2Þ , 3Aℓ
eℓ = q −mℓ + 2Aℓ − ð1=2Þ Aℓ =
6Aℓ , q− mℓ + 2
o2 n o2 Xk n 0 0 −1 0 0 −1 tr SC ð C SC Þ C Q + tr SC ð C SC Þ C Q ℓ ℓ ℓ ℓ ℓ ℓ i ℓ ℓ i i=1 ni
3.6 Multivariate Methods with Unequal Dispersion Matrices
69
for ℓ = 1, 2, 3, where Qi is a kp square block matrix whose (u, w) is Ip for u = w = i and 0 for u 6¼ w; u, w = 1, 2, …, k. The critical regions for testing the hypotheses (3.1.1), (3.1.2), and (3.1.3) are, respectively, T1 > Fα ð f11 , f 21 Þ, e1 T2 > Fα ð f12 , f 22 Þ, and e2
ð3:6:1Þ
T3 > Fα ð f13 , f 23 Þ: e3 For some simulation results, see Keselman, Carriere, and Lix (1993). Also, see Keselman, Algina, and Kowalchuk (2001) for a review on repeated measurements designs. Let us continue with the ABS scores data given in Table 3.5.1. Though the dispersion matrices for the four groups are not significantly different, we will still use the test discussed in this section. data a;input patnum group period1 period2 period3 @@; cards; 1 1 109 115 125 2 1 110 112 135 3 1 105 102 100 4 2 97 120 121 5 2 126 142 148 6 2 151 197 189 7 3 122 162 138 8 3 157 152 217 9 3 212 235 213 10 3 137 183 273 11 4 225 204 197 12 4 175 181 203 ; proc sort;by group; proc corr cov outp=stats noprint; var period1 period2 period3;by group ;run; data stats;set stats; proc iml;use stats; read all into y var{period1 period2 period3}; meany= {108.00 109.67 120.0 124.67 153.00 152.67 157.00 183. 00 210.25 200.00 192.50 200.00}; l1={1 –1 0 0,1 0 - 1 0,1 0 0 -1}; l2={1 –1 0,1 0 –1}; j1={1 1 1}; j2={1 1 1 1};
70
CHAPTER 3 k-Sample Repeated Measurements Design
c1= l1@l2; ∗l1 kronecker product l2; c2=l1@j1; c3=j2@l2; /∗∗∗input data∗∗∗/ k=4;∗number of groups; p=3;∗number of periods; /∗∗∗ end of input data ∗∗∗/ q=k∗p; m1=(p-1)∗(k-1); m2=k-1; m3=p-1; f11= q - m1; f12 = q - m2; f13 = q - m3; s1=y[1:3,1:3]; s2=y[10:12,1:3]; s3=y[19:21,1:3]; s4=y[28:30,1:3]; s=block(s1,s2,s3,s4);∗S=D(s1,s2,s3,s4); T1=(C1∗meany`)`∗(inv(c1∗s∗C1`))∗(c1∗meany`);T2= (C2∗meany`)`∗(inv(c2∗s∗C2`))∗(c2∗meany`); T3=(C3∗meany`)`∗(inv(c3∗s∗C3`))∗(c3∗meany`); zero={0 0 0 ,0 0 0 ,0 0 0}; q1=block(I(3),zero,zero,zero);∗create a block matrix; q2=block(zero,I(3),zero,zero); q3=block(zero,zero,I(3),zero); q4=block(zero,zero,zero,I(3)); /∗∗A1∗∗/ d11= (s∗c1`)∗(inv(c1∗s∗c1`))∗(c1∗q1); a11=(trace(d11∗d11)+(trace(d11))∗∗2)/2; d12= (s∗c1`)∗(inv(c1∗s∗c1`))∗(c1∗q2); a12=(trace(d12∗d12)+(trace(d12))∗∗2)/2; d13= (s∗c1`)∗(inv(c1∗s∗c1`))∗(c1∗q3); a13=(trace(d13∗d13)+(trace(d13))∗∗2)/3; d14= (s∗c1`)∗(inv(c1∗s∗c1`))∗(c1∗q4); a14=(trace(d14∗d14)+(trace(d14))∗∗2)/1; a1=(a11+a12+a13+a14)/2;
3.6 Multivariate Methods with Unequal Dispersion Matrices
71
/∗∗A2∗∗/ d21= (s∗c2`)∗(inv(c2∗s∗c2`))∗(c2∗q1); a21=(trace(d21∗d21)+(trace(d21))∗∗2)/2; d22= (s∗c2`)∗(inv(c2∗s∗c2`))∗(c2∗q2); a22=(trace(d22∗d22)+(trace(d22))∗∗2)/2; d23= (s∗c2`)∗(inv(c2∗s∗c2`))∗(c2∗q3); a23=(trace(d13∗d13)+(trace(d13))∗∗2)/3; d24= (s∗c2`)∗(inv(c2∗s∗c2`))∗(c2∗q4); a24=(trace(d14∗d14)+(trace(d14))∗∗2)/1; a2=(a21+a22+a23+a24)/2; /∗∗A3∗∗/ d31= (s∗c3`)∗(inv(c3∗s∗c3`))∗(c3∗q1); a31=(trace(d31∗d31)+(trace(d31))∗∗2)/2; d32= (s∗c3`)∗(inv(c3∗s∗c3`))∗(c3∗q2); a32=(trace(d32∗d32)+(trace(d32))∗∗2)/2; d33= (s∗c3`)∗(inv(c3∗s∗c3`))∗(c3∗q3); a33=(trace(d33∗d33)+(trace(d33))∗∗2)/3; d34= (s∗c3`)∗(inv(c3∗s∗c3`))∗(c3∗q4); a34=(trace(d34∗d34)+(trace(d34))∗∗2)/1; a3=(a31+a32+a33+a34)/2; f21=((q-m1)∗(q-m1+2))/3∗a1; f22=((q-m2)∗(q-m2+2))/3∗a2; f23=((q-m3)∗(q-m3+2))/3∗a3; e1=q-m1+2∗a1-((6∗a1)/(q-m1+2)); e2=q-m2+2∗a2-((6∗a2)/(q-m2+2)); e3=q-m3+2∗a3-((6∗a3)/(q-m3+2)); p1=1-probf(t1/e1,f11,f21);∗p-value for the interaction; p2=1-probf(t2/e3,f12,f22);∗p-value for the groups; p3=1-probf(t3/e3,f13,f23); ∗p-value for the responses; print p1 p2 p3;quit;run;
p1
p2
p3
0.7137978
0.030452
0.9919797
From the output, the p-values p1, p2, and p3 are used to test the significance of the interaction, groups, and responses. Our values indicate the interaction is not significant. Hence, we can test for the groups and responses and we observe the groups p-value is significant, whereas the responses p-value is not significant.
72
CHAPTER 3 k-Sample Repeated Measurements Design
3.7 ANALYSIS WITH ORDERED CATEGORICAL RESPONSE In some cases, the response Yiαβ taken at βth period on the αth unit of the ith group takes any of c-ordered categorical responses for β = 1, 2, …, p; α = 1, 2 …, Ni, i = 1, 2, …, k. To illustrate, let us consider the data given in Table 3.5.1 and let us convert it into an ordered categorical data with three categories defined by a: ChiSq
3
0.7235 (b1)
∗∗∗ Testing Global Null Hypothesis: BETA=0 Test
Chi-Square
DF
Pr > ChiSq
Likelihood Ratio
28.4374
3
ChiSq
Intercept 3
1
–7.1453
3.0235
5.5848
0.0181 (b5)
Intercept 2 period
1 1
–3.4946 0.6202
2.6724 1.2543
1.7101 0.2445
0.1910 (b6) 0.6210 (b7)
group
1
2.3951
1.2456
3.6973
0.0545 (b8)
periodgroup
1
0.0113
0.5766
0.0004
0.9843 (b9)
Association of Predicted Probabilities and Observed Responses Percent Concordant 87.1 Somers'D 0.794 Percent Discordant 7.7 Gamma 0.837 Percent Tied 5.1 Tau-a 0.490 Pairs
389 c
0.897(b10)
From the p-value of 0.7235 given at (b1), we observe that the proportional odds model is valid in this setting. From the value 0.897 given at (b10) for c, we observe the model is a good fit. The overall significance of the parameters is tested by the p-values given at (b2), (b3), or (b4), and all of these p-values indicate strong significance for all of the parameters. The p-value for interaction given at (b9) is not significant. The p-value for the period effects given at (b7) is not significant. However, the p-value for the groups given at (b8) indicates a borderline nonsignificance. By using the DESCENDING option in PROC LOGISTIC, the ith intercept corresponds to the ith or higher category of the response versus the categories less than i, and the p-value corresponding to the intercept i can be used to test its significance. In our problem, the p-value corresponding to (b5) is significant, indicating that the intercept term corresponding to the third category versus the first and second categories is significant. Similarly, the p-value corresponding to (b6) is not significant, indicating that the intercept corresponding to the second and third categories versus the first category is not significant.
REFERENCES ANDERSON TW. An Introduction to Multivariate Statistical Analysis. 2nd ed. New York: Wiley; 2003.
76
CHAPTER 3 k-Sample Repeated Measurements Design
BARTLETT MS. Properties of sufficiency and statistical tests. Proc R Soc 1937;160A:268–282. BOX GEP. A general distribution theory for a class of likelihood criteria. Biometrika 1949;36:317–346. GREENHOUSE SW, GEISSER S. On methods in the analysis of profile data. Psychometrika 1959;32:95–112. HECK DL. Charts of some upper percentage points of the distribution of the largest characteristic root. Ann Math Statist 1960;31:625–642. HUYNH H, FELDT L. Estimation of the Box correction for degrees of freedom from sample data in randomized block and split plot designs. J Edu Statist 1976;1:69–82. JAMES GS. The comparison of several groups of observations when the ratios of the population variances are unknown. Biometrika 1951;38:324–329. JAMES GS. Tests of linear hypotheses in univariate and multivariate analysis when the ratios of population variances are unknown. Biometrika 1954;41:19–43. KENWARD MG, JONES B. Alternative approaches to the analysis of binary and categorical repeated measurements. J Biopharm Statist 1992;2:137–170. KESELMAN H, CARRIERE K, LIX LM. Testing repeated measures hypotheses when covariance matrices are heterogeneous. J Educ Behav Statist 1993;18:305–319. KESELMAN H, ALGINA J, KOWALCHUK R. The analysis of repeated measurements designs: a review. Br J Math Statist Psychol 2001;54:1–20. LECOUTRE B. A correction for the epsilon approximate test with repeated measures design with two or more independent groups. J Edu Statist 1991;16:371–372. MAUCHLEY JW. Significance test for sphericity of a normal n-variate distribution. Ann Math Statist 1940;29:204–209. MORRISON DF. Multivariate Statistical Methods. 2nd ed. New York: McGraw Hill; 1976. PILLAI KCS. On the distribution of the largest of seven roots of a matrix in multivariate analysis. Biometrika 1964;51:270–275. PILLAI KCS. On the distribution of the largest characteristic root of a matrix in multivariate analysis. Biometrika 1965;52:405–414. PILLAI KCS. Upper percentage points of the largest root of a matrix in multivariate analysis. Biometrika 1967;54:189–194. PILLAI KCS, BANTEGU CG. On the distribution of the largest of six roots of a matrix in multivariate analysis. Biometrika 1959;46:237–270. RAO CR. Linear Statistical Inference and Its Applications. 2nd ed. New York: Wiley; 1973. WELCH BL. On the comparison of several mean values: An alternative approach. Biometrika 1951;38:330–336. WELCH BL. The generalization of student’s problem when several different population variances are involved. Biometrika 1947;34:28–35.
CHAPTER
4
Growth Curve Models
4.1 INTRODUCTION In Chapters 2 and 3, we discussed the repeated measurements data analysis with univariate and multivariate methods. These methods are not always useful and have several restrictions imposed on them. To use the univariate methods, we require the dispersion matrix for the repeated measurements on units to satisfy the sphericity condition. This condition can be relaxed for multivariate methods. However, the multivariate methods require equal number of repeated measurements on each unit and the same covariates at each measurement time. The relaxation of this requirement needs a more broad-based linear and nonlinear models, and we will consider some such cases in this chapter. In a dose–response curve, the responses are plotted against different dose levels (or log dose levels) of the drug. The response curve will be steep in a small window of required dose level and is nearly flat outside that window. Because of the slope pattern, these are known as sigmoidal (or S-shaped) curves. Some commonly used curves are based on logit, probit, and Gompertz models, and we will discuss them in Section 4.2. A simple growth curve model can be used accounting for time-invariant and time-dependent covariates. We will consider them in Section 4.4 as linear models and in Section 4.5 as nonlinear models. Numerical examples are provided in Section 4.6. We will briefly discuss the mixed model analysis in Section 4.3 and the joint action models in Section 4.7. Repeated Measurements and Cross-Over Designs, First Edition. Damaraju Raghavarao and Lakshmi Padgett. © 2014 John Wiley & Sons, Inc. Published 2014 by John Wiley & Sons, Inc.
77
78
CHAPTER 4 Growth Curve Models
4.2 SIGMOIDAL CURVES In some settings, the response variable Y is almost flat at the lower and higher values of the independent variable x and is steep in the effective domain of x. Such curves are known as sigmoidal curves and the underlying model is nonlinear. Some applications of such curves can be found in pharmacokinetics (Krug and Liebig, 1988; Liebig, 1988), toxicology (Becka and Urfer, 1996; Becka, Bolt, and Urfer, 1993), and other disciplines. Usually, x will be the dose (or log of the dose) level and Y will be the response for a drug. The general form of the nonlinear model can be written as Yi = f ðxi ; θÞ + ei , i = 1, 2, …, n,
ð4:2:1Þ
where Yi is the ith response, θ is the parameter vector of dimensionality p, xi is the ith dose or log of the ith dose (di), and ei are random errors. A suitable assumption of the errors can be made depending on the nature of the dose–response curve. While there is vast literature on finding the optimal choice of xi for a linear model (cf. Bose and Dey, 2009; Goos, 2002; Kiefer, 1958; Kunert, 1983; Pukelsheim, 1993; Shah and Sinha, 1989), limited work has been done on nonlinear models. The optimal designs for nonlinear models are given for minimal designs, where n = p; and the interested reader is referred to Antonello and Raghavarao (2000), Biedermann, Dette, and Zhu (2006), Chaudhuri and Mykland (1995), Dette, Melas, and Pepelyshev (2004), Kalish (1990), Dette, Melas, and Wong (2005), and Sitter and Torsney (1995) for details. In Equation (4.2.1), θ will be estimated iteratively. We start with a vector of initial values, θ(0), and write ðaÞ0 θ −θðaÞ + ei , i = 1, 2, …, n, ð4:2:2Þ Yi = f xi ;θðaÞ + hi where ðaÞ0 hi
∂f ðxi ; θÞ ∂f ðxi ; θÞ ∂f ðxi ; θÞ = , , …, , a = 0, 1, …, ∂θ1 ∂θ2 ∂θp
ð4:2:3Þ
and where θ(a + 1) is substituted for θ in Equation (4.2.3). Taking V to be the dispersion matrix of the error vector e0 = (e1, e2, …, en), by weighted least squares, we get – 1 0 0 ^ θ = θða + 1Þ = θðaÞ + H ðaÞ V – 1 H ðaÞ H ðaÞ V – 1 ðY – f Þ, a = 1, 2, …, ð4:2:4Þ
4.2 Sigmoidal Curves
79
where
h i Y0 = ðy1 , y2 , …, yn Þ, f 0 = f x1 , θð aÞ , f x2 , θðaÞ , …, f xn , θðaÞ , 2 3 ðaÞ0 h1 6 ðaÞ0 7 6 7 ðaÞ 6 h2 7 H = 6 . 7, 4 .. 5 0 hðaÞ n
and
θ(a + 1) − θ(a) 0 θ(a + 1) − θ(a) is less than a small predetermined bound δ. Note that H(a) is like the design matrix X in linear models. The estimated dispersion matrix of θ^ is −1 ðaÞ0 : ð4:2:5Þ H V − 1 H ðaÞ In general, we have four-parameter sigmoidal curves, where the parameters are denoted by a, b, c, and d. We take the following: 1. 2. 3. 4.
The plot Y versus x as sigmoid. When x = c, Y = (a + d)/2. Y is a function of x through b(x – c). When b > 0 (or ChiSq
7.01
0.0714
3
Solution for Fixed Effects Effect
Estimate
Standard Error
DF
t Value
Pr > |t|
Intercept
2.6111
0.1928
2
13.55
0.0054
method
0.1667
0.1167
2
1.43
0.2893
Solution for Random Effects Effect
teacher Estimate
Std Err Pred DF t Value Pr > |t| Alpha Lower (b7)
Upper
0.2103
0.1968
3
1.07
0.3635
0.05
−0.4159 0.8366
1
−0.1024
0.1186
3
−0.86
0.4514
0.05
−0.4798 0.2750
Intercept 2
Intercept 1 method
−0.3587
0.1968
3
−1.82
0.1658
0.05
−0.9850 0.2675
2
0.2238
0.1186
3
1.89
0.1555
0.05
−0.1536 0.6012
Intercept 3
0.1484
0.1968
3
0.75
0.5055
0.05
−0.4778 0.7746
−0.1214
0.1186
3
−1.02
0.3812
0.05
−0.4988 0.2559
method method
3
90
CHAPTER 4 Growth Curve Models
The Mixed Procedure Type 3 Tests of Fixed Effects
Effect
Num DF
Den DF
method
1
2
F Value 2.04
Pr > F 0.2893 (b6)
From the output in column (b1), we get the estimators for σ 21 , σ 12, σ 22 , and σ 2 and its standard error in column (b2). The p-values at (b3), (b4), and (b5) can be used to test the significance of σ 21 , σ12, and σ 22 . The p-value at (b6) is used to test the significance of the fixed effect parameter, and in our case, it is not significant. The p-values in column (b7) can be used to test the significance of intercepts and slope for the methods with each teacher. For more details, the interested readers are referred to Harville (1977), Searle, Casella, and McCulloch (1992), and Patterson and Thompson (1971).
4.4 SIMPLE LINEAR GROWTH CURVE MODEL Let us consider n experimental units and the maximum number of repeated measures on any unit is T. However, it is not necessary to have all T observations on every unit. Let ti (≤T) be the number of repeated observations on the ith unit. Let aij denote a time score for the jth observation on the ith unit. Let X1, X2, …, XM be the M time-invariant covariates and let W1, W2, …, WL be the L time-variant covariates. We will assume a two-stage linear model. Let Yij be the response on the jth observation on the ith unit. We assume Yij = η0i + η1i aij +
L X
ki ℓ wij ℓ + eij
ð4:4:1Þ
ℓ =1
where wijℓ is value of the ℓth covariate W ℓ at the jth observation on the ith unit and we take it to be zero, if no data are available; eij are random errors; η0i is intercept; and η1i and ki ℓ are slopes. We model the parameters as follows using the time-invariant covariates: η0i = α0 +
M X m=1
γ 0m xim + ξ0i ,
ð4:4:2Þ
4.4 Simple Linear Growth Curve Model
η1i = α1 +
M X
91
γ 1m xim + ξ1i ,
ð4:4:3Þ
γ 2 ℓm xim + ξ2i ℓ ,
ð4:4:4Þ
m=1
ki ℓ = α2 ℓ +
M X m=1
where ξ0i, ξ1i, and ξ2i ℓ are random errors and γ’s are slopes of the timeinvariant covariates. We will take xim to be zero if the data on Xi are not available. Combining Equations (4.4.1)–(4.4.4), we write the response vector 0 Yi of dimension ti as Yi = α0 1 + xi 0 γ0 1 + ai ðα1 + ðxi 0 γ1 ÞÞ + Wi α2 + Wi 2 xi + ξ0i 1 + ξ1i 1 + Wi ξ2i + ei , ð4:4:5Þ where 1 is a column vector of 1’s of suitable dimension (i.e., Jt,1), 0 0 x0i = ðxi1 , xi2 , …, xiM Þ, γ00 = ðγ 01, γ 02 , …, γ 0M Þ, ai = ðai1 , a0 i2 , …, aiti Þ, γ1 = ðγ 11 , γ 12 , …, γ 1M Þ, Wi = wij ℓ is a tj × L matrix, α2 = ðα21 , α22 , …, α2L Þ, 2 = ðγ 2 ℓm Þ is a L × M matrix, ξ02i = ðξ2i1 , ξ2i2 , …, ξ2iL Þ, and e0i = ðei1 , ei2 , …, eiti Þ. In xi, and Wi, we put 0 values when data are not collected on X those variables at the indicated response time. n t . We combine the expression of the response Let N = i=1 i vectors 0 of 0 Equation (4.4.5) to get an N component vector response 0 0 Y = Y1 , Y 2 , …, Yn as
Y = α0 1 + Xγ0 + aα1 + Aγ1 + Wα2 + W 1n 2 × 1 + ξ0 + ξ1 + B + e: ð4:4:6Þ
In Equation (4.4.6), X is an N × M matrix 2 3 1t1 x1 0 6 1t2 x2 0 7 6 7 X = 6 . 7, 4 .. 5 1tn xn 0
a0 = a01 , a02 , …, a0n ; A is an N × M matrix 2 03 a1 x1 6 a2 x02 7 6 7 A = 6 . 7, 4 .. 5 an x0n
92
CHAPTER 4 Growth Curve Models
W is an N × L matrix 2
W1
3
6 7 6 W2 7 6 7 W = 6 . 7, 6 .. 7 4 5 Wn 0 x0 = x01 , x2 , …, x0n , ξ0 0 = ðξ01 10 t1 , ξ02 10 t2 , …, ξ0n 10 tn Þ, ξ0 1 = ðξ11 a0 1 , ξ12 a0 2 , …, ξ1n a0 n Þ; B is an N × L vector matrix 2
W1 ξ21
3
6 7 6 W2 ξ22 7 6 7 B = 6 . 7, 6 .. 7 4 5 Wn ξ2n
and e0 = e01 , e02 , …, e0n . Model (4.4.6) is a mixed effects model, where the last four terms give random effects parameter and the remaining terms are fixed effects parameter. The effect of the M time-invariant covariates X1, X2, …, XM can be tested by testing the parameters γ 01, γ 02, …, γ 0M. The effect of the L time-variant covariates W1, W2, …, WL can be tested by testing the parameters α21, α22, …, α2L. The effects of the time scores can be tested by testing the parameters γ 11, γ 12, …, γ 1M. The interaction between the set of parameters fX ℓ g and {WM} can be tested by testing the parameters γ 2 ℓm for ℓ = 1, 2, …, L; m = 1, 2, …, M.
4.5 NONLINEAR GROWTH CURVE MODEL Continuing the notation of Section 4.4, for simplicity, we assume one time-variant covariate, W, and one time-invariant covariate, X. Let tij be the score indicating the repetition of the observation on the unit. We can assume any nonlinear model for repeated observations on a unit and also with a time-invariant covariate.
4.6 Numerical Example
93
Let us assume the logistic model for repeated observations as Yij =
bi1 wij + bi4 + eij : 1 + e − ðtij -bi2 Þ=bi3
ð4:5:1Þ
We model the coefficients bi ℓ for ℓ = 1, 2, 3, 4 of Equation (4.5.1) as bi1 = α1 + xi β1 + f1i ,
ð4:5:2Þ
bi2 = α2 + xi β2 + f2i ,
ð4:5:3Þ
bi3 = α3 + xi β3 + f3i ,
ð4:5:4Þ
bi4 = α4 + xi β4 + f4i :
ð4:5:5Þ
Substituting Equations (4.5.2)–(4.5.5) in Equation (4.5.1), we obtain Yij =
wij α1 + wij xi β1 + wij f1i + α4 + xi β4 + f4i + eij : 1 + e − ð tij − α2 − xi β2 − f2i Þ=ð α3 + xi β3 + f3i Þ
ð4:5:6Þ
Usually, the covariate wij in Equation (4.5.1) is considered with coefficient 1; however, for generality, we consider its coefficient bi1.
4.6 NUMERICAL EXAMPLE In this section, the linear and nonlinear growth curves will be illustrated. Example 4.6.1 illustrates fitting the linear growth curve model, and Example 4.6.2 illustrates fitting the nonlinear model. EX A MP L E 4.6.1 A1C sugar levels were monitored on five patients. Two time-invariant covariates were used and are X1 = weight and X2 = age. The sugar levels were taken at 3-month intervals and the response is the change in A1C from the previous recorded levels. During the intervals, time-variant covariates are exercise time (W1). Artificial data are given in Table 4.6.1. 䊏
94
CHAPTER 4 Growth Curve Models
TABLE 4.6.1 Artificial data on change of A1C 3m
6m
9m
Patient
Xi1
Xi2
Wij1
yij
1
170
60
20
+0.2
30
−0.5
30
+0.1
2
220
55
40
−0.1
35
−0.6
30
−0.1
Wij1
yij
Wij1
yij
3
180
40
50
−0.5
40
0.1
45
−0.2
4
200
50
30
−0.1
40
−0.1
40
+0.1
5
185
60
10
−0.3
30
−0.1
30
−0.1
data a; input patient period xi1 xi2 aj wi1 y@@;cards; 1 1 170 60 3 20 .2 1 2 170 60 6 30 -.5 1 3 170 60 9 30 .1 2 1 220 55 3 40 -.1 2 2 220 55 6 35 -.6 2 3 220 55 9 30 -.1 3 1 180 40 3 50 -.5 3 2 180 40 6 40 .1 3 3 180 40 9 45 -.2 4 1 200 50 3 30 -.1 4 2 200 50 6 40 -.1 4 3 200 50 9 40 .1 5 1 185 60 3 10 -.3 5 2 185 60 6 30 -.1 5 3 185 60 9 30 -.1 ; data b;set a; %macro est(in,patno); data c;set a; if patient = &patno; proc sort; by patient; ods trace on; proc glm outstat = z ; model y = aj wi1 /noint ; ∗ no intercept is used because not enough period observations are taken; ods output parameterestimates = all; output out = aa; run; data all1; set all; keep estimate parameter ; proc transpose out = allt; var estimate parameter; data allt; set allt; patient = &patno; if _name_ = 'Estimate'; ajhat = col1 + 0; wi1hat = col2 + 0; drop col1 col2; patient = &patno; proc sort; by patient; drop col1 col2 col3; data all2; set all; keep stderr parameter ; proc transpose out = allt2; var stderr parameter;
4.6 Numerical Example
95
data allt2; set allt2; patient = &patno; if _name_ = 'StdErr'; seajhat = col1 + 0; sewi1hat = col2 + 0; drop col1 col2; patient = &patno; proc sort; by patient; drop col1 col2 col3; data all3; set all; keep probt parameter ; proc transpose out = allt3; var probt parameter; data allt3; set allt3; patient = &patno; if _name_ = 'Probt'; ajpvalue = col1 + 0; wi1pvalue = col2 + 0; drop col1 col2; patient = &patno; proc sort; by patient; drop col1 col2 col3; data z1;set z; if _source_ = 'ERROR'; patient = &patno; keep ss patient; proc sort; by patient; data zz;merge allt c z1 allt2 allt3;by patient; data ∈set zz; keep ajhat seajhat wi1hat sewi1hat xi1 xi2 ss ajpvalue wi1pvalue; if _n_ = 1; %mend est; %est(pat1,1);%est(pat2,2);%est(pat3,3);%est (pat4,4);%est(pat5,5); data final;set pat1 pat2 pat3 pat4 pat5; errorMS = SS; drop ss probt; data final2;set final; proc print; var ajhat seajhat ajpvalue wi1hat sewi1hat wi1pvalue errorms;run;
Output (c1) : Obs
ajhat
seajhat ajpvalue
wi1hat
sewi1hat wi1pvalue errorMS
1
0.045614 0.18103
0.8429
−0.014211 0.043324
0.7982
0.25474
2
−0.009877 0.07535
0.9170
−0.005926 0.013858
0.7428
0.16667
3
0.052335 0.06301
0.5588
−0.011691 0.009038
0.4190
0.09058
4
0.051111 0.01423
0.1729
−0.009333 0.002494
0.1663
0.00200
5
−0.026667 0.11624
0.8564
0.002000 0.029933
0.9575
0.06400
96
CHAPTER 4 Growth Curve Models
data final1; set final; proc glm; model ajhat = xi1 xi2; run; proc glm; model wi1hat = xi1 xi2; run;
The GLM Procedure Dependent Variable: ajhat Source
DF
Sum of Squares
Mean Square
Model
2
0.00286544
0.00143272
Error
2
0.00284314
0.00142157 (c2)
Corrected Total
4
0.00570858
F Value
Pr > F
1.01
0.4980
∗∗∗ Parameter
Standard Error (c4)
Estimate (c3)
t Value
Pr > |t| (c5)
Intercept
0.3169810450
0.22294561
1.42
0.2910
xi1
−.0008153084
0.00096733
−0.84
0.4880
xi2
−.0026179955
0.00225382
−1.16
0.3653
The GLM Procedure Dependent Variable: wi1hat Source
DF
Sum of Squares
Mean Square
Model
2
0.00004202
0.00002101 0.00005806 (c6)
Error
2
0.00011611
Corrected Total
4
0.00015813
F Value
Pr > F
0.36
0.7343
∗∗∗
Parameter
Estimate (c7)
Standard Error (c8)
t Value
Pr > |t| (c9)
Intercept
−.0443513365
0.04505511
−0.98
0.4287
xi1
0.0001076860
0.00019549
0.55
0.6370
xi2
0.0003009658
0.00045548
0.66
0.5767
In Part (c1) of the output, the individual patients analysis are given. One can determine the significance of the parameters
4.6 Numerical Example
97
from that output. In (c2) and (c6), the error mean squares are provided for the parameters aj’s and wi1. The estimates, standard errors, and p-values are given for the second-stage parameters in columns (c3)–(c5) and (c7)–(c9). EX A MP L E 4.6.2 In Table 4.6.2, artificial data are provided for weight reduction (Yij) using diet (Xi) as time-invariant covariate and exercise time (Wij) as time-variant covariate. The months (tij), the given exercise time, is also provided. The length of time as exercise regimen is followed as time score. 䊏 TABLE 4.6.2 Artificial data on weight loss Volunteer
Xi
Wij, tij, yij
1
1200
(20, 3, 3) (40, 6, 5) (25, 9, 3.5) (30, 12, 4)
2
1200
(20, 3, 4) (25, 6, 5) (60, 12, 9) (40, 9, 7)
3
1200
(25, 6, 5) (40, 9, 7) (60, 12, 9) (30, 3, 5)
4
1500
(20, 3, 4) (25, 6, 5) (40, 9, 7) (60, 12, 10)
5
1500
(25, 6, 5) (40, 9, 6)
6
1500
(25, 6, 4) (40, 9, 6) (60, 12, 8)
The following programming lines provide the necessary output to draw inferences: data a;input patient x w t y @@;cards; 1 1200 20 3 3 1 1200 40 6 5 1 1200 25 9 3.5 1 1200 30 12 4 2 1200 20 3 4 2 1200 25 6 5 2 1200 60 12 9 2 1200 40 9 7 3 1200 25 6 5 3 1200 40 9 7 3 1200 60 12 9 3 1200 30 3 5 4 1500 20 3 4 4 1500 25 6 5 4 1500 40 9 7 4 1500 60 12 10 5 1500 25 6 5 5 1500 40 9 6 6 1500 25 6 4 6 1500 40 9 6 6 1500 60 12 8 ; proc sort;by patient;
Step 1: Summarize the parameters b1, b2, b3 and b4 for the 6 patients data b;set a; %macro est(subject,out);
98
CHAPTER 4 Growth Curve Models
data c; set b; if patient = &subject; proc nlin noprint; parms b1 = .1 b2 = 3 b3 = .5 b4 = 1; model y = ((b1∗w) + b4)/(1 + exp(−(t-b2)/b3)); output out = &out parms = b1 b2 b3 b4; run; data &out ;set &out; proc sort nodupkey;by patient; %mend est; %est(1,out1); %est (2,out2); %est (3,out3); %est(4,out4); %est (5,out5); %est (6,out6); data final;set out1 out2 out3 out4 out5 out6; data final1; set final; keep patient b1 b2 b3 b4; proc print; run;
(d1) Obs
patient
b1
b2
b3
b4
1
1
0.10000
−6.1601
0.61366
1.00000
2
2
0.07514
−8.7579
4.17192
3.20507
3
3
0.07329
−9.2748
3.44509
3.68037
4
4
0.14324
2.7534
0.08630
1.36486
5
5
0.12492
−48.3823
0.50000
1.00000
6
6
0.09991
4.9635
0.50000
2.00560
Step 2: We will take b1, b2, b3 and b4 given at (d1) of the output and regress these values using the X values of the dependent variable. data final1;set final; proc glm; model b1 = x; run;
∗∗∗ Estimate
Standard Error
(d2)
(d3)
Intercept
−.0575655521
0.06562982
−0.88
0.4299
x
0.0001201708
0.00004832
2.49
0.0677
Parameter
proc glm;
model b2 = x;
run; ∗∗∗
t Value
Pr > |t| (d4)
4.6 Numerical Example
99
Estimate (d5)
Standard Error (d6)
t Value
Pr > |t| (d7)
4.634195173
79.74988058
0.06
0.9564
x
−0.012126211
0.05871267
−0.21
0.8465
proc glm;
model b3 = x;
t Value
Pr > |t|
Parameter Intercept
run; ∗∗∗
Parameter
Estimate (d8)
Standard Error (d9)
(d10)
Intercept
11.89518939
4.85592335
2.45
0.0705
x
−0.00768873
0.00357498
−2.15
0.0979
t Value
Pr > |t|
proc glm;
model b4 = x;
run; ∗∗∗
Parameter Intercept x
Estimate
Standard Error
(d11)
(d12)
6.298067264
3.49427421
1.80
(d13) 0.1458
−0.003227497
0.00257252
−1.25
0.2779
The parameters b1, b2, b3, and b4 for individual patients are summarized at (d1) of the output. The individual linear regressed parameters β1, β2, β3, and β4 are given in the columns (d2), (d5), (d8), and (d11) of the output. The standard errors of these parameters are given in columns (d3), (d6), (d9), and (d12). The p-values for testing these parameters are given in the columns (d4), (d7), (d10), and (d13). In our example, none of these parameters are significant. In Table 4.6.2, artificial data with different periods was given on the volunteers to show that we can use this kind of general modeling when all periods data are not available on the subjects.
100
CHAPTER 4 Growth Curve Models
4.7 JOINT ACTION MODELS In Section 4.2, we discussed some sigmoidal curves for singlecompound dose–response models. The general logistic regression model with one compound X can be rewritten as Yi = ymin + Px ðiÞ ðymax – ymin Þ + ei :
ð4:7:1Þ
Here, ð iÞ
PX =
DX , − 1 + e βðln di −γ Þ
ð4:7:2Þ
where β indicates the steepness of the dose–response function; γ is ln(dose) corresponding to a response proportion of 0.5, or ln(ED50); di is the ith dose level; and Dx is an indicator variable taking the value 1 (or 0) according to di > 0 (or di = 0), ymin and ymax being the minimum and maximum responses. Joint action refers to the response of an individual to a combination of compounds. The combination of two compounds X1 and X2 can produce an “additive” or a “nonadditive” response. When the response is additive, no interaction exists between the two compounds. When the response is nonadditive, an interaction exists between the two compounds. Synergism and antagonism are two terms used to describe the nonadditive joint action. Synergism is when the two compounds work together to produce an effect greater than that which would be expected by the two compounds working alone. Antagonism is when the two compounds work in competition with one another to produce an effect that is less than that which would be expected by the two compounds working alone (Ashford, 1981; Lupinacci, 2000). From the selection of the appropriate additive and nonadditive joint action model come the concepts of “similar joint action” and “independent joint action.” Compounds competing against one another to work in the same area of the body and in the same manner are known as similar joint action. Compounds working in different areas of the body and by different methods of action are known as independent joint action (Ashford, 1981; Lupinacci, 2000). In this monograph, we will restrict ourselves to the additive and nonadditive independent joint action models.
4.7 Joint Action Models
101
Let X1 and X2 be two compounds administered at d1i and d2j levels, then the additive joint response under independent joint action, Yij, can be written (see Barton, Braunberg, and Friedman, 1993) as Yij = ymin + fPX1 ðiÞ + PX2 ð jÞ – PX1 ðiÞPX2 ð jÞgðymax – ymin Þ + eij , ð4:7:3Þ where ymin and ymax are defined as earlier and PX1 ðiÞ =
DX1 , − β 1 + e 1 ðln d1i − γ1 Þ
ð4:7:4Þ
PX2 ð jÞ =
DX2 : 1 + e − β2 ðln d2j − γ2 Þ
ð4:7:5Þ
The data can be analyzed under different assumptions on the error. By writing the model in a linear form using Taylor’s expansion, Antonello and Raghavarao (2000) gave the optimum dose levels. In the case of nonadditivity, Barton, Braunberg, and Friedman (1993) considered the model Yij = ymin + fP0 X1 ðiÞ + P0 X2 ð jÞ− P0 X1 ðiÞP0 X2 ð jÞgðymax – ymin Þ + eij , ð4:7:6Þ where P0 X1 ðiÞ =
DX 1 1+e
− β1 ½ lnð1 + λPX2 ð jÞÞ + ln d1i − γ 1
,
ð4:7:7Þ
and P0 X2 ð jÞ =
DX2
: ð4:7:8Þ 1+e The optimal design points in the nonadditive case are also given by Antonello and Raghavarao (2000). In Equations (4.7.7) and (4.7.8), λ is known as nonadditivity parameter. Clearly, λ = 0 for additive joint action. − β2 ½ lnð1 + λPX1 ðiÞÞ + ln d2j − γ 2
EX A MP L E 4.7.1 Consider the problem of controlling BP by using two drugs X1 and X2. Five patients with similar demographic variable and the same baseline BP are included in the study. Artificial data of the percentage reduction of diastolic pressure after 3 months of medication is given along with the dosage levels of the two drugs in Table 4.7.1. 䊏
102
CHAPTER 4 Growth Curve Models
TABLE 4.7.1 Artificial data on the percentage reduction of diastolic BP di1 (mg)
di2 (mg)
yij
15
25
9
15
75
9
50
25
12
50
75
11
25
50
10
The following programming lines fit the additive independent joint action model (4.7.3) assuming ymin = 0, ymax = 15 to be known: data a; input d1 d2 y @@;cards; 15 25 9 15 75 9 50 25 12 50 75 11 25 50 10 ; proc nlin; parms beta1 = 1 beta2 = 1.5 gamma1 = 5 gamma2 = 4; model y = ( (1/(1 + exp(−beta1∗(log(d1)-gamma1)))) + (1/(1 + exp(−beta2∗(log(d2)-gamma2))))(1/(1 + exp(−beta1∗(log(d1)-gamma1))))∗ (1/(1 + exp(−beta2∗(log(d2)-gamma2)))) )∗15; run; ∗∗∗
NOTE: Convergence criterion met.
∗∗∗ The NLIN Procedure Source
DF
Sum of Squares
Mean Square
Model
4
526.6
131.7
Error
1
0.3726
0.3726
Uncorrected Total
5
527.0
F Value 353.38
Approx Pr > F 0.0399 (e1)
References
103
Approx Parameter
Estimate (e2)
Std Error
Approximate 95% Confidence Limits (e3) (e4)
beta1
0.6573
0.1750
−1.5668
2.8814
beta2
−2.9541
37.8159
−483.4
477.5
gamma1
2.1875
0.5865
−5.2649
9.6399
gamma2
2.3591
10.2145
−127.4
132.1
∗∗∗
From the output, the p-value given at (e1) tests for the significance of the model and our p-value is significant. The estimated model parameters β1, β2, γ 1, and γ 2 are given in column (e2). The lower and upper CI for these parameters are given in columns (e3) and (e4). A similar program as can also be given for the model (4.7.6) in the case of nonadditive joint action model.
REFERENCES ANTONELLO JM, RAGHAVARAO D. Optimal designs for the individual and joint exposure general logistic regression models. J Biopharm Statist 2000;10:351–367. ASHFORD JR. General models for the joint action of mixtures of drugs. Biometrics 1981;37:457–474. BARTON CN, BRAUNBERG RC, FRIEDMAN L. Nonlinear statistical models for the joint action of toxins. Biometrics 1993;49:95–105. BECKA M, URFER W. Statistical aspects of inhalation toxicokinetics. Environ Ecol Statist 1996;3:51–64. BECKA M, BOLT HM, URFER W. Statistical evaluation of toxicokinetic data. Environmetrics 1993;4:311–322. BIEDERMANN S, DETTE H, ZHU W. Optimal designs for dose–response models with restricted design spaces. J Am Statist Assoc 2006;101:747–759. BOSE M, DEY A. Optimal Cross-Over Designs. Singapore: World Scientific; 2009. BOX MJ. Bias in nonlinear estimation. J R Statist Soc B 1971;33:171–201. CHAUDHURI P, MYKLAND PA. On efficient designing of nonlinear experiment. Statist Sin 1995;5:421–440. DETTE H, MELAS VB, PEPELYSHEV A. Optimal designs for a class of nonlinear regression models. Ann Math Statist 2004;32:2142–2167. DETTE H, MELAS VB, WONG WK. Optimal designs for goodness-of-fit of the Michaelis– Menton enzyme kinetic function. J Am Statist Assoc 2005;100:1370–1381. GOOS P. The Optimal Design of Blocked and Split-Plot Experiments. New York: Springer-Verlag; 2002.
104
CHAPTER 4 Growth Curve Models
HARVILLE DA. Maximum likelihood approaches to variance component estimation and to related problems. J Am Statist Assoc 1977;72:320–338. KALISH LA. Efficient design for estimation of median lethal dose and quantal dose– response curves. Biometrics 1990;46:737–748. KALISH LA, ROSENBERGER JL. Optimal designs for the estimation of the logistic function. Technical Report 33. University Park, PA: Pennsylvania State University; 1978. KIEFER J. On the nonrandomized optimality and randomized nonoptimality of symmetric designs. Ann Math Statist 1958;29:675–699. KRUG H, LIEBIG HP. Static regression models for planning greenhouse production. Acta Horticult 1988;230:427–433. KUNERT J. Optimal design and refinement of the linear model with applications to repeated measurements designs. Ann Math Statist 1983;11:247–257. LIEBIG HP. Temperature integration by kohlrabi growth. Acta Horticult 1988;230: 427–433. LUPINACCI PJ. D-Optimal designs for a class of nonlinear models [unpublished Ph.D. dissertation]. Philadelphia, PA: Temple University; 2000. LUPINACCI PJ, RAGHAVARAO D. Designs for testing lack of fit for a nonlinear dose– response curve model. J Biopharm Statist 2000;10:45–53. PATTERSON HD, THOMPSON R. Recovery of inter-block information when block sizes are unequal. Biometrika 1971;58:545–554. PUKELSHEIM F. Optimal Design of Experiments. New York: Wiley; 1993. SEARLE SR, CASELLA G, MCCULLOCH CE. Variance Components. New York: Wiley; 1992. SHAH KR, SINHA BK. Theory of Optimal Designs. Lecture Notes in Statistic. New York: Springer; 1989. SITTER RR, TORSNEY B. Optimal designs for binary response experiments with two design variables. Statist Sin 1995;5:405–419. SU Y, RAGHAVARAO D. Minimal plus one point designs to test lack of fit for some sigmoidal curves. J Biopharm Statist 2013;23:281–293.
CHAPTER
5
Cross-Over Designs without Residual Effects
5.1 INTRODUCTION When several treatments are tested in an experiment, it is economical and efficient to use sequences of treatments on each experimental unit. In such cases, the treatment differences can be more accurately estimated by eliminating the differences in experimental units. Consider an experiment in which v treatments are tested on b experimental units. Let the duration of the experiment be suitably divided into k periods for application of treatments. Each of the v treatments can be used on each experimental unit, or a subset of the v treatments can be used on each experimental unit depending on the relative sizes of v and k. With v = 4 = k = b, using treatments A, B, C, and D, the experimental layout may appear as in Table 5.1.1. In the layout of Table 5.1.1, unit 1 receives treatments A, B, C, and D in that order in the four periods; unit 2 receives treatments B, A, D, and C in that order in the four periods; etc. With v = 7, k = 3, b = 7, an experimental layout is given in Table 5.1.2. In the layout of Table 5.1.2, unit 1 receives treatments A, B, and D in that order of the seven treatments A–G; unit 2 receives treatments B, C, and E in that order; etc. Repeated Measurements and Cross-Over Designs, First Edition. Damaraju Raghavarao and Lakshmi Padgett. © 2014 John Wiley & Sons, Inc. Published 2014 by John Wiley & Sons, Inc.
105
106
CHAPTER 5 Cross-Over Designs without Residual Effects
TABLE 5.1.1 Design for v = b = k = 4 Experimental unit Period
1
2
3
4
1
A
B
C
D
2
B
A
D
C
3
C
D
A
B
4
D
C
B
A
TABLE 5.1.2 Design for v = b = 7, k = 3 Experimental unit Period
1
2
3
4
5
6
7
1
A
B
C
D
E
F
G
2
B
C
D
E
F
G
A
3
D
E
F
G
A
B
C
Since a sequence of treatments are applied on each experimental unit and the treatments are changed during the course of the experimentation, in the literature, this class of designs are called cross-over designs, change-over designs, or switch-over designs. In this monograph, they will be referred to as cross-over designs. In cross-over designs, it is possible that the treatments may produce direct effects in the period of its application and residual effects in the periods after its application is discontinued. The residual effect of a treatment in the ith period after its application will be called ith-order residual effect. Usually though not always, the ith-order residual effect will be smaller than the (i – 1)th-order residual effect, and thus, in many experimental situations, one ignores the second- and higher-order residual effects. The linear model used to analyze these designs should either incorporate a term accounting for the residual effects, or the experimenter should conduct the experiment and collect the data in such a manner that there is a reasonable washout period between successive applications of
5.2 Fixed Effects Analysis of CODWOR
107
the treatments and the times of collecting the data for each treatment. In the washout period, the unit will not receive any treatment so that the effect of the treatment used in the previous period washes out. Designs using washout periods, which do not account for residual effects, will be referred to as CODWOR, while cross-over designs where no washout periods are provided between periods and residual effects are accounted in the model will be referred to as CODWR. CODWOR designs will be discussed in this chapter, while CODWR designs will be discussed in Chapters 6–8.
5.2 FIXED EFFECTS ANALYSIS OF CODWOR As mentioned in the introduction, CODWOR designs will now be considered using v treatments in k periods on b experimental units. The layout can be conveniently represented in a k × b array using v symbols, where the units correspond to the columns, rows to the periods, and symbols to the treatments. These designs are known in statistical design literature as row–columns designs, designs eliminating heterogeneity in two directions, or two-way elimination of heterogeneity designs. Designs eliminating heterogeneity are available in a more general setting where some cells are untreated by the experimental treatments, and/or some cells receive multiple treatments. However, such generalized two-way elimination heterogeneity designs may not be usually appropriate as repeated measurements designs, as there is no need to provide an extra rest period for an experimental unit because a washout period was already provided in the experimental protocol. Furthermore, if a unit washes out or dies in the course of the experiment, such designs do not fall into the framework of a generalized two-way elimination of heterogeneity designs. Thus, attention is given in this chapter to designs where every cell is filled and the interested readers on generalized two-way elimination of heterogeneity designs are referred to Agrawal (1966a–c), Freeman and Jeffers (1962), and Pothoff (1962). Two-way Anova designs with multiple observations per cell may not be appropriate as RMD designs. However, allowances are made for using a treatment more than once on a unit in different periods. The designs considered in this chapter will be written as a k × b square array where each cell is filled with one of the v symbols {1, 2, …, v}. Here, the columns correspond to the experimental units, rows
108
CHAPTER 5 Cross-Over Designs without Residual Effects
to the periods, and symbols to the treatments. Let d(i, j) be the symbol used in the ith row and jth column. Let the ith symbol be replicated ri times and let r 0 = (r1, r2, …, rv). If Yij is the observation taken in the ith period on the jth unit, it is assumed that Yij = μ + ρi + γ j + τdði, jÞ + eij ,
ð5:2:1Þ
where μ is the general mean, ρi is the ith row (or period) effect, γ j is the jth column (or unit) effect, τd(i,j) is the treatment effect of the treatment d(i, j), and eij are random errors assumed to be IIN(0, σ 2). Note that γ j’s may be random effects if the experimental units are randomly selected from the population of experimental units and the results in this direction will be discussed in Section 5.9. In this section, we restrict ourselves to the situation where the units are not randomly selected and γ j’s are fixed effects. Let ρ0 = (ρ1, ρ2, …, ρk), γ0 = (γ 1, γ 2, …, γ b), τ0 = (τ1, τ2, …, τv), and 0 Y = (Y11, Y12, …, Y1b, Y21, …, Y2b, …, Yk1, …, Ykb). Let U be a kb × v matrix whose ð ij, ℓ Þ position is 1 if the ℓth treatment is applied in the (i, j) cell and 0 otherwise. Equation (5.2.1) can be rewritten as EðYÞ = μJkb, 1 + Ik Jb, 1 ρ + Jk, 1 Ib γ + Uτ, VarðYÞ = σ 2 Ikb : ð5:2:2Þ Let Lv , k = ℓ ij , where ℓ ij is the number of times the ith treatment occurs in the jth row and let Mv,b = (mij), where mij is the number of times the ith treatment occurs in the jth column. Matrices L and M are, respectively, known as treatment-row and treatment-column incidence matrices. The normal equations for the model (5.2.2) can be easily obtained and are 2 3 2 3 2 3 G kb bJ1, k kJ1, b r0 μ^ 6 bJk, 1 bIk Jk, b L0 7 6 ρ^7 6 P 7 6 7 6 7 6 7 ð5:2:3Þ 4 kJb, 1 Jb, k kIb M 0 5 4 γ^5 = 4 C 5, T τ^ r L M DðrÞ where D(r) denotes a diagonal matrix with diagonal entries, r1, r2, …, rv, G is the grand total, P is the k × 1 vector of row totals Pi, C is the column vector of column totals Cj, T is the column vector of treatment totals Tℓ of responses, and ‘b’ over a parameter denotes its least squares estimator. The notation Cα|β,γ,… will be used throughout this monograph ^ in the reduced normal equations to denote the coefficient matrix of α after eliminating the parameter vectors included in the subscript(s) after the bar and ignoring the parameter vectors not listed in the subscripts, but
5.2 Fixed Effects Analysis of CODWOR
109
are present in the model. The parameter μ is obviously eliminated, but will not be used as a subscript of C. Similar interpretation will be given to the adjusted treatment totals Qα|β,γ,…. Clearly, LJk,1 = r, J1,vL = bJ1,k, MJb,1 = r, J1,vM = kJ1,b, J1,kP = J1,bC = J1,vT = G. The second Equation of (5.2.3) can be solved for ρ^ to get 1 P – bJk, 1 μ^−Jk, b γ^−L0 τ^ : ð5:2:4Þ b Substituting for ρ^ in the third Equation of (5.2.3) and simplifying, we obtain 1 Cγjρ γ^ + M 0 – Jb, 1 r0 τ^= Qγjρ , ð5:2:5Þ b ρ^ =
where k G Cγjρ = kIb − Jb, b , Qγjρ = C – Jb, 1 : b b − Since Cγjρ = ð1=k ÞIb , from Equation (5.2.5), we have 1 G 0 1 0 ^γ = C – Jb, 1 – M – Jb, 1 r τ^ : k b b
ð5:2:6Þ
ð5:2:7Þ
Substituting for ρ^ and γ^ in the fourth Equation of (5.2.3) and simplifying, the reduced normal equations estimating τ^ are given by Cτjρ, γ τ^ = Qτjρ, γ ,
ð5:2:8Þ
1 1 1 Cτjρ, γ = DðrÞ – LL0 – MM 0 + rr0 b k bk
ð5:2:9Þ
1 1 G Qτjρ, γ = T – LP – MC + r: b k bk
ð5:2:10Þ
where
and
The residual sum of squares for the model (5.2.2) is R20 = Y0 Y – μG ^ – ρ^0 P –^γ0 C – τ^0 T ( ) ( ) ( ) k X b k b C2 2 2 2 2 X X X G P G G j i − − Yij 2 − = − − − ^τ0 Qτjρ, γ , bk b bk k bk i=1 j=1 i=1 j=1 ð5:2:11Þ
110
CHAPTER 5 Cross-Over Designs without Residual Effects
with (b – 1)(k – 1) – Rank(Cτ|ρ,γ ), degrees of freedom. When Rank (Cτ|ρ,γ ) = v – 1, all elementary contrasts of treatment effects are estimable and the design is said to be connected. The connectedness property for CODWOR will be discussed in Section 5.3. The null hypothesis of main interest to the experimenter is H0τ : τ1 = τ2 = … = τv ð = a, sayÞ:
ð5:2:12Þ
The restricted model then becomes E ðYÞ = μ∗ Jkb, 1 + Ik Jb, 1 ρ + Jk, 1 Ib γ, VarðYÞ = σ 2 Ikb , ð5:2:13Þ where μ∗ = μ + a, and after some routine algebra, the residual sum of squares of the restricted model is given by ( ) ( ) ( ) k X b k b C2 2 2 2 2 X X X G P G G j i − − − − , R21τ = Yij2 − bk b bk k bk i=1 j=1 i=1 j=1 ð5:2:14Þ with (b – 1) (k – 1), degrees of freedom. Thus, the sum of squares for testing the null hypothesis (5.2.12) is given by SSH0τ = R21τ − R20 = ^τ0 Qτjρ, γ ,
ð5:2:15Þ
with Rank(Cτ|ρ,γ ) degrees of freedom. From the linear estimation theory (see Rao 1973, p. 191), we use 0 ^τ Qτj ρ, γ =Rank Cτjρ, γ 2
> Fα ðυ1 , υ2 Þ ð5:2:16Þ R0 = ðb− 1Þðk −1Þ −Rank Cτjρ, γ as a critical region for testing H0τ, where υ1 = Rank(Cτ|ρ,γ ), υ2 = (b – 1)(k – 1) – Rank(Cτ|ρ,γ ) and Fα(υ1, υ2) is the upper α percentile point of the F distribution. These results can be summarized in the Anova Table 5.2.1. As noted earlier, not all contrasts of treatment effects are estimable for every design. Since Cτ|ρ,γ Jv,1 = Ov,1, if ℓ0 τ is estimable, then ℓ0 Jv, 1 = 0. It can be easily seen that ℓ0 τ is estimable iff Cτjρ, γ ℓ 6¼ 0. If ℓ0 τ is estimable, then its best linear unbiased estimate (b.l.u.e.) is 0 − τ = ℓ0 τ^ = ℓ0 Cτjρ ℓc , γ Qτjρ, γ ,
with variance
ð5:2:17Þ
5.2 Fixed Effects Analysis of CODWOR
111
TABLE 5.2.1 Anova table for CODWOR Source
d.f.
Rows (ig. treat) Columns (ig. treat)
k–1 b–1
S.S.
M.S.
F
0
MSt/MSe
k X P2 i
b
i=1 b C2 X j j=1
k
−
G2 bk
−
G2 bk
Treatments (el. rows, col)
υ1
^τ Qτ|ρ,γ
MSt
Error
υ2
By subtraction (=E, say)
MSe
Total
bk – 1
k X b X
Yij 2 −
i=1 j=1
G2 bk
MSt = τ^0 Qτ|ρ,γ/υ1, MSe = E/υ2, υ1 = Rank(Cτ|ρ,γ), υ2 = (b – 1)(k – 1) – υ1.
0 − Var ℓc τ = ℓ0 Cτjρ , γ ℓ MSe :
ð5:2:18Þ
Thus, the null hypothesis H0 : ℓ0 τ = a,
ð5:2:19Þ
when ℓ0 τ is estimable, can be tested by using the t-statistic − ℓ0 Cτjρ , γ Qτjρ, γ − a t = rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ,
− ℓ MS ℓ0 Cτjρ e ,γ
ð5:2:20Þ
with υ2 degrees of freedom for all real values a. Scheffe’s method of multiple comparisons can be used for drawing inferences about several contrasts. To test the null hypothesis of equality of column effects given by H0γ : γ 1 = γ 2 = … = γ b ,
ð5:2:21Þ
^ and ^τ, in that order in Equation (5.2.3) to get we eliminate μ, ^ ρ, Cγjρ, τ γ^ = Qγjρ, τ ,
ð5:2:22Þ
where
0 k 1 1 − Cγjρ, τ = kIb – Jb, b – M – rJ1, b Cτjρ M – rJ1, b , ð5:2:23Þ b b b
112
CHAPTER 5 Cross-Over Designs without Residual Effects
0 G 1 − Qγjρ, τ = C – Jb, 1 – M – rJ1, b Cτjρ Qτjρ , b b
ð5:2:24Þ
Cτ|ρ and Qτ|ρ being, respectively, given by 1 1 Cτjρ = DðrÞ – LL0 , Qτjρ = T – LP: ð5:2:25Þ b b The sum of squares for testing the null hypothesis (5.2.21) will then be given by SSH0γ = ^γ0 Qγjρ, τ ,
ð5:2:26Þ
with Rank(Cγ|ρ,τ) = υ3 degrees of freedom. The critical region for testing H0γ is then given by SSH0γ =υ3 > Fα ðυ3 ,υ2 Þ: R20 =υ2
ð5:2:27Þ
Putting 1 1 Cτjγ = DðrÞ – MM 0 , Qτjγ = T – MC, k k 0 b 1 1 − Cρjγ , τ = bIk – Jk, k – L – rJ1, k Cτjγ L – rJ1, k , k k k 0 G 1 − Qτjγ , Qρjγ, τ = P − Jk, 1 – L – rJ1, k Cτjγ k k
ð5:2:28Þ ð5:2:29Þ ð5:2:30Þ
the reduced normal equations for estimating ρ^ are Cρjγ , τ ρ^ = Qρjγ, τ ,
ð5:2:31Þ
and the sum of squares for testing H0ρ : ρ1 = ρ2 = … = ρv
ð5:2:32Þ
SSH0ρ = ρ^0 Qρjγ, τ
ð5:2:33Þ
is given by with Rank(Cρ|γ,τ) = υ4 degrees of freedom. The critical region for testing H0ρ is then given by SSH0ρ =υ4 > Fα ðυ4 , υ2 Þ: R20 =υ2
ð5:2:34Þ
5.3 Connectedness in CODWOR
113
5.3 CONNECTEDNESS IN CODWOR The following definition holds in any multifactor experiment: Definition 5.3.1. Two levels of a factor are said to be connected if the difference between their effects is estimable, and a factor is connected if each pair of its levels is connected. A CODWOR with treatments connected is also known as a doubly connected design. The following theorem characterizes this property in terms of the rank of Cτ|ρ,γ : Theorem 5.3.1. A CODWOR is doubly connected iff Rank(Cτ|ρ,γ ) = v – 1. Proof. Let the design be doubly connected and let τ1 – τi = ℓ0i τ for appropriate ℓi for i = 2, 3, …, v. Let m0 iQτ|ρ,γ be the b.l.u.e. of ℓ0i τ, implying that Cτjρ, γ mi = ℓi , i = 2, 3, …, v:
ð5:3:1Þ
The vectors m1 = Jv,1, m2, …, mv can be easily verified to be independent. Put M = [m1 m2 … mv] and L = Ov,1 ℓ2 … ℓv . We have Cτjρ, γ M = L
ð5:3:2Þ
Rank Cτjρ, γ = RankðLÞ = v – 1:
ð5:3:3Þ
and as M is nonsingular,
Conversely, assume Rank(Cτ|ρ,γ ) = v – 1. Let ξ1, ξ2, …, ξv–1 be a set of orthonormal eigenvectors corresponding to the nonzero eigenvalues θ1, θ2, …, θv–1 of Cτ|ρ,γ . Then, if ℓ0 i τ is an elementary contrast, it can be Xv − 1 Xv − 1 0 shown that its b.l.u.e. is a ξ Q =θ aξ. , whenever ℓ = i i τjρ , γ i i=1 i=1 i i A chain condition characterizing the connectedness property in a general multifactor design was given by Srivastava and Anderson (1970) and can be specialized for doubly connected designs. However, this condition is not as easy as it is in block designs to apply. If a CODWOR is doubly connected, by putting γ = 0 identically in the model, it can be easily verified that all the contrasts of treatments effects are estimable if the design eliminates heterogeneity through rows
114
CHAPTER 5 Cross-Over Designs without Residual Effects
only. Similarly, the contrasts of treatment effects are estimable if the design eliminates heterogeneity through columns only if ρ = 0. In looking at the converse of this problem, Federer and Zelen (1964) conjectured that an equireplicated CODWOR, in which the row design is connected and the column design is connected, is doubly connected. This conjecture was shown to be false by Shah and Khatri (1973) who gave the following counter example: 1 3 8 7
2 4 6 5
5 7 1 2
6 8 : 3 4
ð5:3:4Þ
While the usual method of showing that the design (5.3.4) is not doubly connected rests on calculating Cτ|ρ,γ and showing that its rank is