Nonparametric Trend Analysis: A Practical Guide for Research Workers 9780773592728


123 35 6MB

English Pages [66] Year 1965

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Cover
Title
Copyright
CONTENTS
Preface
1. Introduction
2. Sectionally Monotonic Functions
3. The Statistic S and Its Distribution
4. Monotonic Trend for Correlated Data: Large-Sample Procedure without Ties
5. Monotonic Trend for Correlated Data: Large-Sample Procedure with Ties
6. Monotonic Trend for Correlated Data: Small-Sample Procedure
7. Monotonic Trend for Independent Samples
8. Nonmonotonic Rank Correlation
9. Higher-Order Trend Analysis for Correlated Data
10. Higher-Order Trend Analysis for Independent Samples
11. Concluding Observations 59
References
Recommend Papers

Nonparametric Trend Analysis: A Practical Guide for Research Workers
 9780773592728

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

non parametric trend analysis

GEORGE ANDREW FERGUSON

nonparametric trend analysis

McGILL UNIVERSITY PRESS

MONTREAL 1965

Copyright, Canada, 1965 by McGill University Press All rights reserved Printed in Canada

Printed in Republic of Korea

PREFACE

This monograph is intended to serve as a practical guide to research workers in the nonparametric analysis of experimental data for trend. The procedures discussed have application to the data of experiments where the treatment variable either is, or may be represented as, an ordinal variable. Under these circumstances the investigator may wish to study the relation between the treatment and the experimental variables to see whether it is monotonically increasing or decreasing, whether it is a two- or a three-sided sectionally monotonic relation, and so on. The method of orthogonal polynomials is the analogous parametric procedure. In the forms of analysis described in this monograph, use is made of the sampling distribution of the statistic S, as used by M. G. Kendall in the definition of the rank correlation coefficient tau. Indeed, the material in this monograph may be viewed simply as an application of the statistic S. Use is also made of ranks for orthogonal polynomials. Nonparametric trend tests other than those described in this monograph can very readily be devised. In the preparation of this monograph numerous illustrative examples have been used. My purpose has been to present the material in a form readily usable by research workers. I am indebted to M. G. Kendall, who read the manuscript and advanced a number of useful comments. I am indebted also to R: B. Malmo, whose research in the field of physiological psychology suggested to me the need for the development of the methods described in this monograph. GEORGE A. FERGUSON v

nonparametric trend analysis

CONTENTS

Preface v 1. Introduction 3 2. Sectionally Monotonic Functions 8 3. The Statistic 5 and Its Distribution 10 4. Monotonic Trend for Correlated Data: Large-Sample Procedure without Ties 19 5. Monotonic Trend for Correlated Data: Large-Sample Procedure with Ties 23 6. Monotonic Trend for Correlated Data: Small-Sample Procedure 26 7. Monotonic Trend for Independent Samples 30 8. Nonmonotonic Rank Correlation 34 9. Higher-Order Trend Analysis for Correlated Data 41 10. Higher-Order Trend Analysis for Independent Samples 52 11. Concluding Observations 59 References 61

1 Introduction

Consider an experiment designed to compare the effects of k experimental treatments. Each treatment may be applied to a different group of subjects, the subjects being assigned to treatments by a randomization procedure. Such a design is sometimes called a randomized group design. Data resulting from this type of experiment are commonly analysed using an analysis of variance for one-way classification. An F ratio is used to test the null hypothesis, Ho . Al = µ2 = ... =

ux,

where 111, µ2, ... , µx are the population means for the k groups. If grounds exist for the rejection of this hypothesis, the means for groups may be compared two at a time. Techniques for the making of such multiple comparisons have been suggested by Duncan (1955), Scheffe (1953, 1959), and others. The F test for means, and the comparison of means two at a time, are techniques which are widely applied to experimental data, regardless of the nature of the treatment variable. Clearly such procedures are appropriate if the treatment variable is a nominal variable. Such, for example, would be the case if the experiment were concerned with the therapeutic effects of different drugs, the effects on learning of quite different learning conditions, the effects of brain lesions in different loci on the behaviour of experimental animals, or the influence of different sensory cues upon maze 3

INTRODUCTION

performance. Clearly, if the levels of the treatment variable constitute a set of nominal categories, the investigator cannot proceed beyond an F test for means and, if the null hypothesis is rejected, the comparison of means two at a time and possibly, if meaningful, the comparison of one subset of means with another subset. In many experiments the treatment variable is not nominal, but may be a variable of either the interval or ratio type. Such would be the case if the experiment were concerned with the behavioural effects of equally spaced dosages of a particular drug, the influence of equally spaced periods of practice on learning a task, or the influence of equally spaced numbers of reinforcements on conditioning or extinction. Under these circumstances the investigator may extend his analysis beyond an F test of the null hypothesis for means, or the making of multiple comparisons, and may study in detail the nature of the functional relation between the dependent, or experimental, variable, and the independent, or treatment, variable. If the treatment and experimental variables are both of the interval or ratio type, questions about the shape of the relation between the two variables are meaningful. The appropriate form of analysis here is called trend analysis. Such analysis, when applied to the data of experiments, may attempt to show whether the dependent variable exhibits a systematic tendency to increase, or decrease, in a linear fashion with change in the treatment variable, or whether a more complex type of relation exists between the variables. A procedure commonly used in the analysis of experimental data for trend is the method of orthogonal polynomials, developed by R. A. Fisher. This method partitions the between-group sum of squares into a number of orthogonal, or independent, regression components, which are linear, quadratic, cubic, quartic, and possibly of higher order. An F ratio may be used to test each component in turn for significance. For k = 3 the between-group sum of squares may be partitioned into two components, a linear component and a quadratic component. For k = 4 a partitioning into three components is possible, linear, quadratic, and cubic. In general, the total between-group sum of squares may be partitioned into k — 1 4

INTRODUCTION

independent components. In practice the interest of the experimenter will rarely extend beyond the first three or four components. The method of orthogonal polynomials assumes that both the experimental variable and the treatment variable can, for all practical purposes, be regarded as of the interval or ratio type. In the discussion above, and in the majority of applications, the treatments are equally spaced; that is, the intervals between treatments are- equal. This is not a necessary condition for the use of orthogonal polynomials. The method has been adapted to situations with unequal intervals. In the above discussion we have considered, first, the case where the treatment variable is a nominal variable and, second, the case where the treatment variable is of the interval or ratio type, the experimental variable, in both instances, being viewed as of interval or ratio type. Many of the variables used in the behavioural sciences are neither nominal, nor interval or ratio, but are ordinal. In many experiments the treatment variable is composed of a set of ordered categories. One treatment may be said to be greater than or less than another, but no meaningful assertions can be made about the equality of intervals between treatments or about how much greater one treatment is than another. Because of this the method of orthogonal polynomials, or any systematic curve-fitting procedure, may not, from a rigorously logical viewpoint, be applied appropriately to the data. Also this method may frequently be found, for other reasons, to be inappropriate in the analysis of much experimental data. The method of orthogonal polynomials is an application of the analysis of variance, and involves the usual assumptions of homogeneity and normality implicit in that method. In practice many sets of experimental data encountered in experiments that involve the behaviour of human and animal subjects violate these assumptions in greater or less degree. The circumstances mentioned above suggest that a method of nonparametric trend analysis might find useful application in the analysis of many sets of data in the behavioural sciences. This monograph presents what appears to be a complete method for the 5

INTRODUCTION

conducting of such analysis. The method is based on ranks, and employs the sampling distribution of the statistic S as used in the definition of Kendall's coefficient of rank correlation tau. Methods of nonparametric trend analysis, and methods based on ranks, other than those described here, can readily be devised. For example, methods employing Ede, as used in the definition of Spearman's rank correlation coefficient, can readily be formulated. Several reasons exist for the choice of a ranking method involving the statistic S. Much is known about the sampling distribution of the statistic S. Its distribution rapidly approaches the normal form as the number of ranks increases. The variance of the distribution is known when one or both sets of ranks contain ties. In general, problems involving ties have been fully explored. Both the asymptotic normality and the power of S have been investigated by Mann (1945). The Wilcoxon—Mann—Whitney U test, one of the more powerful nonparametric tests, is, in effect, a particular application of the sampling distribution of S. This application arises where one set of paired ranks degenerates to a dichotomy. The development of the methods of nonparametric trend analysis described in this monograph was the result of an interest in the analysis of data from experiments designed to test theories that postulate nonlinear, and sectionally monotonic, relations between constructs. Such theories are on the increase in the behavioural sciences, and are at present rather popular in psychological investigation at McGill University. An example is activation theory as discussed by Lindsley (1951), Hebb (1955), Malmo (1959), and others. This theory postulates an inverted-U relation between activation and behavioural efficiency, which is described by Malmo as follows: "From low activation up to a point that is optimal for a given function, level of performance rises monotonically with increasing activation level, but beyond this optimal point the relation becomes nonmonotonic. Further increase in activation beyond this point produces a fall in performance level." Many of the experiments in this field of research involve physiological measurements, which 6

INTRODUCTION

are difficult to obtain and are often rather unstable. The investigator is more concerned with the sectionally monotonic properties of the data than with precise knowledge of a functional relation.

7

2 Sectionally Monotonic Functions

A function Y = f(X) is said to be a monotonically increasing function if any increment in X is associated with an increment in Y. Similarly, if any increment in X is associated with a decrement in Y, the function is said to be a monotonically decreasing function. The magnitude of the increment, or decrement, in X associated with the increment or decrement in Y is irrelevant to the concept of a monotonic function. Monotonicity is an order concept. Because this is so, the monotonic property of any monotonic parametric function is invariant under the transformation of the variables to an order form. This simply means that the monotonic property of the relation remains unchanged when the variate values are replaced by ranks. There are functions which consist of two or more branches, each branch being a monotonically increasing or decreasing function. Such functions are sectionally monotonic. No terminology, it seems, exists to describe such functions. I propose, therefore, to speak of a function that consists of two branches, one monotonically increasing and the other monotonically decreasing, as a bitonic function. A function with three branches, two increasing and one decreasing or vice versa, will be spoken of as a tritonic function, and so on. 8

SECTIONALLY MONOTONIC FUNCTIONS

The general class of function to which these particular varieties belong may be described as a polytonic function. Although the analogy is weak, a monotonic function may for convenience be regarded as the order statistic analogue of a linear function, a bitonic as the analogue of a quadratic, a tritonic as the analogue of a cubic, and so on. In general, a polytonic function with k branches may be regarded as the order statistic analogue of a polynomial of degree k.

9

3 The Statistic S and Its Distribution

In the development and application of nonparametric tests of trend, use is made of the statistic S, as used by Kendall (1943, 1955) in the definition of a coefficient of rank correlation, tau. A brief discussion of S and its distribution is appropriate here. S is descriptive of disarray in a set of ranks. Consider the following paired ranks: X: 1 2 3 4 5 Y: 1 4 3 5 2 The X ranks are in their natural order; the Y ranks exhibit a degree of disarray. To calculate S we compare every rank on Y with every other rank, there being k(k — 1)/2 such comparisons for le ranks. If a pair is ranked in its natural order, say 1 and 4, a weight +1 is assigned. If a pair is ranked in an inverse order, say 4 and 3, a weight —1 is assigned. The statistic S is the sum of such weights over all k(k — 1)/2 comparisons. In the above example the weights are +1, +1, +1, +1, —1, +1, —1, +1, —1, —1, and S = 2. S is a measure of trend. A positive value of S means that the Y ranks show a tendency to increase monotonically with increase in X, 10

THE STATISTIC S AND ITS DISTRIBUTION

whereas a negative S means that the Y ranks show a tendency to decrease monotonically with increase in X. If ties occur, the common convention is adopted of replacing the tied values by the average rank. A comparison of two tied values on Y receives a weight of zero. If ties occur on X, a comparison of the corresponding paired Y values will also receive a weight of zero, regardless of whether the paired Y values are tied. Consider the following example with tied values on both X and Y: X: 1.5 1.5 3 5 5 5 Y: 2 3 4.5 4.5 1 6 Here the comparison on Y of 2 with 3 receives a weight of zero, because the order of the first two paired values on X is arbitrary. Similarly, comparisons on Y involving the last three values will receive weights of zero, because of the triplet of ties on X. In the above example the weights are 0, +1, +1, —1, +1, +1, +1, —1, +1,0, —1, +1,0,0,0, and S = 4. The sampling distribution of S is obtained by considering the k factorial arrangements of Y in relation to X. A value S may be determined for each of the k arrangements. The distribution of these k factorial values is the sampling distribution of S. This distribution is symmetrical. Frequencies taper off systematically from the maximum value towards the tails. The distribution of S rapidly approaches the normal form. For k 10 the normal approximation to the exact distribution is very close. For k = 10 the greatest discrepancy between the exact cumulative relative frequencies and the corresponding normal approximations, using a continuity correction, is of the order 0.005. At the critical 0.025 and 0.050 levels the discrepancies are of the order 0.001 and 0.0005. Table I shows probabilities calculated from the exact sampling distributions of S for k = 4 to k = 10. This table is reproduced from Kendall (1955). The variance of the distribution of S without ties is

o-$2 _

k(k — 1)(2k + 5) .

(1)

18 11

TABLE I• PROBABILITY THAT S ATTAINS OR EXCEEDS A SPECIFIED VALUE (SHOWN ONLY FOR POSITIVE VALUES; NEGATIVE VALUES OBTAINABLE BY SYMMETRY)

Values of k

Values of k S

4

5

8

9

S

6

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36

0.625 0.375 0.167 0.042

0.592 0.408 0.242 0.117 0.042 0.0183

0.548 0.452 0.360 0.274 0.199 0.138 0.089 0.054 0.031 0.016 0.0271 0.0128 0.0187 0.0'19 0.0'25

0.540 0.460 0.381 0.306 0.238 0.179 0.130 0.090 0.060 0.038 0.022 0.012 0.0163 0.0129 0.0212 0.043 0.0'12 0.0'25 0.0'28

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45

0.500 0.360 0.235 0.136 0.068 . 0.028 0.0183 0.0114

7

0.500 0.386 0.281 0.191 0.119

0.068 0.035 0.015 0.0254 0.0114 0.0'20

10

0.500 0.431 0.364 0.300 0.242 0.190 0.146 0.108 0.078 0.054 0.036 0.023 0.014 0.0283 0.046 0.0'23 0.0'll 0.0'47 0.0'18 0.0'58

0.0'15 0.0'28 0.0628

Note: Repeated zeros are indicated by powers; e.g., 0.047 stands for 0.00047. *Reproduced from M. G. Kendall, Rank Correlation Methods, with permission of Charles Griffin and Co. Ltd., London.

12

• THE STATISTIC S AND ITS DISTRIBUTION

The presence of ties affects the variance of the sampling distributions of S. If ties occur in one set of ranks, and not in the other, the variance of S becomes 8[k(k — 1)(2k + 5) — Et(t — 1)(21 + 5)].

Ogg = 1

(2)

One set of ranks contains m sets of t ties. Note that the effect of ties is to reduce the variance as2. Examination of formula (2) shows that the effect of ties is to reduce the variance by unity for each tied pair, by 3.67 for each triplet of ties, and by 8.67 for each quadruplet of ties. Thus we have available a very convenient correction procedure. If ties occur in both sets of ranks, the variance of S is given by 2

as =8

(k -1)(2k+5)—Et(t-1)(21+5) — Eu(u — 1)(2u + 5)

+

1

]

Et(t — 1) (t — 2) I Eu(u — 1) (u — 2)] Et(t — 1) Eu(u — 1)]

+ 2k (k 1— 1)[

(3)

In this formula one set of ranks contains m sets of t ties, and the other set of ranks r sets of u ties. Formula (3) is the more general form for the variance of the distribution of S. It may be of some computational interest to note that if one set of ranks contains m sets of tied pairs, and the other set contains r sets of tied pairs, a correction for ties may be made by subtracting from as2, as calculated by formula (1), a quantity m+

r — 2mr k(k-1) .

Thus for k = 10, if X contained 3 tied pairs and Y contained 4 tied pairs the correction factor would be 6.73. For k = 10 the quantity 13

TABLE 2 VALUES OF ues AND 08 WITHOUT TIES* FOR k = 2 TO k a 20 k

oe=

08

k

oss

o

2 3 4 5 6 7 8 9 10

1.000 3.667 8.667 16.667 28.333 44.333 65.333 92.000 125.000

1.000 1.915 2.944 4.082 5.323 6.658 8.083 9.592 11.180

11 12 13 14 15 16 17 18 19 20

165.000 212.667 268.667 333.667 408.333 493.333 589.333 697.000 817.000 950.000

12.845 14.583 16.391 18.267 20.207 22.211 24.276 26.401 28.583 30.822

*Rules for ties: Rule 1. When ties occur in one set of ranks only, subtract from oat unity for each . tied pair, 3.67 for each triplet of ties, and 8.67 for each quadruplet of ties. Rule 2. When tied pairs occur in both sets of ranks, subtract from grel the quantity. m 2mr +r k(k - 1) . Here in is the number of tied pairs in one set of ranks, and r the number of tied pairs in the other.

14

THE STATISTIC S AND ITS DISTRIBUTION

ass without ties from formula (1) is 125.00, and the corrected value is 125.00 — 6.73 = 118.27. Table 2 shows values of ose and as without ties calculated from formula (1) for k = 2 to k = 20. Situations arise where both the X and Y variables contain numerous ties. Under such circumstances it is appropriate to represent the data in the form of a bivariate frequency table with R rows for the Y variable, and C columns for the X variable. For illustrative purposes we consider the case where X and Y each contain three groups of tied values. The data may be arranged in the form of a bivariate frequency table as follows: Xi X2 X3

YI

a

b

c

Yz

d

e

f

3

g

h

i

1

Let a, b, c, . . ., i represent the cell frequencies. The statistic S is calculated from such a bivariate table by summing all possible terms of the kind ae — bd. The number of such terms is given in general by RC(R — 1)(C — 1) 4 For the 3 X 3 table above S is given by S = (ae — bd) + (ah — bg) + (dh — ge) + (af — cd) + (ai — cg) + (di — fg) + (bf — ce) + (bi — ch) + (ei — fh). 15

THE STATISTIC S AND ITS DISTRIBUTION

The

following is a numerical example in which X and Y are two

ordered variables each containing three groups of tied values:

Xi

X2

Xa

Y,

4

3

1

Y2

3

10

4

Ya

1

4

12

The differences between aII possible cross-products are as follows:

The statistic

31

13

2

13

47

32

2

32

104

S is the sum of these values. In this example S = 276. S calculated from a bivariate frequency

The sampling variance of

table may be obtained from formula (3). One ranking may degenerate to a dichotomy. The variance of the sampling distribution of

S

when one ranking degenerates to a dichotomy with ni and n2

members, respectively, and the other ranking contains na sets of t ties is given by

tr 2

NI [le = 3k(k n2 — k — 1)

E

(4)

Occasions are found where both sets of ranking degenerate to dichotomies, and the data may be represented in the form of a 2 X 2 table. Let the frequencies in the 2 X 2 table be as follows:

a

16

a

b

a+ b

c

d

c -I- d

+c

b+d

THE STATISTIC S AND ITS DISTRIBUTION

The variance of the sampling distribution of S for a 2 X 2 table is given by cr s2 _ (a+b)(c+d)(a+c)(b+d) . k-1

(5)

Thus the variance is seen to be the product of the marginal totals divided by k — 1. IIr tests of significance involving S we may use the probability distributions in Table I for k < 10. For k > 10 the normal approximation to the distribution of S may be used, and the ratio a = Sias referred to standard tables of the normal distribution. In using the normal approximation the discontinuous distribution of S is replaced by the continuous normal distribution. Under these circumstances it is advisable to use a correction for continuity. This is accomplished by subtracting unity from S if it is positive, and adding unity to S if it is negative. Thus the absolute value of S is reduced by unity. The reader may question why we reduce the absolute value of S by unity instead of 0.50 as is the case in using the continuous normal distribution in the estimation of binomial probabilities. The reason for this resides in the fact that adjacent frequencies in the distribution of S are separated from each other by two units, as can be seen from Table 1, and not by one unit as is the case with the binomial distribution. Reducing the absolute value of S by unity is an appropriate procedure when the ranks contain no ties, or when ties are not numerous in either the X or the Y ranks. In other situations, the following rules for continuity corrections may be applied: Rule 1. When one variable is a dichotomy and the other variable is a set of untied ranks, reduce the absolute value of S by unity. Rule 2. When one variable is a dichotomy and the other is a ranking of m sets of ties of the same extent t, reduce the absolute value of S by I. Rule 3. When both variables are dichotomies, reduce the absolute value of S by k/2. 17

THE STATISTIC S AND ITS DISTRIBUTION

Rule 4. When one variable is a dichotomy and the other contains in groupings of values of extent t i, where ti ) 1, reduce the absolute value of S by the correction 1

k m — 1(

li

+ t,n 2 ) '

Here ti and t„, are the numbers of tied values in the first and last groups.

18

4 Monotonic Trend for Correlated Data: LARGE-SAMPLE PROCEDURE WITHOUT TIES

This test may be applied to data obtained by making measurements on N subjects each under k ordered conditions. We proceed by replacing the original measurements by ranks as in the Friedman (1937) two-way analysis of variance by ranks. This method is described by Siegal (1956). S is calculated for each of the N subjects. The values of S are summed over the N subjects to obtain ES. The quantity ES is a measure of monotonic trend, and is descriptive of the increase, or decrease, in the experimental variable with increase in the treatment variable. In the absence of ties the sampling variance of S for any subject is given by formula (1), or may be read directly from Table 2, and is the same for all subjects. The variance of ES is the sum of the separate variances. Thus QE 32 = E u s2 =

Ncr 2.

(6)

Assuming the normality of the distribution of ES, the normal deviate is

~~ ES

(7)

CrEs

19

MONOTONIC TREND FOR CORRELATED DATA

The normal deviate, z, takes the usual critical values 1.96 and 2.58 at the 0.05 and 0.01 levels, respectively, for a nondirectional test. In many experimental situations in which trend tests are used some prior basis exists for predicting the direction of the trend; consequently, a directional test will frequently be appropriate. Estimates of probabilities obtained from the normal approximation to the distribution of ES will be improved by using a correction for continuity. To apply this correction we subtract unity from ES if it is positive, and add unity if it is negative. Thus the absolute value of ES is reduced by unity. The method described above is illustrated in Table 3, which shows hypothetical measurements for eight subjects under four treatments. These measurements have been ranked for each of the N subjects and a value of S calculated. The quantity ES is found to be 14. For k = 4 the sampling variance as' is 8.67. The variance of QE S2 = 8 X 8.67 = 69.33, and EEs = 8.33. Reducing ES by unity as a continuity correction results in z = 13/8.33 = 1.56, which falls short of significance at the 0.05 level. It is of incidental interest to note that for k = 2 the quantity (IESI — 1)2/N is distributed approximately as X2 with d.f. = 1, and the quantity (IESI — 1)/VN has an approximately normal distribution. In this case the present test is the same as the usual sign test for two correlated samples. The steps involved in the above procedure may be summarized as follows: 1. Rank the scores for each subject from 1 to k. 2. Calculate S for each subject. 3. Sum S for all subjects to obtain ES. 4. Calculate the sampling variance of S, use, using formula (1) or Table 2, and multiply this by N to obtain the sampling variance of ES, oT s2. The square root of this quantity is the standard error. 20

TABLE 3 DATA ILLUSTRATING TREND TEST FOR CORRELATED DATA: LARGE-SAMPLE PROCEDURE WITHOUT TIES

Measurements under four conditions Subject

I

II

III

IV

1 2 3 4 5 6 7 8

4 8 7 16 2 1 2 5

5 9 13 12 4 4 6 7

9 14 14 14 7 5 7 8

3 7 6 10 6 3 9 9

Ranks under four conditions Subject

1

II

III

IV

S

1 2 3 4 5 6 7 8

2 2 2 4 1 1 1 1

3 3 3 2 2 3 2 2

4 4 4 3 4 4 3 3

1 1 1 1 3 2 4 4

0 0 0 —4 +4 +2 +6 +6

ES=14, z=(IESI - 1)/c

13/8.33=1.56,

p > 0.05.

21

MONOTONIC TREND FOR CORRELATED DATA

5. Divide IESI — 1 by the standard error, vzs, to obtain the normal deviate, z. 6. Reject the null hypothesis for a nondirectional test at the 0.05 level if z > 1.96, and at the 0.01 level if z > 2.58. The corresponding values of z for a directional test are 1.64 and 2.33.

22

5 Monotonic Trend for Correlated Data: LARGE-SAMPLE PROCEDURE WITH TIES

Ties may occur in the measurements obtained for each subject. Ties will presumably not ordinarily occur in the treatment variable, the values of that variable being controlled by the experimenter. Under these circumstances the value ass may be calculated separately for each subject using formula (2) and the values of (722 summed to obtain cra. A more convenient procedure is to calculate cra as in the untied case, and then subtract unity from this variance for each tied pair, 3.67 for each triplet of ties, and so on. Illustrative data are shown in Table 4. This table shows measurements obtained for eight subjects under four conditions. The quantity ES has been calculated, and is found to be —25. The data contain five tied pairs and one triplet of ties. For k = 4 without ties the sampling variance Qs2 is 8.67, and the variance of az = 8 X 8.67 _ 69.33. We subtract from the latter 5.00 for the five tied pairs and 3.67 for the triplet of ties to obtain a corrected value of the variance of 60.66 and a standard error of 7.79. Reducing the absolute value of ES by unity as a continuity correction results in z = 24/ 7.79 = 3.08, which is significant at better than the 0.01 level for a nondirectional test. 23

TABLE 4 DATA ILLUSTRATING TREND TEST FOR CORRELATED DATA: LARGE-SAMPLE PROCEDURE WITH TIES

Measurements under four conditions Subject

I

II

III

IV

1 2 3 4 5 6 7 8

7 6 3 4 8 4 9 10

7 1 2 2 6 7 4 4

4 5 2 2 5 2 3 2

2 5 2 2 2 2 1 2

Ranks under four conditions Subject

II

III

3.5 4 2 4 4 2 4 4

3.5 1 1 2 3 4 3 3

2 2.5 3.5 2 2 1.5 2 1.5

IV

S

1 2.5 3.5 2 1 1.5 1 1.5

—5 —1 +3 —3 —6 —3 —6 —4

ES= —25, s= OEM —1)/oys=24/7.79=3.08,

p