Measuring Abundance: Methods for the Estimation of Population Size and Species Richness (Data in the Wild) 1784272329, 9781784272326


116 59 8MB

English Pages 229 [237] Year 2020

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Measuring Abundance
Cover
Half Title Page
Title
Copyright
Contents
Preface
Acknowledgements
Part I: Background
1. Statistical ideas
Part II: Stationary individuals
2. Quadrats and transects
3. Points and lines
4. Distance methods
5. Variable sized plots
Part III: Mobile individuals
6. Quadrats, transects, points,and lines – revisited
7. Capture-recapture methods
8. Distance methods
Part IV: Species
9. Species richness
10. Diversity
11. Species abundance distributions (SADS)
12. Other aspects of diversity
Appendix
Notes
Further reading
References
Index of Examples
General Index
Recommend Papers

Measuring Abundance: Methods for the Estimation of Population Size and Species Richness (Data in the Wild)
 1784272329, 9781784272326

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Measuring Abundance

Measuring Abundance Methods for the Estimation of Population Size and Species Richness

Graham J. G. Upton

DATA IN THE WILD SERIES Pelagic Publishing | www.pelagicpublishing.com

Published by Pelagic Publishing PO Box 874 Exeter EX3 9BR UK www.pelagicpublishing.com Measuring Abundance: Methods for the Estimation of Population Size and Species Richness ISBN 978-1-78427-232-6 (Hbk) ISBN 978-1-78427-231-9 (Pbk) ISBN 978-1-78427-233-3 (ePub) ISBN 978-1-78427-234-0 (ePDF) Copyright © 2020 Graham J. G. Upton The moral rights of the author have been asserted. All rights reserved. Apart from short excerpts for use in research or for reviews, no part of this document may be printed or reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, now known or hereafter invented or otherwise without prior permission from the publisher. A CIP record for this book is available from the British Library Cover image: Monarch butterfly (Danaus plexippus) migration (iStock/Jodi Jacobson)

Contents Preface  Acknowledgements 

viii x

Part I.  Background 1.

Statistical ideas 

1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10

Sampling  Sample statistics  Common continuous distributions  Common discrete probability distributions  Compound Poisson distributions  Estimation and inference  Types of model  Testing the goodness of fit of a model  AIC and related measures  Quantile-quantile plots 

2

2 3 6 7 11 12 15 16 17 18

Part II.  Stationary individuals 2.

3.

4.

Quadrats and transects 

22

Points and lines 

51

Distance methods 

59

2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 3.1 3.2 3.3 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9

What shape quadrats?  How many quadrats?  Quadrat placement  Forestry sampling  Quadrats for estimating frequency  Nested quadrats  Quadrats for estimating cover  *Variation between and within quadrats  The point quadrat frame  Line-intercept sampling (LIS)  Point-count transect sampling  Spatial patterns  Locations for sampling points  Simple point-to-plant measures  Using the distance to the kth nearest plant  The point-centred quarter method (PCQM)  Angle-order estimators  Nearest-neighbour distances  Combined point-to-plant and nearest-neighbour measures  Wandering methods 

22 23 26 28 33 37 41 49 51 51 54 59 60 61 62 67 69 70 71 74

vi  |  MEASURING ABUNDANCE 4.10 Handling mixtures of species  4.11 Recommendations 

5.

Variable sized plots 

5.1 5.2 5.3 5.4

Variable area transect (VAT)  3P sampling  Bitterlich sampling  Perpendicular distance sampling (PDS) 

76 78

79 79 83 83 86

Part III.  Mobile individuals 6.

7.

8.

Quadrats, transects, points, and lines – revisited 

6.1 6.2 6.3 6.4 6.5 6.6 6.7

Box quadrats  Strip transects  Using frequency to estimate abundance  Point counts (point transects)  Double-observer sampling  Double sampling  Removal sampling 

90

90 90 92 95 102 107 107

Capture-recapture methods 

115

Distance methods 

142

7.1 7.2 7.3 7.4 7.5 8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8

Capture-recapture models for a closed population  Capture-recapture models for an open population  Pollock’s robust design  Spatial capture-recapture models  Mark-resight estimation  The underlying idea  Detection functions  Point transects  Using imprecise distance data  Introducing covariates  Multiple species  Sightings of groups  Line transects 

116 130 135 137 140 142 143 144 149 150 151 153 153

Part IV.  Species 9.

Species richness 

9.1 9.2 9.3 9.4 9.5 9.6

Richness indices  Rarefaction  The dependence of richness on area  Estimating the unobserved  The limitation of using richness as a measure of diversity  An occupation-detection model 

160 161 162 164 169 173 173

CONTENTS  | vii

10. Diversity 

175

11. Species abundance distributions (SADS) 

188

12. Other aspects of diversity 

195

10.1 10.2 10.3 10.4 10.5 10.6 10.7 11.1 11.2 11.3 11.4 11.5 11.6 12.1 12.2 12.3 12.4

Berger-Parker dominance  Shannon entropy  Simpson’s index  Effective numbers  Fisher’s α  Taking account of differences between species  Measuring β-diversity  Illustrating abundance distributions  The log-series distribution  Truncated Poisson-lognormal distribution  The gambin model  Testing the goodness of fit of a model to a set of octave counts  Determining the drivers for species abundance distributions  Evenness  Similarity and complementarity  Turnover  Rarity 

Appendix  Notes  Further reading  References  Index of Examples  General Index 

175 177 178 178 179 181 183 188 190 190 192 193 194 195 196 198 198

201 203 206 209 222 223

Preface This book aims to bring together, for the first time, descriptions of all the most widely used methods for assessing the sizes of populations of living organisms. The papers referenced come from more than 100 different journals that cover many disciplines. However, both for their ubiquity, and for the number of papers cited, two journals stand out: Biometrics and Ecology. Together they indicate that the subject of this book might be termed quantitative ecology. Wherever possible, examples are used to illustrate the method being described. The methods selected are either those currently used or the earlier methods that underlay them. In a few cases I have suggested adjustments that appear to improve accuracy. The techniques and problems associated with the measurement of plant cover, are rather different from those used for assessing the amount of timber in a forest, and are very different from those used for counting birds or fishes. My hope is that specialists working with one type of organism, may chance across a procedure, currently used in a different context, that they can adapt to their own purposes. The descriptions of the many methods contained in this book are necessarily brief, and there will always be much more that could be written; to cover that deficiency there is a recommended reading section at the back, giving details of specialist books that constitute essential reading for the methods described. Computer programmes are referenced where appropriate. My preference is for programmes based on R (because they are free), but reference is also made to other widely used programmes. Methods for the assessment of the size of mobile populations are particularly complex. As an example, the online manual for the programme MARK (which deals with capture-recapture data) has more than 1000 pages. Following a brief synopsis of relevant statistical methods in Part I, Part II addresses methods for stationary items. Questions addressed here include: the numbers of standing or fallen trees in a forest; the amount of timber in a forest; the amount of plant cover in a field, and the amount of coral in a coral bank. The methods in this section are relatively easy to describe and use. It is no surprise that assessing the numbers of moving objects is much more challenging, both to describe, and to carry out. Part III includes examples of the estimation of the numbers of reptiles (skinks), mammals (grizzly bears, marmots), amphibians (frogs), fish (darters), crustaceans (lobsters, crabs), and birds (ovenbirds). Most examples include computer code, though the analyses here would constitute no more than the preliminary stages of a proper analysis of the data. Part IV is concerned with the many aspects of species richness and diversity. There are a few sections marked with an asterisk. These are sections that may be read by the curious, but can be ignored without affecting the understanding of the remainder. About 40 years ago I was co-author (with Bernard Fingleton) of the two volumes of Spatial Data Analysis by Example. One review welcomed the publication of the second of

PREFACE  | ix those volumes by looking forward to a third volume. We had nothing in mind at the time, but the current volume might have fulfilled that role, since the spatial arrangement of objects has a direct effect on how easy it is to count them. Graham Upton Wivenhoe, Essex June 2020

Acknowledgements I am very grateful to the following for the help they provided: Richard Barnes (elephant dung), Emery Boose (mapping trees), Nathalie Butt (plotting tree data), Rick Camp (bird data and its provenance), Richard Chandler (unmarked), Robert Colwell (Costa Rican ant data), Kevin Darras (audio bird counts), Rocio Duchesne and Ken Tape (Alaskan shrubs), Murray Efford (for advice on spatial capture-recapture), Gregory Gilbert (for permission to use the Californian tree data and for his advice concerning modern methods for mapping trees), Geoff Heard (Australian tree frog data), Klaus Hennenberg (for advice on plotless density estimators), Leanne Hepburn (reef fish protocols), Renske Hijbeek (for supplying the mangrove data and giving permission for its reproduction), Judith Lang and Kenneth Marks (for advice on the coral data they provided), Jeff Laake (advice on mark-recapture), Tom Matthews (ISAR and the Gambin model), Brett McClintock (mark-resight), Trent McDonald (mra), David Morrison (nested quadrats), Louis-Paul Rivest (Rcapture), Robert van Woesik (SFD), Robert Whittaker (octaves). I am particularly grateful to Eric Rexstad for his patience and thoroughness in dealing with my barrage of inquiries while learning to use the Distance package. Finally, it is a pleasure to thank Chris Reed and Nigel Massen for their assistance with the preparation of this manuscript. G. J. G. U.

Part I

Background Readers with any statistical training should probably move directly to another part of the book. This first part presents the barest possible introduction to the statistical terms that arise subsequently. A reader without statistical training should initially simply skim through this part, returning when the need arises.

1.  Statistical ideas Manly and McDonald (1996) stated that ‘Statistical methods play a pivotal role in the process of gathering information to enable many of today’s important conservation problems to be solved.’ Bonar, Fehmi, and Mercado-Silva (2011) wrote ‘Just as a business executive needs the services of a good lawyer and a good accountant, a biologist needs a good statistician.’ This book therefore starts with a crash course introducing the statistical terms and methods that will be applied later in the book. This is intended as no more than an aide memoire rather than a statistics textbook.

1.1  Sampling For key species it may be possible to conduct a census with the aim of counting nearly every individual. For example, the 2018 tiger census in India, required 44,000 field staff, 600,000 human-days, and 523,000 km of foot surveys with 26,800 camera trap locations. The results were 35 million photographs of wildlife, of which 76,651 were of tigers. Comparison of markings suggested that 2461 tigers had been photographed. Combining these observed numbers with DNA analysis, and using some of the methods discussed later in this book, led to an estimate of 2967 tigers.1 Although it would appear that a census of an entire population must be more accurate than a sample Bonham (2013) noted that ‘sample-based data may be more reliable than a 100% inventory. This follows from the fact that samples are often taken with greater care than can be used in a complete census because more expertise can be used in sampling.’ While this comment does not apply to the Indian tiger census, which employed a lot of expertise, and used statistical sampling techniques, it does emphasize the potential accuracy that can be obtained using samples. Consider the modest task of determining the number of oak trees in a large wood. One could methodically walk back and forth through the wood, counting oak trees and marking them to avoid double counting. If the wood is really large, then this will take a very long time and would cost a lot of money. A more sensible idea would be to use a map, divide the wood into 1000 equal-sized small areas, and then visit a sample of 10 of these counting the oaks in each small area. Multiplying the total number of oaks seen by 100 (= 1000/10) will give an estimate of the total number in the wood. Noting the variation in numbers across the 10 small regions will give a good idea of the accuracy of that estimate. The numbers 10 and 1000 can be varied as appears appropriate. The small areas chosen to be sampled must not be chosen because of any prior know­ledge. Ideally, they should also not be close to one another, since neighbouring locations are likely to resemble one another without, necessarily, resembling those at the other end of the wood. Alternative sampling schemes (depending on context) will be discussed throughout the book. Underlying every scheme is the requirement that the

STATISTICAL IDEAS  | 3 sample should be representative of the population being sampled and should in no way be biased by the sampler. Often the latter requirement implies a random arrangement of samples, or a random sampling point, determined by the computer. Some discussion of where sampling should take place is given in Section 2.3.

1.2  Sample statistics Suppose that a sample consists of the n values: x 1, x 2, …, x n . These might be measurements, SAMPLE STATISTICS such as the distances from n sampling points to the nearest trees. They might be55 counts, SAMPLE STATISTICS such as the numbers of organisms in randomly chosen equal-sized regions. Natural value?’ and ‘How variable are these numbers?’. The answers are provided by questionsvalue?’ are ‘What is their average and ‘HowThe variable areare these numbers?’ The and ‘How variable arevalue?’ these numbers?’. answers provided by two statistics: the sample mean and the sample variance. answers two are provided statistics: the sample mean and the sample variance. statistics: by thetwo sample mean and the sample variance. 1.2.1 Sample mean 1.2.1  1.2.1 Sample mean Sample mean The sampledenoted mean, denoted by ¯, is the average ofsample the n sample values: The sample by x¯, is by thexx of the nof values:values: Themean, sample mean, denoted ¯average , is the average the n sample



1 x ¯ = 1 (x + x + · · · + x ) = x ¯ = n (x11 + x22 + · · · + xnn) = n

An equivalent is An equivalent formulaformula is An equivalent formula is



n

1 n 1  xi .(1.1) (1.1) (1.1) n i=1 xi . n i=1

J

1 J (1.2) x ¯ = 1  fj xj , x ¯= n fj xj ,(1.2) (1.2) j=1 n j=1

where J is the number of distinct x-values and fj is the number of occurrences

STATISTICAL of occurrences where J is the IDEAS number of distinct x-values and f is the number where J 6is distinct x-values and is the occurrences of the . If n is reasonably largefj(> 30,jnumber say) thenofthe distribution of value ofthe thenumber value xjof n 30, is reasonably large (> 30, say) of then the distribution of repeat of the value large xj . If (> x j. If n is the reasonably say) then the distribution the values of x from ¯ by a values of x ¯ from repeat samples is likely to be well approximated 2 the values ofSanta x ¯ from repeat samplesofisCalifornia. likely to beThe welldata approximated by a in distribution the mountains were collected samplesforest is likely to be wellCruz approximated (Section 1.3.1). normal (Section 1.3.1).by a normal distribution

normal distribution (Section between December 2006 and 1.3.1). September 2007 and refer to alive main stems Example 1.1 : Californian Douglas Firs that have diameters of at least 1 cm at breast Example 1.1 : Californian Douglas Firs height. The trees are not uniformly spread across the region surveyed, but have Example 1.1: Californian Douglas Firs a distinct cluster towards the bottom of the region. The figure shows the region subdivided into 100 41.1 m by 4 m plots, The left-hand diagram of Figure shows the positions ofcounts Douglas Firs (Pseudotsuga 0 2 in 0these 0 0 1 with 1 0the1 1 regions 1 0 1 0 2 0 1 0 0 6 inSTATISTICAL reported in m the×IDEAS right-hand diagram. menziesii) a 40 40 m corner of the 200 m ×11300 m UC Santa Cruz Forest 1 0 0 0 1 0 0 0 1Ecology number firs of perthe 4m by 4 mforest 0 Santa 0 the 0 1 Cruz 0 0 mountains 0 1 1 plot 1 is ResearchThe Plotmean which formsofpart coastal in of 0 0 1 1 0 02 0 1 0 0 2 were forest The in the Santa Cruz mountains of California. The data were collected 0 0 0 0 0 0 0 1 1 1 California. data collected between December 2006 and September 2007 65 1 0 1 0stems 40 +0and 1= 0to alive 1= 0.65. 1 (1 + 1 +and 0 + 1September + ··· + 0 + 0 +0 0) x ¯December = between 2006 2007 0 refer 01 cm 0 main 4 0of 1 100 1 at 1 0 height. 1 and refer to alive main at least breast 100 stems that have diameters that have diameters of at least 1 cm at breast 1 0height. 1 3 4 0 0 0 0 0 The trees areconvenient not uniformly spread across the have a0 distinct 0 1 surveyed, 3 the 0 0but 0 1.1. 1 region 4 data A more procedure spread begins by summarising in 0Table The trees are not uniformly across have 0figure 0the1region 0 region 0 1but 0 1 0surveyed, 1subdivided clustera towards the bottom of the region. The shows the 0 0region. 0 0figure 0 1shows 1 1 The 1 0the distinct cluster towards the bottom of the 0 0 in3 these 3 1regions 0 0 0 in the 1 1 reported into aregion hundred 4 m × 4 m plots, with the counts subdivided into 100 4 m by 4 m plots, the counts in1these regions 0 with 0 0 0 in 0 shown 1 1 Table 1.1 The numbers of Douglas firs in the 100 34m3× 4m quadrats 0 1 0 0 0 0 0 0 0 0 right-hand diagram. reported in the right-hand diagram. 0 1 0 0 0 0 0 0 0 0 Figure 1.1. The mean number of firs perper 4 m4×m4 by m plot is0 is0 3 3 8 0 0 1 0 The mean number of firs 4 m1 plot 1 0 0 3 Number of 0 10 02 03 04 1 Douglas firs x ¯= (1 + 1 + 0 + 1 + · · · + 0 + 00 +00 +0 0)0= Number100 of 4m × 4m quadrats 63 27 1 5 2

3 8 0 0 1 0 5 5 6 0 70 80 5= 0.65. 065 0 0 0 1 0 0 1 100

0 0 0

A more procedurebegins beginsby by summarising the 1.1. Figure 1.1convenient Theprocedure left-hand diagram shows the positions ofin65 Douglas A more convenient summarizing thedata data inTable Table 1.1. The Figure 1.1 The left-hand diagram shows the positions of 65 Douglas Firs inmean one 40 m bybe40 m Equation corner of (1.2): aEquation Californian research plot. The The can now calculated using (1.2): mean can now be calculated using Firs in one 40 m by 40 m corner of a Californian research plot. The plot is subdivided into a grid of 100 squares of side 4 m. The right-hand Table The numbers Douglas firs squares in the 100 shown in plot is1.1 subdivided into aofgrid of 100 of 4m side× 44m m.quadrats The right-hand diagram 1 reports the counts in each of these 100 squares. Figure 1.1. reports diagram counts squares. {(63 × 0)the + (27 × 1) in + 2each + (5of × these 3) + (2100 × 4) + 5 + 8} = 0.65.

100 The left-hand diagram of Figure 1.1 shows the positions of Douglas Firs Number of Douglas firsof Figure 1.1 0 shows 1 2 the3 positions 4 5 of 6 Douglas 7 8 Firs The left-hand diagram 1.2.2 The median, quartiles andminter-quartile range. (Pseudotsuga menziesii ) in a 40 by 40 m corner of the 200 m by 300 m (Pseudotsuga menziesii ) in a 40 m 63 by 4027m corner Number of 4m × 4m quadrats 1 5 of 2 the 1 200 0 m 0 by1 300 m UC Santa Cruz Forest Ecology Research Plot which forms part of the coastal UC Santa Cruzare Forest Ecology sample Research Plot labelled which forms of the Suppose there 101 distinct values x1 , xpart , . . . , x101coastal with 2 x1 < x2 < · · · < x101 . The median is the central value, x51 : there are 50 The mean be are calculated Equation (1.2): smaller valuescan andnow there 50 largerusing values. The lower quartile is x : there are 25 smaller values and 75 larger values.

4  |  MEASURING ABUNDANCE

1

1

0

1

0

2

0

1

0

0

1

1

0

0

0

1

0

0

0

1

0

0

1

1

0

0

0

1

0

0

4

0

0

1

0

1

0

1

0

1

1

0

1

3

4

0

0

0

0

0

0

0

1

1

0

0

0

1

1

0

0

0

3

3

1

1

1

0

0

0

0

1

0

0

0

0

0

0

0

0

1

0

0

3

3

8

0

0

1

0

0

0

0

0

0

5

0

0

0

0

Figure 1.1  The left-hand diagram shows the positions of 65 Douglas Firs in one 40 m × 40 m corner of a Californian research plot. The plot is subdivided into a grid of 100 squares of side 4 m. The right-hand diagram reports the counts in each of these 100 squares. Table 1.1  The numbers of Douglas firs in the hundred 4 m × 4 m quadrats shown in Figure 1.1.

Number of Douglas firs Number of 4 m × 4 m quadrats

0

1

2

3

4

5

6

7

8

63

27

1

5

2

1

0

0

1

1.2.2  The median, quartiles and inter-quartile range. Suppose there are 101 distinct sample values labelled x 1, x 2, …, x 101 with x 1 < x 2 < … < x 101. The median is the central value, x 51: there are 50 smaller values and there are 50 larger values. The lower quartile is x 26: there are 25 smaller values and 75 larger values. Correspondingly, the upper quartile is x 76: there are 75 smaller values and 25 larger values. The difference in the values of the two quartiles is termed the inter-quartile range. In practice, the numbers of observations and their values are unlikely to be so convenient. But it will always be possible to arrange observations in order of magnitude and identify values that approximate the formal definitions.

1.2.3  Box-whisker plots A box-whisker plot (also known as a box plot) is a way of illustrating the spread of a set of data by using the values of the quartiles and the median. The central box is bounded by the quartiles, with the position of the median indicated either by a point or by a bold line within the box. The whiskers extend from the quartiles (the edges of the box) towards the more extreme values. Depending on the computer programme used, these either extend all the way to the most extreme values, or they may extend by some multiple of the interquartile range, with more extreme values being separately indicated.

some multiple of the the inter-quartile range, with more extreme values being separately indicated. Example 1.2 : Californian Douglas Firs (cont.)

STATISTICAL IDEAS  | 5

8

Example 1.2: Californian Douglas Firs (cont.)

8

0

2

4

6

The Douglas Fir data is far removed from the idealized 101 observations discussed in introducing the quartiles, since so many observations have the same value. The result is Figure 1.2. In this case, after arranging the data in ascending numerical order, all of the minimum value, the lower quartile, and the median, have the value 0. The upper quartile is 1 and the values of 3 or more are indicated as outliers. More usual box-whisker plots will be found in Chapters 4 and 8.

6

Quadrat counts of Douglas firs

4

Figure 1.2 The rather unusual box-whisker plot for the Douglas fir data of Table 1.1.

0

2

The Douglas Fir data is far removed from the idealised 101 observations discussed in introducing the quartiles, since so many observations have the same value. The result is Figure 1.2. In this case, after arranging the data in ascending numerical order, all of the minimum value, the lower quartile, and the median, have the value 0. The Quadrat counts of Douglas firs upper quartile is 1 and the values of 3 or more are indicated as outliers. More usual box-whisker plots will be found in Chapters 4 and 8. 

Figure 1.2  The rather unusual box-whisker plot for the Douglas fir data of Table 1.1. 1.2.4

Sample variance

1.2.4  The Sample variance variability of the n sample values is measured by the sample variance, s2 : The variability of the n sample values is measured by the sample variance, s 2:   n 2  n  1  2 1  ,(1.3) (1.3) xi − xi s = STATISTICAL IDEAS  n − 1  i=1 n i=1 2



8 8

STATISTICAL IDEAS

or, equivalently, by or, equivalently, by or, equivalently, by



 2     J J      2  1  1 2 2    .(1.4) (1.4) s = J fj xj J f j xj −     n 1− 1  .  j=1fj x2 − 1n j=1fj xj   (1.4) s2 = j  n−1 n j=1   j=1

Example 1.3 : Californian Douglas Firs (cont.)

 2Douglas Example 1.3: Californian FirsFirs (cont.) Example 1.3 : Californian Douglas (cont.) Using Table 1.1, x = 27 + 4 + 45 + 32 + 25 + 64 = 197, giving 

2 Using Table +4++ + 25 giving + 64 = 197, giving Using Table 1.1, ∑x 2 1.1, = 27 + 4x+ = 4527 + 32 2545+ + 6432 =197, 2

1 65 s2 =  197 − 2  = 1.563. 199 65 100 2 197 − = 1.563. s = 99 100 In this case, the sample variance is more than twice the size of the sample This suggests that the firs notthan distributed at random (see Sectionmean. In case, this case, the sample variance is more than twice the sizeofofthe thesample sample In mean. this the sample variance is are more twice the size 1.4.2 below). This principally a result of random the cluster of Section firs (see towards the mean. This the are not distributed at (see random Section This suggests thatsuggests the firsisthat are notfirs distributed at 1.4.2 below). bottom of Figure 1.1.  1.4.2 below). This is principally a result of the cluster of firs towards the This is principally a result of the cluster of firs towards the bottom of Figure 1.1. bottom of Figure 1.1.  1.2.5 Sample standard deviation 1.2.5 Sample standard deviation This is s, the square root of the variance. The units of s are the same as those of the original data.root It ofis the used in the construction intervals This is s, the square variance. The units of sofareconfidence the same as those (Section 1.6.2).data. It is used in the construction of confidence intervals of the original

mean. This suggests that the firs are not distributed at random (see Section 1.4.2 below). This is principally a result of the cluster of firs towards the bottom of Figure 1.1. 

6  |  MEASURING ABUNDANCE 1.2.5

Sample standard deviation

1.2.5  This Sample standard deviation is s, the square root of the variance.

The units of s are the same as those of the original data. It is used in the construction of confidence intervals

This is s, the square root of the variance. The units of s are the same as those of the original (Section 1.6.2). data. It is used in the construction of confidence intervals (Section 1.6.2). 1.2.6

Coefficient of variation

1.2.6  Coefficient of variation

Large numbers will usually differ from one another by larger amounts than

Large numbers will usually The differ from one larger amounts will will small numbers. coefficient of another variationby (often referred to asthan the cv ) small accountofthe magnitudes the numbers by scaling the numbers.takes The into coefficient variation (oftenofreferred to as involved the cv) takes into account the standard the mean: magnitudes of thedeviation numbersby involved by scaling the standard deviation by the mean: cv =

1.3

s . x ¯

Common continuous distributions

1.3  Common continuous distributions A continuous distribution is appropriate the being quantity being measured A continuous distribution is appropriate when thewhen quantity measured is not confined is not confined to a specific set of values such as 0, 1, 2, . . . , but may take to a specific set of values such as 0, 1, 2, …, but may take any value in some specified any value in some specified range. It applies to measurements or averages, as range. It applies to measurements or averages, as opposed to counts. opposed to counts. The curve illustrates how likely to occur are the values, is called the Thethat curve that illustrates how likely to occur arepossible the possible values, is probability density function, which is often shortened to pdf and written as f(x). A related called the probability density function, which is often shortened to pdf and function written is the distribution functionfunction F(x), which the probability of obtaining a value of x as f(x). A related is theisdistribution function F(x), which is or less. the probability of obtaining a value of x or less.

1.3.1  Normal distribution The normal distribution has a shape that depends on the values of the two quantities µ and σ2, which may be collectively referred to as the parameters of the distribution. The distribution is symmetric about the mode (the most likely value), µ, which is therefore also both the mean and the median. The quantity σ2 is the variance of the distribution. About 95% of the values of a normal distribution lie within 2σ of the mean (see Figure 1.3).3 Although every normal distribution has a theoretical range from –∞ to ∞, about 99.8% lie within 3σ of the mean. This implies that if µ is much greater than 3σ, then the possibility of obtaining a negative value can be ignored.

N(μ ,σ2)

μ − 2σ

μ

μ + 2σ

Figure 1.3  A normal distribution centred on µ. About 95% of values lie between µ – 2σ and µ + 2σ.

0.5

STATISTICAL IDEAS  | 7

0.2

0.3

0.4

χ 22

0.1

χ 24

0.0

χ 28

0

5

10

x

15

20

Figure 1.4  Chi-squared distributions with degrees of freedom equal to 2, 4, or 8.

1.3.2  Chi-squared distribution This is a continuous distribution that can take any value between 0 and ∞. The shape of the distribution is determined by the value of the single parameter ν, which is referred to as the degrees of freedom of the distribution. The distribution has mean ν and variance 2ν. Three examples are illustrated in Figure 1.4.

1.4  Common discrete probability distributions These are distributions that are appropriate when the quantities being measured are counts.

1.4.1  Bernoulli distribution This simple distribution has just two values: 0 and 1. With the probability of the outcome 1 denoted by p, the distribution has mean p and variance p(1 – p). The Bernoulli distribution plays its part in presence/absence models.

1.4.2  Poisson distribution When every point in space (or time, or space-time) is equally likely to contain an event, then the events constitute a Poisson process. In our case an ‘event’ means that an organism of interest (a plant, say) is present at that location. The events are said to occur at random. Suppose that plants occur at random in a region with a density of µ per unit area. In that case the numbers of plants in equal-area subregions are observations from a Poisson distribution. If the subregions have unit area, then the probability of a randomly chosen subregion containing exactly k plants is

Suppose that plants occur at random in a region with a density of µ per unit area. In that case the numbers of plants in equal-area sub-regions are 3 1 1 1 1 2 observations from a Poisson distribution.1 If 1the2 sub-regions have unit0 area, 8  |  MEASURING ABUNDANCE 0 0 0 1 1 1 2 1 1 2 then the probability of a randomly chosen sub-region containing exactly k 0 0 0 0 0 2 2 1 1 1 plants is 3 0 0 0 0 2 2 1 1 µ1k e−µ PROBABILITY DISTRIBUTIONS P(µ;COMMON k) = 1DISCRETE 0 3 3 0 0 2 2 1 2 k!



0

0

1

1

1

2

1

0

3

1 · ·1· 1.1 k! = k(k − 1)(k 1− 2)

1

2

0

0

2

0

3

1

2

0

0

1

1

11

where e is the exponential function, the quantity k!, is which is referred where e is the exponential function, and theand quantity k!, which referred to2as kto factorial,4 0 0 0 3 3 0 2 2 1 as k factorial 4 , is given by is given by 0 0 0 0 0 0 1 2 1 1



0

1

0

2

The mean of plants a subregion of area Aofis µA. for a Since, Poisson Thenumber mean number ofinplants in a sub-region areaSince, A is µA. forprocess, the variance of the number of plants would alsonumber be µA,of a comparison the be sample a Poisson process, the variance of the plants wouldofalso µA, mean, 2 of 100 Figure 1.5 variance, The diagram illustrates the positions a comparison of theleft-hand sample mean, x ¯ indication with the sample variance, , provides x¯ with the sample s 2, provides an of whether thesplant distribution randomly positioned points. The right-hand diagram reports the 2 > x ¯, then this would indicate that the plants occur in clumps.

Example 1.4 : Random counts

Example 1.4: Random counts Figure 1.5 illustrates a Poisson process.

The plant positions were generated 3 0 1 2 1 1 1 1 2 1

pairs of random numbers. AThe summary of the counts ingenerated 100 sub-regions Figureusing 1.5 illustrates a Poisson process. plant positions were using pairs 0 0 0 1 1 2 1 1 2 1 is given in TableA1.2. of random numbers. summary of the counts 2in 100 subregions is given in Table 1.2. 0 0 0 0 2 1 1 1 100 plants scattered sub-regions, the sample mean is01.1.The The WithWith 100 plants scattered overover 100100 subregions, the sample mean is sample 3 0 0 0 0 2 2 1 1 1 sample variance is given by variance is given by 0 3 3 0 0 2 2 1 2 1 s2 =

1 99



(0 + 36 + 80 + 72) −

11 0 0 2 (100) 0 0 0 100 0

0

1



1 3

88 1 2 1 = 0.89. 3 99 2 2 1 1

=

0

2

0

0

1

0

3

0

2

1

0

As anticipated, the plant is genuinely 4 Here k is a since 0 sample 0 However, 0mean 1 the 1random, 1 1 expression. 2 the 2 1using whole number withpattern a value obtained given k and sample variance equal. formal test is presented in Section 1.8. factorial existsare for approximately all non-integer positive k. A Using the R programming language its value 0 0 3 0 0 2 1 2 1 1 is obtained by writing factorial(x). 3positions 0 1 1 of 1 100 2 1 2 the 1 1 Figure 1.5 The left-hand diagram illustrates randomly positioned points. The right-hand diagram reports the counts 0 0 0 1 1 2 2 1 1 1 in the 100 sub-regions.

Example 1.4 : Random counts

2

2

1

0

1

1

0

0

0

0

1

2

2

3

0

0

1

1

0

0

1

2

0

2

1

3

2

3

0

0

Figure 1.5 illustrates a Poisson process. The plant positions were generated 0 2 1 sub-regions 1 1in 100 1 1 of0 the0 counts using pairs of random numbers. A summary is given in Table 1.2. 0 3 3 0 0 0 2 2 1 With 100 plants scattered over 100 sub-regions, the sample mean is 1. The 0 0 0 0 0 1 1 2 1 sample variance is given by 1 s2 = 99

1



1

1

1 2 0 0 (0 + 36 + 80 + 72) − 1 (100) 100



3 2 0

0

0

2

0

88 2 1 =2 3 = 0.89. 99

0

0

1

1

2

1

Figure41.5  The left-hand diagram illustrates the positions of 100 randomly positioned Here k is a whole number with a value obtained using the given expression. However, k points. factorial The right-hand reports the k. counts 100 subregions. exists for diagram all non-integer positive Using in thethe R programming language its value is obtained by writing factorial(x).

Table 1.2  A summary table of the numbers of plants in the 100 subregions of Figure 1.5.

Number of plants Count

0

1

2

3

36

36

20

8

As anticipated, since the plant pattern is genuinely random, the sample STATISTICAL IDEAS  | 9 mean and sample variance are approximately equal. A formal test is presented in Section 1.8. 

1.4.3  Binomial distribution 1.4.3

Binomial distribution

The binomial distribution is appropriate for situations with just two outcomes (e.g. The binomial distribution is appropriate for situations with just two outcomes ‘Success’ and ‘Failure’). The form of the distribution is determined by the two parameters (e.g. ‘Success’ and ‘Failure’). The form of the distribution is determined by n, the number of trials, and p, the probability of a success (assumed to be constant across the two parameters n, the number of trials, and p, the probability of a success all trials). The probability there across are exactly r successes among the that n trials is P(n, (assumed to bethat constant all trials). The probability there are p; r) given by exactly r successes amongst the n trials is P(n, p; r) given by P(n, p; r) =



n! pr (1 − p)n−r r!(n − r)!

r = 0, 1, . . . , n, (1.5) (1.5)

with 0! defined to be equal to 1. The distribution has mean np and variance with 0! defined to be equal to 1. The distribution has mean np and variance np(1 – p). Note np(1 − p). Note that the case n = 1 corresponds to the Bernoulli distribution that the case n = 1 corresponds to the Bernoulli distribution (Section 1.4.1). (Section 1.4.1). In cases where there arethere moreare than twothan possible outcome, the probabilities refer to all In cases where more two possible outcome, the probabilities the possiblerefer outcomes and the distribution is called the multinomial distribution. to all the possible outcomes and the distribution is called the multinomial distribution. Example 1.5 : Random counts (cont.) Example 1.5: Random counts (cont.) The left-hand diagram of Figure 1.6 repeats the counts previously pre-

The left-hand of Figure 1.6 25 repeats counts previously sented,diagram but groups them into sets ofthe four counts. Defining presented, a success asbut groups them into 25 sets of four counts. Defining a success as being a count greater being a count greater than zero, the bottom-right set of four counts includes than zero,two thesuccesses bottom-right set2)ofand fourtwo counts includes two The successes (1 and 2) andoftwo (1 and failures (zeroes). observed numbers successesThe are reported the right-hand diagramare of reported Figure 1.6. failures (zeroes). observedinnumbers of successes in the right-hand diagram of Figure 1.6. In 36Table of the1.3 100 subregions are no given plants. Theright-hand estimated probability A summary ofthere the numbers in the diagram of Figureof a compared^ corresponding a binomial with p =^ 0.64. success is1.6. therefore pto= the (100 – 36)/100 figures = 0.64.forWith n = 4,distribution and substituting p for p in Equation (1.5),Number Table of 1.3successes compares the estimated proportions for the outcomes 0 1 2 3 4 with those observed. In Section between the Observed number1.8 it will be shown 1that the 4 differences 6 8 6 observed and estimated numbers is no more than would be expected by chance. Estimated number (using p = 0.64) 0.4 3.0 8.0 9.4 4.2 1

1

1

1

2

0 there are no plants. The estimated probability 1 2 1 of3 the1 1001 sub-regions 2 In 36 3 3 With 4n = 4, and 2 4 = 0.64. of a success is therefore pˆ = (100 − 36)/100 substi0

1

2

1

1

0

2

0

2

1

0

1

1

0

0

0

0

1

2

2

3

0

0

1

1

0

0

1

2

0

2

1

3

2

3

0

0

1

0

0

1

1

1

2

1

0

3

0

0

0

3

3

2

2

1

0

2

0

1

0

0

2

0

1

0

1

0

1

1

1

1

1

2

0

0

2

0

1

0

0

2

3

1

2

0

0

1

4

3

2

2

0

3

2

4

4

1

1

1

3

3

2

3

3

4

1

2

Figure 1.6  The left-hand diagram shows the counts in 100 subregions. The right-hand diagram reports the number of non-zero counts within each group of four counts.

1 1 0 0 0 0 1 1 1 1

1 1 0 0 0 0 1 1 1 1

0 00 0 00 1 11 1 11 0 0

0 0 0 0 1 1 1 1 0 0

0 00 0 10 0 10 1 01 0 0

0 0 0 0 0 0 1 1 0 0

1 01 3 03 0 10 1 01 2 2

1 1 3 3 0 0 1 1 2 2

1 31 3 03 2 12 1 21 3 3

1 1 3 3 2 2 1 1 3 3

1 31 2 22 0 10 2 32 1 1

1 1 2 2 0 0 2 2 1 1

2 22 2 02 1 21 0 10 2 2

10  |  MEASURING ABUNDANCE

2 2 2 2 1 1 0 0 2 2

1 21 1 11 0 00 0 20 0 0

1 1 1 1 0 0 0 0 0 0

0 10 0 00 1 01 2 02 0 0

0 0 0 0 1 1 2 2 0 0

3 03 2 12 0 20 0 00 1 1

3 3 2 2 0 0 0 0 1 1

33

22

44

44

11

2 0

3 3 33

1 1 11

1 1 11

3

1

1

3 3 3 33

2 2 2 22

0 3

3 1

3 3 33

3 3 33

4 4 4 44

1 1 1 11

2 2 2 22

Table 1.3  A summary of the numbers given in the right-hand diagram of Figure 1.6 Figure 1.6 The left-hand diagram shows the counts in 100 sub-regions. compared to the corresponding figures for a shows binomial distribution with psub-regions. = 0.64. Figure Figure 1.6 1.6 The Theleft-hand left-hand diagram diagram showsthe the counts countsinin100 100sub-regions. The right-hand diagram reports the number of non-zero counts within Figure Figure1.6 1.6 The Theleft-hand left-handdiagram diagramshows showsthe thecounts countsinin100 100sub-regions. sub-regions. The The right-hand right-hand diagram reports reportsthe thenumber numberofofnon-zero non-zerocounts countswithin within each group of diagram four counts. The TheNumber right-hand right-hand diagram diagramreports reportsthe thenumber number of1 ofnon-zero non-zero counts within of successes 0 2 3counts4within each eachgroup groupofof four four counts. counts. each eachgroup groupofoffour fourcounts. counts.

Observed number 1 4 6 8 6 tuting pˆ for p in Equation (1.5), Table 1.3 compares the estimated proportions tuting tuting p ˆ p ˆ for for p p in in Equation Equation (1.5), (1.5), Table Table 1.3 1.3 compares compares the the estimated estimated proportions proportions number (using p =Table 0.64) 1.3 0.4 3.0the 8.0 9.4 be 4.2 forEstimated the outcomes with those observed. In Section 1.8 it will shown that tuting tuting pˆpˆfor for p pinin Equation Equation (1.5), (1.5), Table 1.3compares compares the estimated estimated proportions proportions for for the the outcomes outcomeswith withthose those observed.InInSection Section1.8 1.8ititwill willbebe shown that that the differences theobserved. observed estimated isshown no more than for for the the outcomes outcomesbetween with withthose those observed. observed.and InInSection Section1.8 1.8numbers ititwill willbebe shown shown that that the the differences differences between between the the observed observedand andestimated estimatednumbers numbersisisnonomore morethan than would be expected by chance. the thedifferences differencesbetween betweenthe theobserved observedand andestimated estimatednumbers numbersisisno nomore morethan than would wouldbebeexpected expectedbybychance. chance.  would wouldbebeexpected expectedby bychance. chance.  1.4.4  Negative binomial distribution 1.4.4 Negative binomial distribution 1.4.4 1.4.4 Negative Negativebinomial binomialdistribution distribution 1.4.4 1.4.4 Negative binomial binomial distribution distribution The assumption of a Poisson isprocess mathematically convenient. It underpins The Negative assumption of aprocess Poisson is mathematically convenient. It un-many The The assumption assumption of ofathe aPoisson Poisson process process isis4. mathematically mathematically convenient. convenient. It Itare unun-usually of the distance methods discussed in Chapter In practice, however, there derpins many of distance methods discussed in Chapter 4. In practice, The Theassumption assumptionofofa aPoisson Poissonprocess processisismathematically mathematicallyconvenient. convenient. ItItununderpins derpins many many of of the the distance distance methods methods discussed discussed in in Chapter Chapter 4. 4. In In practice, practice, clusters derpins ofderpins individuals. clusters present, the variance of the in equal-area however, there are usually clusters ofdiscussed individuals. With clusters the many many ofofWith the thedistance distance methods methods discussed ininChapter Chapter 4.counts 4. InInpresent, practice, practice, however, there there are areusually usually clusters ofindividuals. individuals. With With clusters clusters present, the variance of the inclusters equal-area regions be much greater thanthe their regions however, can be much greater than theirof Incan such cases, a present, negative binomial however, however, there there are arecounts usually usually clusters clusters ofofmean. individuals. individuals. With With clusters clusters present, present, the the variance variance ofthe thecounts counts inin equal-area regions regionscan canbebemuch muchmay greater greater than thantheir their mean. In such cases, aequal-area negative binomial distribution provide better distribution mayof provide a better description of thecan observed counts. variance variance of ofthe the counts counts inin equal-area equal-area regions regions can bebemuch much greater greater than thanatheir their mean. mean. In In such such cases, cases, a a negative negative binomial binomial distribution distribution may may provide provide a a better better description of the observed counts. mean. mean. In In such such cases, cases, a a negative negative binomial binomial distribution distribution may may provide provide a a better better For the negative binomial distribution the probability that exactly k plants are description description of the theobserved observed counts. counts. For theof negative binomial distribution the probability that exactly k plants description description ofof the theobserved observed counts. counts. observed, isFor given by For the thenegative negative binomial binomial distribution distributionthe theprobability probabilitythat thatexactly exactlyk kplants plants are observed, is given by For Forthe thenegative negativebinomial binomialdistribution distributionthe theprobability probabilitythat thatexactly exactlyk kplants plants are areobserved, observed,isisgiven givenbyby are areobserved, observed,isisgiven given (k +by rby− 1)! r p (1 − p)k P(r, p; k) (k =(k++r r−−1)! k = 0, 1, 2, . . . . (1.6) (1.6) 1)! rr kk k!(r − 1)! (k (k + + r r − − 1)! 1)! (1 (1−−p)p) k k==0,0,1,1,2,2,. ...... . (1.6) (1.6) P(r, P(r, p; p; k) k) = = p p r r kk k k==0,0,1,1,2,2,. .. .. .. . (1.6) (1.6) P(r, P(r,p;p;k)k)== k!(r k!(r−−1)! 1)!p p(1(1−−p)p) k!(r k!(r−−of 1)! 1)! Expressed in terms the distribution’s mean, µ, and variance, σ 2 , the pa22 Expressed in termsinof the distribution’s mean, µ, and µ, variance, σ2, theσparameters thepapa- p and Expressed Expressed interms terms of ofthe thedistribution’s distribution’s mean, mean, µ,and andvariance, variance, σ rameters p and r satisfy the following relations: 2,2 ,the thepapaExpressed Expressed ininterms terms ofofthe thedistribution’s distribution’smean, mean,µ,µ,and andvariance, variance,σσ, ,the r satisfy rameters the following relations: rametersp pand andr rsatisfy satisfythe thefollowing followingrelations: relations: rameters rametersp pand andr rsatisfy the the following relations: 2 following andrelations: r = µ2 /(σ 2 − µ). (1.7) psatisfy = µ/σ 22 2 22    and    (1.7) p = µ/σ and and r r = =µ2µ µ). µ). (1.7) (1.7) p = µ/σ 22 2/(σ 2 /(σ 2 2−− and and r r = = µ µ /(σ /(σ − − µ). µ). (1.7) (1.7) p p = = µ/σ µ/σ Substitution of the sample mean x ¯ for µ, and the sample variance s2 for σ 2 , 22 22 2σ for ,2 , Substitution Substitution ofofthe thesample sample mean mean x ¯ x ¯ for forµ,µ,and andthe thesample sample variance variance gives the so-called method of moments estimates as variance 2s2for 2 Substitution of the sample mean x for µ, and s 2 ssfor σ , σσ gives the ¯meanx¯x¯for for σ ,, Substitution Substitutionofofthe thesample samplemean forµ,µ,the and andsample the thesample sample variance variance s for gives gives the the so-called so-called method method of of moments moments estimates estimates as as so-calledgives method of moments estimates as givesthe theso-called so-calledp˜method method moments estimates asasx andestimates r˜ = ¯2 /(s2 − x ¯). (1.8) =x ¯/s2ofofmoments 22 2 22 and and r˜ r˜==x ¯22x ¯/(s x ¯). x ¯). (1.8) (1.8) p˜ p˜==x ¯/s x ¯/s 22 2 /(s 2 2−− p ˜ = x ¯ /s and and r ˜ r ˜ = = x ¯ x ¯ /(s /(s − − x ¯ x ). ¯ ). (1.8) (1.8) p ˜ = x ¯ /s    and    (1.8)

Example 1.6: Californian Douglas Firs (cont.) For the 100 small plots illustrated in Figure 1.1 the sample mean and variance were x¯ = 0.65 and s 2 = 1.56. The sample variance is much greater than the sample mean, suggesting that a negative binomial distribution may prove useful. The method of moments estimates are p˜ = 0.65/1.56 = 0.416 and r˜ = 0.463. A comparison of the observed counts with those from the negative binomial distribution with parameters r˜ and p˜ are given in Table 1.4. An alternative approach, that often provides a better fit, is maximum likelihood estimation (see Section 1.6.4). The expected values resulting from the maximum likelihood estimates are given as the last line in the table. Neither set of expected values is convincingly superior.

estimated by either the method of moments or the method of maximum likelihood (Section 1.6.4). Number of Douglas Firs

0

1

2 STATISTICAL 3 4 or more IDEAS 

| 11

Observed number 63 27 1 5 4 Estimated number (method of moments) 66.6 18.0 7.7 3.7 4.0 Estimated number of (maximum likelihood) 64.2 of20.3 8.4 Firs 3.8with the 3.4 Table 1.4  A comparison the observed frequencies Douglas estimates

obtained using a negative binomial distribution with parameter values being estimated by either the method of moments or the method of maximum likelihood (Section 1.6.4). A comparison of the observed counts with those from the negative binomial 4 or distribution with parameters r˜ and p˜ are given in Table 1.4. An alternative Number of Douglas Firs 0 1 2 3 more approach, that often provides a better fit, is maximum likelihood estimation Observed number 63 27 the 1maximum 5 likeli-4 (see Section 1.6.4). The expected values resulting from hood estimates are given as the last line in the table. Neither set of expected Estimated number (method of moments) 66.6 18.0 7.7 3.7 4.0 values is convincingly superior. Estimated number (maximum likelihood) 64.2 20.3 8.4 3.8 3.4 1.5

Compound Poisson distributions

1.5  Compound Poisson When a population containingdistributions many species is repeatedly sampled, the numof individuals belonging to any particular species willsampled, vary fromthe sample When a ber population containing many species is repeatedly number of to sample according to a Poisson distribution. Suppose that, for species j, individuals belonging to any particular species will vary from sample to sample according the mean number per sample is µj . If µj is regarded as an observation from to a Poisson distribution. Suppose that, for species j, the mean number per sample is µj. If a continuous distribution with probability density function f(µ), then Pk , the µj is regarded as an that observation a continuous distribution with probability probability there arefrom k individuals in the sample, is given by the com-density function pound f(µ), then Pk, the probability that there are k individuals in the sample, is given Poisson distribution: by the compound Poisson distribution: Pk ∝





∞ 0

λk e−k f(µ)dµ, k!

k = 1, 2, 3, . . . , (1.9) (1.9)

with the proportionality introduced to take account of the fact that the number of species with the proportionality introduced to take account of the fact that the numcontributing a zero count cannot be measured. ber of species contributing a zero count cannot be measured.

ESTIMATION AND INFERENCE ESTIMATION AND INFERENCE

15 15

1.5.1  1.5.1 Log-series distribution Log-series distribution 1.5.1

Log-series distribution

In Fisher,InCorbet, Williams (1943), Fisher’s choice of probability density function for Fisher, and Corbet, and Williams (1943), Fisher’s choice of probability density In Fisher, Corbet, and Williams (1943), Fisher’s choice of probability density the Poisson mean led to the log-series distribution, for which function for the Poisson mean led to the log-series distribution, for which

function for the Poisson mean led to the log-series distribution, for which  k α  n k , k = 1, 2, 3, . . . . (1.10) (1.10) Pk = α n , k = 1, 2, 3, . . . . (1.10) Pk = kS n + α kS n + α The procedure for estimating the parameter α will be described in Section The procedure for estimating the parameter α will beα described in Section 10.5. The procedure for estimating the parameter will be described in Section 10.5. 10.5. Poisson lognormaldistribution distribution 1.5.2  1.5.2 Poisson-lognormal 1.5.2 Poisson lognormal distribution

Analysing of observed frequencies in samples, (1974) sug-that the Analysing patternspatterns of observed frequencies in samples, BulmerBulmer (1974) suggested Analysing patterns of observed frequencies in samples, Bulmer suggested that the distribution ofPoisson the logarithm of the Poisson mean (1974) might be a distribution of the logarithm of the mean might be a normal distribution. The gested that the distribution of the logarithm oflognormal the Poisson mean might be a normal distribution. The result is the Poisson distribution: result is the Poisson-lognormal distribution: normal distribution. The result is the Poisson lognormal distribution:



 ∞ 1  ∞ k−1 −λ (ln(λ)−µ)2 /2σ2 √ Pk = λ e e (1.11) 2 2 dλ, 1 (1.11) Pk = k!σ √2π 0 λk−1 e−λ e(ln(λ)−µ) /2σ dλ,(1.11) k!σ 2π 0 where µ and σ 22 are the parameters of the normal distribution. where 2µ and σ are the parameters of the normal distribution.

where µ and σ are the parameters of the normal distribution. 1.6 1.6

Estimation and inference Estimation and inference

The purpose of a sample, which consists of relatively few observations, is to The purpose of a sample, which consists much of relatively few observations, to deduce the properties of the unmeasured larger population. As theissize deduce the properties of the unmeasured much larger population. As the size of a random sample is increased, so the sample mean, x ¯, will become more of a random is of increased, so the sample mean, ¯, willofbecome more reliable as an sample estimate the population mean, µ, as a x result a reduction reliable as an estimate of the population mean, µ, as a result of a reduction in the standard error (see below). At the same time, the sample variance, s2 ,

normal distribution. The result is the Poisson lognormal distribution:  ∞ 2 2 1 √ Pk = λk−1 e−λ e(ln(λ)−µ) /2σ dλ, (1.11) k!σ 2π 0 12  |  MEASURING ABUNDANCE where µ and σ 2 are the parameters of the normal distribution.

1.6  Estimation and inference 1.6

Estimation and inference

The purpose of a sample, which consists of relatively few observations, is to deduce the properties of theofunmeasured much largerof population. the size ofisatorandom The purpose a sample, which consists relatively fewAs observations, sample isdeduce increased, so the sample x¯, willmuch become more reliableAs as the an estimate of the properties of the mean, unmeasured larger population. size of a random increased, so the sample mean, x ¯, willerror become the population mean,sample µ, as aisresult of a reduction in the standard (seemore below). At as an estimate of the population mean,more µ, as reliable a result as of an a reduction the samereliable time, the sample variance, s 2, will become estimate of the 2 in the standard error (see below). At the same time, the sample variance, s2 , population variance, σ . 2 will become more reliable as an estimate of the population variance, σ .

1.6.1  1.6.1 Standard error Standard error The standard error (commonly referredreferred to as the is the square of the variance of The standard error (commonly tos.ase.)the s. e.) is theroot square root of the variance of some interest. It issample closely standard related todeviation, the samples. As an some quantity of interest. It isquantity closelyof related to the deviation, ansample example, the standard error of the sample mean, example,standard the standard errors.ofAs the mean, x¯, is x ¯, is

 √ Var(¯ x) = s/ n,



where n is the sample size, and s2 is the sample variance. As the sample size

where n is the sample size, and s 2 is the reduces, sample variance. sample size for increases, so increases, so the standard error implying As lessthe variable values x ¯, the standard error reduces, implying variable values forthe x¯, and increased precision in and increased precision in our less knowledge concerning population mean. our know­ledge concerning the the population mean. For many situations normal distribution is relevant, so that on about For many the population normal distribution relevant, that on about 95% ofsituations occasions the value lies inis the intervalso(sample value ±2 95% of errors), value and onlies about 99.8% of occasions population valueerrors), lies occasionsstandard the population in the interval (samplethe value ±2 standard and the interval (samplethe value ±3 standard errors). This expressed formally on aboutin99.8% of occasions population value lies in theisinterval (sample value ±3 a confidence standardaserrors). This isinterval. expressed formally as a confidence interval. 16

STATISTICAL IDEAS

Confidence interval 1.6.2  1.6.2 Confidence interval An α% confidence for a population is an interval calculated An α% confidence interval interval for a population mean ismean an interval calculated from sample from asample values a formula that α% of the intervals values using formula that using guarantees thatthat α% guarantees of the intervals so calculated will include so calculated will this include truethat value. course, means the true value. Of course, alsothe means (100Of – α)% willthis not also include thethat true value! (100takes − α)%the will not include the true value! The interval takes the form: The interval form: 



s x ¯ − c√ , n

s x ¯ + c√ n



.(1.12) (1.12)

To achieve given avalue α requires larger values of c for sample sizes. For a To aachieve givenofvalue of α requires larger values of smaller c for smaller sample sizes.size, For larger a fixedvalues sample larger values of c lead larger values forconfidence α. fixed sample ofsize, c lead to larger values for α.toHowever, 100% 100% confidence achieved c = give ∞. Ifann approximate ≥ 20 then can only However, be achieved with c = ∞. If ncan ≥ 20only thenbethe choice cwith = 2 will 95% theinterval. choice c = 2 will giveoutput an approximate 95% confidence interval. Computer confidence Computer often includes confidence intervals for quantities output often includes confidence intervals for quantities of interest. of interest. Example 1.7 : Californian Douglas Firs (cont.) For the n = 100 firs, previous calculations gave the summary values the

Example 1.7: Firs (cont.)error is 1.563/100 = 0.125. values x ¯ =Californian 0.65, and s2 =Douglas 1.563. The standard

= 2, the approximate 95% confidence interval for the For theTaking n = 100cfirs, previous calculations gave the summary values thepopulation values x¯ = 0.65, mean is (0.40, 0.90).  2 and s = 1.563. The standard error is √1.563/100 = 0.125. Taking c = 2, the approximate 95% confidence interval for the population mean is (0.40, 0.90). 1.6.3

Bootstrap interval

Equation (1.12) provides a convenient guide that can be calculated without the need for a computer. However, for small values of n, it relies on assumptions about the population that may not be valid. By contrast, the bootstrap approach makes heavy use of the computer, but is always valid. Suppose that a sample of n observations is taken and denote the first observation by x1 , the second by x2 , and so on. The bootstrap assumption is that the value of any new observation will be equal to one of x1 , x2 , . . . , xn

STATISTICAL IDEAS  | 13

1.6.3  Bootstrap interval Equation (1.12) provides a convenient guide that can be calculated without the need for a computer. However, for small values of n, it relies on assumptions about the population that may not be valid. By contrast, the bootstrap approach makes heavy use of the computer, but is always valid. Suppose that a sample of n observations is taken and denote the first observation by x 1, the second by x 2, and so on. The bootstrap assumption is that the value of any new observation will be equal to one of x 1, x 2 , …, x n with each value being equally probable. With this assumption one can answer questions about the properties of future samples from the population. The procedure is to draw further samples, each of size n, with replacement, from this hypothetical distribution. Each new sample is created using random numbers in the range 1 to n. For example, if the first five random numbers are 11, 8, 3, 11, and 4, then the first five observations in the next ‘sample’ will have values equal to those of x 11, x 8, x 3, x 11, and x 4, respectively. Notice that, by chance, some random numbers may occur more than once, while other numbers will not occur at all. For each new sample, the characteristic of interest (for example, the mean) can be calculated. The range of values obtained for that characteristic gives an indication of the uncertainty in its value. ESTIMATION 17 of the Suppose 999 further samples are generated, so that thereAND areINFERENCE 1000 values characteristic of interest (including the value actually observed). An approximate 95% bootstrapapproximate interval is 95% bootstrap interval is

(v25 , v975 ),

vi is the ith ofvalues. the 1000 values. If greater required, greater precision where v i where is the ith largest of largest the 1000 If required, precision can be obtained can be obtained by increasing the number of new samples. by increasing the number of new samples. Example 1.8 : Californian Douglas Firs (cont.) The 100 values in the original sample are not distinct, but this does not

Example Douglas (cont.) affect1.8: the Californian bootstrap approach, whichFirs treats x1 as the number of Douglas Firs

as the number Douglas but Firsthis in the second sub- the in values the firstinsub-region, x2 sample The 100 the original are notofdistinct, does not affect region, and so forth. Table 1.5 summarises the first three resamples. bootstrap approach, which treats x 1 as the number of Douglas Firs in the first subregion, x 2 as the number of Douglas Firs in the second subregion, and so forth. Table 1.5 The the numbers in the resamples. first three resamples of the 100 4m × 4m Douglas Table 1.5 summarizes first three Fir counts summarised in Table 1.1. Figure 1.7 is a histogram (a diagram in which the areas of rectangles represent counts) that illustrates the wide range0 of the means obtained from resampling the Number of Douglas firs 1 2 3 4 5 8 Mean Douglas Fir data. The values ranged from 0.33 to 1.07. Arranging the values in Original data 63 27 1 5 2 1 1 0.65 order, the bootstrap 95% confidence interval was found to be (0.42, 0.92), which is First resample 56 31 1 7 2 3 0 0.77 in excellent agreement with the (0.40, 0.90) given using Equation (1.12) (with c = 2). Second resample 62 30 2 3 1 1 1 0.60 Note that each Third new set of 999 resamples may lead to a slightly different interval. resample 60 30 0 6 1 2 1 0.70

Table 1.5  The numbers in the first three resamples of the hundred 4 m × 4 m Douglas Fir Figure 1.7 in is Table a histogram counts summarized 1.1. (a diagram in which the areas of rectangles represent counts) that illustrates the wide range of the means obtained from

resampling the Douglas The from ArNumber of Douglas Firs Fir data. 0 1 values 2 ranged 3 4 0.33 5 to 1.07. 8 Mean

ranging the values in order, the bootstrap 95% confidence interval was found Original 27 agreement 1 5 with 2the (0.40, 1 0.90) 1 given 0.65 to be data (0.42, 0.92), which is in63excellent using Equation (1.12) (with 56 c = 2).31Note that First resample 1 each 7 new2 set of3999 resamples 0 0.77 may resample lead to a slightly different  Second 62 interval. 30 2 3 1 1 1 0.60

Third resample 60 30 0 6 1 2 1 1.6.4 The likelihood function and maximum likelihood estimates

0.70

For a random sample x1 , x2 , . . . , xn of observations on X, the likelihood is the product of the probabilities of their occurrence:

affect the bootstrap approach, which treats x1 as the number of Douglas Firs in the first sub-region, x2 as the number of Douglas Firs in the second subregion, and so forth. Table 1.5 summarises the first three resamples.

14  |  MEASURING ABUNDANCE

150

Table 1.5 The numbers in the first three resamples of the 100 4m × 4m Douglas Fir counts summarised in Table 1.1. 0

1

2

3

4

5

8

Mean

Original data

63

27

1

5

2

1

1

0.65

First resample Second resample Third resample

56 62 60

31 30 30

1 2 0

7 3 6

2 1 1

3 1 2

0 1 1

0.77 0.60 0.70

0

50

100

Frequency

Number of Douglas firs

Figure 1.7 is a histogram (a diagram in which the areas of rectangles rep0.2 0.4 0.6 0.8 1.0 1.2 resent counts) that illustrates the wide range of the means obtained from Mean resampling the Douglas Fir data. The values ranged from 0.33 to 1.07. Arthe values in order, bootstrapof95% interval was found Figureranging 1.7  Histogram showing thethe distribution the confidence means of 999 bootstrap resamples to besummarized (0.42, 0.92),inwhich in excellent agreement with the (0.40, 0.90) given of the data Tableis1.1. using Equation (1.12) (with c = 2). Note that each new set of 999 resamples 18 STATISTICAL IDEAS may lead to a slightly different interval. 

1.6.4  1.6.4 The likelihood function likelihood estimates The likelihood functionand and maximum maximum likelihood estimates 100 50



Frequency

150

For a random observations on X, the likelihood is the product of the For a sample random xsample x1x,n xof 1, x 2, …, 2 , . . . , xn of observations on X, the likelihood is probabilities of theirof occurrence: the product the probabilities of their occurrence: (1.13) L = {P(X = x1 ) × P(X = x2 ) × · · · × P(X = xn )} . (1.13)

0

Often these probabilities willfunctions be functions of oneorormore more unknown Often these probabilities will be of one unknownparameters. parameters. The The values of those parameters that maximise L are called the maximum values of those parameters that maximize L are called the maximum likelihood estimates. likelihood estimates. 0.2 0.4 be no need 0.6 to use the 0.8 computer 1.0 1.2 L as simple In someIncases there will to maximize some cases there will be no need to use the computer to maximise L as Mean formulaesimple exist (for example, for a Poisson distribution with parameter µ, the maximum formulae exist (for example, for a Poisson distribution with parameter likelihood of µ islikelihood just the sample µ, estimate the maximum estimatemean, of µ isx¯). just the sample mean, x ¯). When the likelihood is a function of a the single unknown parameter, a 95% confidence When is showing a function of distribution a single unknown parameter, 95% Figure 1.7the likelihood Histogram of the means ofa 999 resamples of theparameter data in 1.1. confidence interval for that consists allTable values that of lead a value interval bootstrap for that parameter consists of all summarised values that of lead to a value thetolog(likelihood) that is within 1.92 of the maximum value.5 Finding these values requires a simple loop programme on the computer. 5 of the is within 1.92 theunknown maximum value. Findingthe these When thelog(likelihood) likelihood isthat a function of of two parameters, joint 95% values requires a simple loop program on the computer. confidence region for the two parameters consists of all pairs of values that lead to a When the likelihoodthat is a lies function of two parameters,value. the joint 6 In95% value of the log(likelihood) within 3.00unknown of the maximum this case, a confidence region for the two parameters consists of all pairs of values that contour plot of the likelihood will demonstrate the interdependence of the two-parameter lead to a value of the log(likelihood) that lies within 3.00 of the maximum estimates. value.6 In this case, a contour plot of the likelihood will demonstrate the inter-dependence of the two parameter estimates. Example 1.9 : Californian Douglas (cont.) Example 1.9: Californian Douglas FirsFirs (cont.) The full counts of the Douglas Fir data were given in Table 1.1. Assuming

The full counts of the Douglas Fir data were given in Table 1.1. Assuming a negative a negative binomial distribution, the likelihood for these data is given by binomial distribution, the likelihood for these data is given by 63

L = {pr }

× {rpr (1 − p)}

27

× ··· ×



(r + 7)! r p (1 − p)8 8!(r − 1)!

1

.

^ The estimates arising from L are arer^ r 0.6154 = 0.6154 p = 0.4863. The estimates arising frommaximizing maximising L = andand p = 0.4863. The The corresponding expected values were given in Table 1.4. corresponding expected values were given in Table 1.4. Figure 1.8 shows contours of the likelihood for these data. The uncertainty in the estimation of the parameter values is apparent: this is not a sharp mountain but a rather flat topped ridge. The location of the maximum is indicated by the filled dot. The hollow dot indicates the values of the method of moments estimator found previously. To convey the uncertainty in the estimates, they might be reported as r ≈ 0.6 and p ≈ 0.5. 

STATISTICAL IDEAS  | 15

1.2

Figure 1.8 shows contours of the likelihood for these data. The uncertainty in the estimation of the parameter values is apparent: this is not a sharp mountain but a rather flat topped ridge. The location of the maximum is indicated by the filled dot. The hollow dot indicates the values of the method of moments estimator found previously. To convey the uncertainty in the estimates, they might be reported as ^ r ≈ 0.6 and^ p ≈ 0.5. 19

1.0 0.8 0.6

Estimated value of p

0.2

0.2

0.4

0.4

0.6

0.8

Estimated value of p

1.2

1.0

TYPES OF MODEL

0.5 0.5

1.01.0

1.5

1.5

Estimated value of r

2.0

2.0

2.5

2.5

Estimated value of r

Figure 1.8  Contour plot of the likelihood surface for the Douglas Fir data. The maximum Figure 1.8 Contour plot of the likelihood surface for the Douglas Fir likelihood estimates are indicated by theestimates filled dot.are The method by of moments data. The maximum likelihood indicated the filled estimates dot. are indicated by the empty circle which lies well the joint confidence region The method of moments estimates areinside indicated by 90% the empty circle which liesinner wellcontour. inside the joint 90% confidence region indicated by the indicated by the inner contour.

1.7

Types of model

1.7  Types of model

Many models are extremely complex, involving information on many variables, Many models are extremely complex, involving information on many variables, and and defying simple description. For example, when using capture-recapture defying simple description. For example, when using capture-recapture methods to methods to estimate the number of mobile animals in a study area, the model estimate requires the number of mobile in athat study area, the model together requireswith estimation estimation of theanimals probability an animal is present, of the probability anprobability animal is present, together withisestimation of theprobprobability estimationthat of the that an animal present detected. These that an animal is detected. These probabilities may depend a variety of abilities present may depend on a variety of characteristics, such as age andongender. This section briefly describes twoThis simple typesbriefly of model that maytwo be simple incorpo-types of characteristics, such as age and gender. section describes rated in be more sophisticated model that may incorporated in models. more sophisticated models. Linear models are principally used to relate a variable of interest, y, toor more Linear models are principally used to relate a variable of interest, y, to one one or more explanatory variables. The simplest form is the linear regression explanatory variables. The simplest form is the linear regression model typified by model typified by



y = β0 + β1 x,(1.14) (1.14)

where x is the value of an explanatory variable, and the βs are constants

where x is the value an estimated explanatory variable, and βsof are constants whose whose valuesofare using a sample of the pairs values of x and y. values are estimated using sample pairs of values explanatory of x and y. variables, the model is exWhena there areofseveral relevant When thereinare severalfashion: relevant explanatory variables, the model is extended in a tended a natural natural fashion: y = β 0 + β1 x1 + · · · + βk xk .

(1.15)

Here there are k explanatory variables and this is called a multiple regression model.

0

1

where x is the value of an explanatory variable, and the βs are constants whose values are estimated using a sample of pairs of values of x and y. When there are several relevant explanatory variables, the model is ex16  |  MEASURING ABUNDANCE tended in a natural fashion:



20

STATISTICAL IDEAS

(1.15) y = β0 + β1 x1 + · · · + βk xk . (1.15)

Here there are k explanatory variables and this is called a multiple regression Here there k explanatory 20 are STATISTICAL IDEAS variables and this is called a multiple regression model. 20 STATISTICAL IDEAS model. The early literature on theseonlinear assumed that thethat variable of interest The early literature thesemodels linear models assumed the variable of had a interest had a normal distribution. However, modern computer programs pernormal distribution. However, modern computer programmes permit the assumption of The early literature on these linear models assumed that the variable of The early literatureand on the these lineardistributions, models assumed thatresulting theasvariable of mit thedistributions, assumption many different the models many different resulting models areand referred to generalized interest had a normalofdistribution. However, modern computer programs per- linear interest had atonormal distribution. modern computer programs perreferred generalized linearHowever, models, or simply s. Still more models, orare simply GLMs.as Still more recently, these models have been extended, byrereplacing mit the assumption of many different distributions, andGLM the resulting models mit thethese assumption of many different distributions, and the resulting models cently, models have been extended, by replacing individual ‘point’ values are‘point’ referred to asfor generalized linear models, or simply GLM s. of Still more reindividual values the explanatory variables by some form average values for are the referred to as generalized linear or simplyvalues GLMfor s. Still more refor explanatory by somemodels, form average the variables. cently, these modelsvariables have been extended, byof replacing individual ‘point’ values the variables. The resulting models are called generalized additive models or GAMs. cently, these models have extended, by replacing individual ‘point’ The resulting models are been called generalized additive models GAM s. values the explanatory variables by individuals some form of average values or for variables. Nowfor consider p, the proportion of are caught in athe trap. Suppose that for the explanatory variables by some form ofthat average values for the variables. Now consider p, the proportion of individuals that are caught in The resulting models are called generalized additive models or GAM s.a trap. p depends onresulting thethat temperature, The simple replacement of y byor pGAM in equation The models arex. called generalized additive models s. of y(1.14) is Suppose p depends on the temperature, x. The simple replacement Now consider p, the proportion of individuals that are caught in a trap. Now consider p, the proportion individuals that are lead caught a trap. not appropriate, because(1.14) it could lead to aofvalue of p lying outside the range (0 by p in quation not appropriate, because it simple could to possible ainvalue Suppose that p dependsis on the temperature, x. The replacement of of y that p the depends on range the temperature, x. solution The simple replacement of y lying outside possible (0 to 1). The is to use an equation to 1). ThepSuppose solution is to use an equation such as: by p in quation (1.14) is not appropriate, because it could lead to a value of

by p as: in quation (1.14) is not appropriate, it could lead to a value of such  (0 to  1). because p lying outside the possible range The solution is to use an equation p lying outside the possible range p(0 to 1). The solution is to use an equation ln  (1.16) such as:  = α + βx, (1.16) such as: 1 − p p p ln =α βx, (1.16) 7 + ln 1logarithm + βx,expression on the left of (1.16) where ln is the so-called natural The this −p =α 1 − p7 The expression on the left of this equation where lnequation is the so-called natural logarithm. is called a logit and the model is described as being a logistic model. 7 where ln is the logarithm expression on the left of this 7 The is called A a logit and the so-called model isnatural described as being a of logistic model. A as logistic model is a where ln is the so-called natural logarithm The expression on the left of this logistic special a wider class models known log-linear equation ismodel calledisaalogit andcase the of model is described as being a logistic model. equation is called a logit and thein, model isasdescribed asmodels. being aMore logistic model.are given special case of a wider class of models known log-linear details models. More details are given for example, Upton (2016). A logistic model is a special case of a wider class of models known as log-linear A logistic model(2016). is a special case of a wider class of models known as log-linear in, for example, models. Upton More details are given in, for example, Upton (2016). models. More details are given in, for example, Upton (2016). 1.8 Testing the goodness-of-fit of a model

1.8 Testing goodness-of-fit of aofmodel 1.8  Testing thethe goodness of fit a model 1.8 Testing goodness-of-fit of a available model Although there the are many specialised tests for particular situations, a

test is Pearson’s goodness-of-fit test. The situations, test requiresaa useful Althoughuseful theregeneral-purpose are many specialized teststests available forfor particular Although there are many specialised available particular situations, 2 Although there areXmany specialised available forrequires particular situations, a 2 given the calculation general-purpose test is of Pearson’s oftests fit goodness-of-fit test. The test calculation useful general-purpose testgoodness isbyPearson’s test. Thethe test requires of X useful general-purpose test is Pearson’s goodness-of-fit test. The test requires 2 given by the calculation of X 2 given by  J 2

(Oj − Ej ) , (1.17) J EjE )2  J (Oj − j=1 (Oj − Ejj )2 ,(1.17) X 22 =  (1.17) X = , (1.17) Ej j=1 E and E are, respectively, the observed and estimated counts in catwhere O j j j j=1 egory j, and J is the number of categories. If the model under test is correct, the observed and estimated counts in catwhere Oj and Ej are, respectively, where Ojthen and EOj jare, and estimated countscounts in category andrespectively, Eobtained and in cat-j, and J where j are, respectively, value forthe Xof2observed willthe beobserved typical a estimated chi-squared distribution egorythe j, and J is the number categories. If theofmodel under test is correct, egory of j,Jand J−isPthe of2 categories. the model underthe test is correct, is the number categories. Ifnumber the model under testIf is correct, then value obtained for having − 1 degrees of freedom, where P is the number of parameters then the value obtained for X 2 will be typical of a chi-squared distribution then the of value obtained for X data. will beFor typical of– a1 –chi-squared distribution values estimated from the the chi-squared approximation to where X 2 will bewith typical a chi-squared distribution having J P degrees of freedom, having J − 1 − P degrees of freedom, where P is the number of parameters 2 having Jof−parameters 1 −the P degrees ofshould freedom, where P3.from isIf the number of parameters be reliable, all E values be at least the value obtained for X P is the number with values estimated the data. For the chi-squared with values estimated from the data. For the chi-squared approximation to with not values estimated from the data. For the chi-squared approximation to2 does lie in the upper tail of the distribution, then the model under test approximation to beallreliable, all the E values at least If the value be reliable, the E values should be atshould least 3. be If the value3.obtained for Xobtained be reliable, all the E values should be at least 3. If the value obtained for X 2 may be judged acceptable. does not lie upper tail of the distribution, then the model under testmay be for X 2 does not lie inin thethe upper tail of the distribution, then the model under test does not lie in thetoupper distribution, then the modelstatistic, under test G2 , of thethe likelihood ratio goodness-of-fit or Anbe alternative X 2 is tail may judged acceptable. judged acceptable. may be judged acceptable. deviance, which is given 2 by 2 2 2 is G ratio goodness-of-fit or which An alternative toGX,2the 2 , the likelihood An alternative to X isto likelihood ratio goodness of fit statistic,statistic, or deviance, , the likelihood An alternative X is G  ratio  goodness-of-fit statistic, or J deviance, which is given by  is given by Oj deviance, which is given by 2 G = 2 J Oj ln   . (1.18)  E J Ojj  j=1 2  O G2 = 2 O ln . (1.18) j G =2 Ojj ln Ej .(1.18) (1.18) j=1 Ej j=1 the calculation of X given by2 X =



7 The natural logarithm may also be written log since it expresses numbers as powers of e. e Thus ln(e) = 1. 7 The natural logarithm 2may also be written log since 2 itand, expresses numbers of e. under In most cases the value of G is close to the value of X it as with X 2,as if powers the model 7 The natural logarithm may also be written loge since expresses numbers as powers of e. e Thus ln(e) = 1. 2 will be typical test is correct, then the value of G of a chi-squared distribution having Thus ln(e) = 1.

J – 1 – P degrees of freedom. With a hand calculator X 2 is easier to calculate, but, since G 2 has some theoretical advantages, when a computer programme is doing the hard work, it is usually the value of G 2 that is given in its output.

THE GOODNESS-OF-FIT OF A MODEL 2 In most cases the value of G2 is closeTESTING to the value of X 2 and, as with X21 , 2 if the model under test is correct, then the value of G will be typical of a In most cases the value having of G2 isJ close valueof of X 2 and, as with X 2 , chi-squared distribution − 1 −to P the degrees freedom. | 17 STATISTICAL IDEAS  2 beGtypical of a if the model under test is correct, then to thecalculate, value of but, G2 will With a hand calculator X 2 is easier since has some chi-squaredadvantages, distributionwhen having J − 1 − P program degrees of theoretical a computer is freedom. doing the hard work, it With a the hand calculator X 2 is given easier in toits calculate, output. but, since G2 has some is usually value of G2 that theoretical advantages, when a computer program is doing the hard work, it Example 1.10:1.10 Californian Douglas FirsFirs (cont.) Example : Californian Douglas (cont.) is usually the value of G2 that is given in its output. The observed and estimated frequencies a negative binomial distri- were The observed and estimated frequencies using ausing negative binomial distribution Example 1.10 : Californian Douglas Firs (cont.) were in Table this number case J =of 5 (the numberreported) of categories given bution in Table 1.4.given In this case 1.4. J = 5In(the categories and two Thewere observed and estimated frequencies a negative reported) and two parameters were estimated from thedistribution data.binomial The reference parameters estimated from the data. The using reference is distritherefore a bution were given in Table 1.4. In this distribution J = 5 (the of degrees categories distribution is therefore withnumber 5-1-2=2 of of fit chi-squared distribution witha5chi-squared – 1 – 2 = 2 case degrees of freedom. The goodness reported) The and goodness-of-fit two parametersstatistics were estimated freedom. are givenfrom by the data. The reference statistics are given by distribution is therefore a chi-squared distribution with 5-1-2=2 degrees of 2 2 freedom. The 92 6.7statistics 1.32 are given by 3.6goodness-of-fit + + + + 0 = 10.86, X2 = 66.6 18.0 7.7 3.7 2 2  2 2    9 63 6.7 1.3 27 3.6 4 2 2 X + + + + 0 = 10.86, = G = 266.663 ln18.0 + 27 ln3.7 + · · · + 4 ln = 13.82.   66.6 7.7  18.0   64.0  63 27 4 G2 = 2 63 ln + 27 ln + · · · + 4 ln = 13.82. 66.6 18.0 64.0 a chi-squared The probability of obtaining a valuea of 10.86 more, distribution The probability of obtaining value ofor10.86 orfrom more,a chi-squared from 8 The probability 8 with 2distribution degrees ofwith freedom, is about 0.004. of a value greater 2 degrees of freedom, is about 0.004. The probability of a than The probability of13.82 obtaining a value more, from(with abinomial chi-squared greater is about 0.001.of It 10.86 seemsor that a negative dis13.82 isvalue about 0.001.than It seems that a negative binomial distribution the parameter 8 distribution with degrees of values freedom, is about 0.004. The probability of a (with the2the parameter estimated using the method of moments) valuestribution estimated using method of moments) provides a very unlikely description value than 13.82 is about 0.001. It seems a very unlikely description of the data. that a negative binomial dis of the provides data. greater tribution (with the parameter values estimated using the method of moments) Example 1.11 : Random counts (cont.) provides a very unlikely description of the data.  Table 1.3 reported the observed and fitted counts of successes based on Example 1.11 : Random counts (cont.) genuinely data.counts In this case, the estimated number for the category Example 1.11:random Random (cont.) Table 1.3 was reported the observed fitted counts of successes on ‘0 successes’ very small (0.4), so,and to improve the accuracy of thebased X 2 test, genuinely random data. In with this thecounts estimated number for the1 category Table that 1.3 reported the observed andcase, fitted successes based on genuinely category is combined the next to giveofthe category ‘≤ success’. ‘0 successes’ was very small (0.4), so, to improve accuracy of the thesuccesses’ X 2 test, was The estimated frequency the combined category is 3.41 and observed random data. In this case, thefor estimated number forthe the category ‘0 that category combined withaccuracy the nextoftothe give ‘≤ 1 success’. frequency 5.is Thus very small (0.4),isso, to improve the X 2the test,category that category is combined The estimated frequency for the combined category is 3.41 and the observed with the next to give the category ‘≤ 1 success’. The estimated frequency for the 2 frequency is 5. Thus 1.592 1.962 1.44 1.812is 5. Thus combined category isX3.41 and the observed frequency 2 + + + = 2.22. = 3.41 7.96 9.44 4.19 2 2 2 2 1.59 1.96 1.44 1.81 X2 = + + + = 2.22. 3.41 7.96 4.19 After combining the categories, J = 4.9.44 The corresponding value for G2 is 2.10. The data were used to estimate the value of one parameter (p), so 2 2The data After combining thedistribution categories, J =a4.chi-squared The value forvalue 2.10. Afterreference combining the categories, J =corresponding 4. The corresponding is the is distribution with 4G− 1isfor − 1G= 2 were used to estimate the value of one parameter (p), so the reference distribution 2.10. The data were These used to estimate the value one parameter so is a degrees of freedom. values are close to theof average value for(p), achichi-squared distribution with –chi-squared 1 = 2 degrees of freedom. These are the reference distribution is– 1adegrees distribution 4 − 1values − 1 =the 2 close squared distribution with 4two of freedom, so thewith hypothesis that degrees ofvalue freedom. These values arebeclose to with the average value of forfreedom, achito the points average for a chi-squared distribution two degrees so were scattered randomly would accepted.  squared distribution with were two degrees of freedom, so would the hypothesis that the the hypothesis that the points scattered randomly be accepted. points were be accepted.  8 Using R, the scattered appropriaterandomly command iswould 1- pchisq(10.86,2) 8 Using

R, the appropriate command is 1- pchisq(10.86,2)

1.9  AIC and related measures

This book is concerned with inferring the size of a population from information provided by samples from that population. It is usually true that, with two competing models that describe the sampled data, it is the more complicated model that provides the more accurate description. As an extreme example, suppose that a description is required for the following set of measurements, which have been taken at regular intervals:

0, 10.4, 19.8, 32.0, 39.9, 50.3.

One possible description is ‘the first observation is 0, the second is 10.4, the third is 19.8, the fourth is 32.0, the fifth is 39.9, and the sixth is 50.3’. This description is 100% accurate,

18  |  MEASURING ABUNDANCE but it is very long-winded and could not be described as a summary. By contrast the statement that ‘each observation is about 10 more than its predecessor with the sequence starting at zero’, while not a perfect description, is usefully concise and effectively describes the values. AIC (short for the Akaike Information Criterion) (AIC) is a measure, introduced by the Japanese statistician Akaike in 1973, that balances model complexity against goodness of fit. AIC values are reported by many computer programmes. The value reported for a particular model is not in itself important; what matters is how large that value is compared to the values reported for alternative models. The best model (providing that it makes sense to the investigator) is the model with the smallest AIC value. If it were simply a matter of comparing AIC values, then the investigator would be replaced by the computer! A discussion of this point is provided by Mac Nally et al. (2017) who ask the question ‘is the “best” model any good?’ For logistic or log-linear models (Section 1.7) the value of AIC is equal to a constant (that depends on underlying distributional assumptions) plus the value of the deviance G 2 (Equation (1.18)) reduced by 2d, where d is the degrees of freedom associated with the G 2 value. Adding extra information into a model (for example, whether a creature is a male or a female), will never increase the value of G 2, but it will decrease d. If the information is not sufficiently valuable, then AIC will increase. Some programmes report the value of AICc, which incorporates a correction for sample size. The difference between AIC and AICc is usually very small. The Bayesian Information Criterion (BIC) works in the same way as AIC, but associates a greater penalty to the loss of a degree of freedom. The two measures have subtly different motivations: AIC seeks to select that model, from those available, that most closely resembles the true model (which will be governed by a myriad of unmeasured considerations and will not be among those considered), whereas BIC assumes that the correct model is among those on offer and seeks to identify that optimal model. Since, in the context of the measurement of abundance, none of the models considered are likely to provide a perfect description, AIC should be the measure used.

1.10  Quantile-quantile plots In distance sampling (Chapter 8) it is necessary to fit curves to a histogram of frequencies. A difficulty with histograms is that they require data to be grouped into ranges of values: different groupings give rise to different histograms and, potentially, to different impressions concerning how well the curve fits the data. An alternative, that avoids grouping the data, is the quantile-quantile plot, which is often called the q-q plot. Suppose that the observed values are numbered in increasing order, so that x 1 ≤ x 2 ≤ … x n . So (roughly) a proportion 1/n of the observed values are less than or equal to x 1, while 2/n are less than or equal to x 2, and so forth. In a q-q plot these proportions9 are plotted against the corresponding theoretical tail probabilities F(x 1), F(x 2), …, F(x n), where F(x) is the theoretical distribution function (Section 1.3) under test. Figure 1.9 shows two examples of q-q plots. For a good-fitting model the plots lie close to the line of equality. In a poor-fitting model there will be noticeable divergences.

In a q-q plot these proportions9 are p ical tail probabilities F(x1 ), F(x2 ), distribution function (Section 1.3) u 1.0

STATISTICAL IDEAS  | 19 0.8 0.2 0.0

0.0

0.5

Empirical cdf

1.0

Figure 1.9 Two q-q plots. (a acceptably. (b) An example of an

0.0

0.0

0.2

0.2

0.4

0.4

0.6

0.6

Fitted cdf

Fitted cdf

0.4

0.8

0.6

(b) 0.8

(a)

Fitted cdf

1.0

1.0

(a)

0.0

0.5

Empirical cdf

1.0

0.0

0.5

1.0

Empirical Figure cdf 1.9 shows two examples o

close to the line of equalit Figure 1.9  Two q-q plots. (a) An example of a model that fits acceptably.plots (b) Anlieexample of an noticeable divergences. unacceptable fit. 1.10.1

1.10.1  Cramér-von Mises test

Cram´ er-von Mises test

This test was suggested by the Swed

by the Austrian, von Mises, in 19 This test was suggested by the Swede, Cramér, in 1928, and, independently, by the STATISTICAL IDEAS  and Austrian,24von Mises, in 1931. Denoting the observed distribution function andthe correspondi functionby by F(x) the corresponding theoretical function by F(x), it collects together all the discrepancies in together the discrepancies the q-q W 2 , given 9 In practice the q-q plot into aall single statistic W2, in given by plot into a single statistic the values used are 1/2n, 3/2n by

2  1  F(x + .(1.19) (1.19) i ) − F(xi ) 12n i=1 n

W2 =

2

the theoretical distribution is incorrect. Largeofvalues of W suggest that thethat theoretical distribution is incorrect. Large values W2 suggest

Part II

Stationary individuals It might seem easy to count stationary objects, but the length of this part of the book is an indication that life is not that simple. The questions that arise are where to count, what to count, and how to count. If the intention is to count all the individuals in a given region, then rules are required for individuals that overlap the region’s boundary. The counting procedure must be both practical and reliable. Much of the discussion in the next few chapters centres on finding efficient methods to achieve objectives. Before continuing, the reader might ponder on the problems associated with counting daisies in a playing field, and then contrast the chosen method with counting oak trees in a wood. One size does not fit all.

2.  Quadrats and transects The term ‘quadrat’ originally referred to a square wooden sampling frame (typically of area 1 m2). It now refers to a region of any shape and size (but usually rectangular or square) that will be used to take repeated samples from a population of interest. Quadrats might be used to estimate species abundance (richness), to determine which species are present, or to determine the amount of ground cover (or canopy cover) attributable to particular species. A narrow rectangular quadrat may be described as a transect. In this book it will be assumed that there is a random element to the placement of a quadrat. However, there is an alternative procedure in which the region examined (known as a relevé) is carefully chosen to be fully representative of the wider area of interest. That choice must be made by someone with a deep know­ledge of the general region. Workers using relevés are particularly interested in the interactions between species. Their aim might be to create a taxonomy of vegetation types. This branch of study has been termed phytosociology.

2.1  What shape quadrats? Whatever shape quadrat is used, there must be clear rules relating to plants that overlap the quadrat boundary. For a rectangular or square quadrat, a typical rule would be that the plant is counted if it overlaps the north or east boundaries, but not if it overlaps the south or west boundaries. This avoids double counting or overestimation of species abundance. For further discussion of problems caused by the edge see Section 2.4. With small quadrats it will be useful to take a photograph in case of doubt when collating results.

2.1.1  Advantages of rectangular quadrats • In Section 9.3.1 it is shown that, on average, more species are likely to be found in rectangular plots than square plots of the same area. • It is easy to keep track of which parts of the quadrat have already been searched. With a square quadrat it might be necessary to mark out subdivisions to avoid double counting or missed sections. • Narrow rectangular quadrats also called belt transects, strip transects (or simply transects) might be easily monitored by a single observer travelling down the plot centre (for example, a diver monitoring the ocean bottom), or by two observers travelling in tandem down either edge of the plot.1

QUADRATS AND TRANSECTS  | 23 • Narrow rectangular quadrats result when remote underwater video (RUV) technology is used to monitor cover on the sea bed. A comprehensive review of alternative approaches is provided by Mallet and Pelletier (2014). • Narrow rectangular quadrats are convenient if it is necessary to avoid entering the quadrat (in order not to tread on, or otherwise damage, the individuals of interest).

bb Advice on data collection

HOW MANY QUADRATS?

29

For rare plants, Elzinga et al. (1998) suggest using a quadrat with a width of 0.25 m or 0.5 m. 2.1.2 Advantages of square quadrats Square quadrats are the natural choice when a grid of contiguous quadrats is used.

2.1.2  Advantages of square quadrats Boundary overlap is less of a problem

with a square quadrat than it

would beare forthe a natural rectangular quadrat the same area, because of the • Square quadrats choice whenof a grid of contiguous quadrats is used. 2 hasbe for a perimeter of aaproblem square. For example, a square 1m • Boundarysmaller overlap is less of with a square quadrat thanquadrat it would a perimeter of 4 m, whereas a rectangular 2 m×0.5 m quadrat has a 5 m rectangular quadrat of the same area, because of the smaller perimeter of a square. perimeter. For example, a square 1 m2 quadrat has a perimeter of 4 m, whereas a rectangular are has a natural choice when a square grid is already in existence 2 m × 0.5 Squares m quadrat a 5 m perimeter. for other purposes. Forwhen example, in the grid United is a well• Squares are a natural choice a square is Kingdom, already inthere existence for other 1 km square National Grid for mapping In the Grid purposes.defined For example, in the UK, there is created a well-defined 1 kmpurposes. square National United States, the Public Land Survey System of the Bureau of Land created for mapping purposes. In the USA, the Public Land Survey System of the Management uses 1-mile-square regions. Bureau of Land Management uses 1-mile-square regions. 2.1.3

Advantages of circular quadrats

2.1.3  Advantages of circular quadrats Circular quadrats are easily defined with

a central pole and a rope in

clear ground. Bibby defined et al (2000) circular in theground. • Circular quadrats are easily withrecommended a central pole and a quadrats rope in clear context of burrow-nesting seabirds. Bibby et al. (2000) recommended circular quadrats in the context of burrow-nesting seabirds. Kershaw et al (2016) provide a useful discussion of the relative merits • Kershaw of et circular al. (2016)and provide a useful of theThey relative merits of circular and rectangular (ordiscussion square) plots. observe that since circular plots have theThey smallest possible for a plots given have area, the theirsmallest rectangular (or square) plots. observe thatperimeter since circular use minimises problems. possible perimeter for aboundary given area, their use minimizes boundary problems. 2.2 How many quadrats? 2.2  How many quadrats? Any increase in the number of quadrats sampled will result in an increase Any increase in the number of quadrats sampled will result in an increase in accuracy. in accuracy. However, since resources are finite, a practical approach is tonumber However, since resources are finite, a practical approach is to use the smallest use the smallest number that gives the desired accuracy. Suppose that there that gives the desired accuracy. Suppose that there are n quadrats, each of area a, being are n quadrats, each of area a, being used to estimate the total number used to estimate the total number of individuals in a region with area A. Let the sample of individuals in a region with area A. Let the sample mean number of 2 mean number of individuals bysample x¯, with the sample being s . The 2 . The estimate individuals be denoted be by denoted x ¯, with the variance being svariance estimate of of the the number numberofofindividuals individuals entire region, N is given , ^ forfor thethe entire region, N is ,given by by



 = Ax ¯.(2.1) (2.1) N a 2

Comparison the values x ¯ and s gives ideaofofthe the spatial spatial pattern pattern of the anan idea Comparison of theofvalues of x¯ofand s 2 gives ¯ then the of the locations of the species 2concerned. If s2 is much less than x locations of the species concerned. If s is much less than x¯ then the individuals are rather individuals are rather regularly placed in the study region. If s2 is much 2 is much regularlygreater placedthan in the study region. If s greater than x then the individuals ¯ x ¯ then the individuals occur in clusters. The intermediate case is a random pattern (which typically appears visually to be a mix of regular and clustered components).

24  |  MEASURING ABUNDANCE QUADRATS TRANSECTS occur in 30 clusters. The AND intermediate case is a random pattern (which typically appears 30 30 be aQUADRATS QUADRATS AND AND TRANSECTS TRANSECTS visually to mix of regular and clustered components). An approximate 95% 95% confidence interval Section1.6.2) 1.6.2)forfor number of An approximate confidence interval(see (see Section thethe number An confidence interval An approximate 95% confidence interval (see (see Section Section 1.6.2) 1.6.2) for for the the number number of individuals in the region is individuals in approximate the region is95%

of is of individuals individuals in in the the region region  is  As As   N  .(2.2)  − 2 As  + 2 As √ √ (2.2) N ,  − 2 aAs  + 2 aAs n ,, n .. N N (2.2) √n √n N − 2 a√ N + 2 a√ (2.2) a n a n To gain an idea of the implications of these formulae, suppose that the To an of the implications of the To gain angain idea theare implications of these formulae, thatthat the objects of To gain anofidea idea of the implications ofinthese these formulae, suppose that the objects of interest randomly located spaceformulae, withsuppose an suppose average of λ per objects of interest are randomly located in space with an average of λ per objects of interest are randomly located in space with an average of λ per interest are randomly located in space with an average of λ per unit area. The randomness unit area. The randomness assumption implies that the objects form what unit area. The assumption the objects form what unitimplies area. (1989) The randomness assumption implies that the objects form what assumption thatrandomness the objects form forest. whatimplies Matheron (1989) termed a of Poisson Matheron termed a Poisson In thisthat case, with quadrats unit forest. Matheron (1989) termed a Poisson forest. In this case, with quadrats of unit Matheron (1989) termed a Poisson forest. Inobserved thisand case,counts with quadrats unit area, bothquadrats the mean the variance ofthe themean will be of λ.of Since In this case, with ofand unit area, both the variance the observed area, the and the variance the will λ. Since both the mean mean and theconfidence variance of ofinterval the observed observed counts will be λ. Since aarea, =be 1,both theSince approximate 95% for thecounts number of be individuals counts will λ. a = 1, the approximate 95% confidence interval for the number of a 1, the 95% a= = 1,area the approximate approximate 95% confidence confidence interval interval for for the the number number of of individuals individuals in an A will thenthen be individuals in an area A will be in be in an an area area A A will will then then be         λ λ  λA − 2A λλ , λA + 2A λλ . (2.3) n n , λA + 2A (2.3) λA − 2A λA − 2A n , λA + 2A n ..(2.3) (2.3) n n Expressing the width of the interval as a proportion of the value being meaExpressing the of as of value meaExpressing theofwidth width of the the interval interval as aa proportion proportion of the the being value being being mea- gives sured Expressing thegives width the interval as a proportion of the value measured  sured gives sured gives  1 4 11 . 44 nλ . nλ . If this is to be no greater than 25%, thennλ simple maths shows that the required If to be than maths shows the If this this is is of to quadrats be no no greater greater than 25%, 25%, then simple maths shows that the required required number is 256/λ. Thusthen eightsimple quadrats would bethat sufficient if they If this is number to be noof greater than 25%, then simple maths shows that the required quadrats is 256/λ. Thus eight quadrats would be sufficient ifif they number of quadrats is 256/λ. Thus eight quadrats would be sufficient theynumber were large enough to contain on average 32 individuals, but far more quadrats of quadrats is 256/λ. Thus eight quadrats would be sufficient if they were large were enough contain on 32 were large large enough to to contain on average average 32 individuals, individuals, but far more more quadrats quadratsenough would be required if the average per quadrat was muchbut less.far would be required if the average per quadrat was much less. to contain on average 32 individuals, but far more quadrats would be required would requiredspaced if the average per quadrat was much Withberegularly individuals, fewer quadrats are less. required than would if the With regularly spaced individuals, fewer quadrats are required than would average per quadrat was much less. With regularly spaced individuals, fewer quadrats are required than be the case for a Poisson forest, but with clustered individuals far morewould samthe case for aa Poisson forest, with clustered individuals far more samberegularly themay casebe for Poisson forest, but with of clustered individuals far the more samWithbe spaced individuals, fewer quadrats are required than would be pling necessary. In the but absence clear information on spatial the absence clear on pling may be bea necessary. necessary. In the absence ofindividuals clear information information on the the spatial spatialmay be distribution, pilot study will beclustered useful. of the case pling for a may Poisson forest, butIn with far more sampling distribution, aa pilot will be distribution, pilotofstudy study be useful. useful. on the spatial distribution, a pilot study necessary. In the absence clearwill information

(a) (b) (c) will be useful. (a) (b) (c) (a) (b) (c) To see the effect of the spatial distribution of the sampled individuals, consider the three cases illustrated in Figure 2.1. In each case there are 100 individuals within a region of 100 square units being sampled by the same eight quadrats of size 1 square unit. The extreme case (a) has a rectangular grid of individuals, with the grid spacing exactly matching the quadrat size, so that every quadrat contains exactly one individual. The resulting estimate of 100 individuals in the sampled region is exactly correct and without doubt. Just one quadrat would have been enough!

(a)

Figure 2.1 Three equal-sized regions each containing 100 individuals Figure 2.1 Three equal-sized regions containing 100 (b) (c) individuals Figure 2.1 The Three equal-sized regions each containing 100 individuals of interest. three patterns are (a)each regular, (b) random, and (c) of of interest. interest. The The three three patterns patterns are are (a) (a) regular, regular, (b) (b) random, random, and and (c) (c) clustered. clustered. clustered.

To see the effect of the spatial distribution of the sampled individuals, To the effect of spatial distribution sampled individuals, To see seethe thethree effect of the the spatial in distribution ofInthe the sampled individuals, consider cases illustrated Figure 2.1.of each case there are 100 consider the three cases illustrated in Figure 2.1. In each case there consider the three cases illustrated in Figure 2.1. In each case there are are 100 100

Figure 2.1  Three equal-sized regions each containing 100 individuals of interest. The three patterns are (a) regular, (b) random, and (c) clustered.

individuals within a region of 100 square units being sampled by the same eight quadrats of size 1 square unit. The extreme case (a) has a rectangular grid of individuals, with the grid spacing exactly matching the quadrat size, so thatQUADRATS every quadrat contains ex| 25 AND TRANSECTS  actly one individual. The resulting estimate of 100 individuals in the sampled region is exactly correct and without doubt. Just one quadrat would have Casebeen (b) enough! is an example of a random pattern (a Poisson forest). The eight quadrats have counts of 1, example 1, 2, 2, and giving pattern x¯ = 1.25,(as 2Poisson = 0.5, and an approximate 95% Case 0, (b)1,is1,an of a2random forest). The eight 2 confidence interval ascounts (75, 175). interval includes thescorrect quadrats have of 0,The 1, 1,confidence 1, 1, 2, 2, and 2 giving x ¯ = 1.25, = 0.5, value, and but it approximate interval (75, 175). The confidence interval is a wideanone indicating95% thatconfidence more quadrats areasneeded to give a useful interval. the correct value, but it is a wide one with indicating thatcounts more quadrats Caseincludes (c) illustrates a highly clustered pattern quadrat of 0, 0, 0, 0, 1, a useful 2, 3, andare 9, needed giving to x¯ =give 1.875, s 2 = interval. 9.55. Note that s 2 is much greater than x¯. In this case Case (c) illustrates a highly clustered pattern with quadrat counts of 0, 0, Equation (2.2) gives the approximate 95% confidence interval as (–31, 406). The lower 0, 0, 1, 2, 3, and 9, giving x ¯ = 1.875, s2 = 9.55. Note that s2 is much greater limit is clearly nonsense: it points to the need to gather more information. Increasing the than x ¯. In this case Equation (2.2) gives the approximate 95% confidence number interval of quadrats to 32 does provide a feasible interval of (63, but this is still very as (-31, 406). The lower limit is clearly nonsense:212), it points to the wide, and yet more sampling would be required before a usefully precise estimate need to gather more information. Increasing the number of quadrats to 32 could be obtained. does provide a feasible interval of (63, 212), but this is still very wide, and yet more sampling would be required before a usefully precise estimate could be obtained.

Example 2.1: Californian treestrees (cont.) Example 2.1 : Californian (cont.)

The counts the trees of Figure were1.1 summarized in Table For1.1. these Thefor counts for the trees of1.1 Figure were summarised in1.1. Table Forcounts 2 = 1.54, so that the approximate 2 x¯ = 0.65, and s 95% confidence interval for the number these counts x ¯ = 0.65, and s = 1.54, so that the approximate 95% confidence interval of Douglas firs in the entire plot is: of Douglas firsforinthe thenumber entire plot is: 200 × 300 16



0.65 ± 2



1.54 100



= 2437.5 ± 931.6 = (1506, 3369).

^  = 2437.5 The estimate N = 2437.5 is reassuringly closeclose to the truetrue number of of trees The estimate N is reassuringly to the number trees(2183), (2183), though theinterval confidence is soit wide that it wouldbe probably though the confidence is sointerval wide that would probably felt thatbe further felt that sampling was required.  sampling was further required.

bb Advice on data collection Historically, trees were mapped using measuring tapes and compasses, though the latter were somewhat imprecise, and the resulting maps might contain serious errors. As a result, Rohlf and Archie (1978) suggested methods based only on distances. When these were distances between two trees, any error in one tree position would lead to an error in the position deduced for the next tree. Boose, Boose, and Lezberg (1998) proposed an alternative distance-only method, that minimized the possibility of error amplification. A more straightforward procedure, that eliminates the possibility of accumulating errors, begins by setting out a carefully checked rectangular grid of marker posts at fixed distances across the study area. Subsequently, for each tree, its overall location can be determined by determining the direction of, and distance to, a nearby marker post (taking account of the diameter of the tree concerned). Typically, the measuring instruments are lasers and sighting compasses. This was the procedure used with the Californian survey plot. Currently, mapping often involves the use of GPS. However, the accuracy of GPS varies across the Earth’s surface, depending on the number of satellites for which there is a line-of-sight view. The accuracy has increased over time, with an increase in the number of satellites, and improvements in software and hardware. In June 2016, according to the US government, 95% of positions were reported as being measured correctly to within 2 m horizontally and 3 m vertically. In some locations,

26  |  MEASURING ABUNDANCE the GPS information can be augmented by other systems, to give much improved accuracy. Rayburn, Schiffers, and Schupp (2011) claimed accuracy to within 2 cm when locating shrubs by using a survey-grade base unit, together with a rover unit mounted on a fixed height pole equipped with a bubble level.

2.3  Quadrat placement In the previous example, the 100 quadrats were arranged in a contiguous 10 × 10 square. In practice, if the idea is to survey a much larger area, then this would be a poor choice, since neighbouring locations tend to be similar to one another, and may give a poor guide to the nature of distant locations within the area of interest. With scattered quadrats some form of random allocation is required to avoid conscious or unconscious bias. However, centring quadrats at randomly selected points, as illustrated in Figure 2.2 (a), may result in overlapping quadrats and an uneven coverage. (a)

(b)

(c)

(d)

Figure 2.2  (a) A random arrangement of quadrats. (b) An arrangement having one quadrat randomly positioned in each column. (c) An arrangement in which each row and each column contains exactly one quadrat. (d) A systematic arrangement with fixed gaps along rows and columns.

QUADRATS AND TRANSECTS  | 27 Overlapping quadrats are easily avoided by using some form of stratification. Figure 2.2 (b) illustrates an allocation in which each column contains one randomly placed quadrat. However, there still appears to be some clustering. This is reduced in Figure 2.2 (c) where both rows and columns contain a single quadrat.2 Figure 2.2 (d) illustrates systematic sampling. In this case a value of x is randomly chosen in the range 1 to 5, with the next sets of five values having x co-ordinates at intervals of 5 from the first. For each set, the same procedure is used to determine the y co-ordinates. This process guarantees that every quadrat has an equal chance of selection, with an even coverage for the entire study region. In a large-scale survey there may well be different terrain types or ecosystems (for example, urban, farmland, woodland, etc.). To ensure that all are represented in an appropriate way, random (or systematic) sampling should take place within each. This is called stratified sampling.

2.3.1  Generalized random-tessellation stratified design (GRTS) A problem with the idealized arrangements illustrated in Figure 2.2 (c) and (d) is that they may not work! For example, the chosen quadrats may miss important features of the study region; they may be inaccessible or they may have inappropriate habitat. Stevens and Olsen (2004) developed a procedure that provides a spatially balanced random sample while nevertheless avoiding these problems. The stages of their procedure are: 1. Establish a labelled grid of quadrats across the study region. 2. Arrange the labels in order along a ‘line’, giving each an appropriate ‘length’. Suppose the length of the ‘line’ is N. 3. Suppose n quadrats are required. Let N/n be denoted by k and let m be a value chosen at random in the range 1 to k. 4. The quadrats chosen are those corresponding to the locations (along the line) m, m + k, m + 2k, etc. Figure 2.3 illustrates the first stages in the labelling of quadrats for GRTS. The study region is divided into four sections; each section is divided into four subsections; each

3C

4C

3D

4D

1C

2C

1D

2D

2A

4A

3B

2B

1A

3A

1B

4B

C

D

A

B

b2C

d2C

a4C

d4C

b3D

a3D

d1C

c1C

c3C

b3C

a1D

d1D

a4A

d4A

a3B

b3A

a1B

a2C

c2C

b4C

a1C

b1C

a3C

a2A

c2A

b4A

d2A a1A d1A

b2A c1A

b1A

a3A c3A

c4C

d3C c4A

d3A

c3D c1D c3B

d1B

c2D

d3D

a2D

b1D

b4D

b3B d3B c1B

b1B

a4D d2B a2B a4B b4B

b2D d2D c4D

d4D b2B c2B

d4B c4B

Figure 2.3  The study region is divided into 4 sections. Each section is broken into 4 subsections. Each subsection is again split. This continues until quadrats of the required size are obtained.

d4D d4C d4B d3D d3C d3B d3A d2D d2C d2B d2A d1D d1C d1B d1A c4D c4C c4B c4A c3D c3C c3B c3A c2D c2C c2B c2A c1C c1B c1A b4D b4C b4B b4A b3D b3C b3B b3A b2D b2C b2B b2A b1D b1C b1B b1A a4D a4C a4B a4A a3D a3C a3A a2D a2C a2B a2A a1D a1C a1B a1A

28  |  MEASURING ABUNDANCE

b2C

d3D

a4D

c1C

c4D

a1C

d2B

b4A

c3B

a3A

Figure 2.4  The upper part shows the line formed by the ordered 61 quadrats of Figure 2.3. The lower part shows the ten selected quadrats. subsection is further divided. The process continues until the smallest quadrat size is deemed appropriate for the application. The quadrats are then arranged in order using reverse hierarchical ordering. For the case illustrated that order is:

d4D, d4C, d4B, d4A, d3D, d3C, …, d1B, d1A, c4D, c4C, …, a1A.

In the example there are N = 64 quadrats. Suppose that three central quadrats (c1D, a3B, and d4A) are unusable (length 0), and that the remainder are equally important (length 1). The resulting ‘line’ is illustrated in the upper part of Figure 2.4. Suppose we want to choose n = 10 quadrats. Thus k = (64 – 3)/10 = 6.1. Suppose the randomly chosen start is m = 4.31. The chosen quadrats are those corresponding to the points 4.31, 10.41, 16.51, …, 59.21. The process and the outcome is illustrated in Figure 2.4. In this example all 61 usable quadrats were given equal weight. On some occasions, however, there will be a minority of quadrats that correspond to habitats that the investigator is particularly keen to sample. For example, if quadrat c3D is assigned a ‘length’ 10, rather than 1, then its probability of inclusion will be 10 times that of the other quadrats.

2.4  Forestry sampling Although there are some specific challenges that face foresters, much of this section has wider application.

2.4.1  Cruising In forestry, timber stocks may be assessed by examining the trees in a sequence of sample plots, often arranged along a series of parallel paths, in the study region (this is referred

QUADRATS AND TRANSECTS  | 29

Figure 2.5  Forestry sampling may consist of a ‘cruise’ with samples taken at regular intervals (here using circular plots) on a series of parallel paths through the forest. Sample plots that overlap the edge need special treatment. to as fixed-plot sampling). Foresters refer to the process of travelling from one plot to the next as cruising. An example cruise is illustrated in Figure 2.5. Note that some sample plots overlap the edge; the subsequent analysis will need to take this into account (forest edges should be sampled since they are likely to have a different composition from forest centres).

2.4.2  Cluster sampling When sampling forests, a major cost is likely be associated with the journey to a sampling point. Once that point has been reached, it will therefore be cost-effective to take several samples before moving to the next sampling point. The number of subplots, and the distances between them, must be decided without reference to the local terrain. Some examples of possible configurations and routes are shown in Figure 2.6. According to Yim et al. (2015), configurations (a) and (d) have been used in national forest inventories in South Korea, with (b) being used in Germany. Their study considered the times associated with the stages of the sampling process and they also considered the effects of spatial correlation on the accuracy of estimates. They recommended (d) for future use in South Korean forests. The impact of quadrat placement on apparent species diversity is discussed in Section 9.3.1.

(a)

(b)

(c)

(d)

Figure 2.6 Example configurations of forestry subplots, with possible connecting routes (dotted lines).

30  |  MEASURING ABUNDANCE

A

B

r

r

C

r

Figure 2.7  A, B, and C are locations in the study region. Circular quadrats of radius r are being used. A would be included in the sample if the quadrat centre lay anywhere within the dotted circle shown. B would be included by any quadrat whose centre lay within the semicircle shown. C is more likely to be included than B, but is less likely than A.

2.4.3  Slopover bias In Figure 2.7 circular quadrats of radius r are in use. Locations A, B, and C lie within the region of interest. Whether or not they are included in a sample will depend upon the location of the sampling points. A sampling point centred anywhere on, or within, the dotted circle of area πr 2 surrounding A, will result in A being included. For B, however, the relevant region is not a circle, but a semicircle of area πr 2/2. Thus A is twice as likely to be selected as B. Any location, such as C, that is within r of the edge will have a reduced chance of being selected. In forestry, this bias, which results from quadrats potentially overlapping the edge of the study region, is termed slopover bias. Note that slopover bias exists whether or not any quadrats actually overlap the edge and irrespective of the shape of the quadrat. The bias applies wherever quadrats are used, but may be particularly important in forestry, where the species composition and plant density at the edge may be distinctly different from that in the centre of the study region. One way of eliminating slopover bias is to allow the quadrat centre to be placed up to a distance r outside the study region (Masuyama, 1954). However, this would not be feasible if, for example, the study region bordered a motorway, or a large body of water. If the proportion of a quadrat that lies within the study region is known, then one solution is to scale up the plant count. For example, if 3/5 of the quadrat lies inside the study region, then the count made for that quadrat should be multiplied by 5/3. Here are two other methods of eliminating the bias.

2.4.3.1  The mirage method This method, introduced by Schmid-Haas (1969), can be used when the study region has a straight edge. If a quadrat overlaps the edge, then the overlap is notionally folded back into the study region with plants in the folded area being added to those already

QUADRATS AND TRANSECTS  | 31 (a)

(b) 1

0 0 2

1

2

1

Figure 2.8  The mirage method for correcting slopover bias. (a) With a circular quadrat, plants in the reflected area get weight 2. (b) With a rectangular quadrat having a different orientation from the study region, some plants will get weight 2, but the reflected region may also result in new plants being selected. counted. For a circular quadrat this will mean that some plants are counted twice. This is illustrated in Figure 2.8 (a). With a square or rectangular plot, if the quadrat is at an angle to the edge of the plot, then new plants may be counted. This is illustrated in Figure 2.8 (b). Gregoire (1982) demonstrated that the procedure results in unbiased estimates of abundance.

2.4.3.2  The walkthrough method Ducey, Gove, and Valentine (2004) suggested a variant on the mirage method that treats each near-edge plant individually. Crucially, it does not require a straight edge to the boundary. The decision on whether a plant should be doubly weighted, is based on whether a plant at twice the distance in the same direction, would fall outside the region of interest (double weighting), or inside the study region (single weighting) Examples are shown in Figure 2.9. Valentine et al. (2006) suggest variations of the procedure for use with cluster sampling.

B A 2 P

1 D2 C1

Figure 2.9  The walkthrough method. From sampling point P, the plants at A and D are doubly weighted, whereas those at B and C do not get counted twice because, at twice the distance, the paths from P fall within the study region.

32  |  MEASURING ABUNDANCE

2.4.4  Circular quadrats for sampling coarse woody debris The term ‘coarse woody debris’, often shortened to CWD, refers to the fallen tree trunks (or large branches) lying on the floor of a forest. Gove and van Deusen (2011) identify three methods that may be used when a fallen tree straddles the boundary of a circular quadrat of radius R.

2.4.4.1  The stand-up method 40 QUADRATS TRANSECTS For this theAND fallen tree is imagined as standing up on the spot where its base 40method QUADRATS AND TRANSECTS currently rests (which may not be where it originally grew). If that spot lies within originally grew). If spot lies the the quadratit the entire tree is included thequadrat sample.then Otherwise, it then originally grew).fallen If that that spot lies within withininthe the quadrat then the entire entirethe tree is tree is included in the sample. Otherwise, the tree is ignored. ignored. fallen fallen tree is included in the sample. Otherwise, the tree is ignored. Suppose sampling is performed using NN quadrats r. Denote the number Suppose sampling is using quadrats ofradius radius r. Denote Suppose sampling is performed performed using N quadratsofof radius r. Denote the the , with the relevant measurement number of fallen trees in quadrat j by n of fallen trees in quadrat j by n , with the relevant measurement (e.g. the tree’s length, j j quadrat j by nj , with the relevant measurement number of fallen trees in (e.g.the tree’s length, volume, or surface area) for tree i being denoted volume, or (e.g.the surface tree’s area) for tree i being denoted by y . Using only quadrat j, the estimate length, volume, or surface area) for tree i being denoted ji . Using only quadrat the for by region for the entire (of area being j, sampled is only A) quadrat j, the estimate estimate for the the entire entire region region (of (of area area by yyji ji . Using

A) A) being being sampled sampled is is



n  njj A  A yyji ..(2.4) (2.4) tt = j j = πr2 (2.4) ji πr2 i=1 i=1

Combining information from all gives an overall as information all N N quadrats quadrats an estimate overall estimate estimate as CombiningCombining information from all Nfrom quadrats gives an gives overall as

N  N 11  t  (2.5) t = t= N tjj .. (2.5) (2.5) N j=1 j=1



The The chainsaw chainsaw method method If If the the tree tree trunk trunk is is judged judged to to be be straddling straddling the the plot plot boundary then the amount included is that which lies within the plot boundary then the amount included is that which lies within the plot 2.4.4.2  The chainsaw method (having hypothetically been cut out of the entire tree using a chainsaw). (having hypothetically been cut out of the entire tree using a chainsaw). Gove and van (2011) three possible protocols protocols for judging If the tree trunk judged to be straddling the plot then the for amount included is Gove is and van Deusen Deusen (2011) describe describe threeboundary possible judging whether a fallen tree straddles the plot boundary: a fallen tree(having straddles the plot boundary: that which whether lies within the plot hypothetically been cut out of the entire tree using

a chainsaw). Gove and van Deusen (2011) describe plot. three possible protocols for judging 1. 1. If If any any part part of of the the trunk trunk falls falls within within the the plot. whether a fallen tree straddles the plot boundary: 2. If any part of the trunk’s centre line (from top to bottom, through 2. If any part of the trunk’s centre line (from top to bottom, through

core of trunk) falls within the 1. If any partthe of the within plot. the coretrunk of the thefalls trunk) falls the within the plot. plot. 3. If any cross-section falls within the plot. 2. If any part of the trunk’s centre line (from top 3. If any cross-section falls within the plot. to bottom, through the core of the trunk) falls within the plot. Of only the first (Gove and van Deusen, 2011). Of these, these, onlyfalls the within first is is unbiased unbiased 3. If any cross section the plot. (Gove and van Deusen, 2011).

Of these, only the first is unbiased (Gove and van Deusen, 2011).

2.4.4.3  The sausage method

LOG LOG

If any part of the trunk’s centre line (from top to bottom, through the core of the trunk) falls within the plot, then the entire tree is included in the sample. For a trunk to be Catchment region Catchment region included, the centre of the quadrat must lie within r of the trunk’s centre line. The resulting ‘catchment region’, illustrated in Figure 2.10, was likened to a sausage by Gove and van Deusen (2011). Gove and van Deusen (2011) discuss the relative merits of the three methods. They note that the 2.10 sausageThe method is likely to measure trees in than otherif methods, Figure sausage method: the log ismore included thethe sample Figure 2.10 the the quadrat quadrat on on the the edge edge

The sausage method: the log is included in the sample if center center lies lies within within the the catchment catchment region. region. Quadrats Quadrats centered centered of the catchment region touch the centre of the catchment region touch the centre line line of of the the log. log.

QUADRATS AND TRANSECTS  | 33

LOG

Catchment region

Figure 2.10  The sausage method: the log is included in the sample if the quadrat centre lies within the catchment region. Quadrats centred on the edge of the catchment region touch the centre line of the log. thereby causing more work, but giving less variable estimates. They note that a fallen tree that has broken into pieces will cause problems when using the stand-up or sausage methods. However, calculating the value of interest (e.g. the volume) is more difficult with the chainsaw method. An alternative approach to the measurement of CWD is discussed later in Section 5.4. A comprehensive review of alternative methods is provided by Russell et al. (2015).

2.5  Quadrats for estimating frequency In the present context, frequency is defined as the proportion of equal-sized quadrats that contain the species of interest. As Figure 2.11 indicates, frequency should not be confused with quantity. Since, when measuring frequency, the exact number of individuals is not required, results are obtained speedily with excellent agreement to be expected between independent observers. Using a grid of contiguous quadrats of an appropriate size, frequency can be a useful guide to the regions where the species is scarce or abundant. However, if the quadrats are too large, then the species may appear to be omnipresent, while, if the quadrats

A A

B

B B

Figure 2.11  Species A has a greater frequency (2/3) than species B (1/3), but there is a greater quantity of species B (3 individuals) than species A (2 individuals).

34  |  MEASURING ABUNDANCE are too small, there will be much unnecessary recording. A good size for a quadrat is therefore one such that between 25% and 75% of quadrats contain the species of interest. If several species are of interest, and some are scarce, then it may be necessary to combine 42 42QUADRATS QUADRATS AND TRANSECTS AND TRANSECTS contiguous quadrats, or use nested quadrats of differing sizes (see Section 2.6). Note that comparisons involving frequency can only be valid if they are made using data from isonsisons involving involving frequency frequency can only can only be valid be valid if they if they are made are made usingusing datadata fromfrom quadrats42of identical and shape. QUADRATSsize AND TRANSECTS quadrats quadrats of identical of identical size and size shape. and shape. Frequency maps can a fast means formeans monitoring change in species Frequency Frequency mapsbe maps can be canaand be fast aeffective fast and and effective effective means for monitoring for monitoring change change occurrence, though there needs to be a clear definition of what is meant by ‘occurrence’. isons involving frequency can only be valid if they are using data from in species in species occurrence, occurrence, though though there there needs needs to betoa be clear amade clear definition definition of what of what is is quadrats identical sizeFor anddefined shape. For a plant, ‘occurrence’ might be as having roots within thedefined quadrat meant meant byof‘occurrence’. by ‘occurrence’. aFor plant, a plant, ‘occurrence’ ‘occurrence’ might might be defined be as having as (so-called having Frequency maps can be a fast and effective for monitoring change root frequency); for within a the tree that might mean having its means trunk the quadrat; for anmean animal roots roots within quadrat the quadrat (so-called (so-called root root frequency); frequency); for in afor tree a that tree that might might mean species though there needs beoraits definition of what isTo having having itsoccurrence, trunk its trunk in the in quadrat; thethat quadrat; for creature an foranimal antoanimal aclear or bird a bird theinequivalent the would would or a birdin the equivalent would be the has home theequivalent quadrat. avoid meant ‘occurrence’. ahome plant, might beboundaries. defined as having be that bebythat the creature theare creature hasFor its has its organisms home in‘occurrence’ the in quadrat. thethat quadrat. To avoid To avoid double-counting, double-counting, double counting, rules required for straddle For example, roots within the quadrat (so-calledthat root straddle frequency); for a tree that mean rulesrules are required are for organisms for straddle boundaries. For might example, For example, an organism may be required counted if itorganisms straddlesthat a ‘south’ orboundaries. ‘west’ boundary, butannotanif it having its may trunk incounted the quadrat; for an animal or aorbird the equivalent would organism organism may be be counted if it straddles if it straddles a ‘South’ a ‘South’ ‘West’ or ‘West’ boundary, boundary, but not but not straddles the ‘north’ or ‘east’ boundaries. be creature has itsorhome inboundaries. theboundaries. quadrat. To avoid double-counting, if itthat straddles if itthe straddles the ‘North’ the ‘North’ ‘East’ or ‘East’ A more liberal definition of occurrence in a quadrat simply requires that an some part rules for organisms that straddle For example, A are more Arequired more liberal liberal definition definition of occurrence of occurrence in aboundaries. in quadrat a quadrat simply simply requires requires that that of the object occurs in the quadrat. In the context of shrubs, this is referred to shoot organism beofcounted if itoccurs straddles a ‘South’ ‘West’ not somesome partmay part of the object the object occurs in the in quadrat. the quadrat. Inorthe In context theboundary, context of shrubs, ofbut shrubs, thisas this frequency. For mobile creatures, the equivalent would be that the creature is simply ifisitreferred straddles ‘North’ orfrequency. ‘East’For boundaries. is referred to the astoshoot as shoot frequency. mobile For mobile creatures, creatures, the equivalent the equivalent would would Athat more definition of quadrat. occurrence in a passing quadrat simply requires observedbein (or passing Whatever definition of frequency is used, be that theliberal creature the through) creature is simply isthe simply observed observed in (or in (or passing through) through) the quadrat. thethat quadrat. some part of thedefinition object in the quadrat. In the context offor shrubs, this the critical requirement is that itoccurs is clearly and is the same Whatever Whatever definition of frequency of frequency is specified, used, is used, the critical the critical requirement requirement isany that iscomparisons that it is it is isclearly referred tospecified, as shoot frequency. For mobile the across equivalent clearly specified, and is and theis same the same for any for comparisons anycreatures, comparisons across spacespace andwould time. and time. across space and time. be that the creature simply in (or passing through) the is quadrat. If a species If a species is present in rnin ofobserved the r of nthe quadrats, nthen quadrats, then then the frequency the is frequency defined is If a species is present inisrispresent of the quadrats, the frequency defined asdefined r/n. If the Whatever of frequency used, the results critical requirement is that it for is for asuse r/n. Ifdefinition theIf intention the intention is isuse toisthese use these results to estimate to frequency the frequency intentionasisr/n. to these results to to estimate the frequency forestimate thethe entire region sampled, clearly and issampled, the then samethen for any comparisons space and the entire thespecified, entire region region sampled, some some idea idea of the of uncertainty the across uncertainty in the in estimate thetime. estimate then some Ifidea of the uncertainty in the estimate will be required. Confidence awill species is present in r of theintervals n quadrats, then the is definedintervals will be required. be required. Confidence Confidence intervals for proportions for proportions arefrequency not are straightforward not straightforward for proportions are not straightforward propositions (see Upton (2016) for details), and as r/n. If the(see intention is to (2016) use results estimate theappropriate frequency for propositions propositions (see Upton Upton (2016) forthese details), for details), andtothe and most the most appropriate approxapproxthe mostthe appropriate approximate 95% confidence interval appears to be that suggested entire sampled, then someappears idea the in imate imate 95%region 95% confidence confidence interval interval appears to of beto that beuncertainty that suggested suggested bythe Agresti byestimate Agresti and and by Agresti and Couli (1998), is will be required. Confidence for proportions are not straightforward Couli Couli (1998), (1998), which which iswhich is intervals propositions (see Upton (2016) for and  most appropriate approx   the   details), r +Agresti 2r + 2 and 1 1 interval appears 1 1to be that suggested by imate 95% confidence p˜) where where p˜),− p˜),+ 2p˜ + 2p˜(1 −p˜(1 p˜) −   where  p˜ = p˜ = . .(2.6) (2.6) p˜ − 2p˜ − 2p˜(1 −p˜(1 (2.6) Couli (1998), which n + 4n + 4 n n is n n     r+2 1 1 Example Californian (cont.) Example . (2.6) (1 −: p˜Californian ), p˜ + 2treesptrees ˜(1 − p˜(cont.) ) where p˜ = p˜ − 2 2.2 p˜:2.2 n+4 n n 2.12 2.12 onceonce againagain shows the positions of the firs (Pseudotsuga Figure shows the positions ofDouglas the Douglas firs (Pseudotsuga ExampleFigure 2.2: Californian trees (cont.) menziesii )2.2 in the corner of the Californian tree tree plot.plot. SinceSince the species occurs menziesii ): in the corner of the Californian the species occurs Example Californian trees (cont.) Figurein 2.12 again thesquares, positions ofquadrats the of Douglas firswould (Pseudotsuga menziesii) each of the m×10 m quadrats that size be ineffective inonce each of 10 theshows 10 m×10 m squares, of that size would be ineffective Figure 2.12 again shows positions of theThe Douglas firs in the in corner of theonce Californian tree plot. the species each of the monitoring the frequency of the the species. The smaller 4 occurs m×4 minquadrats in monitoring the frequency of the Since species. smaller 4(Pseudotsuga m×4 m quadrats ) more in the corner of37% the Californian tree plot. Since the species occurs the 10 m ×menziesii 10 much m squares, quadrats of 37% that size would bespecies. ineffective in monitoring are effective: of these include the are much more effective: of these include the species. in each the 10 m×10 msmaller squares, that would be The approximate 95% confidence for the frequency in ineffective the The approximate 95% confidence forsize the frequency in entire the entire frequency of of the species. The 4quadrats minterval × 4interval mofquadrats are much more effective: monitoring Research Plot the isthefrequency Research Plot is 37% ofin these include species. of the species. The smaller 4 m×4 m quadrats   are much more effective: 37% of these include the species. The approximate for the frequency in the entire Research 39 95% 1 39 39 confidence 1 65 39interval 65 The approximate for±the in the entire ± 2 95% 0.375 ± 0.097 = frequency (0.28, 0.47). ± 2 confidence= interval = 0.375 0.097 = (0.28, 0.47). Plot isResearch Plot104 100104 104 104 is 104 100 104    1 39 65 39 ±2 = 0.375 ± 0.097 = (0.28, 0.47). 104 100 104 104 2.5.12.5.1 Estimating abundance fromfrom frequency datadata Estimating abundance frequency  Suppose that that N individuals are randomly distributed across a region that that is is Suppose N individuals are randomly distributed across a region subdivided into into Q quadrats. The number occurring quadrat is therefore subdivided Q quadrats. The number occurring a quadrat is therefore 2.5.1 Estimating abundance from frequency data in a in Suppose that N individuals are randomly distributed across a region that is subdivided into Q quadrats. The number occurring in a quadrat is therefore

30

40

QUADRATS AND TRANSECTS  | 35

43 43

0

10 10

0

20 20

10

30 30

40 40

20

QUADRATS QUADRATS FOR FOR ESTIMATING ESTIMATING FREQUENCY FREQUENCY

10

20

30

40

00

Figure 2.12  The locations of 65 Douglas Firs in one 40 m × 40 m corner of the Californian Research plot. Quadrats of side 10 m are ineffective for showing frequency, since Douglas Firs occur in each. The smaller quadrats, of side Douglas Firs 0 10 20 30 40 0 10 20 4 m, are 30 more appropriate. 40 occur in 37% of these. Figure Figure 2.12 2.12 The The locations locations of of 65 65 Douglas Douglas Firs Firs in in one one 40 40 m m by by 40 40 m m corner corner of of the the Californian Californian Research Research plot. plot. Quadrats Quadrats of of side side 10 10 m m are are ineffective ineffective for for showing showing frequency, frequency, since since Douglas Douglas Firs Firs occur occur in in each. each. The The smaller smaller 2.5.1  quadrats, Estimating abundance from frequency data quadrats, of of side side 4 4 m, m, are are more more appropriate. appropriate. Douglas Douglas Firs Firs occur occur in in 37% 37% these. of these. Suppose of that N individuals are randomly distributed across a region that is subdivided

into Q quadrats. The number occurring in a quadrat is therefore an observation from a an observation from a Poisson distribution withofmean N/Q. containing The probability Poisson distribution with mean N/Q. The probability a quadrat at least one −N/Q . Suppose that q of is a quadrat least one individual is 1at−least e−N/Q individual 1 – e–N/Qcontaining . Suppose at that q quadrats contain one individual. Then q/Q −N/Q . quadrats least onenatural individual. Then q/Q is an estimate ofgives 1−e−N/Q is an estimate of 1contain – e–N/Qat. Taking logarithms and rearranging an estimate Taking natural logarithms and rearranging gives an estimate of N as of N as −Q ln(1 − q/Q).



He and (2000) Gastonused (2000) used a argument similar argument to arrive the more accurate He and Gaston a similar to arrive at theat more accurate estimate: estimate:



 = ln(1 − q/Q) .(2.7) (2.7) N ln(1 − 1/Q)

Acknowledging randomness wasuncommon, uncommon, Yin Yin and Acknowledging that that randomness was andHe He(2014) (2014)proproposed posed assessing the departures from randomness, by using a grid of quadrats, assessing the departures from randomness, by using a grid of quadrats, and counting and counting the number of occasions on which adjacent quadrats were occuthe number of occasions on which adjacent quadrats were occupied. If individuals are pied. If individuals are randomly distributed in an a × b grid of quadrats so randomly distributed inare an aoccupied, × b grid of quadrats so that qof quadrats occupied, that q quadrats then the probability a pair ofare quadrats boththen the probability of aoccupied pair of quadrats both being occupied is being is q(q − 1) . P = ab(ab − 1)

3 3 3 3 2 2 b posed assessing the departures 3 3 from randomness, 3 by3 using 2 2 a grid of quadrats, b and counting the number of occasions bon which adjacent quadrats were occupied. If individuals are randomly distributed in an a × b grid of quadrats so 36  |  MEASURING ABUNDANCE Figure 2.13 Entries are thethen numbers of neighbouring quadrats. Central that q quadrats are occupied, the probability of a pair of quadrats both quadrats eachEntries neighbours. Edge quadrats quadrats. lose a neighbour. being occupied ishave four Figure 2.13 are the numbers of neighbouring Central Corner quadrats losefour 2 neighbours. each link joins quadrats, quadrats eachEntries have neighbours. Edge quadrats lose two a neighbour. q(qSince − 1)neighbouring Figure 2.13 are the numbers of quadrats. Central P = . quadrats there are (2ab-a-b) distinct links. Corner quadrats lose 2 neighbours. Since each link joins two quadrats, quadrats each have four neighbours. Edge lose a neighbour. ab(ab − 1)

there are (2ab-a-b) distinct links. Corner quadrats lose 2 neighbours. Since each link joins two quadrats, there are (2ab-a-b) distinct links. For an aFor × bangrid quadrats there are (2ab a – b) neighbours (see Figure a ×of b grid of quadrats there are–(2ab − possible a − b) possible neighbours

2.13). Thus the expected number of expected neighbouring apairs random distribution) (see Figure Thus the number of neighbouring (assuming For an a 2.13). × b grid of quadrats there are pairs (2ab −(assuming a − b) possible neighbours is a random distribution) (see Figure 2.13). Thus the expected number of neighbouring pairsneighbours (assuming For an a × b grid of is quadrats there are (2ab − a − b) possible

a(see random Figuredistribution) 2.13). Thus is the expected number of neighbouring pairs (assuming a random distribution) is E = P (2ab − a − b). E = P (2ab − a − b). Denoting the observed number of neighbouring occupied quadrats bythe O, ratio I, Denoting the observed number occupied quadrats by O, Eof=neighbouring P (2ab − a − b). ratio I, given by the observed number of neighbouring occupied quadrats by O, given by theDenoting = neighbouring O/E, theDenoting ratio I, given by the observed numberI of occupied quadrats by O, = O/E, I the ratio I, given by provides an indication of departures from randomness. A value for I less = O/E, than 1 indicates some degree of Iclustering, whereas a value greater provides an indication of departures from randomness. A value for than I less1



provides an indication of departures from randomness. A value for I less than 1 indicates indicates some degree Yin andrandomness. He (2014) suggested adjusting than 1 indicates some ofdegree of clustering, whereas a value greater than 1 provides an indication ofregularity. departures from A value for I less some degree of clustering, whereas a value indicates some of ˆ degree the estimate N bysome taking account ofYin thegreater value ofthan I and using indicates some ofdegree regularity. and He (2014) suggested adjusting than 1 indicates of clustering, whereas a1 value greater thandegree 1 ^ regularity. Yin and He (2014) suggested adjusting the estimate N by taking account of ˆ  of Yin the estimate N degree by taking account the value of (2014) I and using indicates some of regularity. and He suggested adjusting c if I < 1, N  /Ithe ˆ by taking account the valuethe of Iestimate and using N value of I and using ˜  Nof

(2.8) N =  Ncc /I if II < > 1, 1, N I N c if ˜ = N (2.8) N  N c   N if II > < 1, 1, /I if N IN ˜ (2.8) (2.8) where c is a constant withNa=value c toif zero. N  close N I I > 1,They suggested evaluating ˜ N using both c = 0.01with and ca = 0.1 to givetoanzero. indication of the uncertainty in where c is a constant value close They suggested evaluating ˜ cais=value  and ˜ using the estimate. Note that N never less than N that the values chosen N both c = 0.01 and 0.1 to give an indication of the uncertainty in where c is a constant with close to zero. They suggested evaluating ˜ where c is a constant with a value˜close to zero. They suggested evaluating N using both ˜ using for cestimate. are both basedcNote only on and empirical the that N cis=never N and that the uncertainty values chosen N = 0.01 0.1evidence. toless givethan an indication of the in c = 0.01 and c = 0.1 to give an indication of the uncertainty the estimate. Note that N˜ is ˜ is never  andinthat for are based only on empirical evidence. the cestimate. Note that N less than N the values chosen ^ never lessforthan N and that the values chosen for c are based only on empirical evidence. c are based only on empirical evidence.

The following example suggests that the values obtained for N and N˜ should only be regarded as indications of the unobserved abundance.

2

3

3

3

3

2

3

4

4

4

4

3

3

4

4

4

4

3

3

4

4

4

4

3

3

4

4

4

4

3

2

3

3

3

3

2

a

b

Figure 2.13  Entries are the numbers of neighbouring quadrats. Central quadrats each have four neighbours. Edge quadrats lose a neighbour. Corner quadrats lose 2 neighbours. Since each link joins two quadrats, there are (2ab – a – b) distinct links.

NESTED QUADRATS

45

NESTED QUADRATS 45 QUADRATS AND TRANSECTS  | 37

 and N ˜ The following example suggests ˜ that the values obtained for N example suggests that the values obtained for N and N should only be regarded as indications of the unobserved abundance. garded as indications of the 2.3: unobserved abundance. Example Californian trees (cont.) Example 2.3 : Californian trees (cont.) Californian trees (cont.)

In Figure 2.12, a 10 × 10 grid of 4 m × 4 m quadrats was used, so a = b = 10 and there

Figure 2.12, a 10 by 10 4 m by 4 m quadrats was used, so a = b = a 10 by 10 grid ofwere 4 m by 4Inm was used, so grid a = bof= 2ab – aquadrats – b =were 180 neighbouring pairs of quadrats. There are qThere = 37 quadrats 10 and there 2ab − a − b = 180 neighbouring pairs of quadrats. are e 2ab − a − b = 180containing neighbouring pairs quadrats. are atquadrats least of one DouglasThere Fir.least Using Equation (2.7),Using the estimated number of q = 37 containing at one Douglas Fir. Equation (2.7), containing at least one Douglas Fir. Using Equation (2.7), is Firs under the hypothesis the estimated numberofofrandomness Firs mber of Firs under the hypothesis of randomness is under the hypothesis of randomness is  = ln(1 − 0.37) = 46.0. N  = ln(1 − 0.37) = 46.0. N ln(1 − 0.01) ln(1 − 0.01)

The value of 37 P= is 37 × 36/9900 = 0.1345, which gives = 24.2. observed The value ofgives P is × 24.2. 36/9900 0.1345, which gives E =E24.2. The The observed number 37 × 36/9900 = 0.1345, which E The= observed number of pairs, O, is 25 The which suggests very slight clustering. The isratio O, is 25 which of suggests very clustering. ratio pairs, O, is slight 25 which suggests very slight clustering. The ratio I = O/E found to II => O/E is revised found toestimate be 1.0322. Since I > 1, the revised estimate of the nd to be 1.0322. be Since 1, the of the 1.0322. Since I > 1, the revised estimate of the number of Douglas Firs is given by ˜ 47.5, ˜  I Nc . Using c = 0.01 gives N c number of Douglas Firs is given by N N   I N .. Using ˜ =N as Firs is given by N Using cc== 0.01 0.01 gives givesN˜N˜==47.5, 47.5,=while the choice c = 0.1 gives = the estimate while the choice c = 0.1 gives the estimate 48.2. Since the true number is 65, c = 0.1 gives the estimate 48.2. true number 65, data the method has fared poorly. 48.2. Since theSince true the number is 65, foristhese for these data the method has fared poorly. e method has fared poorly. Suppose that that the top bottom halveshalves of theofillustrated regionregion differdiffer materially Suppose theand topregion and bottom the illustrated the top and bottom halves of thethat illustrated differ in habitat. With know­ l edge each half should be separately examined. The materially in habitat. With that knowledge each half should be separatelyupper bitat. With that knowledge each half should be separately ^ half has 21 occupied quadrats with 12 occupied neighbouring pairs giving N = 27.0 and the examined. Thewith upper half has 21 quadrats with 12 neighbouring upper half has 21 occupied quadrats 12 neighbouring ˜  ˜ range for N as (33.0, 35.3). The true count was 30. The lower half has occupied pairs giving N = 27.0 and the range for N as (33.0, 35.3). The true16 count ˜ as (33.0, 35.3). The true count = 27.0 and the range for N ^ ˜ as (25.4, was 30. The lower half has 16 occupied quadrats and 11 neighbouring pairs quadrats and 11 neighbouring pairs giving N = 19.1 and the range for N 27.7). er half has 16 occupied quadrats and 11 neighbouring pairs˜ = giving N 19.1 and the range for N as (25.4, 27.7). The true count was 35. ˜ The true count was 35. In both cases there has been an appreciable under-estimate. and the range for N as (25.4, 27.7). The true count was 35. In both cases there has been an appreciable underestimate.  re has been an appreciable underestimate. 

uadrats

2.6 Nested quadrats 2.6  Nested quadrats

Nested quadrats general use 2.6.1 use2.6.1 Nested quadrats forforgeneral use uadrats for general

When several areand of interest, withbeing some common being common and others When with several species are of interest, with some and others scarce, it will ecies are of interest, some beingspecies common others scarce, it size willa be difficult to find aall single quadratwell size for thatallworks well One for allsolution, difficult to find single quadrat size that works species. difficult to findbea single quadrat that works well for species. One solution, byaWhittaker (1977), is to use aa design by Whittaker is to use design that incorporates varietythat of quadrat tion, suggestedsuggested by Whittaker (1977), is to(1977), usesuggested a design that incorporates a variety of quadrat sizes. riety of quadratsizes. sizes. Two on a 20 m × 50 m plot are ilrectangular designsTwo based on apossible 20rectangular m ×rectangular 50 m designs plot designs are based il- based possible on a 20 m × 50 m plot are illustrated in lustrated in Figure 2.14. The first design, used by the Carolina Vegetation re 2.14. The first design, used thedesign, Carolina Vegetation Figure 2.14. Theisby first used by the Carolina Vegetation and Survey, derived from Survey, derived from that suggested by Peet, Wentworth, Whiteis(1998). from that suggested by Peet, Wentworth, and White (1998). that suggested by Peet, Wentworth, andinWhite (1998).States The second design, which The second design, which is used the United by the National Insti- is used n, which is used in the United States by the National Instiin the USA by the National Institute of(NIISS), Invasiveis Species Science (NIISS), is the so-called tute of Invasive Species Science the so-called Modified-Whittaker Species Science (NIISS), is the so-called Modified-Whittaker Modified-Whittaker plot recommended by Stohlgren, Schell (1995). The Stohlgren, Falkner, and SchellFalkner, (1995). and The plot should d by Stohlgren, Falkner,plot andrecommended Schell (1995).byThe plot should be oriented to maximise the vegetation gradient so as to collect the largest plot should be oriented to maximize the vegetation gradient so as to collect the largest aximise the vegetation gradient so as to collect the largest of species. possiblepossible variety variety of species. f species. 2 Figure 2.15 (a) shows a 100 design advocated advocated by Food andand Agricul2mdesign Figure 2.15 (a) a 100 m bythe the Food Agriculture ) shows a 100 m2 design advocated byshows the Food and Agriculture Organization of the United Nations. Information on different aspects of n of the UnitedOrganization Nations. Information on different aspects of of the United Nations. Information on different aspects 2 2 of the ecosystem shrubs, the ecosystem are collected at2shrubs, different scales (1 m herbs, 25 m for 2 2 2 2 herbs, 25 m for e collected at different scales (1 m are collected at different scales (1 m herbs, 25 m for shrubs, and 100 m for trees). Figure

2.15 (b) shows the implementation by Vicharnakorn et al. (2014) of a modified form of the design as part of larger 1600 m2 quadrats.

38  |  MEASURING ABUNDANCE (a)

(b)

Figure 2.14  Two arrangements for nested subplots within a 20 m × 50 m plot. (a) The design used by the Carolina Vegetation Survey. (b) The design recommended by the National Institute of Invasive Species Science.

(b)

(a)

1m 5m

10 m

Figure 2.15  (a) An arrangement for nested subplots within a square plot of side 10 m suggested by the Food and Agriculture Organization of the United Nations. (b) An implementation of a modified form of (a) within 1600 m2 quadrats used by Vicharnakorn et al. (2014) for the remote sensing of an area bordering Thailand.

QUADRATS AND TRANSECTS  | 39

2.6.2  Nested quadrats for estimating frequency Outhred (1984) suggested an alternative approach to the measurement of frequency using concentric square quadrats. The aim is to simultaneously obtain information on rare and common species. The design consists of concentric square quadrats, with areas that roughly double in magnitude, as illustrated in Figure 2.16. Outhred suggested using two measures, that he termed frequency score and importance score. The frequency score is defined as the proportion of the regions in which a plant occurs. In the figure, since the plants occur in four of the seven regions, the frequency score is 4/7. The importance score is determined by noting the innermost region within which the plant occurs. In the figure the two central regions are devoid of plants, so the innermost plant occurs in the fifth region, counting from the outside. The importance score is therefore 5/7. Morrison, Le Brocque, and Clarke (1995) carried out an extensive investigation of Outhred’s suggestions. They concluded: The importance-score method involves no more sampling effort than does standard qualitative (presence-absence) sampling, and it can therefore be used to sample a larger quadrat area than would normally be used for frequency sampling. This makes the method much more cost-effective as a means of estimating abundance, and it allows a greater number of the rarer species to be included in the sampling. The frequency-score method is more time-consuming, but it is capable of detecting more subtle community patterns. This means that it is particularly useful for the study of species-poor communities or where small variations in composition need to be detected.

Figure 2.16  A sampling design consisting of seven concentric square quadrats having areas of 1, 2, 5, 10, 20, 50, and 100 units. Twenty randomly placed plants are shown. Since they occur in four of the seven regions the frequency score is 4/7. Since the innermost plant is in the fifth region, starting from the outermost ring, the importance score is 5/7.

0.4

0.6

Mean score

0.8

1.0

40  |  MEASURING ABUNDANCE

Raw frequency 0.2

Importance score

0.0

Frequency score

10−4

10−3

10−2

10−1

Density of points per unit area

100

101

Figure 2.17  The average frequency, importance score, and frequency score for randomly positioned plants shown as a function of the density of the plants. Whereas frequency is linearly related to density over a single order of magnitude, the two scores are linearly related over at least two orders of magnitude, making them better at distinguishing between the relative frequencies of different species. Figure 2.17 illustrates the improved sensitivity achieved by using these scores for the case of randomly placed points. In the figure, the mean score (or mean frequency) is plotted against the true point density. For this example it is assumed that the seven concentric square quadrats having areas of 1, 2, 5, 10, 20, 50, and 100 units. All three measures have similarly shaped curves, but, crucially, the range of values where frequency is in the range (0.1, 0.9) is much greater for the new measures. The ratio of the density corresponding to a score of 0.9, to that corresponding to a score of 0.1, is about 20 for frequency, but nearly 140 for the frequency score, and 180 for the importance score. In Figure 2.17 this is shown by the much steeper slope for the midsection of the curve.

bb Advice on data collection ‘The key thing is to construct stringlines that define the diagonals of the nested squares, with appropriate markers along them, to mark the corners of the squares. We used a central metal pole, with stringlines extending from there to the four corners. The markers defined the corners of squares of 0.25, 0.5, 1, 2, 5, 10, 20, 50, 100, 200, 500 and 1000 sq m. The users then need to identify the edges of each of the squares by eye, imagining the edges joining the equivalent stringline markers. If there are problems, then an extra stringline, placed successively along each edge of each nested square, should deal with it.’ (Morrison, pers. comm.)

QUADRATS AND TRANSECTS  | 41

2.7  Quadrats for estimating cover Cover (also called coverage) is a measure of the influence of an organism. However, while there is a clear idea of what cover means, its definition may vary from one study to another, and its measurement will rarely be precise.

2.7.1  Alternative definitions of plant cover Fehmi (2010) suggests that three definitions of plant cover are in frequent use, with different names being used by different authors. The definitions (with Fehmi’s descriptive names) are: 1. Aerial cover: Data refer only to species directly viewable from above. The maximum percentage for any one species is 100%, and the total coverage for detected species cannot exceed 100%. Cover deduced from satellite images is of this type. The total aerial cover may be described as the extent of foliar cover, or crown cover for the region. In sparsely vegetated areas it may be more appropriately termed ground cover. 2. Species cover: For each species this is the proportion of the region covered by that species. The maximum for any one species is 100%, but, since species may overlap, the total species coverage may exceed 100%. 3. Leaf cover: When point quadrats are used and every distinct overlap of a sampling point is separately counted, the value calculated for a single species may exceed 100%. While aerial cover may be relatively easy to determine from high quality satellite images, it is not so easy to determine from the ground, where an observer will be viewing tree tops at an angle. The proportion of the sky that is visible to a stationary observer may be termed canopy closure (Jennings, Brown, and Sheil, 1999). The term canopy cover has been used with each of the above definitions, and for a variant of aerial cover that uses circumscribing circles or polygons to represent individual plants. This latter definition leads to greater coverage values than those for the aerial cover described above. The UK Forestry Commission uses the term net canopy cover for measurements to the drip line of an individual tree. It uses gross canopy cover to include the additional regions between the canopies of neighbouring trees together with small glades. To understand any set of coverage values the ‘small print’ must therefore be studied to determine exactly what has been measured, particularly when comparing data from independent sources. Basal cover is the proportion of ground area occupied by plant bases (for example, tree trunks). Since this varies little from year to year, it is useful when looking at time series. It is also used in judging the relative importance of the species present.

Example 2.4: Alaskan shrubs As part of a NASA-funded research project to map changes in the abundance of tall shrubs (heights greater than 0.5 m) in Arctic tundra a field campaign was carried out in 2010 and 2011 on the North Slope of Alaska. Details are given by Duchesne, Chopping, and Tape (2016). The aim was to match ground truth with available high-resolution panchromatic satellite imagery, in order to arrive at an estimate of the overall abundance of shrubs in that area. This example concentrates on a single 250 m × 250 m site (number 5) bordering the Chandler River.

42  |  MEASURING ABUNDANCE

7300

7350

7400

7450

7500

7550

The data, which are downloadable from the Distributed Active Archive Center for Biogeochemical Dynamics,3 include crown height (height from ground to topmost branch), crown radius (half the distance from the leftmost branch to the rightmost branch as encountered by the observer), and the spatial co-ordinates determined by GPS. Canopy cover is estimated as the area of a circle of the calculated crown radius. The sampling protocol consisted of surveying five parallel equi-spaced transects, 5 m wide, spanning the 250 m square plot. Figure 2.18 shows that, given the practical difficulties, the survey team did an excellent job at following the protocol: the dotted lines indicate the apparent positions of the transects (deduced from the recorded shrub positions). The figure illustrates the locations of 99 shrubs (all alders) at 80 distinct locations (in cases where shrubs were very close together a single GPS reading was used for all), with the crown dimensions indicated. Cases of apparently off-transect shrubs can be attributed to unreliable GPS readings. The figure suggests that the shrub density varies considerably across the plot. The observed canopy cover proportions (truncating radii to 2.5 m for shrubs wider than the 5 m transect) for the five transects are 0.1075, 0.0817, 0.0471, 0.0555, and 0.0297, giving c = 0.064, s 2 = 0.000934, and an approximate confidence interval for the cover for this plot as (0.037, 0.092).

6700

6750

6800

6850

6900

6950

7000

Figure 2.18  The recorded locations of Alders (Alnus) in five transects forming part of a 250 m × 250 m site bordering the Chandler River in the North Slope of Alaska. The circles reflect the shrub crown radii with concentric circles indicating cases where a single GPS reading was assigned to several shrubs. The dotted lines indicate the approximate transect paths deduced from the recorded data.

QUADRATS AND TRANSECTS  | 43

bb How the data were obtained To establish the positions of the shrubs, the observers walked along the centre line of each transect carrying a 5-metre rod. Shrubs were included if they intersected the rod and had their base within the transect. When a shrub was encountered, the shrub was measured and photographs were taken both of the shrub and of the GPS record. The shrub radius was measured as half the distance from the leftmost branch to the rightmost branch as encountered by the observers. The canopy height was taken to be the height from ground level to the topmost branch. At the time of the survey the GPS records were known to be occasionally grossly misleading, but generally accurate to within 10 m. Given the expected GPS accuracy, when several shrubs were close together, a single GPS record was used.

2.7.2  Correcting edge effects for cover assessment When the plants providing cover are relatively large compared to the study region, the rules concerning which plants should be included can make a considerable difference to the cover assessment. Two possible rules are 1. Consider only plants lying wholly in the quadrat. 2. Consider both plants lying wholly within the quadrat and also those (of any size) that overlap the quadrat. In Figure 2.19, which illustrates these rules, the dotted region indicates the region within which the quadrat centres must lie. For Rule 1 (cases A and B in the figure) larger plants are disadvantaged relative to smaller plants because there is a smaller region within

A

B

C

D

Figure 2.19  In each case quadrat centres must lie within the dotted region. If, to be recorded, plants must lie wholly within a quadrat, then larger plants are disadvantaged (compare cases A and B). If plants overlapping the edge are included, then smaller plants are disadvantaged (compare cases C and D).

44  |  MEASURING ABUNDANCE QUADRATS FOR ESTIMATING COVER

53

which the centres of the large plants can lie. For Rule 2 (cases and D) the reverse QUADRATS FORCESTIMATING COVER 53 is true, QUADRATS FOR ESTIMATING COVER 53 with large plants being easier to detect than small plants. Correction requires a scaling of the area of the dotted regions in Figure Correction requires a scaling of the area of the dotted regions in Figure 2.19 to that of Correction a scaling of the area of thethe dotted regions in Figure 2.19 to that ofrequires the original quadrat. Suppose that quadrat is rectangular Correction requires a scaling of the area of the dotted regions in Figure the original quadrat. Suppose that the quadrat is rectangular with sides of length 2.19 to that the original thatbethe quadrat is rectangular with sides of of length a and b quadrat. and that Suppose plant i may approximated by a circle a and b 2.19 to that of the original quadrat. Suppose that the quadrat is rectangular and that plant i may be approximated by a circle with diameter d . Then plant i should be sides of length a and b and that plant i may beweight approximated by a circle with diameter di . Then plant i should be assigned wii , where, for Rule with diameter sides of length a and b and that plant i may beweight approximated byfor a circle with di . Then plant be assigned wi , where, Rule assigned1: weight w i, where, for Rule 1:i should

with diameter di . Then plant i should be assigned weight wi , where, for Rule 1: 1: ab ,(2.9) wi = (2.9) ab (a − d )(b − di ) , i (2.9) wi = ab (2.9) wi = (a − di )(b − di ) , and, for Rule 2: (a − di )(b − di ) and, for Rule 2: 4ab and, for Rule 2: . (2.10) wi = and, for Rule 2: 4ab di ) − (4 − π)d2i .(2.10) wi = 4(a + di )(b +4ab (2.10) 2 4(a + di in )(b + di ) − (4the − π)d wi =detected (2.10) i2 . With n circular plants proportion of the quadrat 4(a + di )(ba+quadrat di ) − (4 − π)d i With by n circular detected covered plants isplants estimated by in a quadrat the proportion of the quadrat With by n circular plants detected in a quadrat the proportion the quadrat Withcovered n circular plants detected in abyquadrat the proportion of theofquadrat covered by plants is estimated n by by plants is estimated by 1  plants is covered estimated n wi πd2 p= 1  (2.11) i. n  p = 4ab wi πd2i2. (2.11) i=1 1 p = 4ab i=1 wi πdi .(2.11) (2.11) 4ab i=1



Advice on data collection Advice on data collection Advice on data collection For an data irregular shaped plant that is roughly circular (for example, the crown bb Advice on collection of a tree), an appropriate forroughly d may circular be the average of the length and For an irregular shaped plantvalue that is (for example, the crown For irregular shaped that is roughly circular (for thethe crown the theplant length defined as of and the ofan abreadth, tree), anwhere appropriate value for d may bethe themaximum average ofdiameter the length For an irregular shaped plant that is is roughly circular (for example, example, crown of a of abreadth, tree), appropriate value for dofmay themaximum average ofdiameter the length and object and an the breadth is the length the diameter at right angles to the breadth, the where the d length isbe defined asbe the ofthe tree), an appropriate value for may the average of the length and the breadth, where the is length is defined as diameter the maximum diameter of the the length. object and the breadth the length of the right angles to where the length defined as the maximum diameter ofat object and object is and the breadth is the length of the diameter atthe right angles to the the breadth length. is the lengthlength. of the diameter at right angles to the length.

Example 2.5 : Bahamian coral Example 2.5 : Bahamian coral Table 2.12.5 uses data derived from the online database4 made available by Example : Bahamian coral Table uses data fromReef the Assessment). online database made available by AGRRA (Atlantic and derived Gulf Rapid I 4am grateful to their 4 Example 2.5:2.1 Bahamian coral Table 2.1 uses data derived from the online database made available by AGRRA (Atlantic andmanagers Gulf Rapid Reef Assessment). I am grateful to their contributors and data for making their extensive data available. The AGRRA (Atlantic and Gulf Rapid Reef Assessment). I am grateful to their 4 made contributors and data managers foronline making their available. data in the table refers to a transect, 1 mdatabase wide extensive and 10 mdata long, traversed on Table 2.1 uses data derived from the available by The AGRRA contributors andatdata managers for making their extensive available. The data in the table refers toAssessment). aCays transect, 1ofm wide and mdata long, traversed on and April 29,Gulf 2011, Anguilla (part the Cay Sal10 Bank in the Bahamas). (Atlantic and Rapid Reef I am grateful to their contributors data in the table refers to aCays transect, 1ofmthe wide and m long, traversed on April 29, 2011, at Anguilla (part Cay Sal 10 Bank in the There were six corals noted with maximum diameters (lengths) ofBahamas). atin least 4 table data managers for making their extensive data available. The data the April 29, 2011, at Anguilla Cays (part of the Cay Sal Bank in and theofBahamas). There were six corals noted with maximum diameters (lengths) at least 4 cm (the minimum recordable size). The coral measurements associated refers to a transect, 1 m wide and 10 m long, traversed on 29 April 2011, at Anguilla There were six corals noted in with diameters (lengths) at least 4 cm (the minimum recordable size). The coral measurements andofassociated cover calculations are given themaximum table. Cays (part of the Cay Sal Bank in size). the Bahamas). There were sixand corals noted with cmThe (the minimum recordable The coral measurements associated cover calculations are given cover in theistable. total effective coral 1565 cm2 in the study region of 100 000 cover calculations are given in the table. maximum diameters (lengths) of at least 4 cm (the minimum recordable 2 2 total effective coral cover is 1565 in 2% theofstudy regionarea of 100 000 The cmThe : thus coral is estimated to cover lesscm than the total atsize). this 2 The total effective coral cover is 1565 cm in the study region of 100 000 2 coral measurements and associated cover calculations are given in the table. cm 2: thus coral is estimated to cover less than 2% of the total area at this location.  2 in than 2: thus cm : thus coral is estimated to cover less 2% of the total area at this The total effective coral cover is 1565 cm the study region of 100,000 cm location.  location.  Size-frequency distribution Zvuloni (2008) note that ‘Many coral is2.7.2.1 estimated to cover less than 2%(SFD) of the total areaetatalthis location. 2.7.2.1of ecological Size-frequency distribution (SFD)characterizing Zvuloni et althe (2008) note that ‘Many areas research aim toward size-frequency dis2.7.2.1of ecological Size-frequency distribution (SFD)characterizing Zvuloni et alspace (2008) note thattime’. ‘Many areas aim to toward the size-frequency distribution (SFD) of research populations assess change across and across areasCorals of ecological aimdiameter toward theRule size-frequency Table 2.1  of at least 4 cm in recorded (using 2) across on a 1time’. mdis× 10 m tribution (SFD) of research populations to assesscharacterizing change across space and tribution (SFD) of populations to assess change across space and across time’. Bahamian transect. 4 http://www.agrra.org/data-explorer/

4 http://www.agrra.org/data-explorer/

4 http://www.agrra.org/data-explorer/ 6 Length (cm)

8

15

16

18

50

Breadth (at 90° to the length; cm)

3

6

11

12

18

40

4.5

7

13

14

18

45

Estimated value for d (cm) Weight, w Weighted cover = wπd2/4 (cm2)

Total

0.9527 0.9282 0.8739 0.8654 0.8330 0.6619 15

36

116

133

212

1053

1565

QUADRATS AND TRANSECTS  | 45

2.7.2.1  Size-frequency distribution (SFD) Zvuloni et al. (2008) note that ‘Many areas of ecological research aim toward characterizing the size-frequency distribution (SFD) of populations to assess change across space and across time.’ If some individuals are large relative to the size of a quadrat, then one of Equations (2.9) and (2.10) should be used to correct the apparent size-frequency distribution for biases due to edge effects.

2.7.3  The visual assessment of cover Visual assessment is unreliable, since observers will inevitably report round numbers (10%, 20%, …) or simple fractions (25%, 33%, …) rather than amounts such as 7% or 26%.5 Another problem is that the totals of visual assessments of the cover provided by individual species may well exceed 100%. An example of these difficulties follows.

Example 2.6: Dartmoor quadrats Kent and Coker (1992, Table 3.4) report the assessments of ground cover made by students for 25 quadrats in Dartmoor. Table 2.2 summarizes the totals reported. The most extreme result was a total of 195%. For that quadrat, according to the students, Pteridium aquilinum (a fern) covered nearly the entire quadrat while both Festuca ovina (a grass) and Galium saxatile (Heath bedstraw) each covered about half the quadrat. The counts of the cover percentages reported for individual species are illustrated in Figure 2.20. The strong preference for 5%, 10%, and 20% is evident. Table 2.2  Totals of the cover assessments made by students for 25 Dartmoor quadrats.

Cover total

120–139

140–159

160–179

180–199

5

9

7

2

2

0

5

10

15

20

Number of reports

25

30

No. of quadrats

100–119

0

20

40

60

Cover (percentage)

80

100

Figure 2.20  The cover percentages reported for the individual species observed in 25 Dartmoor quadrats.

46  |  MEASURING ABUNDANCE

The use of cover scales for plants In the field, accurate cover measurements are well-nigh impossible, and any attempt at great accuracy would be very time-consuming. For these reasons, in 1885, the Finnish botanist Ragnar Hult suggested using a simple five-category geometric scale, now referred to as the Hult-Sernander-Du Rietz cover scale. A scale providing more detail for dominant plants is due to Daubenmire (1959), who suggested using a portable rectangular quadrat with sides of lengths 20 cm and 50 cm, coloured so as to make the assessment of coverage using his six-point scale particularly quick and easy. Another alternative, which places more emphasis on the rarer species, is the New Zealand scale, which is a slight modification of the Braun-Blanquet scale given later in Table 2.5. The three cover scales are summarized in Table 2.3. Cover abundance is sometimes described using the ACFOR scale, where A, C, F, O, and R, are shorthand for Abundant, Common, Frequent, Occasional, and Rare. The descriptions may be used either informally, or may be guided by a scale such as that given in Table 2.4 (in which case an organism accounting for more than 75% of cover might be described as Dominant). Braun-Blanquet (1928) and Domin (1928), independently suggested scales that measured plant abundance using a combination of cover and richness. Both scales have subsequently been modified to provide finer details, especially for rarer species, with the modifications being due to Barkman, Doing, and Segal (1964) and Krajina (1933), respectively. These scales are given in Table 2.5, along with the simpler North Carolina scale suggested by Peet, Wentworth, and White (1998). For the visual estimate of cover to be reasonably accurate, a rectangular quadrat should not be too long (since that would make estimation of a proportion difficult), nor too narrow (since then the proportion of plants overlapping an edge might be large). Cover may be regarded as a surrogate for biomass, since the direct measurement

Table 2.3  Alternative cover scales.

Hult-Sernander-Du Rietz cover scale Scale Cover: upper limit (%)

1

2

3

4

5

6.25

12.5

25

50

100

Daubenmire cover scale Scale

1

2

3

4

5

6

Cover: upper limit (%)

5

25

50

75

95

100

New Zealand cover scale Scale

1

2

3

4

5

6

Cover: upper limit (%)

1

5

25

50

75

100

Table 2.4  The ACFOR scale.

Description Cover: upper limit (%)

Abundant

Common

Frequent

Occasional

Rare

75

50

25

5

1

QUADRATS AND TRANSECTS  | 47 Table 2.5  Alternative cover-abundance scales.

Braun-Blanquet cover-abundance scale Scale

r

+

1

2

3

4

5

Cover: upper limit (%)

5

5

5

25

50

75

100

Number of individuals

1

F

N

Extended Braun-Blanquet cover-abundance scale Scale

r

+

1

2m

2a

2b

3

4

5

Cover: upper limit (%)

5

5

5

5

12.5

25

50

75

100

Number of individuals

1–3

F

A

VA

Domin cover-abundance scale Scale

1

2

3

4

5

6

7

8

9

10

Cover: upper limit (%)

4

4

4

10

25

33

50

75

90

100

Number of individuals

F

S

M

Domin-Krajina cover-abundance scale Scale

+

1

2

3

4

5

6

7

8

9

10

Cover: upper limit (%)

1

1

1

5

10

25

33

50

75

99

≈ 100

Number of individuals

1

F

N

North Carolina cover-abundance scale Scale

1

Cover: upper limit (%) Number of individuals

2

3

4

5

6

7

8

9

10

1

2

5

10

25

50

75

95

100

F

Key: F = Few; N = Numerous; A = Abundant; VA = Very abundant; S = Several; M = Many

of biomass is time-consuming and, by definition, destroys the plant being measured. Chiarucci et al. (1999) compare results from measurements of cover and biomass; it appears that cover is generally an effective substitute.

Example 2.7: Dartmoor quadrats (cont.) It might be thought that using a cover scale rather than a visual estimate would greatly reduce accuracy. However, a study by Damgaard (2014) found that ‘the rather rough Braun-Blanquet sampling procedure provided cover estimates that were comparable in accuracy’ to other sampling methods. Given their similarity, the same would be true for other cover scales. Figure 2.21 demonstrates this for the Daubenmire scale applied to the Dartmoor data where the observed percentages have been replaced by the midpoints of the Daubenmire classes (2.5%, 15%, …, 97.5%).

0

50

100

150

200

250

Total of Daubenmire mid−points

300

48  |  MEASURING ABUNDANCE

0

50

100

150

Total of observed cover values

200

250

300

Figure 2.21  The total Dartmoor cover percentages for the 23 commonest species plotted against the estimates that would have been obtained if the cover observations had been recorded using the Daubenmire cover scale.

2.7.4  The SACFORL cover scale The plant cover scales given in Tables 2.3 to 2.5 took no account of the size of individual organisms since they referred only to total cover. A more sensitive scale is provided by The Marine Biological Association of the UK. Their scale, used since 1990, takes account of the sizes of individual specimens (or their colonies). An extract is given in Table 2.6.

Table 2.6  Extract from the abundance scales proposed by the Marine Nature Conservation Review for both littoral and sublittoral taxa from 1990 onwards.

Size of individuals (or colonies) % cover

15 cm

Density per m2

≥ 80

S

40–79

A

S

20–39

C

A

S

10–19

F

C

A

S

10–99

5–9

O

F

C

A

1–9

1–5

R

O

F

C

0.1–0.9

< 1 and density 0.01–0.09 per m2

L

R

O

F

0.01–0.09

L

R

O

0.001–0.009

L

R

0.0001–0.0009

L

< 0.0001

< 1 and density 0.001–0.009 per m2 < 1 and density 0.0001–0.0009 per m2 < 1 and density < 0.0001 per m2

≥ 10,000 1000–9999 100–999

Key: S = Superabundant; A = Abundant; C = Common; F = Frequent; O = Occasional; R = Rare; L = Less than rare

L R 0.0001-0.0009 < 1 and density 0.0001-0.0009 per m L R 0.0001-0.0009

is probability of an observaSince the are distributed at THE random, the probability ofbeing there being Since plants distributed at random, of there Since the plants aredistribution distributed at random, the probability ofx),there being 2.x, 2). Since the plants are distributed at random, the probability of there being 2 Poisson with mean ρπx This probability is exp(–ρπx The corresponding Since the plantstion areplants distributed ataa random, thex,probability of there being 2 . This of tion of zero from Poisson distribution with mean ρπx probability is no in aofcircle ofis radius P(X >of x), probability ismean the probability an observafrom Poisson distribution with ρπx . This is no plants inof azero circle x, x), is> the of an observano plants in a circle ofno radius x, P(X >radius x), theP(X probability observaplants in aThe circle of radius x,>probability P(X x),anof is the probability ofprobability an observa2 density function is no plantsprobability in a circle ofSince radius x, P(X > x), is distribution the an 2 22 ). corresponding probability density function is exp(−ρπx 2 observa2 ). The corresponding probability density function is exp(−ρπx the plants are distributed at random, the probability of there being tion of zero from a Poisson with mean ρπx . This probability is tion of zero from a Poisson distribution with mean ρπx . This2 probability is

tion of zero from a Poisson distribution meandistribution ρπx . This probability is . This probability is 2 of zero2 from with a Poisson with mean ρπx tion of zero from ation Poisson with mean ρπx . >This probability is isof an observa2plantsdistribution no in aThe circle of radius x, probability P(X density is the ).corresponding The corresponding probability density function exp(−ρπx 2 probability ). The22probability probability function is corresponding density function is x), density exp(−ρπx2 ). The 2 2 exp(−ρπx ). corresponding function is exp(−ρπx f(x) = 2ρπx exp(−ρπx ), probability density function is exp(−ρπx ). The corresponding f(x) = 2ρπx exp(−ρπx ), 2 tion of zero from a Poisson distribution with mean ρπx . This probability is 22 ), 2 f(x) = 2ρπx exp(−ρπx 2 2 2 f(x) =off(x) 2ρπx exp(−ρπx ), density f(x) =expected 2ρπx exp(−ρπx ),X2 2and exp(−ρπx ).= The corresponding probability = ), function is and thef(x) values X 22 exp(−ρπx are respectively respectively 2ρπx exp(−ρπx ), 2ρπx and the expected ofXX and X are and the expected values ofvalues X and are respectively 2 2 2and X are respectively √ 2 and expected off(x) X theand expected values of X and XE(X) √ and the expectedand values of the X and X 2 arevalues =are 2ρπx exp(−ρπx ), 2 respectively of X and Xrespectively are respectively =2 1/(2 1/(2 ρ),(4.1) (4.1) and the expected valuesthe of expected X and Xvalues are respectively E(X) ρ), (4.1) = √ √ √ √ E(X) = 1/(2 ρ), (4.1) E(X) 1/(2X=2ρ), (4.1) (4.1) √ =and = 1/(2 ρ), and theE(X) expected are respectively 1/(2 ρ), (4.1)(4.1) and E(X) =values 1/(2 ofρ),X E(X) and and √ E(X 22 ) = 1/(ρπ). 1/(ρπ). (4.2) E(X (4.2) and and and and E(X)22)==1/(2 ρ), (4.1) and 2 2 E(X ) = 1/(ρπ). (4.2) 2 (4.2) E(X ) = 1/(ρπ). (4.2) Using n sampling points, and denoting the distance from sampling point i E(X ) = 1/(ρπ). (4.2) 2 E(X ) = 1/(ρπ). (4.2) Using n sampling points, and denoting the distance from sampling point  E(X ) = 1/(ρπ). (4.2) nn x /n.i and (i = 1, 2, . . . , n), E(X) is estimated by to its nearest plant by x i i Using n sampling points, and denoting the distance from sampling point (i and = 1,from 2,2.the . .sampling , distance n),the E(X) is estimated by to nUsing itssampling nearest plant by x i=1 Using points, and from sampling point i xi /n.ii i denoting Using n sampling points, and denoting the distance point ifrom i=1 nand sampling points, denoting distance from sampling n point Using n sampling points, and denoting distance sampling point i to its E(X )rearrangement, = the 1/(ρπ). (4.2) n  Using n sampling points, denoting the distance from sampling point i n estimate Substitution in Equation (4.1), and leads to the of  n n Substitution in Equation (4.1), and rearrangement, leads to the estimate ρρ i i/n. (i = 1, 2, . . . , n), E(X) is estimated by xiof to its nearest plant by x  =x1, 2, .estimated . .1,, n), estimated by i=1 to its plant i=1 x i (i i (i (i = 1, 2, .by .plant . , xn), E(X) is = by isE(X) to its nearest plant bynearest xi its ni /n. i=1 2, . .E(X) . , n), estimated byxi /n. to nearest by i=1 xis iE(X) i /n. i=1 (i = 1, 2, . . . , n), is estimated by x /n. to its nearest plant by x (i = 1, 2, …, n), E(X) is estimated by . Substitution in Equation nearest plant by x suggested by Clark and Evans (1954): i i inby i=1 suggested Clark and Evans (1954): Using sampling points, and denoting theleads distance from sampling Substitution in Equation (4.1), and rearrangement, leads to the the estimate of ρ ρi Substitution in and Equation (4.1), and rearrangement, toleads estimate of ρ point Substitution in Equation (4.1), rearrangement, leads torearrangement, the estimate of ρthe  Substitution in Equation (4.1), and estimate of n Evans Substitution inand Equation (4.1), rearrangement, leads to estimate ofby ρtoClark (4.1), rearrangement, leads to the estimate ofthe ρE(X) suggested and (1954): suggested byand Clark and Evans (1954):    (i = 1, 2, . . . , n), is estimated by x /n. to its plant by x suggested by nearest Clark and Evans (1954): 2 i i    suggested by Clark andsuggested Evans (1954): i=1 n 2 by(1954): Clark and Evans (1954):  n  suggested by ClarkSubstitution and Evans in Equation (4.1), and rearrangement, leads to the estimate of ρ 2 44  2nn x 22 .. xx =n n2  xii  (4.3)   n 2ρρ (4.3) n  2   n Evans =  n suggested by Clark and 2 (1954):  22 i=1 2ρ   = n 4 x . (4.3) (4.3) 2 ρxx= n .xx =4n2 4xi i=1. xii (4.3) (4.3) (4.3) ρx = n 4 i 2 .   i=1  ρx = n2 4 i xρ (4.3) 2 ix n . 2i=1 n i=1 estimate i=1 An alternative alternative is is to i=1 use ni=1 x22i /n /n as as an an estimate of of E(X E(X 2 ). ). This This suggests suggests i=1 An to use x   nρ   2 n i22 n 2xxi=1 4as an = xi showed . of n 2 the bias in (4.3) n  nuse 2 2 . Moore (1954) that this estimating ρ by n/ π  2 n 2 An alternative is to x /n estimate E(X ). This suggests 2 2 i n . xE(X Moore (1954) that biassuggests inestimating this xi=1 estimating ρ to by2as n/ πestimate i=1 nis 2 2 the alternative is use xi /n as an estimate of E(X This suggests alternative use estimate ofshowed E(Xof). ).E(X This ρ i=1 iof An alternative An isAn useAn i=1 anto ). an This suggests iian  as i=1 2 estimate is use /nE(X as ).suggests This i=1 to i /n n i=1  xis tois n iof i=1 An alternative to usealternative xi /n as estimate ).it This suggests ncorrected 22 multiplying πxan estimator easily by multiplying it by (n − 1)/n to give:  2 n n i=1 2 n/ estimator is easily corrected by by (n − 1)/n to give:   . Moore (1954) showed that the bias in this x estimating ρ by n/ 2 i . Moore (1954) showed that the bias in this estimating ρ by π i=1 Moore (1954) showed the bias in this xni .. Moore estimating ρ by by n/ π estimating i . that (1954) that the bias in this estimator is easily corrected i=1 i showed i=1 2ρ by n/  Moore (1954) showed that the bias in this x π i=1 ishowed i=1 n . Moore (1954) that bias this xisi easily estimating ρ by n/estimator πAn i=1 2 in  alternative iscorrected use x2i /nto asby anthe estimate of1)/n E(X ).give: This suggests   corrected by multiplying it by (n −to 1)/n to give: estimator is easily by multiplying it (n 1)/n give:to n− estimator is easily corrected by by (nby −1)/n give: i=1 toitto it multiplying by (n – 1)/n give:  n is corrected easily multiplying it by (n − nit by  estimatorby is multiplying easily estimator corrected by multiplying to give: 2 (n − 1)/n 2   . −Moore (1954) showed that the bias in(4.4) this xi(n estimating ρ by n/ π ρx2 2   , = 1) π x   i=1 n  n xii  , x2 = (n− 1)  (4.4) π n nρ  estimator is easily  corrected multiplying itn by 1)/n to give: n =by 22(n − i=1 2π   ρ  , (4.4) (n − 1) x 2 i=1 x2 (4.4) (4.4) 1)2 − 1) π xi , x2ii(4.4) x2 ρx2 = (n − 1) ρx2π = (n x− ρx2 i =x,(n π i  ,(4.4) ρx2 = (n − 1) , i=1 (4.4)2 π i=1 i n i=1 2 /(n − 2) .  i=1 pattern) variance i=1 with (for a random plant equal to ( ρ ) x2 /(n − 2) . with (for a random plantρ i=1 pattern)−variance ( ,ρx2 )at (4.4) 1) πequal x2ito x2 = (n 22 /(n 2 ( The previous results assume that2variance the plants are placed random. A more more with (for a random plant pattern) equal to ρx2 )at − 2) . A 22) The previous results assume that the plants are placed random. x2− withplant (for a random plant pattern) variance equal to ( ρ ) /(n . − 76 DISTANCE METHODS x2 with (for a random pattern) variance equal to ( ρ ) /(n − 2) . 2 2 i=1 x2 (for a random plant pattern) variance equal to ( ρ ) /(n 2) . ^ x2 (forwith a random plant pattern) variance equal to (ρ ) /(n – 2). with (for with a random plant pattern) variance equal to ( ρ ) /(n − 2) . x2there usual case is that thatresults plants assume occur inthat clusters, so placed that there are at large areas of lower x2 is plants occur in clusters, so that are large Thecase previous the plants are placed random. A lower more Theusual previous results assume thatplaced the plants are at random. Aareas moreof The previous results assume that theresults plants assume are atthe random. A more The previous that plants are placed at A more 2 random. Theresults previous results assume that the plants are placed at random. A more usual case The previous assume that theplant plants are placed at random. A ( more than average density, and small areas of higher than average density. Any than average density, and small areas of higher than average density. Any with (for a random pattern) variance equal to ρ ) /(n − 2) . usual case is that plants occur in clusters, so that there are large areas of lower usual case is that plants occur in clusters, so that there are large areas of lower x2 usual case is that plantsusual occurcase in clusters, so that thereinare large areas of there lower are large areas of lower isin that plants occur clusters, so that usual caseisisthat thatplants plants occur in clusters, so that there are large areas of lower arrangement of sampling points will therefore usually result in the majority occur clusters, so that there are large areas of lower than average density, than average density, and small areas of higher than average density. Any arrangement of sampling points will therefore result in density. the majority The previous results assume that the plants are placed at random. A more than average density, and small areas of higher thanusually average density. Any than average density, and small areas of higher than average density. Any than average density, small areas of higher than average Any than average density, and small areas ofand higher than average density. Any of sampling points falling in the regions of low density. With clustered plants and small areas of higher than average density. Any arrangement of sampling points of sampling points falling in the regions of low density. With clustered plants usual case is that plants occur in clusters, so that there are large areas of lower arrangement of sampling points will therefore usually result in the majority of sampling pointsusually will therefore result in the in majority arrangement of arrangement sampling points willoftherefore result inusually the majority arrangement sampling points will therefore usually result the majority arrangement of sampling points will therefore usually result in the majority the result is points therefore likely to beregions anlow underestimate of clustered plant density. will therefore usually result in the majority of sampling points falling in the regions of the result is therefore likely to be an underestimate of plant density. than average density, and small areas of higher than average density. Any of sampling falling in the of low density. With clustered plants of sampling points falling in the regions of density. With plants of sampling points falling in the regions offalling low density. With clustered plants With clustered plants sampling points in density. the regions of clustered low density. of sampling points of falling in the regions oflikely low plants the result is therefore toofunderestimate be anWith underestimate ofdensity. plant density. arrangement of sampling points will therefore usually result in the majority thedensity. result is therefore likely to be an of plant low With clustered plants the result is therefore likely to be an under-estimate the result is therefore likely to be an underestimate plant density. the likely result to is be therefore likely to be of anplant underestimate of plant density. the result is therefore an underestimate of sampling points falling in the regions ofdensity. low density. With clustered plants of plant density. k th nearest plant 4.4 Using the distance to the th nearest plant 4.4 Using the distance theankunderestimate the result is therefore likely to to be x of plant density. x plant kth thxnearest nearest 4.4 to Using the distance to thenearest kth plant Using the to theplant k th distance nearest 4.4 Using the4.4 distance thedistance k plant 4.4 Using the to the nearest 4.4 Using the distance the Equation to (4.4) is k special case plant of aa general general class class of of estimators estimators suggested suggested by by Equation (4.4) is aa th special case of 4.4  Using the distance to the kth nearest plant Morisita (1957): k th nearest plant 4.4 Using the distance to the Morisita (1957): Equation (4.4) is a special case of a general class of estimators suggested by   Equation (4.4) is a special case of a general class of estimators suggested by Equation (4.4) is a special case of a general class case of estimators estimators suggested (4.4) special of estimators a general class ofby suggested by n Equation (4.4) is aEquation special case ofisa ageneral class of suggested by n  Morisita (1957):  Morisita (1957): Equation (4.4) is a special case of a general class of estimators suggested by Morisita 2   Morisita (1957): (1957): (k) (4.5) = (kn (kn − 1) 1)  π nn x x2(k)i  ,,   ρρ Morisita (1957): Morisita (4.5) − π n n(k) =case (k)i  special of a  general of estimators suggested by nclass  (1957): Equation (4.4) is a 2  n 2 i=1 x 2 i=1 2= 1) ρ(kn (k) (4.5) (kn 1) (k) , 2(k)i (4.5) (4.5) − π xπ (k)i ρ(k) , − 1) (4.5) = (kn (1957): − 1) ρ(k) π= ρ x Morisita (k)i x  ,, = (kn (k) π  (k)i ρ(k) x=(k)i , i=1 (4.5) (knis−the 1) distance π (k)i x2(k)i− sampling i=1 where from point i to its kth nearest plant n i=1 where x(k)i is the distance from sampling point i to its kth nearest plant  i=1 i=1 written as x i=1 2 is(k)now now (Figure 4.3). Thus xiimethods Figure Some for estimating x(2) ,plant the  (4.5) ρ =sampling (kn − 1) π ipoint (4.5) (1)i is written as xnearest .. toxplant (Figure 4.3). Thus where x4.3 is the x distance from sampling to,density itsnearest kth use nearest (k)isampling (k)i where x(k)i is the distance from point its plant (1)i (k)i where x(k)i is the distance from point i to its kth plant where x is the distance from sampling point ii kth to its kth nearest plant (k)i where x(k)i is the distance from sampling point i to its kth nearest plant distance to the second nearest neighbour, x , the distance to the third i=1 (3) is now written as x . (Figure 4.3). Thus x i . written as x(1)i . (1)i is now (Figure 4.3). Thus x i (1)i i now written as x (Figure 4.3). Thus xi is(Figure written as x(1)i . 4.3). Thus(1)i xand i xis now written . on. (Figure 4.3). Thusnearest xi is now (1)iso where xneighbour, is the asdistance from sampling point i to its kth nearest plant (2)

(1)

(3)

where x (k)i is the 4.3). distance point i to its kth nearest plant (Figure 4.3). Thus now written as x (Figure Thusfrom xi issampling (1)i . x i is now written as x (1)i. For a random plant pattern, the variance this estimator For a random plant pattern, the variance of thisofestimator is is (k)i





ρ(k)

2

/(kn − 2) .

Pollard (1971) noted that the k in the denominator implies that the variance of the estimate is approximately halved by using the second-nearest tree rather than the nearest, with further reductions gained by using larger values of k. Using x(k) is often referred to as k-tree sampling. It is particularly convenient when the experimenter needs to record several characteristics of the

DISTANCE METHODS  | 63

x

(2)

x

(1)

x

(3)

Figure 4.3  Some methods for estimating plant density use x(2), the distance to the second nearest neighbour, x(3), the distance to the third nearest neighbour, and so on. Pollard (1971) noted that the k in the denominator implies that the variance of the estimate is approximately halved by using the second nearest tree rather than the nearest, with further reductions gained by using larger values of k. Using x (k) is often referred to as k-tree sampling. It is particularly convenient when the experimenter needs to record several characteristics of the ‘plants’ under investigation (e.g. species, size, etc.) and not simply their location. If fixed-radius circular quadrats were used instead, then the radius might be too large in a clumped region or too small in a sparse region. In either case the experimenter might feel that resources were being wasted. To investigate the accuracy of the various distance methods, 1000 patterns of each of the four types were generated within a square study region, with each containing 500 plants. A regular grid of 25 sampling points was used (as illustrated in Figure 4.1). Figure 4.4 summarizes the results using box-whisker plots (Section 1.2.3). Here a logarithmic scale is used, so that 0.5 corresponds to an estimate of 250, 1 corresponds to an estimate of 500, 2 to an estimate of 1000, etc. The ideal estimate would therefore be represented by a narrow box with centre 1 and with short whiskers. For random patterns, all seven estimators are unbiased (i.e. they are centred on 1), but with variability that decreases steadily as k increases (ever narrower boxes and shorter whiskers). Prodan (1968) advocated the use of the distance to the sixth nearest plant. The results summarized in Figure 4.4 support this suggestion, since the boxplot for k = 6 is close to ideal. If the choice of k = 6 is impractically large, then the largest practical value should be used. For regular patterns, both^ ρx and^ ρx2 generally overestimate the number of plants, but, as k increases, so the extent of overestimation by^ ρ(k) reduces. For patterns with a gradient, and for clustered patterns, the reverse occurs, with considerable under-estimation by^ ρx and by^ ρ(k) for low values of k.

USING THE DISTANCE TO THE Random pattern; 500 plants

Random gradient; 500 plants

4

4 Random pattern; 500 plants

Random gradient; 500 plants

4 2

2 1

2 1

2 1

2 1

Random gradient; 500 plants 1 0.5

4

0.5 0.25

1

0.5

Clustered pattern; 500 plants

4 2

1 0.5

2

Regular pattern; 500 plants

4 2

Random pattern; 500 plants

x2

(2)

(3)

(4)

(5)

(6)

x

x2

(2)

(3)

(4)

(5)

(6)

0.25

2

Regular pattern; 500 plants

1 0.5 4

0.5 0.25 x

x2

(2)

(3)

(4)

(5)

(6)

x

x2

(2)

(3)

(4)

(5)

(6)

0.25

0.25

Clustered pattern; 500 plants

1 0.5

0.5 0.25 x

77

Clustered pattern; 500 plants 4

4 2

64  |  MEASURING ABUNDANCE

4

K TH NEAREST PLANT

Regular pattern; 500 plants 4

4

0.5 0.25

2

x

x2

(2)

(3)

(4)

(5)

(6)

x

x2

(2)

(3)

(4)

(5)

(6)

x

x2

(2)

x

x2

(2)

0.25

(3)

(4)

(5)

(6)

(3)

(4)

(5)

(6)

2

Figure 4.4 Box-whisker plots comparing the accuracy of alternative estimators of plant density using the distance from a sampling point to 1 1 Figure 4.4 Box-whisker plots comparing the accuracy of 1 alternative a nearby plant. ‘(k) ’ signifes the use of the k-th nearest neighbour. estimators of plant density using the distance from a sampling point to a nearby plant. ‘(k) ’ signifes the use of the k-th nearest neighbour. 0.5

0.5

0.5

choice of k = 6 is impractically large, then the largest practical value should be used. choice of k = 60.25is impractically large, then the largest practical0.25value should 0.25 overestimate thex number regular patterns, both ρx and ρ0.25 x2 generally be(3)For used. x x2 (2) (4) (5) (6) x x2 (2) (3) (4) (5) (6) x x2 (2) (3) (4) (5) (6) x2 (2) (3) (4) (5) (6) of plants, but, as k increases, so the extent of overestimation by ρthe (k) reduces. number For regular patterns, both ρx and ρx2 generally overestimate Figure 4.4 Box-whisker comparing accuracy of alternative Forplants, patterns gradient, and for clustered patterns, theby reverse occurs,of plant of but,with asplots kaincreases, so thethe extent of overestimation ρestimators (k) reduces. and by ρ  for low values of k. with considerable underestimation by ρ  density using the distance from a sampling point to a nearby plant. ‘(k)’ signifes the use of the x (k) For patterns with a gradient, and for clustered patterns, the reverse occurs,

kth nearest neighbour. with considerable underestimation by ρx and by ρ(k) for low values of k. 4.4.1

Non-random patterns

Non-random patterns 4.4.1  4.4.1 Non-random patterns Morisita (1957) addressed the problem of patterns with varying density.

He observed3(1957) that ‘the distribution of biological individuals . varying . . tends to be aggreMorisita addressed the problem of patterns with density. He Morisita gatedly (1957) addressed the problem of patterns with varying density. Hearea observed3 distributed’ rather thanofrandom. He suggested. .that thetostudy 3 observed that ‘the distribution biological individuals . tends be aggrethat ‘the ‘can distribution of theoretically biological individuals tends to be aggregatedly distributed’ be distributed’ divided intorandom. several… small fractions in which no aggregatedly rather than He suggested that the study area rather than random. He suggested that the(ideally study area ‘can be divided theoretically into gated distribution can be observed’ with the plants being randomly ‘can be divided theoretically into several small fractions in which no aggreseveral small fractions in which no aggregated distribution can be observed’ (ideally with distributed within each fraction). He noted that ‘the density is not necessargated distribution can be observed’ (ideally with the plants being randomly ilybeing the same in theeach different fractions’ and the plantsdistributed randomly distributed within eachtherefore fraction). He noted that ‘thethe density is within fraction). He noted that ‘theproposed density isestimating not necessardensity separately fraction using single observation in eachestimating case. not necessarily same infor theeach different fractions’ and therefore proposed the ily the the same in the different fractions’ andatherefore proposed estimating the Substituting 1 for neach in Equation (4.5)agives the density estimate density separately for each fraction using ausing single observation in each case.for density separately for fraction single observation in each case.the single fraction sampling point Pdensity i as the estimate Substituting 1 for n in Equation (4.5) density estimate for thefraction Substituting 1 for nsurrounding in Equation (4.5) gives thegives for the single single fraction surrounding sampling point P i as surrounding sampling point Pi as k−1 ρPi = . 2 πx k −(k)i 1 ρPi = . πx2(k)i Averaging over the n separate fractions gives the estimate for the entire region as over theover Averaging the n separate fractions the estimate for entire the entire region Averaging n separate fractions gives gives the estimate for the region as 78 as

DISTANCE METHODS

78

DISTANCE METHODS

n 1 ρ(k)a = ρPi .(4.6) (4.6) n n 1 An indication of the overall applicability of the estimate is provided by i=1 ρ(k)a = ρPi . (4.6) n i=1 of examining the variability of theapplicability n individual estimates. Their variance by is An indication of the overall the estimate is provided An indication of the overall applicability of the estimate is provided by examining 3 Quotationsby estimated are the translation original Japanese. examining thefrom variability of theof ntheindividual estimates. Their variance is  n Their variance isestimated by the variability of the n individual estimates. estimated 3 Quotations by 2  are from the the original Japanese. 1 of 2 2 translation (4.7) ρPi ) − n ρ(k)a  , sρ = n (  n−  2 1 1 i=1 2 ,(4.7) (4.7) s2ρ = ( ρPi ) − n ρ(k)a n − 1 i=1 so that an approximate 95% confidence interval (which reflects both sampling and95% the confidence extent the variations in (which plant density the uncertainty area soapproximate that an approximate 95%ofconfidence interval reflects both sampling so that anuncertainty interval (which reflects both across sampling studied) is provided uncertainty and the by extent of the variations in plant density across the area

and the extent of the variations in plant density across the area studied) is provided by

studied) is provided by √ √ ρ(k)a − 2sρ / n, ρ(k)a + 2sρ / n).(4.8) (4.8) ( √ √ (4.8) ( ρ(k)a − 2sρ / n, ρ(k)a + 2sρ / n).

The box-whisker plots summarizing the computer results using ^ ρ(k)a are illustrated in Figure 4.5. Comparison with Figure 4.4 shows that, for patterns with varying local density (the gradient and clustered patterns), the median value for the ^ ρ(k)a estimators Random pattern; 500 plants

4

Random gradient; 500 plants

4

Random pattern; 500 plants

Regular pattern; 500 plants

4

Random gradient; 500 plants

Clustered pattern; 500 plants

4

Regular pattern; 500 plants

Clustered pattern; 500 plants

4

4

4

4

2

2

2

2

2

2

2

2

1

1

1

1

1

1

1

1

0.5

0.5

0.5

0.5

DISTANCE METHODS  | 65 Random pattern; 500 plants

Random gradient; 500 plants

Regular pattern; 500 plants

Clustered pattern; 500 plants

4

4

4

4

2

2

2

2

1

1

1

1

0.5

0.5

0.5

0.5

0.25

0.25 (2)a

(3)a

(4)a

(5)a

(6)a

0.25 (2)a

(3)a

(4)a

(5)a

(6)a

0.25 (2)a

(3)a

(4)a

(5)a

(6)a

(2)a

(3)a

(4)a

(5)a

(6)a

Figure 4.5  Box-whisker plots comparing the accuracy of alternative estimators of plant density using the average distance from a sampling point to a nearby plant (ignoring edge effects). may be, on average, closer to the target value, However, this advantage is offset by the fact that this group gives more variable results than the^ ρ(k) estimators. Whichever of^ ρ(k) and^ ρ(k)a is chosen, it is evident that k should be chosen to be as large as is practicable. A bonus when using^ ρ(k)a is that the precision of the estimate and the variability in the plant density across the region are made explicit using Equation 4.8.

Example 4.1: Mangroves in Kenya According to Hijbeek et al. (2013) ‘progress [in mangrove forests] is particularly difficult when trying to get deeper into the forest as often climbing over tree roots is required. Laying plots in a mangrove forest can range from being extremely time consuming up to unfeasible.’ For this reason, Cintrón and Schaeffer-Novelli (1984) had suggested that a plotless method should be used in preference to quadrats. Hijbeek et al. (2013) compared several distance estimators using four sets of mangrove data (illustrated in Figure 4.6)4 and various types of simulated data. Sites 1 and 2, which include several species of mangrove, show a marked variation in plant density along their lengths. Sites 3 and 4, which were populated by a single mangrove species (Avicennia marina, the white mangrove), differ by their canopy type (closed in Site 3; open in Site 4). Hijbeek et al. (2013) found that both^ ρx and^ ρx2 substantially under-estimated the number of mangroves present. For the artificial data sets, for which they extended their investigation to include ^ ρ(k) and ^ ρ(k,4) with k = 2 or 3, they again found no satisfactory estimator. The results in Table 4.1 confirm that the ^ ρ(k) series of estimators do provide extreme under-estimates for the heterogeneous Sites 1 and 2.5 The estimates using the^ ρ(k)a series are reasonably accurate. For Site 1, Figure 4.7 compares the estimate provided by each sampling point with the actual number of mangroves in the subregion sampled by that point. There are gross errors in individual estimates, but there is no overall bias.

0

5

y

10

15

66  |  MEASURING ABUNDANCE

0

20

40

60

Metres

80

100

0

2

4

y

6

8

1.

0

10

20

30

50

60

70

15

20

10

15

5

y

y

10

0

5 0 0

3.

40

Metres

20

2.

5

10

Metres

15

20

0

4.

5

10

Metres

15

20

0

10

20

y

30

40

50

Figure 4.6  The locations of mangroves in four sites in a mangrove forest in Gazi Bay, Kenya. Sites 1 and 2 include several species with a marked variation in intensity across each. The mangroves in sites 3 and 4 are all Avicennia marina (the white mangrove) with Site 3 being a region with a closed canopy, whereas Site 4 has an open canopy.

0

10

20

30

40

50

x

Figure 4.7  Comparison of the estimated counts using^ ρ(6)a with the true values in each section of mangrove Site 1. The line of equality is shown, together with dotted lines indicating 20% deviations from the true number.

DISTANCE METHODS  | 67

Table 4.1  Estimates of abundance for the four mangrove sites illustrated in Figure 4.6. Point estimates within 20% of the true value are shown in bold. Those approximate 95% confidence intervals for the^ ρ(k)a series that enclose the true value are given in bold.

Estimate

No. of trees

Site

^ ρ(3)

^ ρ(4)

^ ρ(6)

^ ρ(3)a

^ ρ(4)a

^ ρ(6)a

1

103

124

149

350

404

461

472

2

450

474

474

1679

1140

1157

990

3

75

79

74

66

76

81

85

4

170

177

183

193

192

198

227

Approx. 95% confidence interval ^ ρ(3)a

^ ρ(4)a

^ ρ(6)a

No. of trees

1

(224, 475)

(259, 548)

(285, 638)

472

2

(210, 3149)

(610, 1670)

(655, 1659)

990

Site

3

(50, 82)

(59, 92)

(58, 103)

85

4

(129, 257)

(139, 246)

(158, 239)

227

4.5  The point-centred quarter method (PCQM) With this method four measurements are obtained for each sampling point. These are the distances to the nearest plant in each quarter-plane (NE, NW, SE, SW) as illustrated in Figure 4.8.

Figure 4.8  The point-centred quarter method. A sampling point () is randomly placed in a study area containing several plants (). The distances to the nearest plant in each quarter are recorded.

68  | 

Figure 4.8 The point-centered quarter method. A sampling point (o) is randomly placed in a study area containing several plants (•). The distances to the nearest plant in each quarter are recorded. MEASURING ABUNDANCE

An unbiased estimate of ρshown was shown by Morisita (1957) be given An unbiased estimate of ρ was by Morisita (1957) to betogiven by by   n 4  (4.9) ρpc = 4(4n − 1) π x2ij  , (4.9)



i=1 j=1

, …, x x distances in the four quadrants surrounding where x i1where , . . .the , xi4nearest-neighbour are the nearest-neighbour distances in the four quadrants i4i1are ^pc)2/(4n – 2). point i. For randomly distributed plants the estimate samplingsurrounding point i. For sampling randomly distributed plants the estimate has variance (ρ  2 has simulations, variance ρpcHijbeek /(4n −et2). In their al. (2013) found that^ ρ pc seriously overestimated the In their simulations, Hijbeek et al (2013) found that ρpcofseriously overes- pattern density of regular patterns, but under-estimated the number plants, when timated the density of regular but underestimated of when density varied across the study region.patterns, Jost (1993) observed that^ ρ pc the wasnumber unreliable plants, when pattern density varied across the study region. Jost (1993) obused with plants whose density varied across the study region. Apparently unaware that served DISTANCE that ρpc was unreliable when used with plants whose density varied METHODS Morisita 82 (1954) had also suggested its use, Jost proposed using instead the average-based across DISTANCE the study region. Apparently unaware that Morisita (1954) had also 82 METHODS estimator82 DISTANCE METHODS suggested its use, Jost proposed using instead the average-based estimator    4 n     12  4 x2  n  1    .(4.10) (4.10) ρpc2 = 12  ij 4 n  2   (4.10) ρpc2 = nπ 12  j=1 x2 i=1 1 ij  . (4.10) ρpc2 = nπ i=1 1 j=1 xij . nπ i=1 j=1(1954) proved the result, preWith randomly distributed plants, Morisita With randomly distributed plants, Morisita (1954) proved the result, previously With randomly distributed plants, Morisita (1954) proved the(1953), result, that previously demonstrated empirically by Cottam, Curtis, and Hale  With randomly distributed plants, Morisita (1954) proved the result, pre- of the demonstrated empirically by Cottam, Curtis, and Hale (1953), that x , the average ¯ viously demonstrated empirically by Cottam, Curtis, and Hale (1953), that x ¯, the average of the four distances, is an unbiased estimate of  1/ρ. From viously empirically Curtis, and Hale (1953), that  four distances, isdemonstrated an unbiased estimate ofby √1/ρ. From itestimate follows that, withFrom nbysampling x ¯, the of thewith four isCottam, an unbiased of 1/ρ. this it average follows that, ndistances, sampling points, anthis estimate of ρ is provided x ¯ , the average of the four distances, is an unbiased estimate of 1/ρ. From it follows with n sampling points, an estimate of ρ is provided by points, anthis estimate of ρthat, is provided by 2  this it follows that, with n sampling points, an estimate of ρ is provided by n  4 2  2 2  n  4 x   1/¯ x = 16n  ij 2 . n  4 1/¯ x22 = 16n22  xij  . i=1 j=1  1/¯ x = 16n x  . i=1 j=1 ij



j=1 This estimate is not used with complete i=1 data as it is known to be biased, but This estimate is not used with complete data as itPetranka is known to be biased, it underlies the method proposed by Warde and forbut dealing This estimate is not used with complete data as it is known to be(1981) biased, itbut underlies This estimate ismethod not usedproposed with complete dataand as it is known to be biased, but it underlies the by Warde Petranka (1981) for dealing with the problem when there are empty quarters. With n missing values, 0 the method proposed bymethod Wardeproposed and Petranka (1981) for dealing(1981) with the problem when it underlies the by Warde and Petranka for dealing with the problem when there are empty quarters. With n0 0missing values, ) distances andaverage they suggested calculating the average of the available (4n−n there arewith empty quarters. With nthere missing values, they suggested calculating the the problem when are empty quarters. With n missing values, 0 0 distances and they suggestedacalculating the average of the available (4n−n incorporating scaling constant, C, with a value dependent on0 )the proportion distances and a value they suggested the average of the available (4n−n of the available (4n – na0calculating ) distances and incorporating adependent scaling constant, C, with incorporating scaling constant, C, with a value on0 )the proportion of missing values. Their table of values for C is closely approximated by the incorporating a scaling constant, C, with a value dependent the proportion dependent on the values. proportion oftable missing values. Their tableapproximated of on values forbyCthe is closely of missing Their of values for C is closely simple equation of missing values. Their table of values for C is2 closely approximated by the approximated by the simple equation simple equation (4.11) C = 1 − 2.4p + 2.8p , simple equation (4.11) C = 1 − 2.4p + 2.8p22 , (4.11) , (4.11) C = 1 − 2.4p + 2.8p where p = n0 /4n. The resulting estimator is where p = n0 /4n. The resulting estimator is   2 where p = n0 /4n. The resulting estimator is  ρpc3 = C/¯ x2 = C(4n − n0 )2  (4.12)   x2 , ρpc3 = C/¯ x22 = C(4n − n0 )22  x2 , (4.12) ρpc3 = C/¯ x , x = C(4n − n0 ) (4.12) Random pattern; 500 plants Random gradient; 500 plants Regular pattern; 500 plants Clustered pattern; 500 plants where the summation is over the (4n − n40 ) measured distances. 4 4 4 where the summation is over the (4n − n0 ) measured distances. measured distances. where the summation isRandom over the (4n − n0 )Regular Random pattern; 500 plants gradient; 500 plants pattern; 500 plants Clustered pattern; 500 plants 2

4

Random pattern; 500 plants

4

Random pattern; 500 plants

2

4 2

1

2 1 1

0.5

0.5 0.5 0.5

0.25

0.25

0.25

pc

4

Regular pattern; 500 plants

4

Clustered pattern; 500 plants

Random gradient; 500 plants

4

Regular pattern; 500 plants

4

Clustered pattern; 500 plants

2

2

2

2

2

2

2

2

1

1

1

1

1

1

1

0.5

0.5

0.5

0.5

0.5

0.5

0.5

0.5

0.25 0.25

0.25

pc3

pc 0.25

pc

pc2 pc2

pc3pc3

pc

pc2

pc3

0.25

pc

pc2

pc3

pc

pc2

pc3

pc

pc2

pc3

1

1

1 0.5

0.5

pc2 pc3

2

4

2

pc2 pc 0.25 0.25

2

4

1

1

0.5

Random gradient; 500 plants

4

2

1

4

4

0.25

0.5

0.25

0.25

pc pc

pc2

pc2 pc3

pc3 0.25

pc

pc2pc

pc3

0.25

pc

pc2

pc3

0.25

pc

pc2

pc3

pc

pc2

pc3

pc

pc2

pc3

pc2

Figure 4.9  Box-whisker plots for theplots point-centred quarter method. Figure 4.9 Box-whisker for the point-centered quarter method. Figure 4.9 Figure 4.9

Box-whisker plots for the point-centered quarter method. Box-whisker plots for the point-centered quarter method.

Figure 4.9 suggests that the most successful of this group of estimators is Figure 4.9 suggests that the most successful of this group of estimators is . ρpc2 Figure 4.9 suggests that the most successful of this group of estimators is . ρpc2 ρ .

pc3

they suggested calculating the average of the available (4n−n0 ) distances and incorporating a scaling constant, C, with a value dependent on the proportion of missing values. Their table of values for C is closely approximated by the DISTANCE METHODS  | 69 simple equation (4.11) C = 1 − 2.4p + 2.8p2 ,

p= n0resulting /4n. The resulting is where p =where n 0/4n. The estimatorestimator is

  2 ρpc3 = C/¯ x = C(4n − n0 ) x ,(4.12) (4.12) 2



2

distances. where the summation is over − n0 ) measured where the summation is over the (4nthe – n 0(4n ) measured distances. Figure 4.9 suggests that the most successful of this group of estimators is^ ρ pc2. Random pattern; 500 plants

4

Random gradient; 500 plants

Regular pattern; 500 plants

4

4

Example 4.2: Mangroves in Kenya (cont.) 2

2

Clustered pattern; 500 plants

4

83

ANGLE-ORDER ESTIMATORS

2

2

According to Hijbeek al. (2013), the point-centred Example 4.2 : et Mangroves in Kenya (cont.)quarter method is the standard method for assessing numbers of mangroves. However, when they used ^ ρ pc with According to Hijbeek et al (2013) , the point-centered quarter method is the randomstandard sampling points for Sites 3 and 4 (they did not use it with Sites 1 and 2), method for assessing numbers of mangroves. However, when they they found it substantially under-estimated mangrove density. Table 4.2 used ρpc with random sampling points for Sites 3 and 4 (they did not usegives estimates for all four mangrove all three point-centred quarter methods. it with Sites 1 and 2), they sites, foundfor it substantially underestimated mangrove The sampling those detailedfor previously. The results confirm severe density. points Table were 4.2 gives estimates all four mangrove sites, for all the three ^ point-centered quarter methods. The sampling points were those detailed under-estimation by ρ , and show the expected superiority of the average-based pc Figure 4.9 Box-whisker plots for the point-centered quarter method. previously. The results confirm thenot severe underestimation ρpc , and ^ ρ pc2. Note that the latter is arguably as effective as ^ ρ(4)a by which alsoshow involves . Note the latter is is theFigure expected superiority ofthe themost average-based measuring distances to four that trees. 4.9 suggests successful ρ ofpc2 this groupthat of estimators 1

1

1

1

0.5

0.5

0.5

0.5

0.25

0.25

pc

pc2

pc3

0.25

pc

pc2

pc3

0.25

pc

pc2

pc3

pc

pc2

pc3

not as effective as ρ(4)a which also involves measuring distances to ρarguably pc2 . four trees. 

Table 4.2  Estimates of abundance obtained using the point-centred quarter method for the fourTable mangrove sites illustrated in Figure 4.6. Point estimates within 20% of the true 4.2 Estimates of abundance obtained using the point-centered quarter value are shown bold. method forinthe four mangrove sites illustrated in Figure 4.6. Point estimates within 20% of the true value are shown in bold.

Site 1

^ ρ pc

Site 1 259 ρpc ρpc2 ρpc3

Site 2

Site 3

Site 4

Site 569 2

Site52 3

Site 4187

259 315

1022

569 1022

51 51 51

187 227 227

153153

564 564

4545

159 159

True value 472472 True value

990 990

8585

227 227

^ ρ pc2 ^ ρ pc3

315

4.6  Angle-order estimators 4.6 Angle-order estimators The estimators ^ ρx2, ^ ρ(k), and ^ ρ are all special cases of a general class of estimators The estimators ρx2 , ρ(k)pc, and ρpc are all special cases of a general class of suggested by Morisita (1957). When the area around the sampling point is divided into q estimators suggested by Morisita (1957). When the area around the sampling equal sections, the distance being measured to the kth nearest plant in each point iswith divided into q equal sections, with the distance being measured to thesection, then the resulting estimator is kth nearest plant in each section, then the resulting estimator is

where x

ρ(k,q)

  n q  (4.13) = q(kqn − 1) π x2(k)ij  ,(4.13) i=1 j=1

is the distance from the ith sampling point to the kth nearest

distance from the ith sampling point to the where x (k)ij is the(k)ij  kth2nearest plant in the jth plant in the jthhas section. The(ρ estimator has variance ρ(k,q) /(kqn − 2) for a 2 ^ section. The estimator variance (k,q)) /(kqn – 2) for a random plant pattern. The case random plant pattern. The case k = q = 1 corresponds to ρ , the case q = 1 k = q = 1 corresponds to^ ρx2, the case q = 1 corresponds to^ ρ(k), andx2the case k = 1 and q = 4 , and the case k = 1 and q = 4 corresponds to ρpc . corresponds to ρ(k) correspondsEngeman to^ ρ pc. et al (1994) report excellent results when using ρ . However, (3,4)

White et al (2008) remark that ‘in practice much time is spent deciding which is the third closest individual and into which quadrant an individual lies.’ Khan et al (2016) note that the formula for ρ(k,q) is often given incorrectly. The principal error, which is the replacement of q(kqn − 1) by nq(kq − 1) in

70  |  MEASURING ABUNDANCE

x

y Engeman et al. (1994) report excellent results when using^ ρ(3,4). However, White et al. (2008) remark that ‘in practice much time is spent deciding which is the third closest individual and into which quadrant an individual lies’. Khan et al. (2016) note that the formula for^ ρ(k,q) is often given incorrectly. The principal error, which is the replacement of q(kqn – 1) by nq(kq – 1) in Equation (4.5), is also noted in Figure 4.10 The nearest-neighbour method. A sampling point (o) is the excellent discussion of the method by Mitchell (2015). randomly placed in point-centred a study area quarter containing several plants (•). The nearest plant is identified and the distance, y, from that plant to its nearest neighbour is determined.

4.7  Nearest-neighbour distances

If it were possible to choose a plant at random, and measure the distance

If it weretopossible to choose a plant random, and measure the distance its nearest its nearest neighbour, thenatthe results of Section 4.3 would apply totothose neighbour, then the However, results of aSection 4.3 would applybe to selected those distances. However, distances. plant cannot properly at random without a plant cannot properly selected at random without having a list would of all the in which having abelist of all the plants — in which case there be plants no need– to theno plant density! Instead, an approximation to the selection of case thereestimate would be need to estimate the as plant density! Instead, as an approximation a random plant, a sampling point is placed at random in the study region to the selection of a random plant, a sampling point is placed at random in the study andthe the nearestplant plantis isidentified. identified.The The distancethen then recorded,y,y,is is region and nearest distance recorded, thethe distance distance between that plant and its nearest neighbour. Figure 4.10 illustrates between that plant and its nearest neighbour. Figure 4.10 illustrates the procedure. the procedure. The figure also illustrates why the results of Section 4.3 are not appropriate. As The figure also illustrates why the results of Section 4.3 are not appropriate. 2 that surrounds the chosen previously, is knownit that the circular area ofπyarea As it previously, is known that the region circularof region πy 2 that surrounds plant contains no other plants. However, is already known the shaded the chosen plant contains no otheritplants. However, it that is already knownarea thataround the sampling point contained nothe other plants.point Thiscontained overlap, no which will always occur, is the shaded area around sampling other plants. This overlap, which will always occur, is at its greatest when y ≥ 2x. at its greatest when y ≥ 2x. and (1956) Curtis (1956) examined a set of hypothetical random popula- they Cottam Cottam and Curtis examined a set of hypothetical random populations; tions; they concluded that, by analogy with Equation (4.3), an approximately concluded that, by analogy with Equation (4.3), an approximately unbiased estimate of unbiased estimate plant density wasabout provided by about 1.44 plant density was providedofby scaling up by 1.44by (= scaling 1.22) toup give: 2 (= 1.2 ) to give:

2



ρy = 0.36n



n  i=1

yi

2

,(4.14) (4.14)

where y i is the nearest-neighbour distance corresponding to sampling point i. For a ^y)2 /(n – 2). random pattern the variance of this estimator is given by (ρ

x

y

Figure 4.10  The nearest-neighbour method. A sampling point () is randomly placed in a study area containing several plants (). The nearest plant is identified and the distance, y, from that plant to its nearest neighbour is determined.

DISTANCE METHODS  | 71 Random pattern; 500 plants

Random gradient; 500 plants

Regular pattern; 500 plants

Clustered pattern; 500 plants

85

COMBINED POINT-TO-PLANT AND NEAREST-NEIGHBOUR MEASURES

4

4

4

4

where yi is the nearest-neighbour distance corresponding to sampling point i. 2 2 For a random pattern the variance of this2 estimator is given by ( ρ2y ) /(n − 2). In the same way, the equivalent of Equation (4.4) for a nearest-neighbour estimate is   1 1 1 n  2 ρy2 = 1.44(n − 1) (4.15) π yi

2

1

0.5

0.5

i=1

0.5

0.5

Note that the version of this estimator that appears in the literature (often falsely attributed to Byth and Ripley (1973)) has 1.44(n − 1) replaced by n.

0.25

0.25

y

0.25

4

0.25

COMBINED ANDpattern; MEASURES yRandom gradient;POINT-TO-PLANT y2 yNEAREST-NEIGHBOUR y plants85 500 plants Regular 500 plantsy2 Clustered pattern; 500

Randomy2 pattern; 500 plants 4

4

y2

4

Figure 4.11  Box-whisker plots investigating the accuracy of the nearest-neighbour estimators.



where yi is the nearest-neighbour distance corresponding to sampling point i. 2 2 2 2 2 For a random pattern the variance of this estimator is given by ( ρy ) /(n − 2). In the same the equivalent of Equation (4.4) for a nearest-neighbour In the same way, theway, equivalent of Equation (4.4) for a nearest-neighbour estimate is 1 1 1 1 estimate is  n   (4.15) ρy2 = 1.44(n − 1)0.5 π yi2 (4.15) 0.5 0.5 0.5 i=1

Note that the version of this estimator0.25that appears in the literature (often 0.25 Note thatfalsely the version of thisto0.25 estimator appears iny the (often falsely y y2 y2 y2 y y2 attributed Bythy andthat Ripley (1973)) hasliterature 1.44(n − 1) replaced byattributed n. 0.25

to Byth and Ripley (1973)) has 1.44(n – 1) replaced by n. Box-whisker investigating the accuracy of the nearestIn a Figure regular4.11 pattern (a patternplots where plants must keep their distance from one estimators. another),neighbour once a plant has been chosen, its neighbour is likely to be further away than would be expected with a random pattern of the same plant density. For this reason, In a regular pattern (a pattern where plants must keep their distance from estimators only once y-values arehas likely seriously thetonumber of oneusing another), a plant beentochosen, its under-estimate neighbour is likely be individuals in such a pattern. The extent of the under-estimation is illustrated in Figure further away than would be expected with a random pattern of the same 4.11. plant density. For this reason, estimators using only y-values are likely to Random pattern; 500 plants

Random gradient; 500 plants

Regular pattern; 500 plants

Clustered pattern; 500 plants

4

4

4

4

2

2

2

2

1

1

1

1

0.5 0.5 seriously underestimate the number of 0.5individuals in such0.5 a pattern. The extent of the underestimation is illustrated in Figure 4.11.

4.8  Combined point-to-plant and nearest-neighbour measures 4.8 Combined point-to-plant and nearest-neighbour measures 0.25

0.25

y

y2

Figure 4.11

0.25

y

y2

0.25

y

y2

y

y2

Box-whisker plots investigating the accuracy of the nearest-

Diggle (1975) observed that, since x-based estimators under-estimate numbers in a neighbour estimators. Diggle (1975) observed that, since x-based estimators underestimate numclustered pattern, while y-based estimators often overestimate those numbers, an bers in a clustered pattern, while y-based estimators often overestimate those a regular pattern (a pattern plants keep their estimate numbers, thatInmight robust tothat deviations from randomness would bedistance one thatfrom combined anbeestimate might where be robust to must deviations from randomness one another), once a plant has been chosen, its neighbour is likely to be (4.4) information from both x-values and y-values. He suggested combining Equations would be one that combined information from both x-values and y-values. He 6 further than would be expected with a random 6 pattern of the same and (4.15)suggested to give:away combining Equations (4.4) and (4.15) to give :



plant density. For this reason, estimators y-values are likely to  using only    n n seriously underestimate the number of individuals in such a pattern. The   2  .(4.16) π  in Figure extent of the underestimation (4.16) ρxy = 1.2(n −is1)illustrated y4.11. x2i i

86

i=1

i=1

DISTANCE METHODS

4.8 Combined point-to-plant and nearest-neighbour measures In a 6similar fashion Byth suggested combining Equations (4.3)is and (4.14) to Diggle used n rather than(1982) 1.2(n − 1), but the simulations suggest that the latter appro7 In a similar fashion Byth (1982) suggested combining Equations (4.3) and give: priate. 7



Diggleto(1975) x-based estimators underestimate num(4.14) give : observed that, since  nestimators n bers in a clustered pattern, while y-based often overestimate those   2 0.3n be robust (4.17) xi to deviations yi .(4.17) numbers, an estimateρthat from randomness xy2 =might would be one that combined informationi=1 from i=1 both x-values and y-values. He suggested combining Equations (4.4) and (4.15) to give6 : Random pattern; 500 plants Random gradient; 500 plants Regular pattern; 500 plants  Clustered pattern; 500 plants    n n 4 4 4 4   (4.16) x2i yi2  . ρxy = 1.2(n − 1) π  i=1

2

6 Diggle

priate. 1

2

2

i=1

2

used n rather than 1.2(n − 1), but the simulations suggest that the latter is appro1

1

1

72  |  MEASURING ABUNDANCE Random pattern; 500 plants

Random gradient; 500 plants

Regular pattern; 500 plants

Clustered pattern; 500 plants

4

4

4

4

2

2

2

2

1

1

1

1

0.5

0.5

0.5

0.5

0.25

0.25 xy

xy2

0.25 xy

xy2

0.25 xy

xy2

xy

xy2

Figure 4.12  Box-whisker plots investigating the accuracy of measures combining point-toplant and nearest-neighbour distances. Figure 4.12 demonstrates that while these estimators are accurate for random or regular patterns, they under-estimate the average plant density when there are variations in density across the region of interest.

4.8.1  T-square estimators As has been demonstrated, a difficulty with estimators that use y is caused by the variation in the overlap between the circle surrounding the sampling point, and the circle surrounding the nearest neighbour. Besag and Gleaves (1973) suggested avoiding this uncertainty by restricting attention to the half-plane away from the original sampling point, so that no overlap occurs. This is illustrated in Figure 4.13.

x

z

Figure 4.13  The T-square method. A sampling point () is randomly placed in a study area containing several plants (). The nearest plant is identified. The distance, z, from that plant to its nearest neighbour in the half-plane away from the sampling point is determined.

by the variation in the overlap between the circle surrounding the sampling x point, and the circle surrounding the nearest xxneighbour. Besag and Gleaves z (1973) suggested avoiding this uncertainty by zrestricting attention to the | z DISTANCE METHODS   73 half-plane away from the original sampling point, so that no overlap occurs. This is illustrated in Figure 4.13. Besag and Gleaves proposed using ρz given Besag and Gleaves (1973) (1973) proposed using^ ρz given by: by:  n   Figure 4.13 The T-square point (o) is randomly (4.18) 2n π A sampling zi2 .(4.18) ρz =method.

Figure The method. A sampling point (o) is placed 4.13 in a study area containing several plants (•). plant Figure 4.13 The T-square T-square method. Ai=1 sampling pointThe (o) nearest is randomly randomly placed in a study area containing several plants (•). The nearest is identified. The distance, z, from that plant to its nearest neighbour in placed in a study area containing several plants (•). The nearest plant plant is identified. The distance, z, from that plant to its nearest neighbour in the half-plane away from the sampling point is determined. is identified. The distance, z, from that plant to its nearest neighbour in However, since estimates using z-values alone will, like those based on y-values, However, since estimates using z-values alone will, like those based on y-values, the away from is the half-plane half-plane awaynumbers from the the sampling point is determined. determined. overestimate plant in sampling clustered point patterns, Diggle (1975) suggested



overestimate plant numbers in clustered patterns, Diggle (1975) suggested using using combined from information fromz:both x and z: estimators thatestimators combinedthat information both x and 7 Byth

 

n



n

 used 0.25 rather than 0.3 (omitting 1.2 scaling).   1   the 2 2 n n n n

 z ,(4.19) (4.19) ρxz = 2n π   x2i + 1 12  2i (4.19) ρ = ρ xz = 2n 2n π π i=1x xi2i + + 2 i=1zzii2 ,, (4.19) xz  n 2i=1  n     i=1 i=1 i=1       1          2 2  n n  n xi n zi  . (4.20) ρxz2 = n π    112  2 2  (4.20) ρ = i=1z = n n π π  2 i=1x x2ii zii2 ..(4.20) (4.20) ρ xz2 xz2 2 i=1 i=1 i=1 i=1 Byth (1982) examined many possible combinations of the various distances Byth (1982) examined many possible combinations of the various distances Byth and (1982) examined possible combinations various distances and given by : distances concluded that many the preferable combination was ρof of Byth (1982) examined many possible combinations the various xz3the and concluded that preferable combination was xz3 given given by by :: and concluded that the thecombination preferable combination was ρ ρ  by: concluded that the preferable was^ ρnxz3 given   xz3 n    √   n n n n   (4.21) 2 x ρxz3 = n2 2√ i   zi . √ 2 2 (4.21) 2 2 = n x z ρ  (4.21) 2 2 i=1xii i=1zii ..(4.21) ρxz3 xz3 = n



i=1

i=1

i=1 and i=1point-centered-quarter estiSilva et al (2017) compared the T-square Silva et al (2017) compared the T-square and point-centered-quarter estimators with k-tree estimators in the and context of forests in the Azores. et al variant (2017) compared the T-square point-centered-quarter estiSilvamators etSilva al. (2017) compared the T-square and point-centred quarter estimators with with variant k-tree estimators in the context of forests in the Azores. They used quadrat estimate as their ‘benchmark’. They found the mators withthe variant k-tree estimators in the context of forests in the that Azores. variant k-tree estimators in the context of forests in the Azores. They used the quadrat They usedestimators the quadrat quadrat estimate as as the their ‘benchmark’. Theyperforming found that thatbeing the T-square outperformed others with the best They used the estimate their ‘benchmark’. They found the estimate T-square as ‘benchmark’. They found that thewith T-square estimators outperformed estimators outperformed others the best performing being ρxz3their . Figure 4.14 confirms ρxz3 as the being the most reliable, but suggests that T-square estimators outperformed the others with the best performing being ^ ^ ρ .. Figure 4.14 confirms as most reliable, but the others with the best performing being ρdensity . Figure 4.14 confirms ρxz3 asthat being the may seriously in clustered patterns. ρ itxz3 xz3the Figure 4.14underestimate confirms ρ ρ xz3 as being being the most reliable, but suggests suggests that xz3 xz3 plant it may may seriously underestimate plant density density in clustered clustered patterns. patterns. most reliable, but suggests that it may seriously under-estimate plant density in clustered it seriously underestimate plant in

patterns.

4.9 Wandering methods 4.9 4.9 Wandering Wandering methods methods Catana (1953) observed that plants are more likely to be distributed in clumps Catana (1953) observed that plants are are more likely to be be distributed distributed in clumps clumps rather than at observed random. that He proposed a more sampling procedure that, starting from Catana (1953) plants likely to in rather than at proposed aa sampling that, starting a single sampling point,He region ofprocedure interest from to from plant rather than at random. random. Hetraverses proposedthe sampling procedure that,plant starting from aa single single sampling sampling point, point, traverses traverses the the region region of of interest interest from from plant plant to to plant plant

Random pattern; 500 plants

Random gradient; 500 plants

Regular pattern; 500 plants

Clustered pattern; 500 plants

4

4

4

4

2

2

2

2

1

1

1

1

0.5

0.5

0.5

0.5

0.25

0.25 z

xz

xz2

xz3

0.25 z

xz

xz2

xz3

0.25 z

xz

xz2

xz3

z

xz

Figure 4.14  Box-whisker plots investigating the accuracy of the T-square measures.

xz2

xz3

74  |  MEASURING ABUNDANCE WANDERING METHODS

89

4.9  Wandering methods the plant as a point, repeat the procedure.

The result is now referred to as Catana’s ‘wandering-quarters’ sampling procedure. Catana (1953) observed that plants are more likely to be distributed in clumps rather DiggleHe (1983) described the wandering-quarters method from as ‘ana ingenious than at random. proposed a sampling procedure that, starting single sampling sampling procedure whose statistical potential appears not to have been tapped’. point, traverses the region of interest from plant to plant and thereby extracts a sequence Hall, Melville, and Welsh (2001) subsequently generalised the idea to search of inter-plant distances. proposed rules on the mean and median of these segments of angleCatana 2θ. They proposed the based estimator distances to decide whether there was appreciable plant clumping. If clumps were   n  within-clump distances, average diagnosed, then his procedure calculates average 2 ρw1 = . (4.22) w (n −and 1.5) numbers θ between-clump distances, clump sizes, ofi clumps, to arrive at an overall i=1

density estimate. wheresampling wi is a plant-to-plant and,inforFigure the wandering-quarter method,From a Catana’s procedure distance (illustrated 4.15) was as follows. = π/4. sampling They outlined corrections for bias based on prior randomlyθ chosen pointbootstrap-based on the edge of the study region, set a direction of interest WANDERING METHODS 89 information concerning the nature of the non-random pattern. (e.g. north) and identify the nearest plant in the quarter circle centred on this direction. Proceed to the identified plant and, treating the plant as a point, repeat the procedure. the plant as a point, repeat the procedure. The result is now referred to as The result is now ‘wandering-quarters’ referred to as Catana’s ‘wandering-quarters’ sampling procedure. Catana’s sampling procedure. Diggle Diggle (1983) described the wandering-quarters method as sampling (1983) described the wandering-quarters method‘an as ingenious ‘an ingenious procedure whoseprocedure statisticalwhose potential appears not toappears have been Melville, sampling statistical potential not totapped’. have beenHall, tapped’. and Welsh (2001) subsequently generalized the idea togeneralised search segments 2θ. They Hall, Melville, and Welsh (2001) subsequently the ideaoftoangle search of angle 2θ. They proposed the estimator proposedsegments the estimator



ρw1 = (n − 1.5)



θ

n 

wi2

i=1



.(4.22) (4.22)

wi is a plant-to-plant distance and, forwandering-quarters the wandering-quarter method,θ = π/4. where w iwhere is a plant-to-plant distance and, for the method, θ = π/4. They outlinedcorrections bootstrap-based corrections for bias based on prior They outlined bootstrap-based for bias based on prior information concerning Figure 4.16 concerning An example the ‘wandering forward’pattern. procedure (θ = π/2). information the of nature of the non-random the nature of the non-random pattern. In the spirit the of wandering-quarters method, a a‘wandering-forward’ procedure In theof spirit the wandering-quarter method, ‘wandering-forward’ prois also presented, in which θ = π/2. This procedure, which is illustrated Figure 4.16, cedure is also presented, in which θ = π/2. This procedure, which in is illusresults intrated a shorter walk4.16, to obtain number measurements. Simulations of in Figure results the in asame shorter walk toofobtain the same number of measurements. Simulations of random patterns suggest the following: random patterns suggest the following:



ρw2 = (n − 1)



1.4

n  i=1

wi2



.(4.23) (4.23)

One practical advantage of the wandering procedures over those mentioned previously, is that the data collector is required to identify only one random location; furthermore, that location will be relatively easy to find as it will be on the perimeter of the region of interest. In the simulations reported here Figure 4.16 was An example of proceed the ‘wandering procedure (θ interest = π/2). the wanderer allowed to from oneforward’ side of the region of to the other. On average this resulted in 24 measurements for the wanderingIn theprocedure spirit of the wandering-quarter a ‘wandering-forward’ quarter (comparable to the 25 method, used with sampling points), butpro43 cedure is also presented, in which θ = π/2. This procedure, is illusmeasurements for the wandering-forward method. Figure 4.17which suggests that trated in Figureare 4.16, results inunbiased a shorterfor walk to obtaintypes. the same number of the procedures reasonably all pattern However, their measurements. Simulations of random patterns suggest the following:   n  ρw2 = (n − 1) (4.23) 1.4 wi2 . i=1

One practical advantage of the wandering procedures over those mentioned previously, is that the data collector is required to identify only one random location; furthermore, that location will be relatively easy to find as it will be Figure 4.15  An example wandering-quarters procedure (θ =reported π/4). on the perimeterofofCatana’s the region of interest. In the simulations here the wanderer was allowed to proceed from one side of the region of interest to the other. On average this resulted in 24 measurements for the wanderingquarter procedure (comparable to the 25 used with sampling points), but 43 measurements for the wandering-forward method. Figure 4.17 suggests that

DISTANCE METHODS  | 75

Figure 4.16  An example of the ‘wandering-forward’ procedure (θ = π/2). One practical advantage of the wandering procedures over those mentioned previously, is that the data collector is required to identify only one random location; furthermore, that location will be relatively easy to find as it will be on the perimeter of the region of interest. In the simulations reported here the wanderer was allowed to proceed from one side of the region of interest to the other. On average this resulted in 24 measurements for the wandering-quarters procedure (comparable to the 25 used with sampling points), but 43 measurements for the wandering-forward method. Figure 4.17 suggests that the procedures are reasonably unbiased for all pattern types. However, their variability is greater than for most of the alternatives considered. Unsurprisingly, since more measurements are taken, the wandering-forward procedure is the less variable.

Random pattern; 500 plants

Random gradient; 500 plants

Regular pattern; 500 plants

Clustered pattern; 500 plants

4

4

4

4

2

2

2

2

1

1

1

1

0.5

0.5

0.5

0.5

0.25

0.25 w1

w2

0.25 w1

w2

0.25 w1

w2

w1

w2

Figure 4.17  Box-whisker plots investigating the accuracy of the wandering procedures.

76  |  MEASURING ABUNDANCE

Example 4.3: Mangroves in Kenya (cont.) The combination of variable plant density and the shape of the long narrow Sites 1 and 2 suggests that the wandering methods might struggle to obtain acceptable estimates. In Table 4.3 routes F and B indicate routes going forward from one end (the short edge in the case of Sites 1 and 2) or backwards from the other, in each case stopping after 25 trees have been encountered. The T routes traverse the sites from one end of the site to the other end, with no limit on the number of trees. Superficially, the results are encouraging, with many of the estimates lying within 20% of the true value. However, for Site 1, the wandering-forward procedure is shown to be unacceptable, since, for the forward path, it gives rise to estimates that are three or four times the true value. For the backward path at Site 2, both wandering methods give unacceptably low estimates. The reason for all these poor estimates is the great variation in plant density from one end to the other. The limit of 25 plants for the F and B paths implies that only a small proportion of the overall site is being sampled. Because the wandering-forward procedure collects data faster, it samples a smaller proportion than the wandering-quarters procedure and hence is liable to give less representative estimates. When the methods are not limited to 25 plants their estimates are reasonably accurate. The results underline the need to sample throughout the region of interest. Table 4.3  Estimates of abundance obtained using the wandering methods for the four mangrove sites illustrated in Figure 4.6. Point estimates within 20% of the true value are shown in bold. F and B refer to alternative 25-tree routes through the sites. The T routes traverse the sites from one end to the other.

Site 1

Site 2

F

B

T

F

B

T

Site 3

Site 4

434

540

628

1128

606

1016

73

178

Wandering forward,^ ρ w1, θ = π/2

1486

341

528

1021

690

823

88

308

Wandering forward,^ ρ w2

1703

391

596

1170

791

926

104

353

Wandering quarters,

^ ρ w1, θ = π/4

True value

472

472

472

990

990

990

85

227

No of trees sampled by wandering q.

25

25

35

25

25

102

8

16

No of trees sampled by wandering f.

25

25

79

25

25

174

10

25

4.10  Handling mixtures of species Suppose a region consists of two competing types of plant, with one being common and the other rare. The object of an investigation is to assess the abundance of the rare plant. Two possible distance-based strategies are as follows:

DISTANCE METHODS  | 77 1. Ignore the common plants and work only with distances from sampling points to rare plants. 2. Work with all the plants, assessing both the overall abundance and the proportion belonging to the rare type. A problem for both strategies is that, for most of the sampling points, there may be no rare plants visible, or the nearest rare plant may be infeasibly distant. Since the chance of locating a rare plant increases when more plants are examined, this suggests using a method that involves the examination of many plants. The best performing of the methods examined in this chapter was conveniently one that involved many plants:^ ρ(6)a.

Example 4.4: Mangroves in Kenya (cont.) Sites 1 and 2, taken together, formed a complete section of the forest; from low tide shore to terrestrial vegetation. Site 1 contained four mangrove species: 122 Avicennia marina (A), 7 Bruguiera gymnorrhiza (B), 324 Ceriops tagal (C), and 19 Xylocarpus granatum (X). Their locations, which are illustrated separately in Figure 4.18, show considerable variations in intensity for each species.

A

B

C

X

Figure 4.18  Site 1 contained four mangrove species (A, B, C, and X) with the locations indicated (the x-axis and the y-axis are on different scales). The larger symbols identify the locations of those mangroves selected when species is ignored in identifying nearest neighbours and^ ρ(6)a is used with a central row of sampling points. Using ^ ρ(6)a with 25 sampling points and the strategy of ignoring species when identifying the nearest neighbours, the data gatherer might use information from a total of 150 trees. In practice, however, it is likely that fewer trees will be identified, since some trees may be one of the six nearest neighbours for more than one sampling point. Using a central line of 25 sampling points in Site 1, it turns out that just 111 trees are identified. These trees are identified by using larger symbols in Figure 4.18. Table 4.1 reported that, using^ ρ(6)a, Site 1 was estimated as containing 461 trees. The upper part of Table 4.4 refers to this estimate. Since 29 of the 11 trees identified as part of the estimation process were of species A, the number estimated for the entire site is 461 * 29/111 = 120. The estimates for the common species are in excellent agreement with the true numbers. Those for the rarer species are uncertain, since small changes in the numbers identified will make a proportionally large change in the estimate. For example, if there had been two individuals of species B then the estimate would have doubled, whereas if the sampling had failed to include any individuals of that species then its presence would have passed unnoticed.

78  |  MEASURING ABUNDANCE The second half of the table displays the results obtained if each species is separately analysed. The results are marginally better, but this is misleading. In a site with dimensions 16 m × 100 m, for some sample points, the nearest individuals of species B were more than 50 m distant and would not have been detected. Table 4.4  The observed number, true number, and estimated number, using^ ρ(6)a, for the four mangrove species at Site 1. Estimates within 20% of the true value are shown in bold.

Species

A

B

C

X

Considering all species together Number used in estimation process

29

1

74

7

120

4

307

29

65

7

93

16

Estimated number

120

10

345

17

Number actually present

122

7

324

19

Estimated number based on proportion of overall estimate Considering each species separately Number used in estimation process

4.11  Recommendations The results reported in previous sections show that all the methods work well with random patterns, while most struggle when plant density is low, or when there are marked variations in plant density across the region of interest. The choice of method will depend upon what is practical in a particular context. A trial run may be helpful in making decisions about the method to be used. Some general observations follow. • Sample points should be chosen so that the entire study area is covered, with roughly equal areas being ‘allocated’ to each point. • If the results from individual points show marked variations in intensity across the study area, then a more accurate estimate is likely to be obtained by using the average of the separate estimates, rather than an estimate based on pooling information. • Methods that involve measurements to relatively large numbers of neighbouring plants will usually give more accurate estimates than those based on single distances. • Sheil’s variant of the VAT procedure discussed in the next chapter is recommended if the methods in this chapter seem impractical. • The best procedure appears to be to use^ ρ(k)a with k as large as is practical.

5.  Variable sized plots 5.1  Variable area transect (VAT) Parker (1979) suggested a procedure that attempted to combine the speed of distance sampling96with VARIABLE the accuracy of enumerating plants in a specified region. The procedure SIZED PLOTS 96 VARIABLE SIZED PLOTS is illustrated in Figure 5.1. The method uses n sampling points, with the walking in ainspecified direction The method uses n sampling points, withobserver the observer walking a specified Thefrom method uses n sampling points, with the observer walking in alocated specified each point, until plants have been located that are within a distance w/2 from the direction from keach point, until k plants have been that are within a direction from each until k The plants have been that are a is distance w/2 from the estimator observer (see Figure 5.1). Thewithin estimator observer (seepoint, Figure 5.1). is located distance w/2 from the observer (see Figure 5.1). The estimator is  n    n  ρv(k) =  (kn − 1) w di , (5.1) (5.1) ρv(k) = (kn − 1) (5.1) w di , i=1



i=1

where di is the distance walked from the ith sampling point. For a random is the the distance walked from theρ2ithpoint. sampling a random distance walked from the sampling For2).a point. random where where di is thed ipattern /(kn − With kFor taken to be 3, pattern estimator has ith variance v(k) Withkktaken taken to tobe be3,3,Engeman et al. (1994) patternthe theestimator estimatorhas hasvariance variance ρ2v(k) /(kn − 2).. With Engeman et al (1994) judged the VAT estimator to be one of the most pracjudged VATjudged estimator to beestimator one of the In apracsubsequent investigation, Engeman et al the (1994) the VAT to most be onepractical. of the most tical. In a subsequent investigation, Engeman, Nielson and Sugihara (2005) tical. In a subsequent investigation, Engeman, Nielson and Sugihara (2005) Engeman, Nielson and Sugihara (2005) concluded that a slightly largerThey value of k (say concluded that a slightly larger value of k (say 5 to 7) was optimal. also concluded arecommended slightly larger value of k (say 5 to optimal. They also be 5 tothat 7) was optimal. They also recommended that ‘the be transect as wide as can be that ‘the transect be7) aswas wide as can readily accommodated recommended transect asa wide as can be readily accommodated readily that accommodated inbeaInsingle pass’. In a follow-up investigation, comparing in a‘the single pass’. follow-up investigation, comparing the VAT methodthe VAT in a single pass’. In a follow-up investigation, comparing the VAT with the distance methods, White et almethod (2008) concluded that method with themost mostpromising promising distance methods, White et al. (2008) concluded that with the most promising methods, White et al straightforward (2008) concludedtothat VATdistance methodseem would seem thestraightforward most in most ‘the VAT ‘the method would the most to utilizeutilize in most field field situations’. ‘the VAT method would seem the most and straightforward to utilize in that mostthe fieldbest choice for w situations’. Dobrowski Murphy (2006) suggest Dobrowski and and Murphy (2006) suggest that thethe best choice for w is a value similar to the situations’. Dobrowski Murphy (2006) suggest that best choice for a value similar tosampled. the width of the objects being sampled.w width of isthe objects being is a value similar toHowever, the width of the objects being sampled. when plant density varies noticeably across the study region, when plantvaries density varies noticeably across the study region, the approach However,However, when density the study region, the plant approach of using thenoticeably average ofacross individual transect estimates, might be of using the average of individual transect estimates, might be preferable. the the approach of preferable. using the average of individual transect estimates, might be by t , . Denoting Denoting the n separate transect density estimates . . , tn , 1 preferable. Denoting the n density separate transectpoint density t1 , . . . , tfrom n separate estimates by it 1is, estimates …, tn, by theby estimate sampling point i is n, thetransect estimate from sampling given the estimate from sampling point i is given by given by k−1 . (5.2) ti = k−1 wdi . (5.2) ti = wdi and the overall estimate is and the overall estimate is n 1 n ti . (5.3) 1  ρv(k)a = n i=1 ti . (5.3) ρv(k)a = n i=1 The standard deviation of the mean of the n separate estimates (which The standardmeasures deviation of the mean ofofsample the n uncertainty separate estimates a combination and the (which variation in plant denmeasures a combination and the variation in plant density) is of sample uncertainty  variable area transect procedure: from each sampling point (marked with an o)  sity) isFigure 5.1  The   n 2   direction n    in a specified   the observer walks until k (here k = 3) plantshave been observed within 2 1 1  n n  = 2−    s t ti . (5.4) 1 1  i a transect ofs width walked recorded.  =  w. The distance t2in(n − − 1) is ti . n i=1 (5.4) i=1  n(n − 1)  n i=1

i=1

An approximate 95% confidence interval for the overall plant density is An approximate 95% confidence interval for the overall plant density is ( ρ − 2s, ρv(k)a + 2s). ( ρ − 2s, ρ v(k)a + 2s).

situations’. Dobrowski and Murphy (2006) suggest that the best choice for w is a value similar to theand width of the(2006) objects being that sampled. situations’. Dobrowski Murphy suggest the best choice for w is a value similar to the width of the objects being sampled. However, when plant density varies noticeably across is a value similar to the width of the objects being sampled.the study region, However, when plant density varies noticeably across the studymight region, theHowever, approachwhen of using thedensity averagevaries of individual transect be plant noticeably acrossestimates, the study region, the approach of using the average of individual transect estimates, might be, 80  |  MEASURING ABUNDANCE preferable. Denoting the naverage separateoftransect density estimates by t1might , . . . , tbe n the approach of using the individual transect estimates, preferable. Denoting the n separate transect density estimates by t11 , . . . , tnn , the estimateDenoting from sampling point i istransect given by preferable. the n separate density estimates by t1 , . . . , tn , the estimate from sampling point i is given by the estimate from sampling point i is k given − 1 by (5.2) ti = k − 1 .(5.2) (5.2) tii = kwd − i1 . (5.2) ti = wdii . wdi and the overall estimate is and the overall is and theestimate overall estimate is n and the overall estimate is 1 n n ti . (5.3) ρv(k)a = 1  n tii.(5.3) (5.3) ρv(k)a = 1n  i=1 n (5.3) ρv(k)a = i=1 ti . i=1 n i=1 The standard deviation of the mean of the n separate estimates (which The standard deviation of the of mean of theand n separate estimates (which The standard of the mean theof n the separate (which measures a measures adeviation combination of sample theestimates variation in plant denThe standard deviation of the uncertainty mean n separate estimates (which measures a combination of sample uncertainty and the variation inisplant dencombination of sample uncertainty and the variation in plant density) sity) is measures a combination uncertainty and the variation in plant denof sample   sity) is   n 2   sity) is n       22   1 1   n n n 2 n  .  s= t2i2 − 1   ti2  (5.4)  1 n n   . (5.4)  n(n1− 1)  t2ii − 1n  s= t (5.4) i=1 t ii  s =  n(n − 1)  i=1 t − . (5.4) n i i=1 i i=1 i=1  n(n − 1)  i=1 n i=1 An approximate 95% confidence interval for the i=1 overall plant density is An approximate 95% confidence interval for the overall plant density is An approximate 95% confidence interval the overall density An approximate 95% confidence interval for thefor overall plantplant density is is ( ρ − 2s, ρv(k)a + 2s). ( ρv(k)a v(k)a + 2s). v(k)a − 2s, ρ ( ρ − 2s, ρv(k)a + 2s). Sheil et al (2003), notedv(k)a that ‘. . . in low density [plant] cover, [in order to Sheilthe et al (2003),knoted that ‘.sample . . in lowmay density [plant]extend cover, far [in order to obtain required plants] the‘. ultimately from its Sheil et al (2003), noted that . . in low density [plant] cover, [in to Sheilorigin, et al. [and] (2003), noted ‘in low density [plant] cover, [in order to obtain the obtain the required k that plants] the sample may ultimately extend far order from its cross vegetation site types . . .ultimately ’. To avoidextend this, they defined obtain the required k plants] and the sample may far from its [and] cross vegetation and site types . . . ’. To avoid this, they defined requiredorigin, k plants] the sample may ultimately extend far from its origin, [and] cross depending on they the numbers three transect types with different formulae origin, [and] cross vegetation and site types .for . . ’.ti To avoid this, defined ontypes the numbers three transect typesTo with different formulae for tthree ii depending vegetation and site types’. avoid this, they defined transect with different of individuals present. three transect present. types with different formulae for ti depending on the numbers of individuals formulaeoffor t depending on the numbers of individuals present. i individuals present.

If no individuals have been observed by a minimum distance, d min, then the observer stops walking. This is a type 1 transect. If individuals are present, but scarce, then the observer walks the maximum distance, d max, and records k obs, the number of individuals observed. This is a type 2 transect. If individuals are plentiful, then the distance walked to the kth plant is recorded. This is a type 3 transect. The density estimates for the three transect types are given as the last column in Table 5.1. The overall density estimate,^ ρ v(k)s, is the average of the separate transect estimates. If every transect were of type 3 then^ ρ v(k)s would equal^ ρ v(k)a. As the required value of k is increased, so there will be fewer transects of type 3 and more of type 2. Thus, although increasing k reduces the variance for type 3 transects, the reduction of the number of this type of transect will increase the overall variability. A preliminary trial would be required to determine a suitable value for k. Sheil’s procedure is illustrated in Figure 5.2 for a case where k = 5. Four empty transects are curtailed at d min. Four transects with 0 < j < 5 are of length d max. The remaining two transects have lengths defined by the location of the fifth plant. In this case, with 51 plants in the study region, the separate transect estimates are 0, 0, 66.7, 66.7, 100, 0, 33.3, 263.2, Table 5.1  The rules suggested by Sheil et al. (2003) for use with the VAT procedure when estimating plant density in heterogeneous regions. The minimum and maximum search distances from a sampling point are dmin and dmax, respectively.

Type

No. of plants in dmin

No. of plants in dmax

Distance, d, to kth plant

Value for ti

1

0

0

2

>0

kobs(< k)

> dmax

kobs/wdmax

3

>0

At least k

< dmax

(k – 1)/wd

VARIABLE SIZED PLOTS  | 81

9

10

7

8

5

6

3

4

1

2

Figure 5.2  A compact example of Sheil’s procedure. The empty transects 1, 2, 6, and 9 are truncated at dmin. Transects 8 and 10 are truncated at the fifth plant encountered. Transects 3, 4, 5, and 7, each containing from 1 to 4 plants, have length dmax. 0, and 178.6. These values give ^ ρ v(5)s = 70.8 and s = 28.1 so that the approximate 95% confidence interval is (15, 127). This is too wide to be useful and indicates the need for further sampling. Figure 5.3 compares ^ ρ v(k), ^ ρ v(k)a, and ^ ρ v(k)s for the cases k = 3 and k = 6. The results are based on the usual 25 sampling points, with w = 0.05, d min = 0.15, and d max = 0.2. As expected, the results using k = 6 are less variable than those with k = 3. In clustered patterns^ ρ v(k)a tends to overestimate. This is a consequence of edge effects (sample points for which k plants have not been found are ignored using this method). Sheil’s procedure appears to work well with every pattern type. Figure 5.4 emphasizes the problems that may occur when searching for rare plants. For ^ ρ v(k) and ^ ρ v(k)a only a few sample points (in the relatively plant-rich regions) are providing usable information and therefore considerably over estimate. Although Sheil’s method may give an inaccurate estimate, on average it does very well. The simplicity of the method suggests using more sampling points to reduce variability. Of course when plants are this scarce a complete census would be preferable.

Random pattern; 500 plants

Random gradient; 500 plants

Regular pattern; 500 plants

Clustered pattern; 500 plants

4

4

4

4

2

2

2

2

1

1

1

1

0.5

0.5

0.5

0.5

0.25

0.25 v3

v3a

v3s

v6

v6a

v6s

0.25 v3

v3a

v3s

v6

v6a

v6s

0.25 v3

v3a

v3s

v6

v6a

v6s

v3

v3a

v3s

v6

v6a

v6s

Figure 5.3  Box-whisker plots investigating the accuracy of the variable area transect methods when plants are abundant.

82  |  MEASURING ABUNDANCE Random pattern; 50 plants

Random gradient; 50 plants

Regular pattern; 50 plants

Clustered pattern; 50 plants

4

4

4

4

2

2

2

2

1

1

1

1

0.5

0.5

0.5

0.5

0.25

0.25 v3

v3a

v3s

v6

v6a

v6s

0.25 v3

v3a

v3s

v6

v6a

v6s

0.25 v3

v3a

v3s

v6

v6a

v6s

v3

v3a

v3s

v6

v6a

v6s

Figure 5.4  Box-whisker plots investigating the accuracy of the variable area transect methods when plants are scarce.

bb Advice on data collection For an Indonesian forest, Sheil (2002) suggested using transects of 10 m width, with k = 5, d min = 15 m, and d max = 20 m.

Example 5.1: Mangroves in Kenya (cont.) Using the same sampling points as in Chapter 4, and noting that each sampling point must be visited, d max was chosen to be the distance between successive sampling points (4 m for Sites 1, 3, and 4, but 3 m for Site 2). For each site, d min was chosen to be d max/2. The results are summarized in Table 5.2. None of the estimators provide acceptable results for every site. With the given values of d min and d max, there is little difference between^ ρ v(k)a and^ ρ v(k)s. The worst performance is that of^ ρ v(3) for Site 1. None of the estimators achieve the accuracy found using the ^ ρ(k)a series given by Equation (4.6). Table 5.2 The estimates obtained for the four mangrove sites using various VAT estimators with k = 3 or k = 6. Estimates within 20% of the true value are shown in bold.

True no.

^ ρv(3)

1

472

239

2

990

648

3

85

88

4

227

196

Site

^ ρv(3)a

^ ρv(3)s

^ ρv(6)

^ ρv(6)a

^ ρv(6)s

371

373

279

394

401

1144

1145

608

1191

1193

64

65

110

50

54

200

201

218

198

200

VARIABLE SIZED PLOTS  | 83

5.2  3P sampling The phrase ‘3P sampling’ is a shorthand for ‘sampling with Probability Proportional to Prediction’, a method of estimation introduced to the statistics community by Horvitz and Thompson (1952), and to forestry by Grosenbaugh (1964). From a forest containing N trees, an initial sample of n trees is selected, with the aim 100 VARIABLE SIZED PLOTS of estimating characteristic of interest. Most commonly this is V, the total volume 100 some VARIABLE SIZED PLOTS of timber present. Let x denote some other characteristic that is highly correlated with the examples might be height (measured or estimated), basal area, or a visual escharacteristic of interest. For volume, examples might be height (measured or estimated), examples might height (measured or estimated), basal area, or a visual estimate of the treebevolume. Let xmax and xmin be the (possibly hypothetical) basal area, or a visual estimate of the tree volume. Let x and x be the (possibly max min timate of the tree volume. Let x and x be the (possibly hypothetical) maximum and minimum values max anticipatedmin for this characteristic. hypothetical) and minimum values for thisrandom characteristic. characteristic. maximum and minimum values anticipated for this In maximum a process sometimes referred to asanticipated double sampling, numbers a process sometimes referred to asinitial double sampling, In a process sometimes to as sampling, random are used to areInused to select a referred subsample of double the sample. Letrandom r numbers be a numbers random are usedand toofselect a subsample the initial sample. be individual. a random select a subsample thexinitial sample. Let r becharacteristic a random number let x i be the value number let value of of the forLet therand ith i be the number and for let value of the characteristic for ith this individual. iisbe The selection rulexthe that this individual is selection included in thethe sample if of the characteristic iththeindividual. The rule issecond that individual is The selection rule is that this individual is included in the second sample if included in the second sample if

r < (xi − xmin )/(xmax − xmin ). r < (xi − xmin )/(xmax − xmin ).

This procedure implies that individuals with large x-values are more likely to

This procedure implies thatwith individuals with large x-values are more likely This procedure that individuals with large x-values are likely to to be be selected thanimplies those small x-values. Since the next step ismore to accurately selected than those with small x-values. Since the next step is to accurately measure the be selected than those with small x-values. Since the next step is to accurately measure the characteristic of interest (in this case, volume), the implication measure themost characteristic of interest (inthe this case, volume), thethe implication characteristic interest (in this case, volume), implication is that most important is thatofthe important trees are given most attention. that the most important treessample are given trees are is given most attention. Suppose that m of the initial of nmost treesattention. are selected and denote the Suppose that m of the initial sample of n trees are selected and denote theselected total volume, V , is given by volume by vsample Suppose thatofmthe ofith thetree initial of n treesofare and denote the the volume of i . The estimate volume of the ith tree by vi . The estimate of the total volume, V , is given by the ith tree by v i. The estimate of the total volume, V, is given by       m N m       xi vi m N x m ×  . V =   i xxi v i x − i V = i=1 xi × i=1 xi − xmin min . i=1 x − xmin x − xmin i=1 i i=1 i=1 i Usually the minimum value for x will be zero, in which case the expression Usually the minimum value for x will in zero, which thecase expression simplifies to Usually the value forbe x zero, will be in case which the expression simplifies to minimum N  m  simplifies to   1  vi  N x m (5.5) V = 1  ×  i vi .(5.5) m x  V = (5.5) xi × i=1 i . m i=1 x i=1 i=1 i i=1

i=1

Although primarily used to estimate volume of wood, 3P sampling a wider Although primarily used to estimate volume of wood, 3P sampling hashas a far of wood, 3P sampling primarily to estimate farAlthough wider application. Aused useful discussion is provided by West (2011). has a application. A useful discussion is provided byvolume West (2011). far wider application. A useful discussion is provided by West (2011).

5.3 Bitterlich sampling 5.3  Bitterlich sampling 5.3 Bitterlich sampling

Most species of tree ofare by by having beregarded regarded as a Most species treecharacterized are characterized havinga atrunk trunk that that can can be regarded Most species of treethat are line characterized by having a trunk that can bestill reasonably line is either approximately vertical (if (if still growing), or as astraight reasonably straight that is either approximately vertical growas a reasonably straight line that is(ifeither approximately (if still ing), orhorizontal approximately horizontal felled). Furthermore, to growbe in wider approximately (if felled). Furthermore, trees tend vertical totrees be intend proportion: felled). Furthermore, to be in ing), approximately widergreater treeshorizontal arevolume. taller (if and have greater Thetend methods in trees are proportion: tallerorand have The methods involume. this trees section, and Section 5.4, wider trees are proportion: taller andthese havecharacteristics. greater volume. The methods in this section, and Section 5.4, exploit exploit these characteristics. thisThe section, andintroduced Section 5.4, these characteristics. method byexploit Bitterlich (1948) is use for use with standingtrees. trees.At each The method introduced by Bitterlich (1948) is for with standing The method introduced by Bitterlich (1948) is for with standing trees. At each sampling point, the number of trees that haveuse a trunk that subtends samplingAtpoint, the number of the trees that have a trunk thata subtends an angle greater number of trees that have trunk that each sampling point, subtends an angle greater than some pre-specified angle, θ, is recorded. Special inthan some angle, θ, is recorded. Special instruments called relascopes are Special inan pre-specified angle greater than some pre-specified angle, is recorded. struments called relascopes are available to assist θ, with the angle judgement. availableThe to assist with the angle judgement. The method goes under several alternative struments called relascopes are available to assist with the angle judgement. method goes under several alternative names, including the angle-count The method under several alternative names, including theprocedure angle-count names, including thegoes angle-count method, and point sampling. The method , and point sampling. The procedure may be regarded as a specialmay be method ,3Pand pointofsampling. The (West, procedure may be regarded as a special regardedcase as aofspecial case 3P sampling 2011). sampling (West, 2011). case of 3P sampling (West, 2011).

84  |  MEASURING ABUNDANCE

P

r

θ

d

Figure 5.5  A tree of diameter d at a distance r from a sampling point subtending an angle θ at the sampling point P. Suppose that r is the maximum distance at which a tree with basal diameter1 d subtends an angle of θ or more (see Figure 5.5). To be included in the sample the tree must therefore be located somewhere in a ‘catchment’ area of πr 2 surrounding the sampling point. Since a tree of basal diameter d has a basal area of πd 2/4, the ratio of basal area to catchment area, β, is given by d 2/4r 2. Now consider another tree, of diameter kd. This tree will subtend an angle θ or more, up to a distance kr from the sampling point (see Figure 5.6). Since this tree, which has basal area πk 2d 2/4, is located somewhere in a catchment area πk 2r 2, the ratio of basal area to catchment is again β = d 2/4r 2. 102 area VARIABLE SIZED PLOTS For fixed θ, the quantity β, known to foresters as the basal area factor or BAF, will be the samethere for allare distances. Suppose region area A there are nsubtending sampling points, n sampling points,that, withintia trees of of a particular species with ti trees of a particular species subtending an angle of at least θ at sampling an angle of at least θ at sampling point i. Using only the information from point i. Using only the information from sampling point i, the total basal area of the trees of sampling point i, the total basal area of the trees of that species in the region that species in the region will be estimated by Aβt . Since each sampling point will be estimated by Aβti . Since each sampling point gives an equally validgives an i estimate, the pooled estimate of the total is area givenisby equally valid estimate, the pooled estimate of thebasal totalarea basal given by B=



n Aβ  ti = Aβ t¯, (5.6) (5.6) n i=1

t¯ is the number average number trees counted at a sampling A con-interval where t¯ iswhere the average of trees of counted at a sampling point. Apoint. confidence fidence interval could be based the variability of theAn individual estimates. could be based on the variability of theon individual estimates. intriguing feature is that An intriguing feature is thatmeasurements B is calculated without any actual B is calculated without any actual of individual trees.measurements of individual trees.

Advice on data collection One question concerns the choice of value for θ. Too small a value would result in large numbers of eligible trees, some of which might be obscured by nearer vegetation. A recommended choice (Marshall, Iles and Bell, 2004) is a value for θ that results in about 4 to 10 trees being selected from each d sampling point.

kd

r kr

Figure 5.6  A tree of diameter d at a distance r from a sampling point subtends the same angle as a tree of diameter kd at a distance kr from the point. X

P

Y

VARIABLE SIZED PLOTS  | 85

bb Advice on data collection One question concerns the choice of value for θ. Too small a value would result in large numbers of eligible trees, some of which might be obscured by nearer vegetation. A recommended choice (Marshall, Iles and Bell, 2004) is a value for θ that results in about 4 to 10 trees being selected from each sampling point. BITTERLICH SAMPLINGof a 103 With a fixed-plot scheme, the catchment region leading to inclusion tree is the same for every tree, but with Bitterlich sampling the catchment area for a large tree is greater than for a small tree (seeto Figure this reason refer the method is also called withthat probability proportional size. 5.7). GivenFor that foresters to the transitionsampling between or sampling points asprobability a ‘cruise’, yet another description is plotless variable-plot sampling with proportional to size. Given that foresters cruising. refer to the transition between sampling points as a ‘cruise’, yet another description is Although an estimate of the total basal area can be obtained without plotless cruising. makinganmeasurements trees, thiscan is not case for volume,making Although estimate of on theindividual total basal area be the obtained without which requires measurement of basal area and height, together with the use measurements on individual trees, this is not the case for volume, which requires of species-specific equations or look-up tables. Denoting the basal area of the measurement of basal height, together of species-specific equations jth tree at thearea ith and sampling point by bijwith , withthe vijuse being the corresponding or look-up tables.the Denoting the basal area of jth tree atbythe ith sampling point by bij, volume, total volume of timber, V ,the is estimated with v ij being the corresponding volume, the total volume of timber, V, is estimated by

  ti n  1  vij   .(5.7) (5.7) Aβ V = n i=1 b j=1 ij



This is the average over the n sampling points of their separate estimates of total volume, This is the average over the n sampling points of their separate estimates of with the information from each tree being inversely weighted according to the catchment total volume, with the information from each tree being inversely weighted area of the tree. In to thethe forestry literature ratio volume to basal area is referred to according catchment area ofthe the tree.of In the forestry literature the as VBAR.ratio of volume to basal area is referred to as VBAR. The estimated total basal varies from from one sample point to another The estimated totalarea basal area considerably varies considerably one sample point (reflecting, part, (reflecting, the tendency for trees to appearfor in trees clusters). By contrast VBAR does to in another in part, the tendency to appear in clusters). By contrast VBAR does not vary greatly from one tree to another. This suggests that, rather than using a relatively small number of sampling points, with every tree being individually measured, it will be more efficient to use a larger number of sampling points while only measuring a subsample of the trees encountered (an application of the 3P procedure of Section 5.2). Using L as a subscript for the large sample and S as a subscript for the subsample, an estimate of volume, VS , can be calculated using Equation (5.7) applied to the measurements of the trees in the subsample. Using Equation (5.6), two estimates of the basal area can be calculated: one from all the data, BL , and one from the subset alone, BS . To take account of the possibility that the subsample was slightly unrepresentative, an adjusted estimate of volume, Vadj is then given by

X

P

BL  VS . Vadj = BS

Y

(5.8)

In order to select the subsample, Marshall, Iles and Bell (2004) suggested that, at each sampling point, in addition to selecting trees that subtended Figure 5.7  For tree X to be sampled, the sampling point must lie within the left-hand circle the angle θ, a second selection should be made using the larger angle Θ. This shown. Because tree Y is larger, surrounds it that is larger. Thus has a greater procedure is referred to asthe ‘bigcircle BAF that ’. They suggested the size of ΘY should probability beingso selected by a 10 randomly positioned sampling beof chosen that about to 15 trees would be selectedpoint. across the entire set of nL sampling points. A detailed account that explores the interrelation between variability, cost and precision is provided by Yang et al (2017).

according to the catchment area of the tree. In the forestry literature the ratio of volume to basal area is referred to as VBAR. The estimated total basal area varies considerably from one sample point to anotherABUNDANCE (reflecting, in part, the tendency for trees to appear in clusters). 86  |  MEASURING By contrast VBAR does not vary greatly from one tree to another. This suggests that, rather than using a relatively small number of sampling points, not vary greatly from one tree to another. This suggests that,berather using relatively with every tree being individually measured, it will morethan efficient to ause small number of sampling points, with every tree being individually measured, a larger number of sampling points while only measuring a subsample of the it will be more trees efficient to use a(an larger number points while only encountered application of of thesampling 3P procedure of Section 5.2). measuring a Using as aencountered subscript for(an theapplication large sample S as a subscript for the 5.2). subsample of the L trees of and the 3P procedure of Section be calculated using Equation subsample, an estimate of volume, VS , can and Using L as a subscript for the large sample S as a subscript for the (5.7) subsample, applied the measurements the trees inusing the subsample. an estimate of to volume, VS, can beof calculated Equation Using (5.7) Equation applied to the (5.6), two estimates of the the basal area canUsing be calculated: from all estimates the data, of the measurements of the trees in subsample. Equationone (5.6), two BL , and one from the subset alone, BS . To take account of the possibility that basal area can be calculated: one from all the data, BL, and one from the subset alone, BS. the subsample was slightly unrepresentative, an adjusted estimate of volume, To take account of the possibility that the subsample was slightly unrepresentative, an Vadj is then given by ^

adjusted estimate of volume, V adj is then given by

BL  VS .(5.8) (5.8) Vadj = BS



In order to select the subsample, Marshall, Iles and Bell (2004) suggested that, at In order to select the subsample, Marshall, Iles and Bell (2004) suggested each sampling point, in addition to selecting trees that subtended the angle θ, a second that, at each sampling point, in addition to selecting trees that subtended selection should be made using the larger angle Θ. This procedure is referred to as ‘big the angle θ, a second selection should be made using the larger angle Θ. This BAF’. They suggested that the Θ should be chosenthat so the thatsize about to 15 trees procedure is referred to assize ‘bigof BAF ’. They suggested of Θ 10 should would bebeselected across the entire of nLwould sampling points. A detailed account that chosen so that about 10 to set 15 trees be selected across the entire set explores of thenLinterrelation between variability, cost that and explores precisionthe is interrelation provided by Yang sampling points. A detailed account between variability, cost and precision is provided by Yang et al (2017). et al. (2017).

5.4  Perpendicular distance sampling (PDS) Williams and Gove (2003) adapted the ideas underlying Bitterlich sampling in order to estimate the total volume of ‘coarse woody debris’ (CWD) in an area of interest. CWD principally refers to fallen tree trunks, or logs of wood, with diameters greater than some minimum quantity (typically 7.5 cm). When using PDS two measurements are required. One is the perpendicular distance, d, of the log from the sampling point (nearby logs are ignored if no perpendicular distance exists: see Figure 5.8). When estimating volume, the second measurement is s, the cross-sectional area of the log at the point where the perpendicular meets the log. The log is selected if d is less than cs, for some specified value of c (which has dimensions of length–1). In effect, each log is surrounded by its own catchment region which has an area proportional to the log’s volume. A log is selected only if the sampling point lies in its catchment. Figure 5.9 shows a typical catchment.

B A

P

C Figure 5.8  Three logs lie near P, a sampling point. Log A is not selected as no perpendicular distance exists. Whether log B and C are selected will depend on their cross-sectional areas at the feet of the respective perpendiculars.

VARIABLE SIZED PLOTS  | 87 R Q P

Figure 5.9  The shaded region represents a tapering log. The log will be selected only if the sampling point lies within the dotted catchment region. In this case the log would be selected by sampling points at P and Q, but not by a sampling point DISTANCE at R. SAMPLING (PDS) PERPENDICULAR 105 PERPENDICULAR DISTANCE SAMPLING (PDS)

105

area is proportional volume the The sizeThe of size the of catchment area is directly proportional to to thethe volume The size of the the catchment catchment area is directly directly proportional to the volumeofof ofthe the log. To log. that is that log divided into thin log. isTo To see that this this is so, so, suppose that into log jjNis isthin divided into N Neach thinofsections, sections, see that this so,see suppose that logsuppose j is divided sections, thickness h/N. h/N cross-section of each of thickness h/N .. Suppose Suppose that the cross-section area of the the ith ithofsection section Suppose each that of thethickness cross-sectional area ofthat the the ith section is si. area The volume the log, v j, is is s . The volume of the log, v , is therefore given by i j si . The thereforeisgiven by volume of the log, vj , is therefore given by vvj = j =



N N  

h ssi h .. iN N i=1 i=1

Now the area this log. Since the Now consider consider the catchment catchment area forlog. thisSince log. the Since the cross-section cross-section area Now consider the catchment area for thisfor cross-sectional areaarea of the ith section is corresponding section is ii ,, the the width width of of the the corresponding catchment section is of this ofsi,the the ith section is sscorresponding section isof theith width of the catchment sectioncatchment is csi on either side on either side of this point of the log. The total catchment area, a , is cs i j on either side catchment of this point of the log. The total catchment area, aj , is csi log. point of the The total area, a , is therefore given by j therefore given by therefore therefore given given by by

a ajj = = 2c 2c



N N  

h ssii h ,, N N i=1 i=1

is proportional to log which is directly directly proportional to the the volume volume of the log since since which is which directly proportional to the volume of the of logthe since a ajj = = 2cv 2cvjj ..



The probability that, in that, athat, study of sizeof a randomly placed sampling point lies The in study A, placed sampling The probability probability in aaregion study region region ofA,size size A, aa randomly randomly placed sampling point in for log = /A. Suppose in the catchment log catchment j is pj = aj/A. pointfor k, asampling total of m logs point lies liesfor in the the catchment forSuppose log jj is is ppjjthat, =a ajjfor /A.sampling Suppose that, that, for sampling point k, aa total m selected. An estimate are selected. of the total volume is pointAn k,unbiased total of ofestimate m logs logs are are selected. An unbiased unbiased estimate of of the the total total volume volume is is



m m m m v m v m       mA vvjj jj jj mA v v  = A A (5.9) V = = = kk = (5.9) V = A = A = 2c ..(5.9) p a 2cv j j j 2c p a 2cv j j j j=1 j=1 j=1 j=1 j=1 j=1

With points the pooled given With n sampling points the pooled is givenis With n n sampling sampling points the estimate pooled estimate estimate isby given by by n



n n 11    V (5.10) V =  V =n Vkk ..(5.10) (5.10) n k=1 k=1

A the volume would be variability A confidence intervalinterval for thefor overall volume would be based onon thethe variability A confidence confidence interval for the overall overall volume would be based based on the variabilityof the n of the n separate estimates. separate of estimates. the n separate estimates. Williams, and Gove adapted the procedure to Williams, Ducey, and (2005) Gove (2005) (2005) adapted the original original procedure to meamea-surface Williams, Ducey,Ducey, and Gove adapted the original procedure to measure sure surface area (which is relevant since it may provide shelter or habitat for sure surface area (which is relevant since it may provide shelter or habitat for area (which is relevant since it may provide shelter or habitat for other organisms). The other organisms). The procedure is identical to that for volume, except that other organisms). The procedure is identical to that for volume, except that procedure is identical to that for volume, except that the critical measurement is the the measurement is the of log, than its the critical critical measurement isthan the circumference circumference of the thearea. log, rather rather than its crosscross-point, a circumference of the log, rather its cross-sectional From a sampling sectional sectional area. area. From From aa sampling sampling point, point, aa log log is is counted counted if if its its perpendicular perpendicular log is counted if its less perpendicular distance is less than some predetermined multiple, C, distance distance is is less than than some some pre-determined pre-determined multiple, multiple, C, C, of of the the circumference. circumference. of the circumference. Note that, unlike c, C is a dimensionless quantity. For a sampling Note that, c, C is a dimensionless quantity. For a sampling point unlike Note that, unlike c, C is a dimensionless quantity. For a sampling point that that counts M logs, the total surface area is by with (5.9), point that counts logs, total surface is then, by analogy with Equation counts MM logs, thethe total surface area area is then, then, by analogy analogy with Equation Equation (5.9), (5.9), estimated estimated as as M M A/2C. A/2C. An An overall overall estimate estimate of of surface surface area area is is provided provided by by the average of the n sample estimates, with its precision determined the average of the n sample estimates, with its precision determined by by the the variance variance of of the the separate separate estimates. estimates.

Only borderline cases will need precise measurement. The values of c and C should be chosen so that between 4 and 10 logs are selected at most sampling points.

88  |  MEASURING ABUNDANCE Gove et al (2013) provide a comprehensive comparison of PDS methods (including the distance-limited version introduced in the next section).

estimated as MA/2C. An overall estimate of surface area is provided by the average of the n sample estimates, with its precision determined by the variance of the separate 5.4.1 Distance-limited PDS estimates.

bb

One potential problem with the PDS procedure is that a relevant log might be a considerable distance from the sampling point. The answer suggested by Advice data collection Duceyon et al (2013) is to impose an upper limit, dmax , on that distance, as illustrated in Figure 5.10.

Using PDS there is no need to measure the diameters of every piece of CWD. Only borderline cases will need precise measurement. The values of c and C should be chosen so that between 4 and 10 logs are selected at most sampling points. Gove et al. (2013) provide a comprehensive comparison of PDS methods (including dmax in the next section). the distance-limited version introduced

5.4.1  Distance-limited PDS One potential the PDS is that a relevant log might Figure problem 5.10 As with previously, the procedure shaded region represents a tapering log. be a considerable from thePDS, sampling point. answeronly suggested by Ducey et al. Withdistance distance-limited the log will The be selected if the sampling lies within thelimit, truncated dotted (2013) is point to impose an upper d max, on that catchment distance, asregion. illustrated in Figure 5.10. Define s max by s max = d max/c, so that s max is the cross-sectional area corresponding to Define smax by smax = dmax /c, so that smax is the cross-section area d max. The log is selected if either of two situations occur: corresponding to dmax . The log is selected if either of two situations occur:

Case 1: ≤ cs.d ≤ cs. Case s1:≤ s max s ≤ and smaxdand Case 2: s > s and d ≤ d max max Case 2: s > smax and d ≤. dmax .

case 1 selections, and that the case 2 Suppose Suppose that at that sampling point point k, there are are m1 m at sampling k, there 1 case 1 selections, and that the selectionscase have cross-sectional areas s 1, s 2, …, sm 2.sFor the estimate of 2 selections have cross-sectional areas , . . . ,sampling sm2 . For point that sampling 1 , s2that point the volume (Ducey et estimate al., 2013)ofisvolume (Ducey et al, 2013) is m2  k = m1 A + A V si ,(5.11) (5.11) 2c 2dmax i=1



theestimate overall estimate of volume being the average the n separate estiwith the with overall of volume being the average of the of n separate estimates. mates.to circumference rather than area, and c is replaced by the dimensionless If s refers C, then the equivalent formula provides a distance-limited estimate of total surface area. An alternative approach to the measurement of CWD was discussed earlier in Section 2.4.4. A comprehensive review of alternative methods is provided by Russell et al. (2015).

dmax

Figure 5.10 As previously, the shaded region represents a tapering log. With distancelimited PDS, the log will be selected only if the sampling point lies within the truncated dotted catchment region.

Part III

Mobile individuals With moving objects there are two principal reasons why the number counted is likely to be less than the number present. One reason is simply that a creature, that might have been counted, is elsewhere in its territory at the time of counting. The other reason is simply that the creature is not seen! Birds are particularly problematical: there is the bird that flies behind the observer; the bird that dives under water just before the observer looks in that direction; the bird that vanishes into the foliage before it can be identified. Animals also cause difficulties: a small animal may be concealed behind a large bush, or behind the other creatures between it and the observer. Creatures under water also present obvious difficulties. To estimate the number of moving targets it is therefore necessary to make assumptions. This requires mathematical models. Models will include unknowns that must be estimated. The estimation process is rarely simple, so the methods in this Part make relatively heavy use of computers and the examples include computer code.

6.  Quadrats, transects, points, and lines – revisited 6.1  Box quadrats The quadrats discussed at length in Chapter 2, were essentially two-dimensional, and not useful for a mobile individual that can ‘escape’ from the quadrat before being counted. For small ground-living creatures, one solution is to add vertical sides to the quadrat to form a ‘box’ that will trap the individuals of interest. The resulting data would be analysed in the same way as for ordinary quadrat data.

6.2  Strip transects A strip transect is simply a narrow rectangular quadrat. Typically, the observer travels along the centre of the strip, recording all individuals as they are encountered. If it is believed that every individual is counted, then the methods discussed in Chapter 2 are appropriate. However, in most cases, the probability of detecting an individual reduces with increasing distance between the individual and the observer, so that the number recorded is less than the number present. If, for each detected individual, the approximate distance from the observer to that individual is recorded, then the methods of Chapter 8 are appropriate. If the distance is not recorded, but it can be assumed that the proportion of individuals that elude detection is the same from year to year (or site to site), then the numbers provide an index of relative abundance. This is the basis of the many examples of citizen science, where untrained observers record what they see on particular days or particular walks. Their aggregated findings can provide reliable year-to-year trends for the species concerned.

6.2.1  Bats In the UK, the National Bat Monitoring programme was introduced in 1996. Volunteer walkers use ultrasonic detectors to detect bats by their calls. Depending on the target species, the transects are either triangular within a pre-specified randomly chosen 1 km square, or follow the path of a waterway. The transect counts could be used to provide an index of change.1 For North America, Britzke and Herzog (2009) suggested using mobile (driving) transects where the observer detects bats using a slow-moving car with an ultrasonic

QUADRATS, TRANSECTS, POINTS, AND LINES – REVISITED  | 91 detector mounted on the roof. Their suggestions were incorporated into the protocol proposed by Loeb et al. (2015), which is applied to selected 10 km × 10 km quadrats across the continent. Key points are as follows:

bb Advice on data collection • Drive slowly and steadily at 20 mph with headlights and hazard lights on. • Avoid main roads, gated roads, rough gravel roads, or roads through dense forest. • To avoid double counting, choose reasonably straight roads. • Using a 10 km × 10 km square grid for mapping, the route should remain within a single square. • Two surveys should be conducted within a fortnight during the bats’ maternity season. • Surveys should begin 45 minutes after sunset on clear dry nights.

6.2.2  Marine fish A trawl is in effect a line transect. Methods based on changes in the amount of fish caught are considered later in Section 6.7.2. Less invasive approaches to the measurement of fish abundance are introduced in Section 6.4.3.

6.2.3  Reef fish Strip transects, in which a diver records what is seen, are the most common method used for sampling on coral reefs (Caldwell et al., 2016). For an underwater visual census (UVC), divers should follow a well-defined procedure, such as those described by Halford and Thompson (1994) and Labrosse, Kulbicki, and Ferraris (2002).

bb Advice on data collection • Divers should work in pairs. • One end of a (typically) 50 m tape is fixed to the sea bed. • Before starting to record, allow fish time to acclimatize to the presence of the divers. • Proceed steadily along the tape, unwinding the tape and working on one short section (say 3 m) at a time. Halford and Thompson (1994) recommend counting the fastest moving species first; Labrosse, Kulbicki, and Ferraris (2002) advise counting the most abundant species first. Remote underwater video (RUV) techniques avoid the disturbance caused by divers, and effectively give extra time for the observer. A comprehensive review of the methods used, showing the increased sophistication of the equipment available, is given by Mallet and Pelletier (2014).

92  |  MEASURING ABUNDANCE Well-camouflaged species may be massively under-reported by an UVC. For example, a study by Willis (2001), showed that just 26 spectacled triplefin (Ruanoho whero) were observed at a location where 292 were present. Zarco-Perello and Enríquez (2019) showed that the use of RUV could more than double the number of species reported using UVC. An alternative approach to underwater monitoring has been suggested by Widmer et al. (2019). Their scheme uses a series of small waterproof cameras, fixed at intervals in a line along the edge of a transect. All the cameras are similarly oriented and simultaneously take a sequence of pictures. so that the AND entire transect is being constantly monitored. The 114 QUADRATS, TRANSECTS, POINTS, LINES — REVISITED authors term the procedure point-combination transect sampling. 6.3

Using frequency to estimate abundance

6.3  Using frequency to estimate In Chapter 2, it was shown that frequencyabundance (a record of either 0 for absent, or 2, 1 it forwas present) be used to(aestimate abundance. andorNichols In Chapter shownmight that frequency record of either 0 forRoyle absent, 1 for present) (2003) suggested that a similar approach could be used with bird records, might be used to estimate abundance. Royle and Nichols (2003) suggested that a similar if, at each sampling point, a series of counts were made during a period in approachwhich couldthe bepopulation used withsize bird records, if, at each sampling point, a series of counts could be assumed to be constant. Each point count were made during a period in which the population size could assumed to be constant. should have the same duration and refer to the same sizeberegion. Each point count should have the same duration and refer to the same region. The underlying idea is simple. At a site where a species is size abundant, The underlying idea is simple. At a site where a species is abundant, individuals will individuals will be observed on most visits; where it is rare, it will be rarely observed. Thus variations probability will reflect variations be observed on most visits; whereinitthe is rare, it will of bedetection rarely observed. Thus variations in in abundance. the probability of detection will reflect variations in abundance. The model proposed by Royle Nichols (2003) (hereafter theRN RNmodel) model )has two The model proposed by Royle andand Nichols (2003) (hereafter the has two concerns number of birds present a site, and components: onecomponents: concerns theone number of the birds present at a site, andatthe other concerns the other concerns the number observed at a site. For example, if birds occur the number observed at a site. For example, if birds occur at random across a region, at random across a region, then the number present around sampling point then the i,number around sampling point distribution i, Ni, will be anSection observation Ni , will present be an observation from a Poisson (see 1.4.2). from a Poisson distribution (see Section 1.4.2). The parameter of the distribution, λ i , will reflect The parameter of the distribution, λi , will reflect the density of the species the density of the species in the neighbourhood of the site. If all the sites in the neighbourhood of the site. If all the sites are similar then λi = λ,are for similar then λi = all λ, for all i. If there are d distinct groups of sites, then there would be d i. If there are d distinct groups of sites, then there would be d different different λ λ-values.values. presentatatsite sitei,i,the the probability probability that it Suppose that, for of each the Ni individuals Suppose that, for each theof N present i individuals independent of theofdetection of other thatonit occasion is detected onpoccasion j is pij , of is detected j is , independent the detection other individuals. The ij individuals. The species will not be recorded at site i only if all Ni individuals species will not be recorded at site i only if all Ni individuals remain undetected. The remain undetected. The probability that the species is observed on visit j is probability that the species is observed on visit j is therefore Pij, given by therefore Pij , given by



(6.1) Pij = 1 − (1 − pij )Ni .(6.1)

may depend on characteristics the characteristics of the samThe of the parameter on the of the sampling site. The value of value the parameter λi mayλidepend is likely to such depend upon factors the such as of theday, the The tovalue of pij The valuepling of pijsite. is likely depend upon factors as the weather, time weather, time ofand day, so theforth. experience of of thethis observer, and so forth. experience of the the observer, Models type have also beenModreferred to els of this type have also been referred to as abundance-induced heterogeneity as abundance-induced heterogeneity (AIH) models. The estimate of the overall abundance (AIH) models. The estimate of the overall abundance may be obtained by may be obtained scaling thesampling area of the sitearea to the area of the scaling upbyfrom the up areafrom of the site sampling to the larger of larger the region region being sampled. being sampled. Fiske and Chandler (2011) (2011) presented the versatile R package unmarked which can Fiske and Chandler presented the versatile R package unmarked be used to fit many abundance models including the RN model described above. which can be used to fit many abundance models including the RN model To be describedvalid above. To be sure of obtaining valid parameter estimates, thewith es- log(λ) sure of obtaining parameter estimates, the estimation procedure works procedure works with log(λ) and log(p /(1 − p ), so that some ij ij and log(ptimation /(1 – p )), so that some manipulation is required to make sense of the results ij ij obtained.manipulation is required to make sense of the results obtained. Example 6.1 : Wood Thrushes in New Hampshire Royle and Nichols (2003) examined the fit of the RN model to several sets of data including a set of observations of Wood Thrush (Hylocichla mustelina) at 50 sites in New Hampshire. A total of 11 visits were made to each site

QUADRATS, TRANSECTS, POINTS, AND LINES – REVISITED  | 93

Example 6.1: Wood Thrushes in New Hampshire Royle and Nichols (2003) examined the fit of the RN model to several sets of data including a set of observations of Wood Thrush (Hylocichla mustelina) at 50 sites in New Hampshire. A total of 11 visits were made to each site during a period of 30 days in 1991 as part of the North American Breeding Bird Survey. The data for the Wood Thrushes are provided as part of the unmarked package. Table 6.1  (a) Summary of the numbers of occasions (out of 11) on which Wood Thrushes were observed at 50 sites. (b) Summary of the numbers of sites (out of 50) on which Wood Thrushes were observed on the 11 visits.

(a) Number of occasions

0

1

2

3

4

5

6

7

8

9

10

11

Number of sites

5

8

7

11

2

1

1

4

4

1

3

3

(b) Visit Number of sites

1

2

3

4

5

6

7

8

9

10

11

12

16

11

17

21

21

23

21

22

22

20

Table 6.1 summarizes the survey results. Part (a) shows that there were three sites where Wood Thrushes were always detected and five where it was never noted. Part (b) suggests that the numbers increased in the later part of the observation period. Using unmarked the analysis begins with the rather unlikely model in which the probability of observing a Wood Thrush is the same at each site (constant λ), and the same at each sampling period (constant p). library(unmarked)

# Calls the appropriate R library

#woodthrush.bin is the 50

11 matrix of 0s and 1s

# that correspond to 'not detected' and 'detected' woodthrushUMF |z|)

0.792 0.158 5.03

5e-07

Detection: Estimate

SE

z

P(>|z|)

-1.21 0.17 –7.14 9.41e-13 AIC: 633.9534

The estimate of λ is exp(0.792) = 2.21 and the estimate of p is

exp(–1.21)/(1 + exp(–1.21)) = 0.23.

94  |  MEASURING ABUNDANCE This is the start of the analysis, not the end! Table 6.1 (b) showed that the numbers of sites where wood thrushes were detected was much higher during the final seven visits than during the first four. This might reflect a change in behaviour of the birds brought on by increased visibility as a result of the need to feed young. The analysis could proceed as follows: earlylate|z|)

-1.63 0.205 –7.98 1.52e-15

earlylate2

0.65 0.181

3.58 3.39e-04

AIC: 622.4348

The first entry, (intercept), refers to the first value of the vector earlylate and gives an estimate of p as exp(–1.63)/(1 + exp(–1.263)) = 0.16. The second entry refers to the difference between the two periods. Since –1.63 + 0.65 = –0.98, the probability of detection in the second period is estimated to have increased to exp(–0.98)/ (1 + exp(–0.98)) = 0.27. The lower AIC value for this model suggests that this model should be preferred to its predecessor. In reality, the 50 sampling sites are unlikely to be equally suitable for Wood Thrushes. If there is information about the sites, then it is possible to model the effect of these background factors. To illustrate the approach, suppose that sites 1 to 12 are near water, and that sites 11, 12, and 41 to 50 have many trees. Then the analysis would continue: water|z|)

earlylate2

-2.153 0.271 -7.94 1.97e-15 0.607 0.178

3.41 6.55e-04

AIC: 581.5511

This model considers four types of site, each with its own estimate of abundance. For sites without trees, but near water: exp(2.349) = 10.5. For sites without trees and not near water: exp(2.349 – 1.391) = 2.6. For sites with trees and near water: exp(2.349 – 0.924) = 4.2. Finally, for sites with trees but not near water: exp(2.349 – 1.391 – 0.924) = 1.0. Both tail probabilities (the entries in the P(>|z|)) column are very small suggesting that both background factors are relevant. The much smaller tail probability associated with proximity to water would indicate that this was the dominant feature.2 Since each point count refers to an area of π/16 square miles, multiplying the λ-values by 16/π would give the estimated density per square mile.

6.4  Point counts (point transects) Point-count sampling generally requires one or more observers to go to a specified point, and report what is there. It is not expected that a point count will detect all individuals present, but its magnitude will depend upon the conditions under which it is made. For example, a novice observer, with poor sight, counting birds in eclipse plumage on a misty morning, is likely to count fewer birds than a sharp-sighted experienced observer counting the same birds in their breeding plumage on a sunny day. Also, if twice as many individuals of species A are detected, compared to the number for species B, then this does not imply that species A is twice as abundant as species B. That may be true, but it may also be true that species A is more easily detected. However, a comparison of the counts for a particular species made on different years can provide a valid index of change in population size, provided that the counts are made under comparable conditions. Comparability is also likely if the counts have been aggregated across many sampling points, as is the case for national surveys.

6.4.1  Birds In order for the information from different observers at different locations to be comparable, there needs to be a clear set of instructions that are followed by all. Ralph, Droege, and Sauer (1995) (henceforth RDS) set out a list of 28 recommendations that covered all aspects of the counting process. However, when Matsuoka et al. (2014) (henceforth M) studied 20 years of bird count data from Canada and Alaska, they found that less than 3% of the point counts, had exactly followed the recommendations concerning the length of the count, and the count radius. They endorsed the earlier recommendations, which are given below, together with some elaborations suggested by M.

96  |  MEASURING ABUNDANCE

bb Advice on data collection • Count radius: Infinite. RDS specified that the numbers observed within 50 m, and the numbers recorded at greater distances, should both be recorded. M suggested that the ideal would be exact distances (so that the methods of Chapter 8 could be employed). Failing that, they suggest using at least four distance ranges, with 50 m being one of the end points. • Count period: RDS suggesting counting for either 5 or 10 minutes. Numbers should be recorded separately for the periods 0–3 minutes, and 3–5 minutes (and 5–10 minutes, if the longer period is used). M suggested subdividing the 5–10-minute interval into two, with the division at 8 minutes. M suggested that consideration might be given to recording subsequent detections of a bird previously detected (so that the methods of Chapter 7 might be used). • Number of observers: RDS had in mind a single observer. However, using two observers allows the methods of Section 6.5 to be used. • Number of visits: RDS anticipated single visits per season, but multiple visits may give improved estimates. Thus, MacLeod et al. (2012), found that ‘A single five-minute count had c. 60% chance of detecting bellbirds [Anthornis melanura] at a location where they were present, while the cumulative detection probability increased to almost one after five repeat counts per survey.’ Note that one visit is likely to be sufficient if distance methods (Chapter 8) are used. The recommendations by RDS were put together so as to allow some degree of comparability across different studies, but, as M noted, there are considerable variations. Thus the North American Breeding Bird Survey (BBS), which has been running since 1966, uses 3-minute counts. The New Zealand forest counts in the 1980s were for 5 minutes (Hartley, 2012). Gibbons and Gregory (2006) recommend normally using 5 or 10 minutes, with longer counts in forests. However, the use of an extended time interval may lead to overestimation, since some individuals may be double-counted and others may be included that were simply passing through the region. For this reason, Smith et al. (1998) recommended a maximum duration of 10 minutes. The arrival of the observer at the counting point may scare individuals away, so it is advisable to avoid under-estimation by waiting for a fixed time before commencing the count. Gibbons and Gregory (2006) suggest one minute for birds. For the BBS, each observer is assigned a 24.5 mile route, with sampling points at either end, and at half-mile intervals between. There are more than 4000 such routes, with start points and travel directions chosen at random, with the aim of providing an even covering across the entirety of North America. In most cases the routes follow minor roads, so that the sampler will not be disturbed by traffic noise. Each route is sampled once a year at the height of the breeding season. The guidelines given above are not precisely followed, since an upper distance bound of 400 m is used. Here are some more general BBS instructions:

QUADRATS, TRANSECTS, POINTS, AND LINES – REVISITED  | 97

bb Advice on data collection • • • •

Avoid sampling on a day with poor visibility, rain, or strong winds. Count during the early morning. A full 360° scan should be used. Keep track of the locations of individuals, noting if several are simultaneously present. • It may help to record positions on paper, but avoid looking down at your notes for extended periods. Buckland, Marsden, and Green (2008) warned that In many bird monitoring surveys, no attempt is made to estimate bird densities or abundance. Instead, counts of one form or another are made, and these are assumed to correlate with bird density. Unless complete counts on sample plots are feasible, this approach can easily lead to false conclusions, because detectability of birds varies by species, habitat, observer and many other factors. Trends in time of counts often reflect trends in detectability, rather than trends in abundance. Conclusions are further compromised when surveys are conducted at unrepresentative sites. Because of these problems, they advise using distance methods (see Chapter 8). Their criticism is supported by the results of the seven-year study reported by Norvell, Howe, and Parrish (2003).

6.4.2  Butterflies The North American Butterfly Association has co-ordinated point counts since 1993. It requires teams of volunteer observers to count all butterflies seen within a single day, within circles of radius 7.5 miles around pre-specified points. Comparisons of the results across years, are used to monitor changes in butterfly populations, and to study the effects of weather and habitat change.

6.4.3  Remote underwater surveying According to the manual produced by Johannesson and Mitson (1983), the first scientific record of the use of an acoustic method for detecting fish was a Japanese publication in 1929. A very readable account of the subsequent development of acoustics for the assessment of the abundance and ecology of fish, and other marine life, is provided by Fernandes et al. (2002). The quantity measured is amount of fish, rather than number of fish. Underwater photography can provide more detailed information. In an overview of the use of videos of fish coming to bait, Cappo, Harvey, and Shortis (2006) claim that ‘The use of remote, baited “video fishing” techniques offer standardized, non-extractive, methodologies for estimating relative abundance of a range of marine vertebrates and invertebrates, with the option of very precise and accurate length and biomass estimates when stereo-camera pairs are used.’ Harvey et al. (2007) assessed the number of fish present by determining the value of maxN, defined as the maximum number of fish

98  |  MEASURING ABUNDANCE seen at any one time. Despite its name, maxN provides a lower bound on the number of fish present. Denney et al. (2017) suggest that improved accuracy is gained by using a rotating camera. Marini et al. (2018) report promising results using genetic programming to automate the counting of fish in video footage. They find that the most important requirement is to avoid bio-fouling of the camera.

6.4.4  Camera traps A camera trap is a remotely activated camera, equipped with an appropriate sensor, that collects a series of images of creatures that pass. Burton et al. (2015) provide a comprehensive review of previous studies that used camera traps. Niedballa et al. (2016) introduced the R package camtrapR which provides management of camera trap data, and an interface to relevant R packages. Camera trap data are often summarized using a relative abundance index (RAI), which POINT COUNTS (POINT TRANSECTS) 121 may be calculated either as the proportion of photographs containing species of POINT COUNTS (POINT TRANSECTS) the 121 interest, the average number of individuals per photograph, or the number of distinct Camera trap data are often summarised using a relative abundance index animals (if the creatures in question aresummarised distinctively marked). Camera trap may data arecalculated often a relative abundance index (RAI), which be either asusing the proportion of photographs A camera situated by a known trail will be more successful than one a thicket, (RAI), which may be calculated either as the proportion of photographs containing the species of interest, the average number of individuals per in phowhile a camera with high resolution and the wide range will more successful containing species of of interest, average of individuals per than photograph, orthe the number distinct animals (ifnumber the be creatures in question are a poor tograph, or the numberrange of distinct animals (if the comparisons creatures in question are of RAI quality camera with amarked). narrow of vision. Therefore, of two sets distinctively distinctively marked). A camera situated by reliable, a known iftrail be more and successful than procedures one in a values will only be reasonably thewill equipment sampling are A However, camera bywith a known trail will be onenumbers in a thicket, whilesituated a camera high resolution andmore widesuccessful range willthan be succomparable. for a permanent array of cameras, variations inmore the of thicket, whilea from apoor camera with high should resolution and wide range willconcerning beTherefore, more suc-changes cessful than quality camera with aprovide narrow range of vision. individuals observed year to year, information cessful than aofpoor camera withwill a narrow of vision. Therefore, comparisons two quality sets of RAI values only berange reasonably reliable, if the in abundance.

comparisonsand of two sets ofprocedures RAI valuesare will only be reasonably if the equipment sampling comparable. However,reliable, for a permaequipment and sampling procedures arenumbers comparable. However,observed for a permanent array of cameras, variations in the of individuals from nent to array of should cameras, variations in the numbers of individuals from year year, provide information concerning changes inobserved abundance. 6.4.4.1  year Trapping region to year, should provide information concerning changes in abundance. 6.4.4.1 Trapping regionbyThe region depends effectivelyupon scanned by a camera depends The region effectively scanned a camera its distance above ground and 6.4.4.1its distance Trapping above region ground The region effectively scanned by aspecies. cameraHowever, depends upon and the size of the targeted the size of the targeted species. However, the regions scanned by identical cameras set at uponregions its distance above andcameras the sizeset of the targeted species. the scanned by ground identical at identical heights However, may still identical differ. heights may still differ. This is because there may be variations in the inclination the regions scanned identical cameras set at identical heights still This is becausebythere may be variations in the inclination of may the local of the local terrain relative to camera, the camera, and also thethe amount of in cover in local the vicinity differ. This is because there may variations in inclination of the terrain relative to the andbe also in theinamount of cover the vicinity of the camera. terrain relative to the camera, and also in the amount of cover in the vicinity of the camera. of the camera. Rowcliffe et al. et (2008) proposed a random encounter of Rowcliffe al (2008) proposed a random encountermodel modelwhere wherethe the creatures creaRowcliffe et alwere (2008) proposed a randomand encounter model where the tures of interest supposedly distributed atmoving random, moving in crearan- They interest were supposedly distributed at random, inand random directions. turesthe of estimated interest were supposedly distributed at by random, moving random directions. They showed the density,and λ, was theningiven showed that density, λ, that was thenestimated given dom directions. They showed that the estimated density, λ, was then given by απ by (6.2) λ= απ+ θ) ,(6.2) vr(2 λ= , (6.2) vr(2 +by θ) a camera in unit time, v is the where α is the number of pictures taken

where α speed is theof number of pictures taken by a camera indetection unitintime, vtime, isInthe speed where α is the number pictures taken by a camera unit v is the of the the target, and rofand θ define the camera zone. practice, target, and r and θ define the camera detection zone. In practice, individuals speed of the target, and r and θ define the camera detection zone. In practice, individuals are likely to be moving along well-worn paths, but the model are is likely to be moving along well-worn paths, butalong the model is reasonable providing individuals are likelythat to be well-worn paths, buttothe model isthat the reasonable providing themoving cameras have not been positioned specifically reasonable providing that the havecover not been positioned cameras cover have not been positioned to cameras specifically these paths. to specifically these paths. cover these are paths. If individuals are moving groups,then then the the total total number would If individuals moving in in groups, numberpresent present would be individuals as are moving in groups, then the total number present would estimatedbeasIfestimated be estimated as

λgA, (6.3) λgA, (6.3) (6.3) where g is the average group size and A is the area of the target region. where g is6.1 theshows average group size and A is the to area the target Figure that a target likely be of detected if it region. passes near where g¯ is the average group size and Aisismore the area of the target region. shows that a target is more becamera detected if it passes time near theFigure centre6.1 of the field of view because, in likely effect,tothe has a longer thewhich centretoofdetect the field view because, in effect, the camera has a longer time in theofindividual’s presence. in which to detectexample the individual’s presence. The following suggests that the effective field of view of a camera The exampleafter suggests thatofthe may be following best determined a study theeffective results. field of view of a camera may be best determined after a study of the results. Example 6.2 : Bawean warty pigs Example 6.2 : Bawean warty pigs



QUADRATS, TRANSECTS, POINTS, AND LINES – REVISITED  | 99

C

A

B

D

Figure 6.1  Detection of a passing creature depends on its speed, direction and proximity to the camera. It also depends upon the frame speed of the camera. The diagram indicates the paths of four individuals. Those at locations A and B are more likely to be detected than those on the parallel paths at C and D. Similar results occur whatever the paths. Figure 6.1 shows that a target is more likely to be detected if it passes near the centre of the field of view because, in effect, the camera has a longer time in which to detect the individual’s presence. The following example suggests that the effective field of view of a camera may be best determined after a study of the results.

Example 6.2: Bawean warty pigs

5

Rademaker et al. (2016) conducted a study of the warty pig Sus blouchi on Bawean island, Indonesia. Their data, which are available at the public repository http:// data.4tu.nl, provide information concerning the positions of pigs when first sighted. 3

3

2

1

1

1

2

Distance to side (m)

4

1

1

1

1

0

1 11 1 1 0

13

1

1 14 94 3 1 1 1 1 3 1 2

1

1

5 1 1 1 1 1

6

1

4

1

1 4

11 14 21

3

1

1

1

1

6

1

8

10

12

14

Distance in front (m)

Figure 6.2  The locations of groups of pigs relative to the camera when first detected. The results for all cameras are superimposed. Numbers are group sizes.

100  |  MEASURING ABUNDANCE Figure 6.2 superimposes the results from 34 cameras, with sightings at angles up to 60° and distances up to 14 m. The figure suggests that the region covered by the cameras may not be a simple sector: at 60° the most distant pig recorded was 5 m away; at 25° the most distant pig recorded was about 9 m away; the three most distant records refer to pigs noted at angles of 10° or less. A reasonable approximation for the catchment region for these pigs might be a rectangle of dimensions 10 m × 15 m, with an area of 150 m2.3 Rowcliffe et al. (2011) examined camera trap records of agoutis (Dasyprocta punctata) and also found ‘a relative lack of records with both large radius and large angle’.

6.4.4.2  Using the RN model Section 6.3 introduced the RN model of Royle and Nichols (2003). The model assumes that the individuals of interest are randomly distributed across the region of interest with a density of λ per unit area. At each sampling point, the model requires there to be a sequence of independent samples. The time period over which sampling occurs must be sufficiently short that it can be assumed that the local population is unchanged. Crucially, the model also assumes that no individual is observed at more than one sampling point. In the present application each ‘sampling point’ is a camera. What is not so straightforward is the definition of a sample. This must be a section of time during which the camera either records (1), or fails to record (0), an individual of interest. Too small a time interval will give too few 1s. Too long an interval will give too few 0s. The problem is equivalent to that faced when choosing the quadrat size to determine plant frequency (Section 2.5). There is also the question of the size of the gap between successive records. If there is a zero gap, then double counting must be avoided for individuals that remain in view across the time boundary between intervals. If there is a non-zero gap, then there is the danger that valuable data will have been ignored. The example that follows demonstrates the overestimates that may arise if several cameras monitor the same territory.

Example 6.3: Bawean warty pigs (cont.) This example concentrates on the data obtained within the 32.6 km2 Blok Gunung forest in the centre of Bawean island. Cameras were installed five at a time, with each set being in position for seven days. The locations of the first eight deployments are illustrated in Figure 6.3, with each camera set approximately 300 m from the previous camera. In addition to the camera locations, Figure 6.3 shows 1 km circles centred around each group of five cameras. Since wild pigs typically have territories of around 4 km, a pig family viewed from one camera in a group is likely to be the same family as that viewed by other cameras in the group. The unmarked routine can be used to fit the model in precisely the fashion described previously in Section 6.3. If the proximity of the cameras to one another is ignored, then the 40 × 7 presence/absence matrix includes 19 separate sightings with an average group size of 1.7. The model output included:

QUADRATS, TRANSECTS, POINTS, AND LINES – REVISITED  | 101

9362

12

9360

21 22

23

25

36

9359

5

4

24 6 10 7 9

37

26

38

29 28

27

39

17

30

13 11

18 16 19

40

8

12

15

14

20

33 32

9357

9358

North−south, km

9361

3

31

681

35 34

682

683

684

685

686

East−west, km

Figure 6.3  The locations of the first eight deployments of five cameras. The circles have radius 1 km and are centred on the midpoints of each group of five cameras. Abundance: Estimate

SE

z P(>|z|)

0.466 1.04 0.45

0.653

Assuming that the (very imprecise) estimate for log(λ) relates to a region of 150 m2, this corresponds to the infeasible estimate of around 11,000 pigs per km2! A more reasonable approach is to regard each set of five cameras as a single ‘observer’, with each observer covering about 3 km2.4The basic data is now an 8 × 7 array, with a total of 15 sightings. The resulting output is Abundance: Estimate

SE

z P(>|z|)

2.55 0.896 2.85 0.00435

The estimate of abundance is now about 4 pigs per km2, with an approximate 95% confidence interval from 2 to 75 km2. Given the overlap between deployments 3 and 4, the estimate is still likely to be an overestimate. Using Equation (6.2), Rademaker et al. (2016) gave an estimate of around 6 per km2.

102  |  MEASURING ABUNDANCE

6.5  Double-observer sampling

DOUBLE-OBSERVER DOUBLE-OBSERVER SAMPLING SAMPLING 125 125 DOUBLE-OBSERVER DOUBLE-OBSERVER SAMPLING SAMPLING 125 125

Equation Equation (6.2), (6.2), Rademaker Rademaker et alet(2016) al (2016) givegive an estimate an estimate of around of around 6 per 6 per km22km . 22.

Equation Equation (6.2), (6.2), Rademaker Rademaker et alet(2016) al (2016) givegive an estimate an estimate of around of around 6 per 6 per km km . . There are  at  least three variants of what is recorded when there are two observers at a   sampling site. 6.5 6.5Double-observer Double-observer sampling sampling

6.5Double-observer Double-observer sampling sampling 6.5.1  6.5Dependent double-observer sampling (DDS) There There are are at least at least three three variants variants of what of what is recorded is recorded when when there there are are twotwo

There There are are atknown least at least three three variants of what of what is recorded is present, recorded when when there there are are two two Suppose that it is that, forvariants every individual the probability of being observers observers at aatsampling a sampling site.site. observers at aat sampling site.site. observed isobservers p. Then, ifsampling na individuals were observed, the estimated number present would be n/p. Nichols et al. (2000) presented a method adapted from one proposed by Cook 6.5.1 6.5.1Dependent Dependent double-observer double-observer sampling (DDS) (DDS) and Jacobson (1979) for estimating bias insampling aerial surveys. The method, which uses two 6.5.1 6.5.1 Dependent Dependent double-observer double-observer sampling sampling (DDS) (DDS) observers, results in two counts being recorded for each sampling point. The of firstofcount Suppose Suppose thatthat it isitknown is known that, that, for every for every individual individual present, present, the the probability probability Suppose Suppose thatthat it isitknown is known that, that, for every for every individual individual present, present, the the probability probability of of is the number of individuals observed by thewere ‘primary’ recorder; the second count is being being observed observed is p.isThen, p. Then, if n ifindividuals n individuals were observed, observed, the the estimated estimated numnumbeing being observed observed is p.isThen, p. Then, if n ifindividuals n individuals werewere observed, observed, the the estimated estimated numnumber ber present present would would beobserved n/p. be n/p. Nichols Nichols et al et(2000) al (2000) presented presented a method a method adapted adapted the number of individuals by the ‘secondary’ recorder that were not observed ber ber present present would would be n/p. be n/p. Nichols Nichols et alet(2000) al (2000) presented presented a method a method adapted adapted from one one proposed proposed by Cook by Cook and and Jacobson Jacobson (1979) for estimating for estimating biasbias in aerial in aerial by the from primary recorder. During a sequence of(1979) point counts, the two observers should from from one one proposed proposed by Cook by Cook and and Jacobson Jacobson (1979) (1979) for estimating for estimating biasbias in aerial in aerial surveys. surveys. TheThe method, method, which which uses uses twotwo observers, observers, results results in two in two counts counts being being alternate the primary and secondary roles. surveys. surveys. TheThe method, method, which which usesuses two two observers, observers, results results in two in two counts counts being being recorded recorded for for eacheach sampling sampling point. point. TheThe firstfirst count count is the is the number number of individof individrecorded recorded for each for each sampling sampling point. point. TheThe firstfirst count count is the is the number number of individof individualsuals observed observed by the by the ‘primary’ ‘primary’ recorder; recorder; the the second second count count is the is the number number of of ualsuals observed observed by the by the ‘primary’ ‘primary’ recorder; recorder; the the second second count count is the is the number number of of individuals individuals observed observed by the by the ‘secondary’ ‘secondary’ recorder recorder thatthat werewere not not observed observed by by individuals individuals observed observed by the by the ‘secondary’ ‘secondary’ recorder recorder thatthat werewere not not observed observed by by bb Advice on data collection the the primary primary recorder. recorder. During During a sequence a sequence of point of point counts, counts, the the twotwo observers observers the the primary primary recorder. recorder. During During a sequence a sequence of point of point counts, counts, the the two two observers observers should alternate alternate the the primary primary andForcey and secondary secondary roles. roles. (2002) made a number of In theshould context of bird populations, and Anderson should should alternate alternate the the primary primary andand secondary secondary roles. roles.

recommendations included Advice Advice on that data on data collection collection • •

Advice Advice on data on data collection collection In the In the context context of bird of bird populations, populations, Forcey Forcey and and Anderson Anderson (2002) (2002) made made a a In the In the context context of bird of bird populations, populations, Forcey Forcey and and Anderson Anderson (2002) (2002) made made a a number number of recommendations of recommendations thatthat included included ‘Unlimited radius counts produce the greatest number of detections and are number number of recommendations of recommendations that that included included useful for recording rare species’, ‘Unlimited ‘Unlimited radius radius counts counts produce produce the greatest the greatest number number of detections of detections and and ‘Unlimited ‘Unlimited radius radius counts counts produce produce the greatest the greatest number number of detections of detections and and are useful useful for minutes] recording for recording rare rare species’, species’, ‘Longer are counts [10 reduce observer bias because investigators have are useful are useful for recording for recording rare rare species’, species’, more time to‘Longer record individual birds and ensures are investigators accurate’ ‘Longer counts counts [10 [10 minutes] minutes] reduce reduce observer observer biasdata bias because because investigators ‘Longer ‘Longer counts counts [10 minutes] [10 minutes] reduce reduce observer observer bias bias because because investigators investigators havehave more more timetime to record to record individual individual birdsbirds and and ensures ensures datadata are accurate’ are accurate’ havehave moremore timetime to record to record individual individual birdsbirds and and ensures ensures datadata are accurate’ are accurate’

Let Let pi (ipi=(i1,=2)1,be 2)the be the detection detection probability probability for observer for observer i. The i. The probability probability

pi 2) (ipbe =2)1, be 2)the be the detection detection probability probability for observer for observer i.probability The i. The probability probability i=(i1, Letthat pi Let (ithat = Let 1, probability i.i is The that observer observer observer jthe sees jdetection sees an individual an individual not not seenfor seen byobserver observer by observer itherefore is therefore (1−p (1−p i )pji.)pj . thatthat observer observer j sees j sees an individual an individual not not seenseen by observer by observer i is therefore i is therefore (1−p (1−p i )pj i.)pj . j sees an individual not seen by observer i is therefore (1 – p )p . Let n be the number of beij the be the number number of individuals of individuals (totalled (totalled overover all sampling alli sampling sites) counted counted Let Let nij n j ijsites) be the number number of individuals of individuals (totalled (totalled overover all sampling all sampling sites) sites) counted counted Let Let nij be nij the by observer by(totalled observer j when j when observer observer i is ithe is the primary primary observer. observer. Thus, Thus, for for a site aobserver site withwithi is the individuals over all sampling sites) counted by observer j when by observer by observer j when j when observer observer i is ithe is the primary primary observer. observer. Thus, Thus, for aforsite a site withwith individuals, M individuals, on average on average primaryM Thus, for a site with M individuals, on average Mobserver. individuals, M individuals, on average on average



  and   n11 n=11M=pM p andandn n=12M=(1M−(1p1−)pp21.)p2 .(6.4) (6.4) (6.4) n11 n=11M=p11M p11andandn12 =12M=(1M−(1p1−)pp21.)p2 . (6.4)(6.4) 12 n

Thus, Thus, on average, on n average, n12=/n n(1 /n =2(1−p )p12./pSimilarly n(1 /n = 1(1−p )p21./p2 . these Thus, on average, –= p (1−p )p /p11)p . Similarly n 21/nn =/n –= p(1−p /p22)p . 1Solving 11 21/p 1 . Similarly 22 2/p 12/n 11 2221 2)p Thus, Thus, on average, on average, n12 /n n12 =11 (1−p = (1−p )p12./pSimilarly n n21 =22 = (1−p )p21./p2 . 12 11/n 111 1 )p21/p 1 . Similarly 21 /n 21 22/n 22(1−p 2 )p12/p Solving Solving these these simultaneous simultaneous equations equations gives gives estimates estimates of the of the detection detection probaprobasimultaneous equations gives estimates of gives the detection for the two Solving Solving these these simultaneous simultaneous equations equations gives estimates estimates of the ofprobabilities the detection detection probaprobabilities bilities for the for the two two observers observers as as observers as bilities bilities for the for the two two observers observers as as n11 n22 n11 n22 −22n− n 12 n21 −22n− n 12 n21 11 n 11 n n22 −22n12 − nn21 n22 −22n12 − nn21 p2n= (6.5) (6.5) p1 =p1n= 11 n 11 n 12 21 12 n21andandp2 = 11 n 11 n 12 21 12,n21 , p1 =p1 = n22 n10 and and p  = p  = ,(6.5) (6.5)(6.5)   and   2 2 n n n n 22 10 11 20 , n22 nn10 n11 n20 22 n10 11 n 20 11 n20 +TRANSECTS, n+ (for i =i 1,= 2). 1,LINES 2). The—The probability probability thatthat an individual an individual is is where where ni0QUADRATS, n=i0 n= i1 n i2 n i2 (for 126 POINTS, AND REVISITED n= +i1 + n(for (for i =probability i1,= 2). 1, 2). TheThe probability probability thatthat an individual an individual isby at is least where where i1 n i1 i2 2). i2 The where nobserved ni1ni0 +by nn=i2i0at (for =none 1, observed i0 =observed )(1 )(1 p2−) p. 2Using ) . isUsing Equations Equations by least atileast one observer observer is p is=p1= − (1 1that −−(1 pan pindividual 1− 1− )(1 − )(1 p − ) p . Using ) . Using Equations Equations observed observed by at by least at least one one observer observer is p is = p 1 = − (1 1 − − (1 p − p 1 1 2 2 one observer is p = 1 – (1 – p1)(1 – p 2). Using Equations (6.5) this is estimated by (6.5) this is estimated by n12 n21 p = 1 − .(6.6) (6.6) n11 n22 The total number of individuals seen across all the sampling points is n00 =

The totalnnumber of individuals seen across all the sampling points is n 00 = n10 + n 20, so 10 + n20 , so that the total number of individuals present is given by that the total number of individuals present is given by  = n00 / N p.

Formulae for the variances of the estimated probabilities were given by Cook and Jacobson (1979). The estimated variance for p1 is given by p1 (1 − p)(n00 − pn10 )  , Var( p1 ) =

(6.7)

(6.5) this by (6.5) this is estimated by (6.5) this is is estimated estimated by n12 n21 n n21 n12 n p = 1 −p = 1 − . 12 21 . (6.6) (6.6) n22 n11 n22 n n11 11 n22 | 103 QUADRATS, AND LINES Thenumber total number of individuals seen all across all the sampling points is n00 = The total of individuals seenTRANSECTS, across thePOINTS, sampling points is n–00REVISITED  00 = + n , so that the total number of individuals present is given by n so nthat the total of individuals presentpresent is givenisby n10 + nn2010 20 that thenumber total number of individuals given by 10, + 20 , so



/  = n00 N p. N p=. n00 00 /

Formulae the variances the estimated probabilities were given by Formulae for the thefor variances theof estimated probabilities were given by Cook and Formulae for variances ofofthe estimated probabilities were given by Cook and Jacobson The for estimated variance for p  is given by Cook and Jacobson (1979).(1979). The estimated for p  is given by ^ 1 Jacobson (1979). The estimated variance pvariance is given by 1 1 1



(1 − p p n10 )) p1 (1 − p pn00 p 1)(n (1 00 −− p )(n )(n −p n 10 )− 00   = p11 ) = 1 , 10 ,  Var( p1 )Var( p  (1 − p  )n n p2 (1 − p22 (1 )n10 2 −np20 2 )n10 10 n20 20

(6.7) (6.7)

(6.7)

 a corresponding result for Var( p22 ). Also with a with corresponding result for for Var( p2 ). Also with a corresponding result . Also

    11 1 11 11 11 1 1 1    )= ++ + + ,(6.8) Var( p) Var( = (1p− p)2(1p − p)22 p + + , − − p1 n10 p 210 220 p110 n20 p (1 − p )n10 (1 − p )n20 p 11pn p 22pn p 221 (1 p 112 (1 n n (1 −p p 11 )n )n (1 −p p 22 )n )n20 10 20 10 20 (6.8) (6.8) Nichols that Nichols et alshowed (2000) Nicholset etal.al(2000) (2000) showedshowed that that



  2) Var(  p)(1 −np np200 00  ) (1 − p) n00 n200)Var(  00   )Var( + (6.9) (6.9) (6.9) Var(N = N + (6.9) Var( N) = =4 00 +44 p p22 p p2

The DOBSERV (available an R implements their apThe program DOBSERV (available withwith anwith R implements theirtheir apThe program program DOBSERV (available with anversion) R version) version) implements their apThe programme DOBSERV (available an version) R implements approach. proach. proach. The methods in this section are most effective when there are many sampling points, The American methods this section areSurvey. most effective when there arebeen manyused sam-in other methods in this in section are most effective when there are many samas in The the North Breeding Bird However, they have pling points, as North in the American North American Breeding Bird Survey. However, pling points, as in the Breeding Bird Survey. However, they they situationshave as the following example illustrates. been used in other situations as the following example illustrates. haveused beeninused other situations as the following example illustrates. have been otherinsituations as the following example illustrates. Example 6.4 : Alpine marmots Example 6.4 : Alpine marmots Example 6.4 : Alpine marmots

Example 6.4: Alpine marmots et al (2016) compared four methods for estimating the population CorlattiCorlatti et al (2016) compared four methods for estimating the population size of Alpine marmot (Marmota marmota) in a study area of the Stelvio

(Marmota marmota) in a area studyofarea of the Stelvio size of size Alpine marmot (Marmota marmota) in a study the Stelvio Corlatti et ofal.Alpine (2016)marmot compared four methods for estimating the population size National Italy. test double-observer sampling theyfour used four sites National Park inPark Italy.inTo testTo double-observer sampling they used sites of Alpine marmot (Marmota marmota) in a study area of the Stelvio National Park the 2016 the (primary, secondary): in the in summer of 2016of the results (primary, secondary): (8, 0), (8, in the summer summer ofobtaining 2016 obtaining obtaining the results results (primary, secondary): (8, 0), 0), in sampling they used four sites in=the summer of 0), (2, 1), 1).two The two observers alternated roles, son11 that n11 10, (2, Italy. 0), (2, (2,To 1),test (1, double-observer 1). (1, The observers alternated roles, so that 11 = 10, 2016 results (primary, secondary): (8, p0), (2,=1),0.88, (1,=1).0.88, The two n1,12 = 1,n21and 1. nThus p0), n3,22 =the = n1.21 Thus =(2,29/33 n22 =obtaining 00 1 12 3, 00 = n15, 1 15, 22 n= 12 and 21 = 00 = 1 = 29/33 observers alternated roles, so that n = 10, n = 3, n = 1, and n = 1. Thus n 00 = 15, p = 29/40 = 0.725, and p = 1−1/30 giving the estimated number of marmots p2 = 29/40 0.725, p =and 1−1/30 giving the estimated of 21 marmots 11 22 12 numbernumber p22 == 29/40 = and 0.725, p = 1−1/30 giving the estimated of marmots pas1 15×30/29 = 29/33 = 0.88, p 2≈=≈15.51 29/40 0.725, and p =since 1low, – capture-recapture 1/30 the estimated number of as 15×30/29 ≈= 16. is very sincegiving capture-recapture methods ≈ 15.51 16. This is This very low, methods (discussed in Chapter 7) ≈showed at54least 54since marmots were present the (discussed in 15 Chapter 7)≈ showed that atthat least marmots were present in the in methods marmots as × 30/29 15.51 16. This is very low, capture-recapture 5 area.5 5 study study area. study study (discussed inarea. Chapter 7) showed that at least 54 marmots were presentin the area.5 Modelling site-to-site variability Royle (2004) assumed the num6.5.1.1 6.5.1.1 Modelling site-to-site variability Royle (2004) assumed that thethat numof at sampling sites varied to bers of bers individuals presentpresent at equal-sized sampling sites varied to bers of individuals individuals present at equal-sized equal-sized sampling sites according varied according according to

5 The underestimate was attributed to unhelpful weather conditions. 5 5 The underestimate 5 was attributed to unhelpful weather weather conditions. The underestimate was attributed to unhelpful conditions. 6.5.1.1  Modelling site-to-site variability

Royle (2004) assumed that the numbers of individuals present at equal-sized sampling sites varied according to an underlying Poisson distribution (Section 1.4.2) with parameter λ. The R package unmarked of Fiske and Chandler (2011) can be used to fit the model and assess the relevance of background information concerning the sites (for example, in the case of birds, the value of λ might vary according to the amount of foliage present). Comparing the AIC values (Section 1.9) of alternative models provides information concerning the relevance of the background variables included in the analysis.

104  |  MEASURING ABUNDANCE

Example 6.5: Alpine marmots (cont.) Using unmarked the following commands produce the results for the model that ignores the differences between observers (denoted here as a and b): library(unmarked); Obs Rican Jdo (while also(cont.) Jbeing > J). aspecies consequence =J but 3262 individuals) found using fogging, and S2 = 197 (N1Costa : Costa ∗ Rican (cont.) > Jindividuals) (while alsoants J∗ > J). using aExample consequence being found the Winkler sifter. There species (N12.1 2 =J1415 1 with Example 12.1 : were Costa Rican in ants (cont.) The ants in Costa Rica trapped various ways S 1 =1 165 species (N1 = 3262 were S = 38 species that were trapped by both methods. Thus 12 The ants in Costa Rica were trapped in various ways with S1 = 165 Example 12.1 : Costa Rican antsand (cont.) 1 individuals) being found using fogging, S = 197 species (N = 1415 The ants in 3262 Costaindividuals) Rica were being trapped in SS12 individuals) = 2 various ways with 2 and = 165 197 species (N1 = 38 found using fogging, 1 being species found using the Winkler sifter. There were S = 38 species that trapped The (N ants in1415 Costa Rica trapped in using various ways with S = 165 = 3262 individuals) being found using fogging, and S12were 197 (N21 = = 0.12. J =werebeing 12 individuals) found the Winkler sifter. There species 165 + 197 − 38 = 3262 individuals) being found using fogging, and S = 197 species = 1415 individuals) being found using the Winkler sifter. There (N by both methods. Thus 2 were S12 =21 38 species that were trapped by both methods. Thus The 38 observed by both methods included individuals = 1415 individuals) being found usingmethods. the870 Winkler There species were S12(N =2species 38 species that were trapped by both Thussifter.observed 38 thebyWinkler usingS12 fogging and 385that observed using sifter, soThus that were = 38 species were trapped both methods. = 0.12. J= 38 197 − 38 = 0.12. J = 165 + 38 870 385 165 + 197 −V 38 U = J by 0.2667, =included = 0.2721, = 0.12. = =both The 38 species observed methods individuals observed 3262 1415 870870 165 + 197 − 38 The The 38 species observed by both methods included individuals observed 38 species by both methods included 870 individuals observed using fogging andobserved 385 observed using the Winkler sifter, so that using using fogging and 385 observed using the Winkler sifter, so that and hence The 38 species by both methods included 870 individuals observed fogging andobserved 385 observed using the Winkler sifter, so that 870 385 sifter, so that using fogging and 385 observed using × the Winkler 0.2667 0.2721 V = = 0.2721, ∗ U = 870 = 0.2667, 385 J U= = 3262 = 0.2667, V = 1415 = 0.16. = 0.2721, 0.2667 × 0.2721) 870 + 0.2721 − (0.2667 385 3262 1415 U= = 0.2667, V = = 0.2721, and hence 3262 Amongst species, there were 1415 f1+ = 11 species represented by and hence the 38 shared 0.2667 × 0.2721 and hence one hence individual data. ×The frequencies for= these and = the fogging0.2667 0.16. species in the J ∗in 0.2721 0.2667 −1,(0.2667 × 0.2721) = 0.16. J ∗ =were second collection 18, + 3, 0.2721 11, 21, 26, 29, 18, 3, 5, and 4, so that n2 = 139. 0.2721 × 0.2721) 0.2667−×(0.2667 0.2667 + 0.2721 ==0.16. J38∗ = The other values are f = 5, f = 10, f 2, and n1 = 219. 2+ +1 +2 = 11 species represented by Amongst therelevant shared species, there were f 0.2667 + 0.2721 − (0.2667 1+ × 0.2721) 11for species represented by Amongst the shared species, there were = =0.4344, V  == 0.3801, and Adjusting for38in unobserved species gives: Uf1+ one individual the fogging data. The frequencies these species in the = 5, 11and species represented by Amongst the 38inshared species, there were f18, one individual the fogging frequencies for these species in139. the 1+ 3, hence second collection were 18, 3, 11,data. 21, 1,The 26, 29, 4, sorepresented that n2 = Among the 38 shared species, there were f 1+ = 11 species by one 0.4344 × 0.3801 one individual in were the fogging The frequencies for = these species in139. the = second collection 18, 3,are 11,data. 21, 1, 26, 29, 18, 3, 5, and 4, so that n 2 The other relevant values f = 5, f = 10, f 2, and n = 219. = 0.25. J 2+ frequencies +1 +2 1 the second individual in the fogging data. The for these species in 0.4344 +11, 0.3801 (0.4344 0.3801) = 139. second collection were 18, 3,are 1, 3, 5,f+2 and= so that The other relevant values f21, =−26, 5, f29, = 10, 2, and nn12 = 219. 2+ gives: +1 18, =×so 0.4344, V=4,139. == 0.3801, and Adjusting for unobserved species U collection were 18, 3,unobserved 11, 21, 1, 26, 29,f2+ 18, 3, 5, and that n 2= The other relevant  4,  2, and The other relevant values are = 5, f = 10, f n = 219. 0.4344, V == 0.3801, and Adjusting for species gives: U The final value is roughly double that given by the Jaccard index, which takes +1 +2 1 hence  Adjusting for  valueshence are f = 5, f = 10, f = 2, and n = 219. unobserved species 0.4344 × 0.3801 Adjusting gives: U = 0.4344, V == 0.3801, and 2+ for +1 +2 species 1 no account ofJunobserved the individuals.   = numbers of0.4344 ×(0.4344 0.3801 × 0.3801) = 0.25.  = 0.3801, gives: hence U = 0.4344, V and hence 0.4344 + 0.3801 − = 0.25. J = 0.4344−×(0.4344 0.3801 × 0.3801) 0.4344 + 0.3801  = 0.25. The final valueJ is=roughly double that given by the Jaccard index, which takes 0.4344 + 0.3801 − (0.4344 0.3801) 12.3 Turnover The final value is roughly that given by×the Jaccard index, which takes no account of the numbersdouble of individuals.  The final value is roughly that given by the Jaccard index, which takes no account of the numbersdouble of individuals.  Yuan et al of (2016) suggested a comparison of the species mixture at one no account the numbers of that individuals.  location with that at a nearby location within the same region would provide 12.3 Turnover 12.3 Turnover 12.3 1 Yuan etTurnover al (2016) suggested that a comparison of the species mixture at one +







Amongst the 38 shared species, there were f1+ = 11 species represented by one individual in the fogging data. The frequencies for these species in the second collection were 18, 3, 11, 21, 1, 26, 29, 18, 3, 5, and 4, so that n2 = 139. 198  |  MEASURING ABUNDANCE The other relevant values are f2+ = 5, f+1 = 10, f+2 = 2, and n1 = 219. Adjusting for unobserved species gives: U  = 0.4344, V  == 0.3801, and hence 0.4344 × 0.3801 = 0.25. J = 0.4344 + 0.3801 − (0.4344 × 0.3801) The final value is roughly double that given by the Jaccard index, which takes The final value isofroughly double that given by the Jaccard index, which takes no no account the numbers of individuals. 

account of the numbers of individuals. 12.3

Turnover

et al (2016) suggested that a comparison of the species mixture at one 12.3  Yuan Turnover location with that at a nearby location within the same region would provide

RARITY 243 with Yuan et al. (2016) suggested that a comparison of the species mixture at one location that at a1 nearby location within the same region would provide a measure of diversity. Details of the methods used are given in the appendix to Longino, J. T., Coddington, J., They refer to this as turnover possible and measures. A second a measure ofspatial diversity. They and refersuggested to this as several spatial turnover suggested and Colwell, R. K. (2002). several possible measures. secondrefers type of turnover in is temporal turnover type of turnover is temporal turnoverA which to changes species composition at a which refers to changes in species at a single over time. single location over time. A simple and composition effective measure useslocation the so-called Manhattan A simple distance given by: and effective measure uses the so-called Manhattan distance given

by:

K

dM =



1 |p1k − p2k |, (12.9) (12.9) 2 k=1

pik (which may be zero) is the proportion of the total count at time i where pikwhere (which may be zero) is the proportion of the total count at time i (i = 1, 2) which (i = 1, 2) which consists of species k. The bounds on dM are 0 (no change) consists of species k. The bounds on d M are 0 (no change) and 1 (complete change). and 1 (complete change). Example 12.2 : Indian trees (cont.)

ExampleThe 12.2: Indianlists trees Appendix the (cont.) trees counted in the Mudumalai Forest Dynamics Plot in 1988, 1992, and 2000. The values of dM for the changes from 1988 to The Appendix lists the trees counted in the Mudumalai Forest Dynamics Plot in 1992, from 1992 to 2000, and from 1998 to 2000, are 5.3%, 7.0%, and 12.1%, 1988, 1992, and 2000. The values of d M for the changes from 1988 to 1992, from 1992 respectively.  to 2000, and from 1998 to 2000, are 5.3%, 7.0%, and 12.1%, respectively. 12.4

Rarity

12.4  InRarity a fascinating and influential paper, Rabinowitz (1981) suggested that there were three aspects to be considered (1981) in determining whether a species In a fascinating andrelevant influential paper, Rabinowitz suggested that there were three should be regarded as rare:. These were: relevant aspects to be considered in determining whether a species should be regarded Geographic range — large or small; as rare. These were:

Habitat specificity — wide or narrow; Local population size — large and occasionally dominant, or small and never • Geographic range – large or small; dominant. • HabitatCross-classifying specificity – wide or aspects narrow;gives eight combinations, with seven of them these • Localdescribing population – large andtheoccasionally or small and never somesize aspect of rarity; rarest being adominant, small population occupying a narrow habitat in a restricted location. Yu and Dobson (2001), who dominant. considered more than a thousand species of mammals, categorised more than one third these as belonging this rarest Similarly, Emam Cross-classifying aspectstogives eight category. combinations, withEspeland seven ofand them describing (2011), using published studies of plants, found that 30% of the plants studied some aspect of rarity; the rarest being a small population occupying a narrow habitat in a fell in the rarest category. restricted location. Yu and Dobson (2001), who considered more than a thousand species Location in Britain can be specified by reference to a grid of 1 km squares. of mammals, categorized more than one-third asrange belonging to this rarest by category. A simple method for quantifying geographic is therefore provided Similarly,counting Espeland Emam usingwithin published of plants, found that 30% theand number of (2011), grid squares whichstudies the species of interest has of the plants fellHowever, in the rarest category. beenstudied recorded. Hartley and Kunin (2003) demonstrated that the Location in Britain can of be two specified reference to a grid of 1 km relative abundance speciesby can be critically dependent onsquares. the size A of simple square used. They also notedrange that the number ofprovided occupiedby squares is depenmethod for quantifying geographic is therefore counting the number dent on the precise placement of the grid. Using Rabinowitz’s categories, Fattorini et al (2012) assigned a vulnerability index (α; see Table 12.1) to each species in order to identify locations of prime conservation concern. They determined two quantities:

OTHER ASPECTS OF DIVERSITY  | 199 Table 12.1  The categorization used by Rabinowitz (1981) to determine the rarity of a species, with the values of the vulnerability index used by Fattorini et al. (2012).

Range: (W, Widespread; R, Restricted)

W

W

W

R

W

R

R

R

OTHERS, ASPECTS OF DIVERSITY Habitat:244 (V, Varied; Specific)

V

V

S

V

S

V

S

S

Population: (A, Abundant; L, Low)

A

L

A

A

L

L

A

L

244

OTHER ASPECTS OF DIVERSITY

Table index, 12.1 α The categorization used by the rarity 1 Rabinowitz 2 3(1981) 4to determine 5 6 7 Vulnerability Table 12.1 with The by Rabinowitz (1981) determine the(2012). rarity of a species, thecategorization values of the used vulnerability index used byto Fattorini et al of a species, with the values of the vulnerability index used by Fattorini et al (2012).

8

Range: (within W, Widespread; R, Restricted) W has W been R Wrecorded. R R However, R of grid squares which the species of W interest Range: ( W, Widespread; R, Restricted) W W R W R R R Habitat: Varied; S, Specific) V W Sabundance V S of V S species S Hartley and Kunin(V,(2003) demonstrated that theV relative two can Habitat: (V, Varied; S, Specific) V V S V S V S S Population: (A, on Abundant; A They L A L that L the A number L be critically dependent the sizeL,ofLow) square used. alsoAnoted of Population: (A, Abundant; L, Low) A L A A L L A L Vulnerability index, α on the precise placement 1 2 of 3 the4grid. 5 6 7 8 occupied squares is dependent Vulnerability index, α 1 2 3 4 5 6 7 8 Using Rabinowitz’s categories, Fattorini et al. (2012) assigned a vulnerability index (α; see Table 12.1) to each species in order to identify locations of prime conservation concern. They determined two quantities:

L L i=1 (αi − αmin ) (12.10) (αi − αmin ) , (12.10) i=1 L(α max − αmin ) , (12.10)  − α ) L(α min L max (α − α ) i min L βW = i=1 , (12.11) S (αi − αmin ) βW = i=1 (12.11) i=1 (αi − αmin ) , (12.11) S i=1 (αi − αmin ) where L is the number of species at the local site under consideration, S L is the of the species at the localconsideration, under consideration, Snumber score is the thenumber total number of species across all sites, αsite where L iswhere ofnumber species at local site under S is the total i is the vulnerability is the vulnerability score is the total number of species across all sites, α i thei maximum species (see Table 12.1), and αmax and vulnerability scoreαmin for were species (see Tableand 12.1), and of speciesforacross allisites, αi is the αminhelps were the maximum and for speciesvalues i (seeacross Table 12.1), and The αmaxβand index sites with all species. C values αmax and minimum αmin were the maximum and minimum acrossidentify all species. The βaC index helps identify with minimum values across all species. Thewhile βC index identifies sitessites largea sites high proportion threatened species, β, W species, helps identify sites with aof high proportion of threatened while βwith W identifies W identifies sites with large high proportion of threatened species, while β , numbers of threatened species. with large numbers of threatened species. numbers of threatened species. Example 12.3 : Californian trees (cont.) Example 12.3 : Californian trees (cont.) The Californian study region contained 8372 trees representing 31 tree Example 12.3: trees (cont.) The Californian study region contained 8372skewed trees representing species. AsCalifornian usual the counts presented a highly distribution. 31 In tree this species. As usual the counts presented a ahighly skewed distribution. In this case there were five species represented by single tree, and one species (PseuThe Californian studyfive region contained 8372 trees representing 31 tree species. As case there were species represented by a single tree, and one species (Pseudotsuga menziesii ) accounting for 2180 trees. A species was defined to be usual the counts presented a highly skewed distribution. In this case there were dotsuga accounting for the 2180 trees. number A species was defined to be abundantmenziesii if there )were more than median of individuals. five species by a divided single and square one species (Pseudotsuga menziesii) abundant if there were more thantree, the median number of individuals. Therepresented study region was into 600 quadrats with side 10 m, accounting for 2180 trees. A species was defined to be abundant if there were The study region was divided into 600 square quadrats with side 10 m,more with the number of quadrats containing a species being recorded for each with the There number ofoffive quadrats a species being for each than the median number individuals. species. were speciescontaining recorded in more than 200recorded of the quadrats, There were five species in more than 200 ofside the 10 quadrats, Thespecies. study region was divided intorecorded 600 quadrats with m, with with Pseudotsuga menziesii being thesquare most widespread (293 quadrats). A the with Pseudotsuga menziesii being the most widespread (293 quadrats). Awere was regarded as widespread if it occurred more the median numberspecies of quadrats containing a species being recordedinfor eachthan species. There species was regarded as widespread if it occurred in more than the median number of quadrats occupied by a species, which was 10. five species recorded in more than 200 of the quadrats, with Pseudotsuga menziesii number quadrats occupied by on a species, which was was 10. regarded as having theofabsence of information habitat, a species being theIn most widespread (293 quadrats). A species was regarded as widespread In theneeds absence of quadrat information on for habitat, a species was regarded asofhaving specific if the counts that species had a occupied coefficient variaif it occurred in more than the median number of quadrats by a species, specific if the quadrat counts for that species had a coefficient of variation (seeneeds Section 1.2.6) that was greater than the median for the 31 species. which was 10. tion (see these Section 1.2.6) that greater than the median for an theα-value 31 species. With definitions, 14was of the 31 species were assigned of 1. In One theWith absence of information athe species was asof having these definitions, 14 of on thehabitat, 31 species were assigned an α-value 1. species (Pinus ponderosa) was assigned value 3, oneregarded species (UmbelspecificOne needs if the quadrat counts for that species had a coefficient of variation species (Pinus ponderosa) was assigned the value 3, one species (Umbellularia californica) was given the value 6, and the remainder were given the (see californica) was given 6, and the 31 remainder Sectionlularia 1.2.6)8.that was greater than the the value median for the species. were given the value value 8. definitions, 14 of the 31 species were assigned an α-value of 1. With these βC βC

= =

One species (Pinus ponderosa) was assigned the value 3, one species (Umbellularia californica) was given the value 6, and the remainder were given the value 8.

200  |  MEASURING ABUNDANCE Figure 12.2 represents the βC values for the 600 quadrats, with the darkest squares having the highest values. There were 518 quadrats with zero values: each of the trees present in these quadrats was judged to have α = 1. The numbers of vulnerable species (using the criteria outlined) are shown in the figure for each quadrat. A comparison of local characteristics (soil, rainfall, etc.) might help to identify the conditions required for the scarcer species to flourish. Varying the criteria used for each characteristic might make any such relationship more evident. 1

1 1

1

1

1 2

1 1

1

1

2

1

1

1

1 1

1

1

1 1

1

1

1

1

1 1

3

1

1

1 1

1 1

1

1

1

1

1

1 1

1 1 1

1

1

1

1

1 3

1 2 1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

Figure 12.2  The variation in βC values across the 300 m × 200 m Californian plot. Higher values are represented by darker squares. The number of more vulnerable species is indicated for each quadrat.

Appendix The table lists species of trees in the 50-ha (500 m x 1 km) Mudumalai Forest Dynamics Plot, which forms part of the Mudumalai Wildlife Sanctuary in the Western Ghats of India. The trees listed have diameters of at least 10 cm at breast height. With the notable exception of Indian laburnum (Cassia fistula), there has been a decline in the numbers of most species, while the number of species has also declined from 63 in 1988 and 1992 to 61 in 2000. The most frequent are axlewood (Anogeissus latifolia), laurel (Terminalia crenulata), myrtle (Lagerstroemia microcarpa), and teak (Tectona grandis).

Order Family

Genus

Species

B B C E E E F F F F F F F F F G G G G G L L L L L L L

Cordia Cordia Cassine Diospyros Careya Madhuca Albizia Bauhinia Bauhinia Cassia Dalbergia Erythrina Ougeinia Pterocarpus Butea Wrightia Canthium Hymenodictyon Xeromphis Mitragyna Radermachera Stereospermum Gmelina Premna Tectona Olea Schrebera

obliqua wallichii glauca montana arborea neriifolia odoratissima malabarica racemosa fistula latifolia indica oojeinensis marsupium monosperma tinctoria dicoccum orixense spinosa parvifolia xylocarpa colais arborea tomentosa grandis dioica swietenioides

Boraginaceae Boraginaceae Celastraceae Ebenaceae Lecythidaceae Sapotaceae Fabaceae Fabaceae Fabaceae Fabaceae Fabaceae Fabaceae Fabaceae Fabaceae Faboideae Apocynaceae Rubiaceae Rubiaceae Rubiaceae Rubiaceae Bignoniaceae Bignoniaceae Lamiaceae Lamiaceae Lamiaceae Oleaceae Oleaceae

1988

1992

2000

18 37 4 130 28 3 4 27 9 179 33 5 109 17 32 1 19 10 558 15 309 89 58 2 1805 4 69

20 38 3 117 25 3 3 26 9 246 35 5 94 17 31 1 16 10 521 14 308 90 51 2 1779 4 57

23 38 1 114 20 2 2 25 7 289 30 3 57 12 29 3 16 10 387 14 293 89 34 1 1716 4 32

202  |  MEASURING ABUNDANCE Order Family

Genus

Species

L Mp Mp Mp Mp Mp Mp Mp Mv Mv Mv Mv Mv Mv My My My My My My My R R R R R R R R R S S S S S S S TOTAL

Vitex Antidesma Mallotus Bischofia Bridelia Emblica Casearia Flacourtia Bombax Shorea Eriolaena Grewia Helicteres Kydia Anogeissus Terminalia Terminalia Terminalia Lagerstroemia Lagerstroemia Syzygium Artocarpus Ficus Ficus Ficus Ficus Ficus Ficus Ziziphus Ziziphus Lannea Mangifera Semecarpus Garuga Chukrasia Allophylus Schleichera

altissima diandrum philippensis javanica retusa officinalis esculenta indica ceiba roxburghii quinquelocularis tiliifolia isora calycina latifolia bellirica chebula crenulata microcarpa parviflora cumini gomezianus benghalensis drupacea hirsuta religiosa tsjakela virens rugosa xylopyrus coromandelica indica anacardium pinnata tabularis cobbe oleosa

Lamiaceae Euphorbiaceae Euphorbiaceae Phyllanthaceae Phyllanthaceae Phyllanthaceae Salicaceae Salicaceae Bombacaceae Dipterocarpaceae Malvaceae Malvaceae Malvaceae Malvaceae Combretaceae Combretaceae Combretaceae Combretaceae Lythraceae Lythraceae Myrtaceae Moraceae Moraceae Moraceae Moraceae Moraceae Moraceae Moraceae Rhamnaceae Rhamnaceae Anacardiaceae Anacardiaceae Anacardiaceae Burseraceae Meliaceae Sapindaceae Sapindaceae

1988

1992

2000

1 1 8 1 40 518 39 7 35 6 242 493 7 1328 2180 34 59 2628 3160 82 398 1 3 4 1 7 10 11 8 20 13 4 12 29 1 0 72 15,037

1 1 9 1 30 464 40 7 35 6 184 432 8 635 2158 33 51 2584 3189 75 393 1 3 4 0 6 10 13 7 14 13 4 12 26 1 2 69 14,046

1 0 4 0 23 387 37 2 35 1 26 353 2 84 2103 31 43 2482 3092 67 383 1 3 4 0 8 10 10 8 6 13 1 9 23 1 1 69 12,574

Key: B = Boriginales; C = Celastrales; E = Ericales; F = Fabales; G = Gentianales; L = Lamiales; Mp = Malpighiales; Mv = Malvales; My = Myrtales; R = Rosales; S = Sapindales Source: Center for Tropical Forest Science, Smithsonian Tropical Research Institute

Notes Chapter 1: Statistical ideas 1. Details taken from the online report by Jhala, Y. V., Qureshi, Q., and Nayak, A. K. (eds). (2019)

Status of tigers, co-predators and prey in India 2018. Summary Report. National Tiger Conservation Authority, Government of India, New Delhi & Wildlife Institute of India, Dehradun. TR No./2019/05.

2. The data were obtained via the website https://forestgeo.si.edu/sites/north-america/ university-california-santa-cruz. 3. For a more precise 95% the value is 1.96σ, but, for the measurement of abundance (given other uncertainties), using 2σ should give appropriate accuracy. 4. Here k is a whole number with a value obtained using the given expression. However, k factorial exists for all non-integer positive k. Using the R programming language its value is obtained by writing factorial(x). 5. The value of 1.92 is one-half of the upper 5% value of a chi-squared distribution with 1 degree of freedom. For a 99% interval replace 1.92 by 3.32. 6. Here 3.00 is an approximation to one-half of 5.991, where 5.991 is the upper 5% point of a chi-squared distribution. For the 99% region replace 3.00 by 4.61. 7. The natural logarithm may also be written loge since it expresses numbers as powers of e. Thus ln(e) = 1. 8. Using R, the appropriate command is 1 – pchisq(10.86,2). 9. In practice the values used are 1/2n, 3/2n, …, (1 – 1/2n).

Chapter 2: Quadrats and transects 1. In this chapter it is assumed that the entire contents of the quadrat or transect are observed. Buckland et al. (2001) use the term strip transect to distinguish this case from the case where the probability of an item being observed depends upon that item’s distance from the observer. For that case, which is considered in Chapter 6, Buckland et al. (2001) used the term line transect. 2. For a 25 × 25 study region, with 25 quadrats, this type of design is easily achieved as follows. The 25 x co-ordinates are 1, 2, …, 25, while the corresponding y co-ordinates are a random reordering (e.g. using balls drawn from a bag, or a computer’s random number generator) of 1, 2, …, 25. 3. From website https://daac.ornl.gov/cgi-bin/dsviewer.pl?ds_id=1365. The site is Colville201005. 4. http://www.agrra.org/data-explorer/. 5. Such quantities could arise if a 10 × 10 grid were superimposed on a photograph of the study region. 6. A formal test, assuming a normal (Gaussian) distribution for the individual estimates, would compare the ratio with the upper tail of an F-distribution having (Q – 1) and Q(n – 1) degrees of freedom. 7. Note that the table gives more decimal places than may seem sensible. The reason is that, in calculations, premature rounding may lead to wildly inaccurate final results. Fortunately, computers do not round prematurely!

204  |  NOTES TO pp. 57–104

Chapter 3: Points and lines 1. A consequence of adding constants to the numerator and denominator is that estimates of 0 or 1 (presumed to be impossible values) are avoided.

Chapter 4: Distance methods 1. Here ‘plants’ is used as an all-embracing term for the individuals being counted, which might be stationary items of any type. 2. The study area is taken to be the unit square. For the random pattern, each plant location is defined by taking a pair of uniform random numbers as the plant co-ordinates. For the gradient pattern a third random number is generated. If this exceeds the second number, then all three numbers are discarded and a new set is considered.

For the regular pattern, potential plant positions were again generated using a pair of random numbers, but the pair were only accepted if every previously generated location was at least a distance of 0.025 away from the new position. The first step in generating a clustered pattern was to randomly locate a cluster centre in an inner square with sides at 0.1 and 0.9. A random integer between 1 and 20 determined the number of individuals in this cluster. The location of each individual was then random within a square of side 0.2 centred on the cluster centre.

3. Quotations are from the translation of the original Japanese. 4. Data kindly supplied by Dr Hijbeek. 5. The values used for k are 3 (the largest value used in many comparative studies), 4 (to match the 4 distances used by the point-centred quarter methods discussed later), and 6, the value recommended by Prodan (1968). For Sites 1 and 2, the 25 sampling points were arranged centrally in a single row at intervals of 4 m (Set 1) or 3 m (Set 2). For Sites 3 and 4, a 5 × 5 lattice of sampling points was used, centrally placed in each site, with 4 m between successive points in a row or column. 6. Diggle used n rather than 1.2(n – 1), but the simulations suggest that the latter is appropriate. 7. Byth used 0.25 rather than 0.3 (omitting the 1.2 scaling).

Chapter 5: Variable sized plots 1. Tree diameters are measured at ‘breast height’ which is typically taken to be 4.5 ft (1.37 m) above ground level. This is familiarly referred to as DBH.

Chapter 6: Quadrats, transects, points, and lines – revisited 1. To obtain estimates of abundance, Mathews et al. (2018) estimated bat densities by ‘multiplying the typical maternity roost density in an average quality landscape by twice the typical number of adult females per roost’. 2. Lovers of Wood Thrushes should note that the characteristics of the sites as given here are fictitious. 3. For comparison, a 120° sector with radius 15 m would have an area of about 240 m2. 4. Note that the choice of 3 km2 is arbitrary, being based on an inspection of Figure 6.3. 5. The under-estimate was attributed to unhelpful weather conditions. 6. 2 × (5.63 + 2.05) = 15.36.

NOTES TO pp. 130–197  | 205

Chapter 7: Capture-recapture methods 1. Although Jolly and Cormack had rooms in the same corridor in Aberdeen University (and played each other at chess) they were unaware of each other’s research until their papers were published. 2. The corresponding 99% interval is obtained by replacing –1.6 by –2.2 and 2.4 by 3.0 in Equation (7.25). 3. BUGS code may be run under R using the rjags package.

Chapter 8: Distance methods 1. The first two Hermite polynomials used in the Distance programme are H4(z) = 16z4 – 48z2 + 12 and H6(z) = 64z6 – 480z4 + 720z2 – 120. 2. Their further suggestion is to ignore observations for which p(d) < 0.10 for point transects, or < 0.15 for line transects. 3. I am particularly grateful to Dr Eric Rexstad for his patient assistance with my use of Distance. 4. The probability density function is simply the product of the distance d and the detection function p(d). 5. In 2003 the corresponding figure was much higher at about 49.5 per ha. 6. Distance does not permit simultaneous use of covariates and extensions and does not allow a uniform key with covariates. 7. In Distance these cases require NAs for the distance entry. 8. These are the numbers after truncation of the distance data.

Chapter 9: Species richness 1. Details are given in the Appendix. 2. In reality, because of spatial clustering, the required number of trees would be greater, but probably no more than a thousand. 3. This is the estimator introduced with different notation in Equation (7.14) in the discussion of capture-recapture methods. 4. For package details, downloads, etc., see http://chao.stat.nthu.edu.tw/wordpress/software_​ download/. 5. The ant data are freely available at http://esapubs.org/archive/ecol/E083/011/suppl-1.htm.

Chapter 10: Diversity 1. The solution is to work with exp(H) rather than H itself. 2. Data are available via http://www.agrra.org/data-explorer/.

Chapter 11: Species abundance distributions (SADS) 1. Truncated because a count of zero cannot occur.

Chapter 12: Other aspects of diversity 1. Details of the methods used are given in the appendix to Longino, Coddington, and Colwell (2002).

Further reading Wide-ranging books Henderson, P. A., and Southwood, T. R. E. (2016) Ecological Methods, 4th edn. Chichester: Wiley. Now in its 4th edition, this is 600 pages of useful information. Krebs, C. J. (1998) Ecological Methodology, 2nd edn. Pearson: Cambridge. The 2nd edition is still available, while some chapters of a possible 3rd edition may be downloaded from the author’s website (http://www.zoology.ubc.ca/~krebs/books.html). Sutherland, W. L. (ed.) (2006) Ecological Census Techniques, 2nd edn. Cambridge: Cambridge University Press. In addition to discussing methods from most of the chapters in this present book, this authoritative compilation presents separate chapters relating to amphibians, birds, fish, invertebrates, mammals, plants, and reptiles. Each chapter is written by specialists in their field.

Chapter 1: Statistical ideas There are many introductory statistics books, and many all-embracing ecology books will include a statistical introduction. Schreuder, H. T., Ernst, R., and Ramirez-Maldonado, H. (2004) Statistical Techniques for Sampling and Monitoring Natural Resources, General Technical Report RMRS-GTR-126. Rocky Mountain Research Station, Fort Collins, CO: US Department of Agriculture, Forest Service. A useful example, available online, and aimed especially at foresters.

Chapter 3: Points and lines Gregoire, T. G., and Valentine, H. T. (2007) Sampling Strategies for Natural Resources and the Environment. Boca Raton, FL: Taylor & Francis. An authoritative text on sampling, with emphasis on sampling in forestry. Herrick, J. E., Van Zee, J. W., McCord, S. E., Courtright, E. M., Karl, J. W., and Burkett, L. M. (2017) Monitoring Manual for Grassland, Shrubland, and Savanna Ecosystems. Las Cruces, NM: USDA-ARS Jornada Experimental Range. Available online, this describes the practicalities of setting up transects on grassland. Hill, J., and Wilkinson, C. (2004) Methods for Ecological Monitoring of Coral Reefs. Townsville: Australian Institute of Marine Science. Available online, this provides very detailed guidance on all aspects of monitoring coral reefs. It includes descriptions of ten applications of transects. Lutes, D. C., Keane, R. E., Caratti, J. F., Key, C. H., Benson, N. C., Sutherland, S., and Gangi, L. J. (2006) FIREMON: Fire Effects Monitoring and Inventory System. Rocky Mountain Research Station, Fort Collins, CO: US Department of Agriculture. Available online, this provides a comprehensive account of the implementation of the line-intercept and point-intercept methods.

FURTHER READING  | 207

Chapter 6: Quadrats, transects, points, and lines – revisited Gibbs, J. P. (2000) Monitoring populations. In Boitani, L., and Fuller, T. K. (eds), Research Techniques in Animal Ecology. New York: Columbia University Press. ch. 7. Available online, this provides a discussion of the use of indices. Kéry, M., and Royle, J. A. (2015) Applied Hierarchical Modeling in Ecology: Analysis of Distribution, Abundance and Species Richness in R and BUGS: Volume 1: Prelude and Static Models; (2020) Volume 2: Dynamic and Advanced Models. London: Academic Press. More suitable for statisticians, these two volumes (well over 1000 pages) provide the definitive guide to hierarchical models, including use of the unmarked package. Ralph, C. J., Sauer, J. R., and Droege, S. (1995) Monitoring Bird Populations by Point Counts. Albany, CA: US Forest Service. Freely available online. The European Bird Census Council has provided a useful summary of methods used in Europe which is available at https://pecbms.info/best-practice-guide/.

Chapter 7: Capture-recapture methods Otis, D. L., Burnham, K. P., White, G. C., and Anderson, D. R. (1978) Statistical inference from capture data on closed animal populations, Wildlife Monographs, 62, 3–135. The classic survey of types of capture-recapture model in closed populations. The MARK manual (with more than 1000 pages) freely available from http://www.phidot. org/software/mark/docs/book/ is probably the most useful guide for any novice wishing to analyse capture-recapture data. The length is an indication of the effort required for a full understanding, and an indication also of the extent to which this chapter has done no more than graze the surface of the topic. For statisticians, some specialist books are: McCrea, R. S., and Morgan, B. J. T. (2014) Analysis of Capture-Recapture Data. Boca Raton, FL: Chapman and Hall/CRC. Royle, J. A., Chandler, R. B., Sollmann, R., and Gardner, B. (2013) Spatial Capture-Recapture. Waltham, MA: Academic Press. Amstrup, S. C., McDonald, T. L., and Manly, B. F. J. (2005) Handbook of Capture-Recapture Analysis. Princeton, NJ: Princeton University Press. Two books recommended by the authors of the MARK manual are: Burnham, K. P., and Anderson, D. R. (2002) Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach, 2nd edn. New York: Springer-Verlag. Williams, B. K., Nichols, J. D., and Conroy, M. J. (2002) Analysis and Management of Animal Populations. San Diego, CA: Academic Press.

Chapter 8: Distance methods While the online webpages for the Distance programme are a treasure trove of worked examples and videos, with a free online course, the book that underlies the Distance approach is: Buckland, S. T., Anderson, D. R., Burnham, K. P., Laake, J. L., Borchers, D. L., and Thomas, L. (2001) Introduction to Distance Sampling. Oxford: Oxford University Press. The same authors edited a follow-up book that addresses various specialist topics: Buckland, S. T., Anderson, D. R., Burnham, K. P., Laake, J. L., Borchers, D. L., and Thomas, L. (2004) Advanced Distance Sampling. Oxford: Oxford University Press.

208  |  MEASURING ABUNDANCE An important paper that suggests strategies for obtaining more accurate estimates is given by: Buckland, S. T., Marsden, S. J., and Green, R. E. (2008) Estimating bird abundance: making methods work, Bird Conservation International, 18, S91–S108.

Part IV: Species Kindt, R., and Coe, R. (2005) Tree Diversity Analysis: A Manual and Software for Common Statistical Methods for Ecological and Biodiversity Studies. Nairobi: World Agroforestry Centre (ICRAF). Available online, a book that concentrates on the measurement of species richness and diversity in the context of trees, and also includes some advanced statistical techniques. Magurran, A. E., and McGill, B. J. (eds) (2010) Biological Diversity: Frontiers in Measurement and Assessment. Oxford: Oxford University Press. Twenty chapters dealing with diversity, SADs, and a variety of applications. Kondratyeva, A., Grandcolas, P., and Pavoine, S. (2019) Reconciling the concepts and measures of diversity, rarity and originality in ecology and evolution, Biological Reviews, 94, 1317–1337. A detailed and thoughtful paper with contents that match its title.

References Affleck, D. L. R., Gregoire, T. G., and Valentine, H. T. (2005) Design unbiased estimation in line intersect sampling using segmented transects, Environmental and Ecological Statistics, 12, 139–154. Agresti, A., and Couli, B. A. (1998) Approximate is better than exact for interval estimation of binomial parameters. American Statistician, 52, 119–126. Alatalo, R. V. (1981) Problems in the measurement of evenness in ecology, Oikos, 37, 199–204. Alldredge, M. W., Pollock, K. H., Simons, T. R., Collazo, J. A., and Shriner, S. A. (2007) Time-of-detection method for estimating abundance from point-count surveys, The Auk, 124, 653–664. Anderson, M. J., Crist, T. O., Chase, J. M., Vellend, M., Inouye, B. D., Freestone, A. L., Sanders, N. J., Cornell, H. V., Comita, L. S., Davies, K. F., Harrison, S. P., Kraft, N. J. B., Stegen, J. C., and Swenson, N. G. (2011) Navigating the multiple meanings of β-diversity: a roadmap for the practicing ecologist, Ecology Letters, 14, 19–28. Arnason, A. N., Schwarz, C. J., and Gerrard J. M. (1991) Estimating closed population size and number of marked animals from sighting data, Journal of Wildlife Management, 55, 718–730. Arrhenius, O. (1921) Species and area, Journal of Ecology, 9, 95–99. Baillargeon, S., and Rivest, L.-P. (2007) Rcapture: loglinear models for capture-recapture in R, Journal of Statistical Software, 19, 1–31. Baldridge, E., Harris, D. J., Xiao, X., and White, E. P. (2016) An extensive comparison of speciesabundance distribution models, PeerJ, 4, e2823. Barkman, J. J., Doing, H., and Segal, S. (1964) Kritische bemerkungen und vorschläge zur quantitativen vegetationsanalyse, Acta Botanica Neerlandica, 13, 394–419. Bart, J., and Earnst, S. (2002) Double sampling to estimate density and population trends in birds, The Auk, 119, 36–45. Bart, J., Droege, S., Geissler, P., Peterjohn, B., and Ralph, C. J. (2004) Density estimation in wildlife surveys, Wildlife Society Bulletin, 32, 1242–1247. Barwell, L. J., Isaac, N. J. B., and Kunin, W. E. (2015) Measuring β-diversity with species abundance data, Journal of Animal Ecology, 84, 1112–1122. Beenaerts, N., and Vanden Berghe, E. (2005) Comparative study of three transect methods to assess coral cover, richness and diversity, Indian Journal of Geo-Marine Sciences, 4, 29–37. Berger, W. H., and Parker, F. L. (1970) Diversity of planktonic foraminifera in deep sea sediments, Science, 168, 1345–1347. Besag, J. E., and Gleaves, J. T. (1973) On the detection of spatial pattern in plant communities, Bulletin of the International Statistical Institute, 45, 153–158. Bibby, C. J., Burgess, N. D., Hill, D. A., and Mustoe, S. (2000) Bird Census Techniques. London: Academic Press. Bitterlich, W. (1948) Die Winkelzählprobe. Allgemeine Forst- und Holzwirtschaftliche Zeitung, 59, 45. Böhning, D. (2008) A simple variance formula for population size estimators by conditioning, Statistical Methodology, 5, 410–423. Bonar, A. A., Fehmi, J. S., and Mercado-Silva, N. (2011) An overview of sampling issues in species diversity and abundance surveys. In A. E. Magurran and B. J. McGill (eds), Biological Diversity. Oxford: Oxford University Press. pp. 11–24. Bonham, C. D. (2013) Measurements for Terrestrial Vegetation, 2nd edn. New York: Wiley.

210  |  MEASURING ABUNDANCE Boose, E. R., Boose, E. F., and Lezberg, A. L. (1998) A practical method for mapping trees using distance measurements, Ecology, 79, 819–827. Borchers, D. L., and Efford, M. G. (2008) Spatially explicit maximum likelihood methods for capture-recapture studies, Biometrics, 64, 377–385. Borchers, D. L., and Fewster, R. M. (2016) Spatial capture-recapture models, Statistical Science, 31, 219–232. Botta-Dukát, Z. (2005) Rao’s quadratic entropy as a measure of functional diversity based on multiple traits, Journal of Vegetation Science, 16, 533–540. Braun-Blanquet, J. (1928) Pflanzensoziologie: Gründzuge der Vegetationskunde. Berlin: Springer-Verlag. Brillouin, L. (1962) Science and Information Theory, 2nd edn. New York: Dover. Britzke, E. R., and Herzog, C. (2009) Using Acoustic Surveys to Monitor Population Trends in Bats. Vicksburg, MS: US Army Engineer Research and Development Center. Buckland, S. T., Marsden, S. J., and Green, R. (2008) Estimating bird abundance: making methods work, Bird Conservation International, 18, 91–108. Buckland, S. T., Anderson, D. R., Burnham, K. P., Laake, J. L., Borchers, D. L., and Thomas, L. (2001) Introduction to Distance Sampling. Oxford: Oxford University Press. Bulmer, M. G. (1974) On fitting the Poisson lognormal distribution to species-abundance data, Biometrics, 30, 101–110. Bunge, J. A. (2013) A survey of software for fitting capture-recapture models, WIREs Computational Statistics, 5, 114–120. Burton, A. C., Neilson, E. W., Moreira, D., Ladle, A., Steenweg, R., Fisher, J. T., Bayne, E., and Boutin, S. (2015) Wildlife camera trapping: a review and recommendations for linking surveys to ecological processes, Journal of Applied Ecology, 52, 675–685. Byth, K. (1982) On robust distance-based intensity estimators, Biometrics, 38, 127–135. Byth, K., and Ripley, B. D. (1980) On sampling spatial patterns by distance methods, Biometrics, 36, 279–284. Caldwell, Z. R., Zgliczynski, B. J., Williams, G. J., and Sandin, S. A. (2016) Reef fish survey techniques: assessing the potential for standardizing methodologies, PLoS ONE, 11, e0153066. Camp, R. J., LaPointe, D. A., Hart, P. J., Sedgwick, D. E., and Canale, L. K. (2019) Large-scale tree mortality from Rapid Ohia Death negatively influences avifauna in lower Puna, Hawaii Island, USA, The Condor, 121, duz007. Canfield, R. H. (1941) Application of the line interception method in sampling range vegetation, Journal of Forestry, 39, 388–394. Cappo, M., Harvey, E. S., and Shortis, M. R. (2006) Counting and measuring fish with baited video techniques – an overview, Australian Society for Fish Biology 2006 Workshop Proceedings, 101–114. Catana, A. J. (1953) The wandering quarter method of estimating population density, Ecology, 44, 349–360. Chao, A. (1984) Nonparametric estimation of the number of classes in a population, Scandinavian Journal of Statistics, 11, 265–270. Chao, A. (1987) Estimating the population size for capture-recapture data with unequal capture probabilities, Biometrics, 43, 783–791. Chao, A. (1989) Estimating population size for sparse data in capture-recapture experiments, Biometrics, 45, 427–438. Chao, A. (2001) An overview of closed capture-recapture models, Journal of Agricultural, Biological and Environmental Statistics, 6, 158–175. Chao, A. (2005) Species richness estimation. In N. Balakrishnan, C. B. Read, and B. Vidakovic (eds), Encyclopaedia of Statistical Sciences, 2nd edn. New York: Wiley. vol. 12, pp. 7907–7916. Chao, A., and Lee, S.-M. (1992) Estimating the number of classes via sample coverage, Journal of the American Statistician Association, 87, 210–217.

REFERENCES  | 211 Chao, A., and Shen, T.-J. (2010) Program SPADE (Species Prediction And Diversity Estimation). Program and Users Guide. Accessed at: http://chao.stat.nthu.edu.tw. Chao, A., Lee, S.-M., and Jeng, S.-L. (1992) Estimating population size for capture-recapture data when capture probabilities vary by time and individual animal, Biometrics, 48, 201–216. Chao, A., Wang, Y. T., and Jost, L. (2013) Entropy and the species accumulation curve: a novel entropy estimator via discovery rates of new species, Methods in Ecology and Evolution, 4, 1091–1100. Chao, A., Chazdon, R. L., Colwell, R. K., and Shen, T.-J. (2006) Abundance-based similarity indices and their estimation when there are unseen species in samples, Biometrics, 62, 361–371. Chao, A., Gotelli, N. J., Hsieh, T. C., Sander, E. L., Ma, K. H., Colwell, R. K., and Ellison, A. M. (2014) Rarefaction and extrapolation with Hill numbers: a framework for sampling and estimation in species diversity studies, Ecological Monographs, 84, 45–67. Chao, A., Hsieh, T. C., Chazdon, R. L., Colwell, R. K., and Gotelli, N. J. (2015) Unveiling the species-rank abundance distribution by generalizing the Good-Turing sample coverage theory, Ecology, 96, 1189–1201. Chapman, D. G. (1951) Some properties of the hypergeometric distribution with applications to zoological censuses, University of California Publications in Statistics, 1, 131–160. Chapman, D. G. (1954) The estimation of biological populations, The Annals of Mathematical Statistics, 25, 1–15. Chiarucci, A., Wilson, J. B., Anderson, B. J., and de Dominicis, V. (1999) Cover versus biomass as an estimate of species abundance: does it make a difference to the conclusions? Journal of Vegetation Science, 10, 35–42. Chiu, C.-H., Wang, Y. T., Walther, B. A., and Chao, A. (2014) An improved nonparametric lower bound of species richness via a modified Good-Turing frequency formula, Biometrics, 70, 671–682. Cintrón, G., and Schaeffer-Novelli, Y. (1984) Methods for studying mangrove structure. In S. C. Snedaker and J. G. Snedaker (eds), The Mangrove Ecosystem: Research Methods. Paris: UNESCO. ch. 6. Clark, P. J., and Evans, F. C. (1954) Distance to nearest neighbor as a measure of spatial relationships in populations, Ecology, 35, 445–453. Clarke, K. R. (1990) Comparisons of dominance curves, Journal of Experimental Marine Biology and Ecology, 138, 143–157. Clarke, K. R., and Warwick, R. M. (1999) The taxonomic distinctness measure of biodiversity: weighting of step lengths between hierarchical levels, Marine Ecology Progress Series, 184, 21–29. Colwell, R. K., and Coddington, J. A. (1994) Estimating terrestrial biodiversity through extrapolation, Philosophical Transactions of the Royal Society of London, Series B, 345, 101–118. Cook, R. D., and Jacobson, J. O. (1979) A design for estimating visibility bias in aerial surveys, Biometrics, 35, 735–742. Corlatti, L., Nelli, L., Bertolini, M., Zibordi, F., and Pedrotti, L. (2017) A comparison of four different methods to estimate population size of Alpine marmot (Marmota marmota), Hystrix, 28, 1–7. Cormack, R. M. (1964) Estimates of survival from sighting of marked animals, Biometrika, 51, 429–438. Cottam, G., and Curtis, J. T. (1956) The use of distance measures in phytosociological sampling, Ecology, 37, 451–460. Cottam, G., Curtis, J. T., and Hale, B. W. (1953) Some sampling characteristics of a population of randomly dispersed individuals, Ecology, 34, 741–757. Damgaard, C. (2014) Estimating mean plant cover from different types of cover data: a coherent statistical framework, Ecosphere, 5(2), 20.

212  |  MEASURING ABUNDANCE Darroch, J. N., and Ratcliff, D. (1980) A note on capture-recapture estimation, Biometrics, 36, 149–153. Daubenmire, R. F. (1959) A canopy-cover method of vegetational analysis, Northwest Science, 33, 43–46. Dawe, E. G., Hoenig, J. M., and Xu, X. (1993) Change-in-ratio and index-removal methods for population assessment and their application to snow crab (Chionoecetes opilio), Canadian Journal of Fisheries and Aquatic Sciences, 50, 1467–1476. Duchesne, R. R., Chopping, M. J., and Tape, K. D. (2016) Capability of the CANAPI algorithm to derive shrub structural parameters from satellite imagery in the Alaskan Arctic, Polar Record, 52, 124–133. de Vries, P. G. (1973) A General Theory on Line Intersect Sampling with Application to Logging Residue Inventory. Wageningen, Netherlands: Mededelingen Landbouwhogeschool. DeLury, D. B. (1947) On the estimation of biological populations, Biometrics, 3, 145–167. Denney, C., Fields, R., Gleason, M., and Starr, R. (2017) Development of new methods for quantifying fish density using underwater stereo-video tools, Journal of Visualized Experiments, 129, 56635. Diggle, P. J. (1975) Robust density estimation using distance methods, Biometrika, 62, 39–48. Diggle, P. J. (1983) Statistical Analysis of Spatial Point Patterns. London: Academic Press. Dobrowski, S. Z., and Murphy, S. K. (2006) A practical look at the variable area transect, Ecology, 87, 1856–1860. Dollar, S. J. (1982) Wave stress and coral community structure in Hawaii, Coral Reefs, 1, 71–81. Domin, K. (1928) The relations of the Tatra mountain vegetation to the edaphic factors of the habitat: a synecological study, Acta Botanica Bohemica, 6/7, 133–164. Dorazio, R. M., Jelks, H. L., and Jordan, F. (2005) Improving removal-based estimates of abundance by sampling a population of spatially distinct subpopulations, Biometrics, 61, 1093–1101. Ducey, M. J., Gove, J. H., and Valentine, H. T. (2004) A walkthrough solution to the boundary overlap problem, Forest Science, 50, 427–435. Ducey, M. J., Williams, M. S., Gove, J. H., Roberge, S., and Kenning, R. S. (2013) Distancelimited perpendicular distance sampling for coarse woody debris: theory and field results, Forestry, 86, 119–128. Duchamp, J. E., Yates, M., Muzika, R.-M., and Swihart, R. K. (2006) Estimating probabilities of detection for bat echolocation calls: an application of the double-observer method, Wildlife Society Bulletin, 34, 408–412. Efford, M. G. (2004) Density estimation in live-trapping studies, Oikos, 106, 598–610. Efford, M. G. (2011) Estimation of population density by spatially explicit capture-recapture with area searches, Ecology, 92, 2202–2207. Efford, M. G. (2019) Non-circular home ranges and the estimation of population density, Ecology, 100, e02580. doi:10.1002/ecy.2580. Elzinga, C. L., Salzer, D. W., and Willoughby, J. W. (1998) Measuring and Monitoring Plant Populations. Denver, CO: Bureau of Land Management. Engeman, R. M., Nielsen, R. M., and Sugihara, R. T. (2005) Evaluation of optimized variable area transect sampling using totally enumerated field data sets, Environmetrics, 16, 767–772. Engeman, R. M., Sugihara, R. T., Pank, L. F., and Dusenberry, W. E. (1994) A comparison of plotless density estimators using Monte Carlo simulation, Ecology, 75, 1769–1779. Enquist, B. J., Feng, X., Boyle, B., Maltner, B., Newman, E. A., Jørgensen, P. M., Roehrdanz, P. R., Thiers, B. M., Burger, J. R., Corlett, R. T., Couvreur, L. P., Dauby, G., Donoghue, J. C., Foden, W., Lovett, J. C., Marquet, P. A., Merow, C., Midgley, G., Morueta-Holme, N., Neves, D. M., Oliveira-Filho, A. T., Kraft, N. J. B., Park, D. S., Peet, R. K., Pillet, M., SerraDiaz, J. M., Sandel, B., Schildhauer, M., Šimová, I., Violle, C., Wieringa, J. J., Wiser, S. K., Hannah, L., Svenning, J.-C., and McGill, B. J. (2019) The commonness of rarity: global and future distribution of rarity across land plants, Science Advances, 5, eaaz0414.

REFERENCES  | 213 Espeland, E. K., and Emam, T. M. (2011) The value of structuring rarity: the seven types and links to reproductive ecology, Biodiversity and Conservation, 20, 963–985. Evans, R. A., and Love, R. M. (1957) The step-point method of sampling: a practical tool in range research, Journal of Range Management, 10, 208–212. Farnsworth, G. I., Pollock, K. H., Nichols, J. D., Simons, T. R., Hines, J. E., and Sauer, J. R. (2002) A removal model for estimating detection probabilities from point-count surveys, The Auk, 119, 414–425. Fattorini, S., Cardoso, P., Rigal, F., and Borges, P. A. V. (2012) Use of arthropod rarity for area prioritisation: insights from the Azorean islands, PLoS ONE, 7, e33995. Fehmi, J. S. (2010) Confusion among three common plant cover definitions may result in data unsuited for comparison, Journal of Vegetation Science, 21, 273–279. Fernandes, P. G., Ferlotto, F., Holliday, D. V., Nakken, O., and Simmonds, J. (2002) Acoustic applications in fisheries science: the ICES contribution, ICES Marine Symposia, 215, 483–492. Fisher, R. A., Corbet, A. S., and Williams, C. B. (1943) The relation between the number of species and the number of individuals in a random sample of an animal population, Journal of Animal Ecology, 12, 42–58. Fiske, I. J., and Chandler, R. B. (2011) unmarked: an R package for fitting hierarchical models of wildlife occurrence and abundance, Journal of Statistical Software, 43, 1–23. Fletcher, R. J., Jr, and Hutto, R. L. (2006) Estimating detection probabilities of river birds using double surveys, The Auk, 123, 695–707. Forcey, G. M., and Anderson, J. T. (2002) Variation in bird detection probabilities and abundances among different point count durations and plot sizes, Proceedings of the Annual Conference of the Southeastern Association of Fish and Wildlife Agencies, 56, 331–342. Forcey, G. M., Anderson, J. T., Ammer, F. K., and Whitmore, R. C. (2006) Comparison of two double-observer point-count approaches for estimating breeding bird abundance, Journal of Wildlife Management, 70, 1674–1681. Gibbons, D. W., and Gregory, R. D. (2006) Birds. In W. J. Sutherland (ed.), Ecological Census Techniques. Cambridge: Cambridge University Press. ch. 9. Good, I. J. (1953) The population frequencies of species and the estimation of population parameters, Biometrika, 40, 237–264. Goodall, D. W. (1952) Some considerations in the use of point quadrats for the analysis of vegetation, Australian Journal of Scientific Research, Series B, 5, 1–41. Gove, J. H., and van Deusen, P. C. (2011) On fixed-area plot sampling for downed coarse woody debris, Forestry, 84, 109–117. Gove, J. H., Ducey, M. J., Valentine, H. T., and Williams, M. S. (2013) A comprehensive comparison of perpendicular distance methods for sampling downed coarse wood debris, Forestry, 86, 129–143. Gray, J. S., Bjørgesæter, M. K., and Ugland, K. I. (2006) On plotting species abundance distributions, Journal of Animal Ecology, 75, 752–756. Gregoire, T. G. (1982) The unbiasedness of the mirage correction procedure for boundary overlap, Forest Science, 28, 504–508. Gregoire, T. G., and Valentine, H. T. (2003) Line intersect sampling: ell-shaped transects and multiple intersections, Environmental and Ecological Statistics, 10, 263–279. Grimm, A., Gruber, B., and Henle, K. (2014) Reliability of different mark-recapture methods for population size estimation tested against reference population sizes constructed from field data, PLoS ONE, 9, e98840. Grosenbaugh, L. R. (1964) Some suggestions for better sample-tree-measurement. In Proceedings of the Society of American Foresters Annual Meeting, 20–23 October 1963, Boston. pp. 36–42. Guillera-Arrolta, G., Kéry, M., and Lahoz-Monfort, J. J. (2019) Inferring species richness using multispecies occupancy modeling: estimation performance and interpretation, Ecology and Evolution, 9, 780–792.

214  |  MEASURING ABUNDANCE Halford, A. R., and Thompson, A. A. (1994) Visual Census Surveys of Reef Fish. Townsville: Australian Institute of Marine Science. Hall, P. G., Melville, G. J., and Welsh, A. H. (2001) Bias correction and bootstrap methods for a spatial sampling scheme, Bernoulli, 7, 829–846. Harley, S. J., Myers, R. A., and Dunn, A. (2001) Is catch-per-unit-effort proportional to abundance? Canadian Journal of Fisheries and Aquatic Sciences, 58, 1760–1772. Harrison, S. P., Ross, S. J., and Lawton, J. H. (1992) Beta diversity on geographic gradients in Britain, Journal of Animal Ecology, 61, 151–158. Hartley, L. J. (2012) Five-minute bird counts in New Zealand, New Zealand Journal of Ecology, 36, 268–278. Hartley, S., and Kunin, W. E. (2003) Scale dependency of rarity, extinction risk, and conservation priority, Conservation Biology, 17, 1559–1570. Harvey, E. S., Cappo, M., Butler, J. J., Hall, N., and Kendrick, G. A. (2007) Bait attraction affects the performance of remote underwater video stations in assessment of demersal fish community structure, Marine Ecology Progress Series, 350, 245–254. He, F., and Gaston, K. J. (2000) Estimating species abundance from occurrence, The American Naturalist, 156, 553–559. Heard, G. W., Scroggie, M. P., Clemann, N., and Ramsey, D. S. L. (2014) Wetland characteristics influence disease risk for a threatened amphibian, Ecological Applications, 24, 250–262. Heip, C. (1974) A new index measuring evenness, Journal of the Marine Biological Association of the United Kingdom, 54, 555–557. Hijbeek, R., Koedam, N., Khan, M. N. I., Kairo, J. G., Schoukens, J., and Dahdouh-Guebas, F. (2013) An evaluation of plotless sampling using vegetation simulations and field data from a mangrove forest, PLoS ONE, 8, e67201. Hill, M. O. (1973) Diversity and evenness: a unifying notation and its consequences, Ecology, 54, 427–432. Horvitz, D. G., and Thompson, D. J. (1952) A generalization of sampling without replacement from a finite universe, Journal of the American Statistical Association, 47, 663–685. Huggins, R. M. (1989) On the statistical analysis of capture experiments, Biometrika, 76, 133–140. Jennings, S. B., Brown, N. D., and Sheil, D. (1999) Assessing forest canopies and understorey illumination: canopy closure, canopy cover and other measures, Forestry, 72, 59–74. Johannesson, K. A., and Mitson, R. B. (1983) Fisheries acoustics: a practical manual for aquatic biomass estimation, FAO Fisheries Technical Paper, 240, 1–249. Jolly, G. M. (1963) Estimates of population parameters from multiple recapture data with both death and dilution – deterministic model, Biometrika, 50, 113–128. Jolly, G. M. (1965) Explicit estimates from capture-recapture data with both death and immigration-stochastic model, Biometrika, 52, 225–247. Jost, L. (1993) A simple distance estimator for plant density in nonuniform stands: mathematical appendix. Accessed at: http://www.loujost.com/Statistics%20and%20Physics/PCQ/ PCQJournalArticle.htm. Kaiser, L. (1983) Unbiased estimation in line-intercept sampling, Biometrics, 39, 965–976. Keating, K. A., Schwartz, C. C., Haroldson, M. A., and Moody, D. (2002) Estimating numbers of females with cubs-of-the-year in the Yellowstone grizzly bear population, Ursus, 13, 161–174. Keeley, J. E., and Fotheringham, C. J. (2005) Plot shape effects on plant species diversity measurements, Journal of Vegetation Science, 16, 249–256. Kelker, G. H. (1940) Estimating deer populations by a differential hunting loss in the sexes, Proceedings of the Utah Academy of Science, 17, 65–69. Kemp, C. D., and Kemp, A. W. (1956) The analysis of point quadrat data, Australian Journal of Botany, 4, 167–174. Kendall, W. L., and Bjorkland, R. (2001) Using open robust design models to estimate temporary emigration from capture-recapture data, Biometrics, 57, 1113–1122.

REFERENCES  | 215 Kendall, W. L., Nichols, J. D., and Hines, J. E. (1997) Estimating temporary emigration using capture-recapture data with Pollocks robust design, Ecology, 78, 563–578. Kendall, W. L., Pollock, K. H., and Brownie, C. (1995) A likelihood-based approach to capturerecapture estimation of demographic parameters under the robust design, Biometrics, 51, 293–308. Kent, M., and Coker, P. (1994) Vegetation Description and Analysis. Chichester: Wiley. Kershaw, J. A., Jr, Ducey, M. J., Beers, T. W., and Husch, B. (2016) Forest Mensuration, 5th edn. New York: Wiley-Blackwell. Khan, M. N. I., Hijbeek, R., Berger, U., Koedam, N., Grueters, U., Islam, S. M. Z., Hasan, M. A., and Dahdouh-Guebas, F. (2016) An evaluation of the plant density estimator the Point-Centred Quarter Method (PCQM) using Monte Carlo simulation, PLoS ONE, 11(6), e0157985. Kiani, B., Fallah, A., Tabari, M., Hosseini, S. M., and In Parizi, M.-H. (2013) A comparison of distance sampling methods in Saxaul (Halloxylon Ammodendron c. a. Mey Bunge) shrublands, Polish Journal of Ecology, 61, 207–219. Koleff, P., Gaston, K. J., and Lennon, J. J. (2003) Measuring beta diversity for presence-absence data, Journal of Animal Ecology, 72, 367–382. Krajina, V. J. (1933) Die pflanzengesellschaften des Mlynica-Tales in den Visoke Tatry (Hohe Tatra), Beihefte zum Botanischen Centralblatt, 50, 774–957; 51, 1–224. Kuehl, R. O., McClaran, M. P., and van Zee, J. (2001) Detecting fragmentation of cover in desert grasslands using line intercept, Journal of Range Management, 54, 61–66. Kvålseth, T. O. (2015) Evenness indices once again: critical analysis of properties, SpringerPlus, 4, 232. Leinster, T., and Cobbold, C. A. (2012) Measuring diversity: the importance of species similarity, Ecology, 93, 477–489. Lennon, J. J., Koleff, P., Greenwood, J. J. D., and Gaston, K. J. (2001) The geographical structure of British bird distributions: diversity, spatial turnover and scale, Journal of Animal Ecology, 70, 966–979. Leslie, P. H., and Davis, D. H. S. (1939) An attempt to determine the absolute number of rats on a given area, Journal of Animal Ecology, 8, 94–113. Leujak, W., and Ormond, R. F. G. (2007) Comparative accuracy and efficiency of six coral community survey methods, Journal of Experimental Marine Biology and Ecology, 351, 168–187. Levy, E. B., and Madden, E. A. (1933) The point method of pasture analysis, New Zealand Journal of Agriculture, 46, 267–279. Lincoln, F. C. (1930) Calculating Waterfowl Abundance on the Basis of Banding Returns. Circular 118. Washington, DC: US Department of Agriculture. Loeb, S. C., Rodhouse, T. J., Ellison, L. E., Lausen, C. L., Reichard, J. D., Irvine, K. M., Ingersoll, T. E., Coleman, J. T. H., Thogmartin, W. E., Sauer, J. R., Francis, C. M., Bayless, M. L., Stanley, T. T., and Johnson, D. H. (2015) A Plan for the North American Bat Monitoring Program (NABat). Asheville, NC: USDA. Longino, J. T., Coddington, J., and Colwell, R. K. (2002) The ant fauna of a tropical rain forest: estimating species richness three different ways, Ecology, 83, 689–702. Lucas H. A., and Seber, G. A. F. (1977) Estimating coverage and particle density using the line intercept method, Biometrika, 64, 618–622. Mac Nally, R., Duncan, R. P., Thomson, J. R., and Yen, J. D. L. (2017) Model selection using information criteria, but is the ‘best’ model any good? Journal of Applied Ecology, 55, 1441–1444. MacArthur, R. H. (1965) Patterns of species diversity, Biological Reviews, 40, 510–533. Magurran, A. E., Queiroz, H. L., and Hercos, A. P. (2013) Relationship between evenness and body size in species rich assemblages, Biology Letters, 9. Accessed at: https://doi. org/10.1098/rsbl.2013.0856.

216  |  MEASURING ABUNDANCE Mallet, D., and Pelletier, D. (2014) Underwater video techniques for observing coastal marine biodiversity: a review of sixty years of publications (1952–2012), Fisheries Research, 154, 44–62. Manly, B. F. J. (1984) Obtaining confidence limits on parameters of the Jolly-Seber model for capture-recapture data, Biometrics, 40, 749–758. Manly, B. F. J., and McDonald, L. L. (1996) Sampling wildlife populations, Chance, 9, 9–20. Margalef, R. (1958) Information theory in ecology, International Journal of General Systems, 3, 36–71. Marini, S., Fanelli, E., Sbragaglia, V., Azzurro, E., del Rio Fernandez, J., and Aguzzi, J. (2018) Tracking fish abundance by underwater image recognition, Scientific Reports, 8, 137–148. Mark, A. F., and Esler, A. E. (1970) An assessment of the point-centred quarter method of plotless sampling in some New Zealand forests, Proceedings of the New Zealand Ecological Society, 17, 106–110. Marshall, D. D., Iles, K., and Bell, J. F. (2004) Using a large-angle gauge to select trees for measurement in variable plot sampling, Canadian Journal of Forestry Research, 34, 840–845. Masuyama, M. (1954) On the error in crop cutting experiment due to the bias on the border of the grid, Sankhya, 14, 181–186. Matheron, G. (1989) Estimating and Choosing. Berlin: Springer. Mathews, F., Kubasiewicz, L. M., Gurnell, J., Harrower, C. A., McDonald, R. A., and Shore, R. F. (2018) A Review of the Population and Conservation Status of British Mammals: Technical Summary. A report by the Mammal Society under contract to Natural England, Natural Resources Wales and Scottish Natural Heritage. Peterborough: Natural England. Matsuoka, S. M., Mahon, C. L., Handel, C. M., Sólymos, P., Bayne, E. M., Fontaine, P. C., and Ralph, C. J. (2014) Reviving common standards in point-count surveys for broad inference across studies, The Condor, 116, 599–608. Matthews, T. J., Borregaard, M. K., Ugland, K. I., Borges, P. A. V., Rigal, F., Cardoso, P., and Whittaker, R. J. (2014) The gambin model provides a superior fit to species abundance distributions with a single free parameter: evidence, implementation and interpretation, Ecography, 37, 1002–1011. Matthews, T. J., Triantis, K. A., Rigal, F., Borregaard, M. K., Guilhaumon, F., and Whittaker, R. J. (2016) Island species-area relationships and species accumulation curves are not equivalent: an analysis of habitat island datasets, Global Ecology and Biogeography, 25, 607–618. Matthews, T. J., Borges, P. A. V., Azevedo, E. B., and Whittaker, R. J. (2017) A biogeographical perspective on species abundance distributions: recent advances and opportunities for future research, Journal of Biogeography, 44, 1705–1710. Matthews, T. J., Borregaard, M. K., Gillespie, C. S., Rigal, F., Ugland, K. I., Krüger, R. F., Marques, R., Sadler, J. P., Borges, P. A. V., Kubota, Y., and Whittaker, R. J. (2019) Extension of the gambin model to multimodal species abundance distributions, Methods in Ecology and Evolution, 10, 432–437. Maunder, M. N., Sibert, J. R., Fonteneau, A., Hampton, J., Kleiber, P., and Harley, S. J. (2006) Interpreting catch per unit effort data to assess the status of individual stocks and communities, ICES Journal of Marine Science, 63, 1373–1385. McClintock, B. T., White, G. C., and Pryde, M. A. (2009) Improved methods for estimating abundance and related demographic parameters from mark-resight data, Biometrics, 75, 1–11. McClintock, B. T., White, G. C., Antolin, M. F., and Tripp, D. W. (2009) Estimating abundance using mark-resight when sampling is with replacement or the number of marked individuals is unknown, Biometrics, 65, 237–246. McDonald, L. L. (1980) Line-intercept sampling for attributes other than coverage and density, Journal of Wildlife Management, 44, 530–533. McGill, B. J., Etienne, R. S., Gray, J. S., Alonso, D., Anderson, M. J., Benecha, H. K., Dornelas, M., Enquist, B. J., Green, J. L., He, F., Hurlbert, A. H., Magurran, A. E., Marquet, P. A.,

REFERENCES  | 217 Maurer, B. A., Ostling, A., Soykan, C. U., Ugland, K. I., and White, E. P. (2007) Species abundance distributions: moving beyond single prediction theories to integration within an ecological framework, Ecology Letters, 10, 995–1015. Menhinick, E. F. (1964) A comparison of some species-individuals diversity indices applied to samples of field insects, Ecology, 45, 859–861. Mitchell, K. (2015) Quantitative Analysis by the Point-Centered Quarter Method. Accessed at: https://arxiv.org/abs/1010.3303v2. Moore, P. G. (1954) Spacing in plant populations, Ecology, 35, 222–227. Moran, P. A. P. (1951) A mathematical theory of animal trapping, Biometrika, 38, 307–311. Morisita, M. (1954) Estimation of population density by spacing method, Memoirs of the Faculty of Science, Kyushu University, Series E, 1, 187–197. Morisita, M. (1957) A new method for the estimation of density by the spacing method applicable to non-randomly distributed populations (translated by USDA, Forest Service, 1960), Physiology and Ecology (in Japanese), 7, 134–144. Morisita, M. (1959) Measuring of the dispersion and analysis of distribution patterns, Memoires of the Faculty of Science, Kyushu University, Series E. Biology, 2, 215–235. Morrison, D. A., Le Brocque, A. F., and Clarke, P. J. (1995) An assessment of some improved techniques for estimating the abundance (frequency) of sedentary organisms, Vegetatio, 120, 131–145. Nemec, A. F. L., and Davis, G. (2002) Efficiency of six line intersect sampling designs for estimating volume and density of coarse woody debris. Research Section, Vancouver Forest Region, B.C. Ministry of Forests, Nanaimo, B.C., Technical Report, TR-021/2002. Nichols, J. D., Hines, J. E., Sauer, J. R., Fallon, F. W., Fallon, J. E., and Heglund, P. J. (2000) A double-observer approach for estimating detection probability and abundance from point counts, The Auk, 117, 393–408. Niedballa, J., Sollmann, R., Courtiol, A., and Wilting, A. (2016) camtrapR: an R package for efficient camera trap data management, Methods in Ecology and Evolution, 7, 1457–1462. Norvell, R. E., Howe, F. P., and Parrish, J. R. (2003) A seven-year comparison of relativeabundance and distance-sampling methods, The Auk, 120, 1013–1028. Otis, D. L., Burnham, K. P., White, G. C., and Anderson, D. R. (1978) Statistical inference from capture data on closed animal populations, Wildlife Monographs, 62, 11–35. Outhred, R. K. (1984) Semi-quantitative sampling in vegetation survey. In K. Myers, C. R. Margules and I. Musto (eds), Survey Methods for Nature Conservation. Canberra: CSIRO. pp. 87–100. Palmer, M. W., and White, P. S. (1994) Scale dependence and the species-area relationship, The American Naturalist, 144, 717–740. Parker, K. R. (1979) Density estimation by variable area transect, Journal of Wildlife Management, 43, 484–492. Peet, R. K., Wentworth, T. R., and White, P. S. (1998) A flexible, multipurpose method for recording composition and structure, Castanea, 63, 262–274. Petersen, C. G. J. (1896) The yearly immigration of young plaice into the Limfjord from the German Sea, Report of the Danish Biological Station, 6, 5–84. Pielou, E. C. (1966) The measurement of diversity in different types of biological collections, Journal of Theoretical Biology, 13, 131–144. Pielou, E. C. (1979) Mathematical Ecology. New York: Wiley. Pollard, J. H. (1971) On distance estimators of density in randomly distributed forests, Biometrics, 27, 991–1002. Pollock, K. H. (1982) A capture-recapture design robust to unequal probability of capture, Journal of Wildlife Management, 46, 757–760. Pollock, K. H., Hines, J. E., and Nichols, J. D. (1984) The use of auxiliary variables in capturerecapture and removal experiments, Biometrics, 40, 329–340. Preston, F. W. (1948) The commonness, and rarity, of species, Ecology, 29, 254–283.

218  |  MEASURING ABUNDANCE Prodan, M. (1968) Punktstichprobe für die Forsteinrichtung, Der Forst- und Holzwirt, 23, 225–226. Rabinowitz, D. (1981) Seven forms of rarity. In H. Synge (ed.), The Biological Aspects of Rare Plant Conservation. Chichester: Wiley. pp. 205–217. Rademaker, M., Meijaard, E., Semiadi, G., Blokland, S., Neilson, E. W., and Rode-Margono, E. J. (2016) First ecological study of the Bawean Warty Pig (Sus blouchi), one of the rarest pigs on Earth, PLoS ONE, 11, e0151732. Ralph, C. J., Droege, S., and Sauer, J. R. (1995) Managing and monitoring birds using point counts: standards and applications. In C. J. Ralph, J. R. Sauer, and S. Droege (eds), Monitoring Bird Populations by Point Counts. Albany, CA: US Forest Service. pp. 161–175. Rao, C. R. (1982) Diversity and dissimilarity coefficients: a unified approach, Theoretical Population Biology, 21, 24–43. Rao, C. R. (2010) Quadratic entropy and analysis of diversity, Sankhyā, 72-A, 70–80. Rayburn, A. P., Schiffers, K., and Schupp, E. W. (2011) Use of precise spatial data for describing spatial patterns and plant interactions in a diverse Great Basin shrub community, Plant Ecology, 212, 585–594. Riddle, J. D., Pollock, K. H., and Simons, T. R. (2010) An unreconciled double-observer method for estimating detection probability and abundance, The Auk, 127, 841–849. Rivest, L.-P., and Baillargeon, S. (2007) Applications and extensions of Chao’s moment estimator for the size of a closed population, Biometrics, 63, 999–1006. Roberts, T. E., Bridge, T. C., Caley, M. J., Baird, A. H. (2016) The point count transect method for estimates of biodiversity on coral reefs: improving the sampling of rare species. PLoS ONE, 11, e0152335. Rohlf, F. J., and Archie, J. W. (1978) Least-squares mapping using interpoint distances, Ecology, 59, 126–132. Rothery, P. (1974) The number of pins in a point quadrat frame, Journal of Applied Ecology, 11, 745–754. Rowcliffe, J. M., Field, J., Turvey, S. T., and Carbone, C. (2008) Estimating animal density using camera traps without the need for individual recognition, Journal of Animal Ecology, 45, 1228–1236. Rowcliffe, J. M., Carbone, C., Jansen, P. A., Kays, R., and Kranstauber, B. (2011) Quantifying the sensitivity of camera traps: an adapted distance sampling approach, Methods in Ecology and Evolution, 2, 464–476. Royle, J. A. (2004) Generalized estimators of avian abundance from count survey data, Animal Biodiversity and Conservation, 27, 375–386. Royle, J. A., and Nichols, J. D. (2003) Estimating abundance from repeated presence-absence data or point counts, Ecology, 84, 777–790. Royle, J. A., and Young, K. V. (2008) A hierarchical model for spatial capture-recapture data, Ecology, 89, 2281–2289. Royle, J. A., Fuller, A. K., and Sutherland, C. (2016) Spatial capture-recapture models allowing Markovian transience or dispersal, Population Ecology, 58, 53–62. Russell, M. B., Fraver, S., Aakala, T., Gove, J. H., Woodall, C. W., D’Amato, A. W., and Ducey, M. J. (2015), Quantifying carbon stores and decomposition in dead wood: A review, Forest Ecology and Management, 350, 107–128. Sadinle, M. (2009) Transformed logit confidence intervals for small populations in single capture-recapture estimation, Communications in Statistics – Simulation and Computation, 38, 1909–1924. Schmid-Haas, P. (1969) Stichproben am waldrand, Mitteilungen der Schweizerische Anstalt für das Forstliche Versuchswesen, 45, 234–303. Schmidt, J. H., and Rattenbury, K. L. (2017) An open-population distance sampling framework for assessing population dynamics in group-dwelling species, Methods in Ecology and Evolution, 9, 936–945.

REFERENCES  | 219 Schnabel, Z. E. (1938) The estimation of total fish population of a lake, The American Mathematical Monthly, 45, 348–352. Seber, G. A. F. (1965) A note on the multiple recapture census, Biometrika, 52, 249–259. Seber, G. A. F. (1982) The Estimation of Animal Abundance and Related Parameters, 2nd edn. London: Griffin. Shannon, C. E. (1948) A mathematical theory of communication, The Bell System Technical Journal, 27, 379–423. Sheldon, A. L. (1969) Equitability indices: dependence on the species count, Ecology, 50, 466–467. Sheil, D. (2002) Biodiversity research in Malinau. In CIFOR, ITTO Project PD 12/97 Rev. I (F): Forest, science and sustainability: the Bulungan model forest. Technical Report Phase 1, 1997–2001. Bogor, Indonesia: CIFOR and ITTO. pp. 57–107. Sheil, D., Ducey, M. J., Sidiyasa, K., and Samsoedin, I. (2003) A new type of sample unit for the efficient assessment of diverse tree communities in complex forest landscapes, Journal of Tropical Forest Science, 15, 117–135. Silva, L. B., Alves, M., Elias, R. B., and Silva, L. (2017) Comparison of T-square, pointcentered quarter, and N-tree sampling methods in Pittosporum undulatum invaded woodlands, International Journal of Forestry Research, 2818132. Accessed at: https://doi. org/10.1155/2017/2818132. Simpson, E. H. (1949) Measurement of diversity, Nature, 163, 688. Smith, W. P., Twedt, D. J., Hamel, P. B., Ford, R. P., Weidenfeld, D. A., and Cooper, R. J. (1998) Increasing point-count duration increases standard error, Journal of Field Ornithology, 69, 450–456. Stevens, D. L., Jr, and Olsen, A. R. (2004) Spatially balanced sampling of natural resources, Journal of the American Statistician Association, 99, 262–278. Stohlgren, T. J., Falkner, M. B., and Schell, L. D. (1995) A Modified-Whittaker nested vegetation sampling method, Vegetatio, 117, 113–121. Strauss, D., and Neal, D. L. (1963) Biases in the step-point method on bunchgrass ranges, Journal of Range Management, 36, 623–626. Tidmarsh, C. E. M., and Havenga, C. M. (1955) The Wheel-Point Method of Survey and Measurement of Semi-Open Grasslands and Karoo Vegetation in South Africa. Pretoria: Botanical Survey of South Africa, Memoir 29. Ugland, K. I., Lambshead, P. J. D., McGill, B. J., Gray, J. S., O’Dea, N., Ladle, R. J., and Whittaker, R. J. (2007) Modelling dimensionality in species abundance distributions: description and evaluation of the gambin model, Evolutionary Ecology Research, 9, 313–324. Upton, G. J. G. (2016) Categorical Data Analysis by Example. New York: Wiley. Upton, G. J. G., and Fingleton, B. (1985) Spatial Data Analysis by Example. Vol 1: Point Pattern and Quantitative Data. Chichester: Wiley. Valentine, H. T., Ducey, M. J., Gove, J. H., Lanz, A., and Affleck, D. L. R. (2006) Corrections for cluster-plot slop, Forest Science, 52, 55–66. Vicharnakorn, P., Shrestha, R. P., Nagai, M., Salam, A. P., and Kiratiprayoon, S. (2014) Carbon stock assessment using remote sensing and forest inventory data in Savannakhet, Lao PDR, Remote Sensing, 6, 5452–5479. Warde, W., and Petranka, J. W. (1981) A correction factor table for missing point-center quarter data, Ecology, 62, 491–494. Warwick, R. M., and Clarke, K. R. (1995) New ‘biodiversity’ measures reveal a decrease in taxonomic distinctness with increasing stress, Marine Ecology Progress Series, 129, 301–305. West, P. W. (2011) Potential for wider application of 3P sampling in forest inventory, Canadian Journal of Forestry Research, 41, 1500–1508. White, G. C., and Burnham, K. P. (1999) Program MARK: survival estimation from populations of marked animals, Bird Study, 46, sup1, S120–S139.

220  |  MEASURING ABUNDANCE White, N. A., Engeman, R. M., Sugihara, R. T., and Krupa, H. W. (2008) A comparison of plotless density estimators using Monte Carlo simulation on totally enumerated field data sets, BMC Ecology, 8, 6. Whittaker, R. H. (1960) Vegetation of the Siskiyou mountains, Oregon and California, Ecological Monographs, 30, 279–338. Whittaker, R. H. (1965) Dominance and diversity in land plant communities, Science, 147, 250–260. Whittaker, R. H. (1972) Evolution and measurement of species diversity, Taxon, 21, 213–251. Whittaker, R. H. (1977) Evolution of species diversity on land communities, Journal of Evolutionary Biology, 10, 1–67. Whittington, J., and Sawaya, M. A. (2015) A comparison of grizzly bear demographic parameters estimated from non-spatial and spatial open population capture-recapture models, PLoS ONE, 10, e0134446. Widmer, L., Heule, E., Colombo, M., Ruegg, A., Indermaur, A., Ronco, F., and Salzburger, W. (2019) Point-Combination Transect (PCT): incorporation of small underwater cameras to study fish communities, Methods in Ecology and Evolution, 10, 891–901. Willers, J. L., Yatham, S. R., Williams, M. R., and Akins, D. C. (1992) Utilization of the lineintercept method to estimate the coverage, density, and average length of row skips in cotton and other row crops, Annual Conference on Applied Statistics in Agriculture, Proceedings, 48–59. Williams, F. M. (1977) Model-free evenness: an alternative to diversity measures. Paper S23-6 presented 8/8/1977 to Second International Ecological Congress, Satellite program in Statistical Ecology, Satellite A: NATO-Advanced Study Institute and ISEP Research Workshop, Jerusalem. Williams, M. S., and Gove, J. H. (2003) Perpendicular distance sampling, another method of sampling coarse woody debris, Canadian Journal of Forestry Research, 33, 1564–1579. Williams, M. S., Ducey, M. J., and Gove, J. H. (2005) Assessing surface area of coarse woody debris with line intersect and perpendicular distance sampling, Canadian Journal of Forestry Research, 35, 949–960. Williamson, M., and Gaston, K. J. (2005) The lognormal distribution is not an appropriate null hypothesis for the species-abundance distribution, Journal of Animal Ecology, 74, 409–422. Willis, T. J. (2001) Visual census methods underestimate density and diversity of cryptic reef fishes, Journal of Fish Biology, 59, 1408–1411. Wilson, D. J., Mulvey, R. L., Clarke, D. A., and Reardon, J. T. (2017) Assessing and comparing population densities and indices of skinks under three predator management regimes, New Zealand Journal of Ecology, 41, 84–97. Wilson, J. W. (1960) Inclined point quadrats, New Phytologist, 59, 1–7. Wilson, J. W. (1963) Errors resulting from thickness of point quadrats, Australian Journal of Botany, 11, 178–188. Wilson, M. V., and Shmida, A. (1984) Measuring beta diversity with presence-absence data, Journal of Ecology, 72, 1055–1064. Wilson, R. M., and Collins, M. F. (1992) Capture-recapture estimation with samples of size one using frequency data, Biometrika, 79, 543–553. Yang, T.-R., Hsu, Y.-H., Kershaw, J. A., Jr, McGarrigle, E., and Kilham, D. (2017) Big BAF sampling in mixed species forest structures of northeastern North America: influence of count and measure BAF under cost constraints, Forestry, 90, 649–660. Yarranton, G. A. (1966) A plotless method of sampling vegetation, Journal of Ecology, 54, 229–237. Yee, T. W., Stoklosa, J., and Huggins, R. M. (2015) The VGAM package for capture-recapture data using the conditional likelihood, Journal of Statistical Software, 65, 1–33. Yen, J. D. L., Thomson, J. R., and Mac Nally, R. (2012) Is there an ecological basis for species abundance distributions?, Oecologia, 171, 517–525.

REFERENCES  | 221 Yen, J. D. L., Thomson, J. R., Paganin, D. M., Keith, J. M., and Mac Nally, R. (2014) Function regression in ecology and evolution: FREE, Methods in Ecology and Evolution, 6, 17–26. Yim, J.-S., Shin, M.-Y., Son, Y., and Kleinn, C. (2015) Cluster plot optimization for a large area forest resource inventory in Korea, Forest Science and Technology, 11, 139–146. Yin, D., and He, F. (2014) A simple method for estimating species abundance from occurrence maps, Methods in Ecology and Evolution, 5, 336–343. Yu, J., and Dobson, F. S. (2001) Seven forms of rarity in mammals, Journal of Biogeography, 27, 131–139. Yuan, Y., Buckland, S. T., Harrison, P. J., Foss, S., and Johnston, A. (2016) Using species proportions to quantify turnover in diversity, Journal of Agricultural, Biological and Environmental Statistics, 21, 363–381. Zarco-Perello, S., and Enríquez, S. (2019) Remote underwater video reveals higher fish diversity and abundance in seagrass meadows, and habitat differences in trophic interactions, Scientific Reports, 9, 6596. Zvuloni, A., Artzy-Randrup, Y., Stone, L., van Woesik, R., and Loya, Y. (2008) Ecological size-frequency distributions: how to prevent biases in spatial sampling, Limnology and Oceanography: Methods, 6, 144–152.

Index of Examples Alaskan shrubs ​41, 50 Alpine marmots ​103, 104, 106, 107 Australian tree frogs ​132

Japanese white-eyes in Hawai‘i ​144, 145, 149

Bahamian coral ​44, 184, 185, 187 Bawean warty pigs ​99, 100

New Zealand skinks ​117, 119, 121, 122, 125–8, 140 Newfoundland snow crabs ​114 Nightingale reed warblers in Alamagan ​150

Californian grassland ​56 Californian trees ​3, 5, 10, 12–14, 17, 25, 34, 37, 163, 165–7, 182, 186, 188, 199 Costa Rican ants ​170, 172, 197 Dall’s sheep in Alaska ​154 Dartmoor plant cover ​45, 47

Mangroves in Kenya ​65, 69, 76, 82

Okaloosa darters in Florida ​108, 111 Ovenbirds in the Great Smoky Mountains ​112 Prince Edward Island lobsters ​110 Wood Thrushes in New Hampshire ​93

Grizzly bears in Banff National Park ​131, 138 Indian trees ​160, 161, 163, 176, 179, 180, 182, 183, 188, 190, 191, 193, 194, 198

Yellow-fronted canaries in Hawai‘i ​152 Yellowstone grizzly bears ​129

General Index 3P sampling ​83 Abundance-induced heterogeneity model ​92 ACFOR scale ​46 Aerial cover ​41 AIC ​18 AICc ​18 AIH model ​92 Akaike information criterion ​18 Angle-count method ​83 Angle-order estimators ​69 BAF ​84 Basal area factor ​84 Basal cover ​41 Bayesian information criterion ​18 Belt transect ​22 Berger-Parker dominance ​175 Bernoulli distribution ​7 β-diversity ​183 Beta overlap ​186 Between quadrant variance ​49 BIC ​18 Big BAF ​86 Binomial distribution ​9 Bitterlich sampling ​83 Bootstrap interval ​13 Box plot ​4 Box quadrat ​90 Box-whisker plot ​4, 63 Braun-Blanquet cover scale ​47 Camera trap ​98 Canopy closure ​41 Canopy cover ​41 Capture-recapture method ​115 Catch per unit effort ​109 Catchability coefficient ​110 Chain intercept transect ​54 Chainsaw method ​32 Change-in-ratio ​113 Chao’s lower bound ​124, 128

Chi-squared distribution ​9 Circular quadrat ​23 Citizen science ​90 CJS model ​130 Closed population ​115 Coarse woody debris ​32, 86 Coefficient of variation ​6 Collector’s curve ​160 Complementarity ​196 Compound Poisson distribution ​11 Confidence interval ​12 Continuous distribution ​6 Contour plot ​15 Cormack-Jolly-Seber model ​130 Covariate ​140, 150 Cover ​41 Cover scale ​46 Coverage ​41, 52 CPUE ​109 Cramér-von Mises test ​19 Crown cover ​41 Cruising ​29 plotless ​85 CWD ​32, 86 Daubenmire cover scale ​46 DBH ​204 DDS ​102 Degrees of freedom ​7 Dependent double-observer sampling ​102 Detection function ​137, 142 Deviance ​16 Distance methods ​59, 142 Distribution Bernoulli ​7 binomial ​9 chi-squared ​7 compound Poisson ​11 continuous ​6 log-series ​11, 179 multinomial ​9 negative binomial ​10

224  |  MEASURING ABUNDANCE normal ​6 Poisson ​7 Poisson-lognormal ​11 size-frequency ​45 species-abundance ​188 Distribution function ​6 Diversity ​159, 175 Beta ​183 Functional ​182 genetic ​182 taxonomic ​181 Diversity profile ​182 Domin cover scale ​47 Dominance ​178 partial ​176 Double sampling ​83, 107 Driving transect ​90 Edge effects ​61 Effective number ​178, 182 Entropy ​177 Equitability ​195 Estimation ​12 Evenness ​195 Explanatory variable ​16 Factorial ​8 Fetch ​52 Fisher’s α ​179 Fixed-plot sampling ​29 Foliar cover ​41 Frequency (presence/absence) ​33 root ​34 shoot ​34 Frequency score ​39 Function detection ​137 hazard-rate ​143 key ​143 Functional data analysis ​194 Functional diversity ​182 GAM ​16 Gambin model ​192 Generalized additive model ​16 Generalized linear model ​16 Generalized random-tessellation stratified design ​27 Genetic diversity ​182 Gini-Simpson index ​178

GLM ​16 Goodness-of-fit test ​16 Gross canopy cover ​41 Ground cover ​41 GRTS ​27 Hazard-rate function ​143 Hermite polynomial ​143 Hill number ​178 Histogram ​13 Hult-Sernander-Du Rietz cover scale ​46 IDS ​105 Importance score ​39 Independent double-observer sampling ​105 Index Gini-Simpson ​178 Jaccard’s ​187, 196 Margalef richness ​161 Menhinick richness ​161 Simpson ​178 Inference ​12 Inter-quartile range ​4 ISAR ​167 Island species-area relationship ​167 Jaccard’s index ​187, 196 Jolly-Seber model ​130 k-dominance curve ​175 k-tree sampling ​63 Key function ​143 Leaf cover ​41 Levy bridge ​51 Likelihood ​14 Likelihood ratio goodness-of-fit test ​16 Lincoln index ​116 Line-intercept sampling ​51 Line-intersect sampling ​51 Line transect ​28, 187 Linear regression model ​15 LIS ​51 Log-linear model ​16 Log-series distribution ​10, 179 Logistic model ​16, 120 Logit ​16, 120 Lower quartile ​4 Manhattan distance ​198 Margalef richness index ​161

GENERAL INDEX  | 225 Mark-recapture ​115 Maximum likelihood ​14 maxN ​97 Mean, sample ​3 Median ​4 Menhinick richness index ​161 Method of moments ​10 Mirage method ​30 Mobile transect ​90 Mode ​6 Model abundance-induced heterogeneity ​92 AIH ​92 CJS ​130 Cormack-Jolly-Seber ​130 Gambit ​192 generalized additive ​16 generalized linear ​16 Jolly-Seber ​130 linear regression ​15 log-linear ​16 logistic ​16, 120 multiple regression ​16 random encounter ​98 RN ​92 spatial capture-recapture ​137 species-area ​164 Modified-Whittaker plot ​37 Multinomial distribution ​9 Multiple regression model ​16 Natural logarithm ​20 Nearest-neighbour distances ​70 Negative binomial distribution ​10 Nested quadrant ​37 Net canopy cover ​41 New Zealand cover scale ​46 Normal distribution ​9 North Carolina cover-abundance scale ​47 Occupancy probability ​173 Octave ​189 Open population ​115 Outlier ​5 Parameter ​6 Partial dominance ​176 PCQM ​67 PDS ​86 Pearson’s goodness-of-fit test ​16 Perpendicular distance sampling ​86

Petersen mark-recapture estimator ​116 Phytosociology ​22 Plot box-whisker ​4 contour ​15 Modified-Whittaker ​37 quantile-quantile ​18 Plotless cruising ​85 Point-centered quarter method ​67 Point-combination transect sampling ​92 Point counts ​95 Point-intercept sampling ​51, 58 Point quadrat frame ​51 Point sampling ​83 Point-to-plant measures ​61 Point transect ​95, 144 Poisson distribution ​7 Poisson forest ​25 Poisson-lognormal distribution ​11 Poisson process ​7 Population closed ​115 open ​115 Probability density function ​6 Probability of interspecific encounter ​178 Probability proportional to prediction ​83 Q-q plot ​18 Quadrat ​22 box ​90 circular ​23 nested ​37 Quantile-quantile plot ​18 Quartile lower ​4 upper ​4 RAI ​98 Random encounter model ​98 Rank-abundance plot ​188 Rarefaction ​162 Rarity ​198 Relascope ​83 Relative abundance index ​98 Relevé ​22 Remote underwater video ​91 Removal sampling ​107 Reverse hierarchical ordering ​28 Richness index Margalef ​161 Menhinick ​161

226  |  MEASURING ABUNDANCE RN model ​92 Robust design ​135 Root frequency ​34 Rugosity ​54 SAC ​160 SACFORL cover scale ​48 SAD ​187 Sample mean ​3 Sample variance ​5 Sampling 3P ​83 Bitterlich ​83 chain-intercept ​52 dependent double-observer ​102 double ​83, 107 fixed-point ​29 independent double-observer ​105 k-tree ​63 line-intercept ​51 line-intersect ​51 perpendicular distance ​86 point ​83 point-combination ​92 point-intercept ​51 removal ​107 stratified ​27 systematic ​27 unreconciled double-observer ​106 variable-plot ​85 Sausage method ​32 Score frequency ​39 importance ​39 Series expansions ​143 SFD ​45 Shannon entropy ​177 Shoot frequency ​34 Similarity ​196 Simpson index ​178 Size-frequency distribution ​45 Skip ​52 Slopover bias ​30 SPAR ​164 Spatial capture-recapture model ​137 Spatial pattern ​59 Species-abundance distribution ​187

Species accumulation curve ​160 SPecies-ARea (SPAR) model ​164 Species cover ​41 Species richness ​160 Stand loss ​52 Stand-up method ​32 Standard deviation ​6 Standard error ​12 Step-point method ​55 Stratified sampling ​27 Strip transect ​28, 90 Systematic sampling ​27 T-square estimators ​72 Taxonomic diversity ​181 Test, Cramér-von Mises ​19 Transect belt ​22 chain-intercept ​54 driving ​90 line ​91, 153 mobile ​90 point ​95, 144 strip ​22, 90 variable area ​79 Turnover ​198 UDS ​106 Underwater visual census ​91 Unreconciled double-observer sampling ​106 Upper quartile ​4 Variable, explanatory ​16 Variable area transect ​79 Variable plot sampling ​85 Variance between-quadrant ​49 sample ​7 within-quadrant ​49 VAT ​79 VBAR ​85 Walkthrough method ​31 Wandering forward ​74 Wandering quarters ​74 Wheel-point method ​55 Within-quadrat variance ​49