# a/s/m Study Manual for Exam C/Exam 4: Construction and Evaluation of Actuarial Models [17 ed.]

2,635 248 9MB

English Pages 1664 [1684] Year 2014

I Severity, Frequency, and Aggregate Loss
1 Basic Probability
1.1 Functions and moments
1.2 Percentiles
1.3 Conditional probability and expectation
1.4 Moment and probability generating functions
1.5 The empirical distribution
Exercises
Solutions
2 Parametric Distributions
2.1 Scaling
2.2 Transformations
2.3 Common parametric distributions
2.3.1 Uniform
2.3.2 Beta
2.3.3 Exponential
2.3.4 Weibull
2.3.5 Gamma
2.3.6 Pareto
2.3.7 Single-parameter Pareto
2.3.8 Lognormal
2.4 The linear exponential family
2.5 Limiting distributions
Exercises
Solutions
3 Variance
3.2 Normal approximation
3.3 Bernoulli shortcut
Exercises
Solutions
4 Mixtures and Splices
4.1 Mixtures
4.1.1 Discrete mixtures
4.1.2 Continuous mixtures
4.1.3 Frailty models
4.2 Conditional Variance
4.3 Splices
Exercises
Solutions
5 Policy Limits
Exercises
Solutions
6 Deductibles
6.1 Ordinary and franchise deductibles
6.2 Payment per loss with deductible
6.3 Payment per payment with deductible
Exercises
Solutions
7 Loss Elimination Ratio
Exercises
Solutions
8 Risk Measures and Tail Weight
8.1 Coherent risk measures
8.2 Value-at-Risk (VaR)
8.3 Tail-Value-at-Risk (TVaR)
8.4 Tail Weight
8.5 Extreme value distributions
Exercises
Solutions
9 Other Topics in Severity Coverage Modifications
Exercises
Solutions
10 Bonuses
Exercises
Solutions
11 Discrete Distributions
11.1 The (a,b,0) class
11.2 The (a,b,1) class
Exercises
Solutions
12 Poisson/Gamma
Exercises
Solutions
13 Frequency— Exposure & Coverage Modifications
13.1 Exposure modifications
13.2 Coverage modifications
Exercises
Solutions
14 Aggregate Loss Models: Compound Variance
14.1 Introduction
14.2 Compound variance
Exercises
Solutions
15 Aggregate Loss Models: Approximating Distribution
Exercises
Solutions
16 Aggregate Losses: Severity Modifications
Exercises
Solutions
17 Aggregate Loss Models: The Recursive Formula
Exercises
Solutions
18 Aggregate Losses—Aggregate Deductible
Exercises
Solutions
19 Aggregate Losses: Miscellaneous Topics
19.1 Exact Calculation of Aggregate Loss Distribution
19.1.1 Normal distribution
19.1.2 Exponential and gamma distributions
19.1.3 Compound Poisson models
19.2 Discretizing
19.2.1 Method of rounding
19.2.2 Method of local moment matching
Exercises
Solutions
20 Supplementary Questions: Severity, Frequency, and Aggregate Loss
Solutions
II Empirical Models
21 Review of Mathematical Statistics
21.1 Estimator quality
21.1.1 Bias
21.1.2 Consistency
21.1.3 Variance and mean square error
21.2 Hypothesis testing
21.3 Confidence intervals
Exercises
Solutions
22 The Empirical Distribution for Complete Data
22.1 Individual data
22.2 Grouped data
Exercises
Solutions
23 Variance of Empirical Estimators with Complete Data
23.1 Individual data
23.2 Grouped data
Exercises
Solutions
24 Kaplan-Meier and Nelson-Åalen Estimators
24.1 Kaplan-Meier Product Limit Estimator
24.2 Nelson-Åalen Estimator
Exercises
Solutions
25 Estimation of Related Quantities
25.1 Moments
25.1.1 Complete individual data
25.1.2 Grouped data
25.1.3 Incomplete data
25.2 Range probabilities
25.3 Deductibles and limits
25.4 Inflation
Exercises
Solutions
26 Variance of Kaplan-Meier and Nelson-Åalen Estimators
Exercises
Solutions
27 Kernel Smoothing
27.1 Density and distribution
27.1.1 Uniform kernel
27.1.2 Triangular kernel
27.1.3 Other symmetric kernels
27.1.4 Kernels using two-parameter distributions
27.2 Moments of kernel-smoothed distributions
Exercises
Solutions
28 Mortality Table Construction
28.1 Individual data based methods
28.1.1 Variance of estimators
28.2 Interval-based methods
Exercises
Solutions
29 Supplementary Questions: Empirical Models
Solutions
III Parametric Models
30 Method of Moments
30.1 Introductory remarks
30.2 The method of moments for various distributions
30.2.1 Exponential
30.2.2 Gamma
30.2.3 Pareto
30.2.4 Lognormal
30.2.5 Uniform
30.2.6 Other distributions
30.3 Fitting other moments, and incomplete data
Exercises
Solutions
31 Percentile Matching
31.1 Smoothed empirical percentile
31.2 Percentile matching for various distributions
31.2.1 Exponential
31.2.2 Weibull
31.2.3 Lognormal
31.2.4 Other distributions
31.3 Percentile matching with incomplete data
31.4 Matching a percentile and a moment
Exercises
Solutions
32 Maximum Likelihood Estimators
32.1 Defining the likelihood
32.1.1 Individual data
32.1.2 Grouped data
32.1.3 Censoring
32.1.4 Truncation
32.1.5 Combination of censoring and truncation
Exercises
Solutions
33 Maximum Likelihood Estimators—Special Techniques
33.1 Cases for which the Maximum Likelihood Estimator equals the Method of Moments Estimator
33.1.1 Exponential distribution
33.2 Parametrization and Shifting
33.2.1 Parametrization
33.2.2 Shifting
33.3 Transformations
33.3.1 Lognormal distribution
33.3.2 Inverse exponential distribution
33.3.3 Weibull distribution
33.4 Special distributions
33.4.1 Uniform distribution
33.4.2 Pareto distribution
33.4.3 Beta distribution
33.5 Bernoulli technique
33.6 Estimating qx
Exercises
Solutions
34 Variance Of Maximum Likelihood Estimators
34.1 Information matrix
34.1.1 Calculating variance using the information matrix
34.1.2 Asymptotic variance of MLE for common distributions
34.1.3 True information and observed information
34.2 The delta method
34.3 Confidence Intervals
34.3.1 Normal Confidence Intervals
34.3.2 Non-Normal Confidence Intervals
34.4 Variance of Exact Exposure Estimate of j
Exercises
Solutions
35 Fitting Discrete Distributions
35.1 Poisson distribution
35.2 Negative binomial
35.3 Binomial
35.4 Fitting (a,b,1) class distributions
35.6 Choosing between distributions in the (a,b,0) class
Exercises
Solutions
36 Hypothesis Tests: Graphic Comparison
36.1 D(x) plots
36.2 p-p plots
Exercises
Solutions
37 Hypothesis Tests: Kolmogorov-Smirnov
37.1 Individual data
37.2 Grouped data
Exercises
Solutions
38 Hypothesis Tests: Anderson-Darling
Exercises
Solutions
39 Hypothesis Tests: Chi-square
39.1 Introduction
39.2 Definition of chi-square statistic
39.3 Degrees of freedom
39.4 Other requirements for the chi-square test
39.5 Data from several periods
Exercises
Solutions
40 Likelihood Ratio Test and Algorithm, Schwarz Bayesian Criterion
40.1 Likelihood Ratio Test and Algorithm
40.2 Schwarz Bayesian Criterion
Exercises
Solutions
41 Supplementary Questions: Parametric Models
Solutions
IV Credibility
42 Limited Fluctuation Credibility: Poisson Frequency
Exercises
Solutions
43 Limited Fluctuation Credibility: Non-Poisson Frequency
Exercises
Solutions
44 Limited Fluctuation Credibility: Partial Credibility
Exercises
Solutions
45 Bayesian Methods—Discrete Prior
Exercises
Solutions
46 Bayesian Methods—Continuous Prior
46.1 Calculating posterior and predictive distributions
46.2 Recognizing the posterior distribution
46.3 Loss functions
46.4 Interval estimation
46.5 The linear exponential family and conjugate priors
Exercises
Solutions
47 Bayesian Credibility: Poisson/Gamma
Exercises
Solutions
48 Bayesian Credibility: Normal/Normal
Exercises
Solutions
49 Bayesian Credibility: Bernoulli/Beta
49.1 Bernoulli/beta
49.2 Negative binomial/beta
Exercises
Solutions
50 Bayesian Credibility: Exponential/Inverse Gamma
Exercises
Solutions
51 Bühlmann Credibility: Basics
Exercises
Solutions
52 Bühlmann Credibility: Discrete Prior
Exercises
Solutions
53 Bühlmann Credibility: Continuous Prior
Exercises
Solutions
54 Bühlmann-Straub Credibility
54.1 Bühlmann-Straub model: Varying exposure
54.2 Hewitt model: Generalized variance of observations
Exercises
Solutions
55 Exact Credibility
Exercises
Solutions
56 Bühlmann As Least Squares Estimate of Bayes
56.1 Regression
56.2 Graphic questions
56.3 Cov(Xi,Xj)
Exercises
Solutions
57 Empirical Bayes Non-Parametric Methods
57.1 Uniform exposures
57.2 Non-uniform exposures
Exercises
Solutions
58 Empirical Bayes Semi-Parametric Methods
58.1 Poisson model
58.2 Non-Poisson models
58.3 Which Bühlmann method should be used?
Exercises
Solutions
59 Supplementary Questions: Credibility
Solutions
V Simulation
60 Simulation—Inversion Method
Exercises
Solutions
61 Simulation—Special Techniques
61.1 Mixtures
61.2 Multiple decrements
61.3 Simulating (a,b,0) distributions
61.4 Normal random variables: the polar method
Exercises
Solutions
62 Number of Data Values to Generate
Exercises
Solutions
63 Simulation—Applications
63.1 Actuarial applications
63.2 Statistical analysis
63.3 Risk measures
Exercises
Solutions
64 Bootstrap Approximation
Exercises
Solutions
65 Supplementary Questions: Simulation
Solutions
VI Practice Exams
1 Practice Exam 1
2 Practice Exam 2
3 Practice Exam 3
4 Practice Exam 4
5 Practice Exam 5
6 Practice Exam 6
7 Practice Exam 7
8 Practice Exam 8
9 Practice Exam 9
10 Practice Exam 10
11 Practice Exam 11
12 Practice Exam 12
13 Practice Exam 13
Appendices
A Solutions to the Practice Exams
Solutions for Practice Exam 1
Solutions for Practice Exam 2
Solutions for Practice Exam 3
Solutions for Practice Exam 4
Solutions for Practice Exam 5
Solutions for Practice Exam 6
Solutions for Practice Exam 7
Solutions for Practice Exam 8
Solutions for Practice Exam 9
Solutions for Practice Exam 10
Solutions for Practice Exam 11
Solutions for Practice Exam 12
Solutions for Practice Exam 13
B Solutions to Old Exams
B.1 Solutions to CAS Exam 3, Spring 2005
B.2 Solutions to SOA Exam M, Spring 2005
B.3 Solutions to CAS Exam 3, Fall 2005
B.4 Solutions to SOA Exam M, Fall 2005
B.5 Solutions to Exam C/4, Fall 2005
B.6 Solutions to CAS Exam 3, Spring 2006
B.7 Solutions to CAS Exam 3, Fall 2006
B.8 Solutions to SOA Exam M, Fall 2006
B.9 Solutions to Exam C/4, Fall 2006
B.10 Solutions to Exam C/4, Spring 2007
C Cross Reference from Loss Models
D Exam Question Index
##### Citation preview

Study Manual for

Exam C/Exam 4 Construction and Evaluation of Actuarial Models Seventeenth Edition

by Abraham Weishaus, Ph.D., F.S.A., CFA, M.A.A.A. Note: NO RETURN IF OPENED

Study Manual for

Exam C/Exam 4 Construction and Evaluation of Actuarial Models Seventeenth Edition

by Abraham Weishaus, Ph.D., F.S.A., CFA, M.A.A.A. Note: NO RETURN IF OPENED

TO OUR READERS: Please check A.S.M.’s web site at www.studymanuals.com for errata and updates. If you have any comments or reports of errata, please e-mail us at [email protected].

©Copyright 2014 by Actuarial Study Materials (A.S.M.), PO Box 69, Greenland, NH 03840. All rights reserved. Reproduction in whole or in part without express written permission from the publisher is strictly prohibited.

Contents

I

Severity, Frequency, and Aggregate Loss

1

Basic Probability 1.1 Functions and moments . . . . . . . . . . . . . 1.2 Percentiles . . . . . . . . . . . . . . . . . . . . . 1.3 Conditional probability and expectation . . . . 1.4 Moment and probability generating functions 1.5 The empirical distribution . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

3 3 7 8 11 13 14 21

Parametric Distributions 2.1 Scaling . . . . . . . . . . . . . . . . 2.2 Transformations . . . . . . . . . . . 2.3 Common parametric distributions 2.3.1 Uniform . . . . . . . . . . . 2.3.2 Beta . . . . . . . . . . . . . . 2.3.3 Exponential . . . . . . . . . 2.3.4 Weibull . . . . . . . . . . . . 2.3.5 Gamma . . . . . . . . . . . 2.3.6 Pareto . . . . . . . . . . . . 2.3.7 Single-parameter Pareto . . 2.3.8 Lognormal . . . . . . . . . . 2.4 The linear exponential family . . . 2.5 Limiting distributions . . . . . . . Exercises . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

29 29 31 33 34 35 35 36 37 38 39 40 40 43 45 48

Variance 3.1 Additivity . . . . . . . 3.2 Normal approximation 3.3 Bernoulli shortcut . . . Exercises . . . . . . . . Solutions . . . . . . . .

2

3

4

1

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

51 51 52 54 55 56

Mixtures and Splices 4.1 Mixtures . . . . . . . . . . . 4.1.1 Discrete mixtures . . 4.1.2 Continuous mixtures 4.1.3 Frailty models . . . . 4.2 Conditional Variance . . . . 4.3 Splices . . . . . . . . . . . . Exercises . . . . . . . . . . . Solutions . . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

59 59 59 61 62 63 66 69 76

. . . . .

. . . . .

iii

CONTENTS

iv

5

Policy Limits Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6

Deductibles 6.1 Ordinary and franchise deductibles . . 6.2 Payment per loss with deductible . . . 6.3 Payment per payment with deductible Exercises . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . .

85 87 91

. . . . .

95 95 95 97 101 111

7

Loss Elimination Ratio Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

119 120 126

8

Risk Measures and Tail Weight 8.1 Coherent risk measures . . 8.2 Value-at-Risk (VaR) . . . . . 8.3 Tail-Value-at-Risk (TVaR) . . 8.4 Tail Weight . . . . . . . . . . 8.5 Extreme value distributions Exercises . . . . . . . . . . . Solutions . . . . . . . . . . .

. . . . . . .

135 135 137 140 144 147 149 151

Other Topics in Severity Coverage Modifications Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

159 163 168

10 Bonuses Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

179 180 182

11 Discrete Distributions 11.1 The ( a, b, 0) class 11.2 The ( a, b, 1) class Exercises . . . . . Solutions . . . . .

. . . .

187 187 191 195 199

12 Poisson/Gamma Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

211 212 216

13 Frequency— Exposure & Coverage Modifications 13.1 Exposure modifications . . . . . . . . . . . . 13.2 Coverage modifications . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . .

. . . .

221 221 221 223 228

14 Aggregate Loss Models: Compound Variance 14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.2 Compound variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

235 235 236

9

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

. . . . .

. . . . . . .

. . . .

. . . .

CONTENTS

v

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

239 247

15 Aggregate Loss Models: Approximating Distribution Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

257 260 267

16 Aggregate Losses: Severity Modifications Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

277 278 285

17 Aggregate Loss Models: The Recursive Formula Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

295 299 302

18 Aggregate Losses—Aggregate Deductible Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

307 312 318

19 Aggregate Losses: Miscellaneous Topics 19.1 Exact Calculation of Aggregate Loss Distribution 19.1.1 Normal distribution . . . . . . . . . . . . 19.1.2 Exponential and gamma distributions . . 19.1.3 Compound Poisson models . . . . . . . . 19.2 Discretizing . . . . . . . . . . . . . . . . . . . . . 19.2.1 Method of rounding . . . . . . . . . . . . 19.2.2 Method of local moment matching . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . .

325 325 325 326 329 329 330 330 332 334

20 Supplementary Questions: Severity, Frequency, and Aggregate Loss Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

339 343

II

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

Empirical Models

21 Review of Mathematical Statistics 21.1 Estimator quality . . . . . . . . . . . . . 21.1.1 Bias . . . . . . . . . . . . . . . . . 21.1.2 Consistency . . . . . . . . . . . . 21.1.3 Variance and mean square error 21.2 Hypothesis testing . . . . . . . . . . . . 21.3 Confidence intervals . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . .

351 . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

353 353 354 356 356 357 359 363 368

22 The Empirical Distribution for Complete Data 22.1 Individual data . . . . . . . . . . . . . . . . 22.2 Grouped data . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

375 375 376 377 379

. . . . . . . .

CONTENTS

vi

23 Variance of Empirical Estimators with Complete Data 23.1 Individual data . . . . . . . . . . . . . . . . . . . . 23.2 Grouped data . . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

381 381 382 385 388

24 Kaplan-Meier and Nelson-Åalen Estimators 24.1 Kaplan-Meier Product Limit Estimator . 24.2 Nelson-Åalen Estimator . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

393 394 398 401 410

25 Estimation of Related Quantities 25.1 Moments . . . . . . . . . . . . . . 25.1.1 Complete individual data 25.1.2 Grouped data . . . . . . . 25.1.3 Incomplete data . . . . . . 25.2 Range probabilities . . . . . . . . 25.3 Deductibles and limits . . . . . . 25.4 Inflation . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

419 419 419 419 422 425 425 426 427 432

26 Variance of Kaplan-Meier and Nelson-Åalen Estimators Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

437 440 447

27 Kernel Smoothing 27.1 Density and distribution . . . . . . . . . . . . . . . 27.1.1 Uniform kernel . . . . . . . . . . . . . . . . 27.1.2 Triangular kernel . . . . . . . . . . . . . . . 27.1.3 Other symmetric kernels . . . . . . . . . . . 27.1.4 Kernels using two-parameter distributions 27.2 Moments of kernel-smoothed distributions . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

457 457 458 463 470 471 472 474 480

28 Mortality Table Construction 28.1 Individual data based methods 28.1.1 Variance of estimators . 28.2 Interval-based methods . . . . Exercises . . . . . . . . . . . . . Solutions . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

489 489 493 494 499 508

29 Supplementary Questions: Empirical Models Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

515 518

III

Parametric Models

. . . . .

. . . . . . . . .

. . . . .

. . . . . . . . .

. . . . .

. . . . . . . . .

. . . . .

. . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

523 525

CONTENTS

vii

30.1 Introductory remarks . . . . . . . . . . . . . . . . . 30.2 The method of moments for various distributions 30.2.1 Exponential . . . . . . . . . . . . . . . . . . 30.2.2 Gamma . . . . . . . . . . . . . . . . . . . . 30.2.3 Pareto . . . . . . . . . . . . . . . . . . . . . 30.2.4 Lognormal . . . . . . . . . . . . . . . . . . . 30.2.5 Uniform . . . . . . . . . . . . . . . . . . . . 30.2.6 Other distributions . . . . . . . . . . . . . . 30.3 Fitting other moments, and incomplete data . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . 31 Percentile Matching 31.1 Smoothed empirical percentile . . . . . . . . 31.2 Percentile matching for various distributions 31.2.1 Exponential . . . . . . . . . . . . . . . 31.2.2 Weibull . . . . . . . . . . . . . . . . . . 31.2.3 Lognormal . . . . . . . . . . . . . . . . 31.2.4 Other distributions . . . . . . . . . . . 31.3 Percentile matching with incomplete data . . 31.4 Matching a percentile and a moment . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

525 526 526 526 527 528 529 529 530 533 541

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

555 555 556 556 557 558 558 559 561 561 567

32 Maximum Likelihood Estimators 32.1 Defining the likelihood . . . . . . . . . . . . . . . 32.1.1 Individual data . . . . . . . . . . . . . . . 32.1.2 Grouped data . . . . . . . . . . . . . . . . 32.1.3 Censoring . . . . . . . . . . . . . . . . . . 32.1.4 Truncation . . . . . . . . . . . . . . . . . . 32.1.5 Combination of censoring and truncation Exercises . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

575 577 577 578 579 580 581 582 591

. . . . . . . . . .

33 Maximum Likelihood Estimators—Special Techniques 33.1 Cases for which the Maximum Likelihood Estimator equals the Method of Moments Estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.1.1 Exponential distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.2 Parametrization and Shifting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.2.1 Parametrization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.2.2 Shifting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.3 Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.3.1 Lognormal distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.3.2 Inverse exponential distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.3.3 Weibull distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.4 Special distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.4.1 Uniform distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.4.2 Pareto distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.4.3 Beta distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.5 Bernoulli technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C/4 Study Manual—17th edition Copyright ©2014 ASM

601 601 601 602 602 603 603 604 604 605 606 606 607 608 609

CONTENTS

viii

33.6 Estimating q x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

612 614 630

34 Variance Of Maximum Likelihood Estimators 34.1 Information matrix . . . . . . . . . . . . . . . . . . . . . . . . . 34.1.1 Calculating variance using the information matrix . . . 34.1.2 Asymptotic variance of MLE for common distributions 34.1.3 True information and observed information . . . . . . 34.2 The delta method . . . . . . . . . . . . . . . . . . . . . . . . . . 34.3 Confidence Intervals . . . . . . . . . . . . . . . . . . . . . . . . 34.3.1 Normal Confidence Intervals . . . . . . . . . . . . . . . 34.3.2 Non-Normal Confidence Intervals . . . . . . . . . . . . 34.4 Variance of Exact Exposure Estimate of qˆ j . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

647 647 647 651 656 659 661 661 662 664 665 675

35 Fitting Discrete Distributions 35.1 Poisson distribution . . . . . . . . . . . . . . . . . . . 35.2 Negative binomial . . . . . . . . . . . . . . . . . . . . 35.3 Binomial . . . . . . . . . . . . . . . . . . . . . . . . . 35.4 Fitting ( a, b, 1) class distributions . . . . . . . . . . . 35.5 Adjusting for exposure . . . . . . . . . . . . . . . . . 35.6 Choosing between distributions in the ( a, b, 0) class Exercises . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

687 687 688 688 690 692 693 696 704

36 Hypothesis Tests: Graphic Comparison 36.1 D ( x ) plots . . . . . . . . . . . . . . 36.2 p–p plots . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

713 713 714 716 720

37 Hypothesis Tests: Kolmogorov-Smirnov 37.1 Individual data . . . . . . . . . . . . 37.2 Grouped data . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

725 725 730 732 740

38 Hypothesis Tests: Anderson-Darling Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

749 750 752

39 Hypothesis Tests: Chi-square 39.1 Introduction . . . . . . . . . . . . . . . . . 39.2 Definition of chi-square statistic . . . . . . 39.3 Degrees of freedom . . . . . . . . . . . . . 39.4 Other requirements for the chi-square test 39.5 Data from several periods . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . .

757 757 760 763 765 767 769 785

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

CONTENTS

ix

40 Likelihood Ratio Test and Algorithm, Schwarz Bayesian Criterion 40.1 Likelihood Ratio Test and Algorithm . . . . . . . . . . . . . . . . 40.2 Schwarz Bayesian Criterion . . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . .

795 795 800 801 805

41 Supplementary Questions: Parametric Models Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

811 816

IV

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

Credibility

823

42 Limited Fluctuation Credibility: Poisson Frequency Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

827 833 842

43 Limited Fluctuation Credibility: Non-Poisson Frequency Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

849 852 856

44 Limited Fluctuation Credibility: Partial Credibility Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

861 862 868

45 Bayesian Methods—Discrete Prior Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

873 877 890

46 Bayesian Methods—Continuous Prior 46.1 Calculating posterior and predictive distributions 46.2 Recognizing the posterior distribution . . . . . . . 46.3 Loss functions . . . . . . . . . . . . . . . . . . . . . 46.4 Interval estimation . . . . . . . . . . . . . . . . . . 46.5 The linear exponential family and conjugate priors Exercises . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . .

909 909 914 915 916 917 918 925

47 Bayesian Credibility: Poisson/Gamma Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

939 940 948

48 Bayesian Credibility: Normal/Normal Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

953 957 958

49 Bayesian Credibility: Bernoulli/Beta 49.1 Bernoulli/beta . . . . . . . . . . . 49.2 Negative binomial/beta . . . . . Exercises . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . .

961 961 964 965 968

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . .

CONTENTS

x

50 Bayesian Credibility: Exponential/Inverse Gamma Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

971 975 978

51 Bühlmann Credibility: Basics Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

981 986 992

52 Bühlmann Credibility: Discrete Prior 1001 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1006 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1025 53 Bühlmann Credibility: Continuous Prior 1045 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1049 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1061 54 Bühlmann-Straub Credibility 54.1 Bühlmann-Straub model: Varying exposure . . . . . 54.2 Hewitt model: Generalized variance of observations Exercises . . . . . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

1073 1073 1074 1078 1083

55 Exact Credibility 1091 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1093 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1097 56 Bühlmann As Least Squares Estimate of Bayes 56.1 Regression . . . . . . . . . . . . . . . . . . . 56.2 Graphic questions . . . . . . . . . . . . . . . 56.3 Cov ( X i , X j ) . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

1101 1101 1103 1105 1106 1111

57 Empirical Bayes Non-Parametric Methods 57.1 Uniform exposures . . . . . . . . . . . 57.2 Non-uniform exposures . . . . . . . . 57.2.1 No manual premium . . . . . . 57.2.2 Manual premium . . . . . . . . Exercises . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

1113 1114 1116 1116 1123 1124 1131

58 Empirical Bayes Semi-Parametric Methods 58.1 Poisson model . . . . . . . . . . . . . . . . . 58.2 Non-Poisson models . . . . . . . . . . . . . 58.3 Which Bühlmann method should be used? Exercises . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

1143 1143 1147 1150 1152 1159

. . . . . .

. . . . . .

59 Supplementary Questions: Credibility 1165 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1170 C/4 Study Manual—17th edition Copyright ©2014 ASM

CONTENTS

V

xi

Simulation

1177

60 Simulation—Inversion Method 1179 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1184 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1193 61 Simulation—Special Techniques 61.1 Mixtures . . . . . . . . . . . . . . . . . . . . . 61.2 Multiple decrements . . . . . . . . . . . . . . 61.3 Simulating ( a, b, 0) distributions . . . . . . . 61.4 Normal random variables: the polar method Exercises . . . . . . . . . . . . . . . . . . . . . Solutions . . . . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

1203 1203 1204 1207 1209 1212 1218

62 Number of Data Values to Generate 1225 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1230 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1233 63 Simulation—Applications 63.1 Actuarial applications 63.2 Statistical analysis . . . 63.3 Risk measures . . . . . Exercises . . . . . . . . Solutions . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

1237 1237 1239 1239 1241 1251

64 Bootstrap Approximation 1261 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1266 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1269 65 Supplementary Questions: Simulation 1275 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1278

VI

Practice Exams

1283

1

Practice Exam 1

1285

2

Practice Exam 2

1297

3

Practice Exam 3

1307

4

Practice Exam 4

1317

5

Practice Exam 5

1327

6

Practice Exam 6

1337

7

Practice Exam 7

1349

8

Practice Exam 8

1359

9

Practice Exam 9

1369

CONTENTS

xii

10 Practice Exam 10

1381

11 Practice Exam 11

1391

12 Practice Exam 12

1403

13 Practice Exam 13

1415

Appendices A Solutions to the Practice Exams Solutions for Practice Exam 1 . . Solutions for Practice Exam 2 . . Solutions for Practice Exam 3 . . Solutions for Practice Exam 4 . . Solutions for Practice Exam 5 . . Solutions for Practice Exam 6 . . Solutions for Practice Exam 7 . . Solutions for Practice Exam 8 . . Solutions for Practice Exam 9 . . Solutions for Practice Exam 10 . . Solutions for Practice Exam 11 . . Solutions for Practice Exam 12 . . Solutions for Practice Exam 13 . .

1425 . . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

1427 1427 1439 1452 1465 1478 1489 1501 1514 1527 1540 1552 1565 1579

B Solutions to Old Exams B.1 Solutions to CAS Exam 3, Spring 2005 . B.2 Solutions to SOA Exam M, Spring 2005 . B.3 Solutions to CAS Exam 3, Fall 2005 . . . B.4 Solutions to SOA Exam M, Fall 2005 . . B.5 Solutions to Exam C/4, Fall 2005 . . . . B.6 Solutions to CAS Exam 3, Spring 2006 . B.7 Solutions to CAS Exam 3, Fall 2006 . . . B.8 Solutions to SOA Exam M, Fall 2006 . . B.9 Solutions to Exam C/4, Fall 2006 . . . . B.10 Solutions to Exam C/4, Spring 2007 . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

1597 1597 1601 1604 1608 1612 1622 1626 1629 1632 1641

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

C Cross Reference from Loss Models

1651

D Exam Question Index

1653

Preface Exam C/4 is the catch-all exam containing all mathematical material that doesn’t fit easily into one of the other exams. You will study models for property/casualty insurance: models for size of loss, number of losses, and aggregate losses. Along the way, you’ll learn about risk measures: possible measures for how much surplus a company should hold, based on the risk characteristics of its business. Then you will switch gears and study statistics. You will learn how to estimate mortality rates, and distribution functions for loss sizes and counts, and how to evaluate the quality of the estimates. After that, you will study credibility: adjusting estimates based on experience. Finally, you will learn the basics of stochastic simulation, a valuable tool for actuarial modeling. Prerequisites for most of the material are few beyond knowing probability (and calculus of course). Occasionally we will refer to mortality rates as q x (something you learn about in Exam MLC/LC), and we’ll even mention double decrement models in Lesson 28, but overall, Exam MLC/LC plays very little role. Regression is helpful at one point for one topic within Bühlmann credibility, but questions on that particular topic are rare. The CAS website provides some guidance on the relationship between exams, and considers Exam P/1 the only prerequisite to this exam.

This manual The exercises in this manual I’ve provided lots of my own exercises, as well as relevant exercises from pre-2000 exams, which are not that easy to get. Though the style of exam questions has changed a little, these are still very useful practice exercises which cover the same material—don’t dismiss them as obsolete! CAS 4B had 1 point, 2 point, and 3 point questions. Current exam questions are approximately as difficult as the 2-point questions. All questions in this manual from exams given in 2000 and later, with solutions, are also available on the web from the SOA. When the 2000 syllabus was established in 1999, sample exams 3 and 4 were created, consisting partially of questions from older exams and partially of new questions, not all multiple choice. These sample exams were not real exams, and some questions were inappropriate or defective. These exams are no longer posted on the web. I have included appropriate questions, labeled “1999 C3 Sample” or “1999 C4 Sample”. These refer to these 1999 sample exams, not to the 306 sample questions currently posted on the web, which are discussed later in this introduction. Questions from old exams are marked xxx:yy, where xxx is the time the exam was given, with S for spring and F for fall followed by a 2-digit year, and yy is the question number. Sometimes xxx is preceded with SOA or CAS to indicate the sponsoring organization. From about 1986 to 2000, SOA exams had 3digit numbers (like 160) and CAS exams were a number and a letter (like 4B). From 2000 to Spring 2003, exam 3 was jointly sponsored, so I do not indicate “SOA” or “CAS” for exam 3 questions from that period. There was a period in the 1990’s when the SOA, while allowing use of its old exam questions, did not want people to reveal which exam they came from. As a result, I sometimes cannot identify the source exam for questions from this period. In such a case, I mark the question aaa-bb-cc:yy, where aaa-bb-cc is the study note number and yy is the question number. Generally aaa is the exam number (like 160), and cc is the 2-digit year the study note was published. No exercises in this manual are taken from the Fall 2005, Fall 2006, or Spring 2007 exams, which you may use as a dress rehearsal. However, Appendix B has the solutions to all of these exams. While in most cases these are the same as the official solutions, in a couple of cases I use the shortcuts which you learn in this manual. That appendix also provides solutions to relevant questions from pre-2007 CAS Exam 3’s. C/4 Study Manual—17th edition Copyright ©2014 ASM

xiii

CONTENTS

xiv

Other Useful Features of This Manual The SOA site has a set of 306 sample questions and solutions.1 Almost all of these questions are from released exams that are readily available; nevertheless many students prefer to use this list since nonsyllabus material has been removed. Appendix D has a complete cross reference between these questions and the exams they come from, as well as the page in this manual having either the question or the solution. This manual has an index. Whenever you remember some topic in this manual but can’t remember where you saw it, check the index. If it isn’t in the index but you’re sure it’s in the manual and an index listing would be appropriate, contact the author.

Tables Download the tables you will be given on the exam. They will often be needed for the examples and the exercises; I take the information in these tables for granted. If you see something in the text like “That distribution is a Pareto and therefore the variance is . . . ”, you should know that I am getting this from the tables; you are not expected to know this by heart. So please download the tables. Go to www.soa.org. Click on , click on “EDUCATION”, “EXAMS AND REQUIREMENTS”, “ASA”, and the Exam C box. Download the syllabus, which is the first bullet under “Syllabus and Study Materials”. At the bottom of the syllabus under “Other Resources”, click on “Tables for Exam C”. The direct address of the tables at this writing (October 2014) is http:www.soa.org/files/pdf/edu-2009-fall-exam-c-table.pdf The tables include distribution tables and the following statistical tables: the normal distribution function and chi-square critical values. The distribution tables are an abbreviated version of the Loss Models appendix. Whenever I refer to the tables from the Loss Models appendix in this manual, the abbreviated version will be sufficient. At this writing, the tables (on the second page) specify rules for using the normal distribution table that is supplied: Do not interpolate in the table. Simply use the nearest value. If you are looking for Φ (0.0244) , use Φ (0.02) . If you are given the cumulative probability Φ ( x )  0.8860 and need x, use 1.21, the nearest x available. The examples, exercises, and quizzes in this manual use this rounding method. On real exams, they will try to avoid ambiguous situations, so borderline situations won’t occur, but my interpretation of the rules (used for problems in this manual) is that if the third place is 5, round up the absolute value. So I round 0.125 to 0.13 and −0.125 to −0.13.

New for this Edition The 17th edition features Practice Exam 7, a new practice exam. I tried to make this exam less computationintensive and more conceptual. Students tell me that current exams are more conceptual, giving you unfamiliar contexts in which to apply the principles of this course.

Flashcards Many students find flashcards a useful tool for learning key formulas and concepts. ASM flashcards, available from the same distributors that sell this manual, contain the formulas and concepts from this manual in a convenient deck of cards. The cards have crossreferences, usually by page, to the manual.

1Actually less than 306. Some questions were mistakenly put on the original list and deleted, and some questions pertaining to topics dropped from the syllabus in Fall 2009 were deleted. C/4 Study Manual—17th edition Copyright ©2014 ASM

CONTENTS

xv

Notes About the Exam Released Exams You may wonder what the relative weights are for each of the subjects on the exam. Table 1 lists the number of questions on each of the Exam C/4 topics that are still on the current syllabus. May Nov. May Nov. Nov. Nov. Nov. May Nov. Nov. May Syllabus Topic Lessons 2000 2000 2001 2001 2002 2003 2004 2005 2005 2006 2007 Weight Severity, Frequency, 1–19 1 2 1 2 1 1 0 0 0 0 5 15–20% Aggregate Loss Empirical Estimation 21–28 4 2 2 4 4 5 4 6 5 9 7 20–25% Parametric Fitting

30–35

6

6

6

4

7

8

11

6

8

7

6

Testing Fit

36–40

2

0

3

2

2

2

3

4

4

1

3

42–44

1

1

0

1

1

2

1

1

1

1

0

45–50

4

5

5

4

3

5

3

3

3

5

3

51–56

2

2

4

5

5

3

4

5

3

3

57–58

2

3

1

1

1

1

2

2

3

2

        20–25%   5     2 

60–64

1

1

0

1

1

1

1

3

3

4

3 5–10%

23

22

22

24

25

28

29

30

30

32

Limited Fluctuation Credibility Bayesian Credibility Bühlmann Credibility Empirical Bayes Simulation Total



25–30%

34

Table 1: The number of questions in released Exams C/4 on each exam subjecta a For the purpose of this table, F03:13 was classified as a probability-Lesson 1 question and F03:30 was classified as a parametric fit question, but neither question was based on the syllabus material when the exam was given.

CONTENTS

xvi

• Starting with the October 2013 exam, minor changes were made to the syllabus. A short passage on extreme value distributions was added, material on mortality table construction was expanded, and special simulation methods were added. So this table can only be used as a general idea of question distribution. The syllabus is different from what it was when these exams were given. The final column, taken from the syllabus, may be the most reliable guide. A more detailed listing of exam questions and which lesson they correspond to can be found in Appendix D. Please use that list to identify questions no longer on the syllabus. Note that many questions require knowing more than one topic; I classify these based on the latest topic (based on the order in this manual) you need. Thus a question requiring knowledge of Bayesian and Bühlmann credibility would be classified as a Bühlmann credibility problem.

Guessing penalty There is no guessing penalty on this exam. So fill in every answer—you may be lucky! Leave yourself a couple of seconds to do this. If you have a calculator that can generate random numbers, and some time, you can use a formal method for generating answers; see Example 60D on page 1183. Otherwise, filling in a B for every question you don’t know the answer to is just as good.

Calculators A wide variety of calculators are permitted: the TI-30X (or TI-30Xa, or TI-30X II, battery or solar, or TI-30XS or TI-30XB MultiView, the BA-35 (battery powered or solar), and the BA-II Plus (or BA II Plus Professional Edition). You may bring several calculators into the exam. The MultiView calculator is considered the best one, due to its data tables which allow fast statistical calculations. The data table is a very restricted spreadsheet. Despite its limitations, it is useful. I’ve provided several examples of using the data table of the Multiview calculator to speed up calculations. Another feature of the Multiview is storage of previous calculations. They can be recalled and edited. Other features which may be of use, although I do not use them in the calculator tips provided, are the K constant and the table feature, which allows calculation of a function at selected values or at values in an arithmetic progression. Financial calculations do not occur on this exam; interest is almost never considered. You will not miss the lack of financial functions on the Multiview.

Changes to Syllabus There have been no changes to the syllabus since October 2013.

Study Schedule Different students will have different speeds and different constraints, so it’s hard to create a study schedule useful for everybody. However, I offer a sample 13-week study schedule, Table 2, as a guide. The last column lists rarely tested materials so you can skip those if you are behind in your schedule. Italicized sections in this column are, in my opinion, extremely unlikely exam topics. C/4 Study Manual—17th edition Copyright ©2014 ASM

CONTENTS

xvii

Table 2: Thirteen Week Study Schedule for Exam C/4

Week 1 2 3 4 5 6 7 8 9 10 11 12 13

Subject

Lessons

Rarely Tested

Probability basics Risk measures and severity Frequency and aggregate loss Aggregate loss (continued) and statistics Empirical estimators Variance of KM,NA Estimators, kernel smoothing, mortality table construction Method of moments & percentile matching Maximum likelihood Maximum likelihood (continued) and hypothesis testing Limited fluctuation and discrete Bayesian credibility Continuous Bayesian credibility Bühlmann credibility Bühlmann credibility (continued) and simulation

1–5 6–10 11–14 15–21 22–25 26–28

1.4, 2.2, 4.1.3 8.4 11.2,13.1 17,19.2 23

30–31 32–33 34–40

34.1.3,35.4,38

42–45

43

46–50 51–54 55–64

48 54.2 56,57.2.2

Errata Please report any errors you find. Reports may be sent to the publisher ([email protected]) or directly to me ([email protected]). When reporting errata, please indicate which manual and which edition you are referring to! This manual is the 17th edition of the Exam C/4 manual. An errata list will be posted at errata.aceyourexams.net

Acknowledgements I wish to thank the Society of Actuaries and the Casualty Actuarial Society for permission to use their old exam questions. These questions are the backbone of this manual. I wish to thank Donald Knuth, the creator of TEX, Leslie Lamport, the creator of LATEX, and the many package writers and maintainers, for providing a typesetting system which allows such beautiful typesetting of mathematics and figures. I hope you agree, after looking at mathematical material apparently typed with Word (e.g., the Dean study note) that there’s no comparison in appearance. I wish to thank the many readers who have sent in errata, or who have reported them anonymously at the Actuarial Discussion Forum. A partial list of students who sent in errata for the previous editions is: Kyle Allen, Casey Anderson, Carter Angell, Jason Ard, Opoku Archampong, April Ayres, Madhuri Bajaj, George Barnidge, Austin Barrington, Brian Basiaga, Michael Baznik, Michael Beck, Aaron Beaudoin, Marc Beaudoin, Aaron Beharelle, Yann Bernard, Shiri Bernstein, Elie Bochner, Karl Boettcher, Batya Bogopulsky, Kirsten Boyd, Andrew Brady, Kelsey Bridges, Ken Burton, Eric Buzby, Anna Buzueva, Emily Byrnes, Joshua Carlsen, Todd Carpino, Michael Castellano, Christi Cavalieri, Aaron Chase, Steve Cheung, Jonathan Choi, Julie Cholet, Albert Chua, Bryn Clarke, Darren Costello, Laura Cremerius, Jessica Culhane, Marco Dattilo, Gordon Davis, William Derech, Connie Di Pierro, Feng Dong, Ryan Dood, Jacob Efron, Jason Elleman, Sean Fakete, Amarya Feinberg, Sterling Felsted, Drew Fendler, Nick Fiechter, Gail Flamenbaum, Matthew Flanagan, Erin Flickinger, John Fries, Cory Fujimoto, Brad Fuxa, Meihua C/4 Study Manual—17th edition Copyright ©2014 ASM

xviii

CONTENTS

Gao, Yoram Gilboa, Sean Gingery, Shikha Goel, Lindsey Gohn, Aaron Hendrickson-Gracie, Joseph Gracyalny, Karen Grote, Brian Gugat, Zhoujie Guo, Amanda Hamala, Aaron Hendrickson-Gracie, Thomas Haggerty, Josh Harwood, David Hibbard, Jay Hines, Jennifer Ho, Martin Ho, Dean Guo, ennis Huang, Jonathon Huber, Wallace Hui, Professor Natalia Humphrey, Kenneth Hung, John Hutson, Andrew Ie, Anthony Ippolito, Matthew Iseler, Naqi Jaffery, Dennis Jerry, Merrick Johnson, Nathan Johnson, Jason Jurgill, Michael Kalina, Ethan Kang, Allen Katz, Patrick Kavanagh, Ben Kester, Anand Khare, Cory Kientoff, Geo Kini, Carol Ko, Bradley Koenen, Emily Kozlowski, Boris Krant, Reuvain Krasner, Stephanie Krob, Brian Kum, Takehiro Kumazawa, Brian Lake, Eric Lam, Gary Larson, Shivanie Latchman, Olivier Le Courtois, Charles Lee, Jacob Lee, Seung Lee, York Lee, Justin Lengermann, Theodore Leonard, Aw Yong Chor Leong, David Levy, Luyao Li, Tony Litterer, William Logan, Allison Louie, Sheryn Low, Yitzy Lowy, Grant Luloff, Grover MacTaggart, Sohini Mahapatra, Matthew Malkus, Kandice Marcacci, Grant Martin, Jason Mastrogiacomo, Patrick McCormack, Jacob McDougle, Maria Melguizo, Albert Miao, Jeremy Mills, Andy Moriarty, Daniel Moskala, Greg Moyer, Michael Nasti, Don Neville, Joseph Ng, Raymond Ng, Ryan Nolan, Stephen Nyamapfumba, Adam Okun, Saravuth Olunsiri, Kevin Owens, Gino Pagano, Christopher Palmer, Kong Foo Pang, Kamila Paszek, Tanya Pazitny, Jonathan Peters, James Pilarski, Amanda Popham, Forrest Preston, Andrew Rallis, Claudio Rebelo, Denise Reed, Jeremiah Reinkoester, Adam Rich, Christopher Roberts, Vanessa Robinson, Andrew Roggy, Maria Rutkowski, Heather Samuelson, Megan Scott, Colin Scheriff, Eric Schumann, Simon Schurr, Andy Shapiro, David Sidney, Phil Silverman, Carl Simon, Rajesh Singh, Betty Siu, Stephen Smith, Ian Spafford, Mark Spinozz, Erica Stead, Sebastian Strohmayr, Alison Stroop, David Stulman, Jonathan Szerszen, Susan Szpakowski, Jenny Tam, Todd Tauzer, Amy Thompson, Geoff Tims, David Tong, Mayer Toplan, Dustin Turner, Linh Van, Greg Vesper, Lei Wang, Joan Wei, Philip Welford, Caleb Wetherell, Patrick Wiese, Mendy Wenger, Adam Williams, Garrett Williams, Wilson Wong, Jeff Wood, Thomas Woodard, Serina Wu, Ziyan Xie, Bo Xu, Hoe Yan, Jason Yeung, Xue Ying, Rodrigo Zafra, Aaron Zeigler, Jenny Zhang, Moshe Zucker. I thank Professor Homer White for help with Example 34D.

Part I

Severity, Frequency, and Aggregate Loss

2

PART I. SEVERITY, FREQUENCY, AND AGGREGATE LOSS

This part is labeled “Severity, Frequency, and Aggregate Loss”. Let’s define these terms and one other term. Severity is the (average) size of a loss. If the average auto liability loss size is 27,000, then 27,000 is the severity. Frequency is the (average) number of claims per time period, usually per year. If a group of 100 policyholders submits an average of 5 claims per year, frequency is 0.05. Aggregate loss is the total losses paid per time period, usually per year. If a claim of 200 and a claim of 500 are submitted in a year, aggregate losses are 700. Pure premium is the (expected) aggregate loss per policyholder per time period, usually per year. If on the average 0.05 claims are submitted per year and each claim averages 10,000, and frequency and severity are independent, then pure premium is (0.05)(10,000)  500. In the above definitions, the words “average” and “expected” are in parentheses. The above terms are not that precise; sometimes they refer to a random variable and sometimes they refer to the expected value. It is OK to say pure premium is 500 (as we said above), but it is also OK to speak about the variance of pure premium. You will have to figure out the precise meaning from the context.

Lesson 1

Basic Probability Reading: Loss Models Fourth Edition 3.1–3.3

Before you start this lesson . . .

Have you read the preface? You may be familiar with textbooks with long prefaces which assume that you’re already familiar with all the material. These prefaces contain reflections of the author on the course, stories about how the book came to be, acknowledgements to the author’s family for having patience while the author was ignoring them and working on the book, etc. The preface to this manual is nothing like that! The preface is short and has information you need immediately to use this manual. It will answer questions such as: • How the hell am I supposed to know all the moments of the gamma distribution? (First example in this lesson) • The author’s solution to his exercise looks wrong! Is there an errata list? • The author’s solution to his exercise looks wrong, and there’s nothing in the errata list! What do I do now? • The author’s solution to his exercise looks right. Which of my friends should I thank for that? • I remember reading someplace in the manual about a Fisher information matrix, but can’t remember where it is. What do I do now? The preface also has some information which you don’t need immediately, but will be of interest eventually. For example, “What is the distribution of exam question by topic?” So please, take 5 minutes of your valuable time to read the preface.

Loss Models begins with a review of probability. This lesson is a very brief summary of probability. If you aren’t familiar with probability already, and find this summary inadequate, you can read chapters 2 and 3 in Loss Models. If that isn’t enough, you’ll have to study a probability textbook. (Did you ever take Exam P/Exam 1?)

1.1

Functions and moments

The cumulative distribution function of a random variable X, usually just called the distribution function, is the probability F ( x )  Pr ( X ≤ x ) . It defines X, and is right-continuous, meaning limh→0 F ( x + h )  F ( x ) C/4 Study Manual—17th edition Copyright ©2014 ASM

3

1. BASIC PROBABILITY

4

1−S

F d F dx

1−F

R

exp (−H )

S

− ln S

R

H

h

d H dx

h

Also:

f

• f (x )  −

f

• h (x ) 

d S (x ) dx

f (x ) S (x )

Figure 1.1: Relationships between probability functions

for h positive. Some random variables are discrete (there are isolated points x i at which Pr ( X  x i ) is nonzero) and some are continuous (meaning F ( x ) is continuous, and differentiable except at a countable number of points). Some are mixed—they are continuous except at a countable number of points. Some probability functions are: • S ( x ) is the survival function, the complement of F ( x ) , the probability of surviving longer than x, Pr ( X > x ) . • For a continuous random variable, f ( x ) is the probability density function. f ( x ) 

d dx F ( x ) .

• For a discrete random variable, p ( x ) is the probability mass function. p ( x )  Pr ( X  x ) . Often, f ( x ) satisfies the same relations for continuous variables as p ( x ) does for discrete variables. • h ( x ) is the hazard rate function. h ( x )  S ( x )  − d lndxS ( x ) . In International Actuarial Notation, µ x is used for this. h ( x ) is like a conditional density function, the conditional density given survival to time x. f (x )

• H ( x ) is the cumulative hazard rate function.

Z H (x ) 

x −∞

h ( t ) dt  − ln S ( x )

The distributions we will use will almost always assume nonnegative values only; in other words,Pr ( X < 0)  0. When the probability of a negative number is 0, we can set the lower bound of the integral to 0 instead of −∞.

A schematic relating these probability functions is shown in Figure 1.1. Why do we bother differentiating F to obtain f ? Because the density is needed for calculating moments. Moments of a random variable measure its center and dispersion. The expected value of X is defined by

Z E[X] 

∞ −∞

x f ( x ) dx

and more generally the expected value of a function of a random variable is defined by E[g ( X ) ]  C/4 Study Manual—17th edition Copyright ©2014 ASM

Z

∞ −∞

g ( x ) f ( x ) dx

1.1. FUNCTIONS AND MOMENTS

5

For discrete variables, the integrals are replaced with sums. The n th raw moment of X is defined as µ0n  E[X n ]. µ  µ01 is the mean. The n th central moment of X (n , 1) is defined as µ n  E[ ( X − µ ) n ].1 Usually n is a positive integer, but it need not be. Expectation is linear, so the central moments can be calculated from the raw moments by binomial expansion. In the binomial expansion, the last two terms always merge, so we have µ2  µ02 − µ2 µ3 

µ4 

µ03 µ04

3µ02 µ 4µ03 µ

+ 2µ +

instead of µ02 − 2µ01 µ + µ2

3

6µ02 µ2

− 3µ

4

µ03 µ04

3µ02 µ 4µ03 µ

+

+

3µ01 µ2 6µ02 µ2

(1.1) −µ −

3

4µ01 µ3

(1.2) +µ

4

Special functions of moments are: • The variance is Var ( X )  µ2 , and is denoted by σ 2 . • The standard deviation σ is the positive square root of the variance. • The skewness is γ1  µ3 /σ 3 . • The kurtosis is γ2  µ4 /σ 4 . • The coefficient of variation is σ/µ. Skewness measures how weighted a distribution is. A distribution with more weight on higher numbers has positive skewness and a distribution with more weight on lower numbers has negative skewness. A normal distribution has skewness of 0. Kurtosis measures how flat a distribution is. A distribution with more values further away from the mean has higher kurtosis. A normal distribution has skewness of 3. Skewness, kurtosis, and coefficient of variation are dimensionless. This means that if a random variable is multiplied by a positive constant, these three quantities are unchanged. We will discuss important things you should know about variance in Lesson 3. For the meantime, let’s repeat formula (1.1) using different notation, since it’s so important: Var ( X )  E[X 2 ] − E[X]2 Many times this is the best way to calculate variance. For two random variables X and Y: • The covariance is defined by Cov ( X, Y )  E ( X − µ X )( Y − µY ) .

f

g

• The correlation coefficient is defined by ρ XY  Cov ( X, Y ) / ( σX σY ) . As with the variance, another formula for covariance is Cov ( X, Y )  E[XY] − E[X] E[Y] For independent random variables, Cov ( X, Y )  0. Also, the covariance of a variable with itself is its variance: Cov ( X, X )  Var ( X ) . When there are two random variables, one can extract single variable distributions by summing (discrete) or integrating (continuous) over the other. These single-variable distributions are known as marginal distributions. 1This µ n has no connection to µ x , the force of mortality, part of International Actuarial Notation used in the study of life contingencies. C/4 Study Manual—17th edition Copyright ©2014 ASM

1. BASIC PROBABILITY

6

A 100p th percentile is a number π p such that F ( π p ) ≥ p and F ( π−p ) ≤ p. If F is strictly increasing, it is the unique point at which F ( π p )  p. A median is a 50th percentile. We’ll say more about percentiles later in this lesson. A mode is x such that f ( x ) (or Pr ( X  x ) for a discrete distribution) is maximized. The moment generating function is MX ( t )  E[e tX ] and the probability generating function is PX ( t )  X E[t ]. One useful thing proved with moment generating functions is that a sum of n independent exponential random variables with mean θ is a gamma random variable with parameters α  n and θ. We will discuss generating functions later in this lesson. Example 1A For the gamma distribution, as defined in the Loss Models Appendix:2 1. Calculate the coefficient of variation. 2. Calculate the skewness. 3. Calculate the limit of the kurtosis as α → ∞. 4. If X has a gamma distribution with α  5 and θ  0.1, calculate E[e X ]. Answer:

1. The appendix indicates that E[X k ]  ( α + k − 1)( α + k − 2) · · · ( α ) θ k . Hence E[X 2 ]  ( α + 1) αθ2

E[X]2  α2 θ 2 σ2  αθ2 So the coefficient of variation is

αθ αθ



1 √ α

.

2. Note that all terms in the numerator and denominator have a factor of θ 3 , which cancels and therefore may be ignored. So without loss of generality we will set θ  1. The numerator of the skewness fraction is E[X 3 ] − 3 E[X 2 ]µ + 2µ3  ( α + 2)( α + 1) α − 3 ( α + 1) α 2 + 2α3  α3 + 3α 2 + 2α − 3α 3 − 3α 2 + 2α3

 2α The denominator is α 3/2 , so the skewness is

2 √ α

. This goes to 0 as α goes to ∞.

3. Once again, θ may be ignored since θ 4 appears in both numerator and denominator. Setting θ  1, the variance is α and the denominator of the kurtosis fraction is σ4  α2 . The numerator is E[X 4 ] − 4 E[X 3 ]µ + 6 E[X 2 ]µ2 − 3µ4  ( α + 3)( α + 2)( α + 1)( α ) − 4 ( α + 2)( α + 1) α 2 + 6 ( α + 1) α3 − 3α 4 We only need the highest degree non-zero term, since this will dominate as α → ∞. The coefficient of α 4 (1 − 4 + 6 − 3) is zero, as is the coefficient of α3 (6 − 12 + 6), leaving α2 , whose coefficient is 11 − 8  3. The denominator is α2 , so the kurtosis goes to 3 as α → ∞.

4. This is the moment generating function of the gamma distribution evaluated at 1, M (1) , which you can look up in the appendix: M (1)  1 − θ (1)



 −α

 0.9−5

2Have you downloaded the tables from the SOA website yet? If not, please download them now, so that you will understand the solution to this example. See page xiv for instructions on where to find them. C/4 Study Manual—17th edition Copyright ©2014 ASM

1.2. PERCENTILES

7

However, we’ll carry out the calculation directly to illustrate expected values. X

E[e ] 

Z 0

Z  

Z0 ∞ 0

10  9 

e x f ( x ) dx ex x 4 e −10x dx Γ (5) 0.15 105 4 −9x x e dx Γ (5)

!5 Z

10 9

∞ 0

95 4 −9x x e dx Γ (5)

!5

because the final integral is the integral of a gamma density with α  5 and θ  integrate to 1.

1 9,

which must



Example 1B For an auto liability coverage, claim size follows a two-parameter Pareto distribution with parameters θ  10,000 and α. Median claim size is 5000. Determine the probability of a claim being greater than 25,000. Answer: By definition of median, F (5000)  0.5. But F ( x )  1 − 1−

10,000 10,000 + 5000

θ α θ+x ,

so we have

 0.5

α ln 23  ln 0.5

α  1.7096

The probability of a claim being greater than 25,000 is 1 − F (25,000) . 10,000 1 − F (25,000)  10,000 + 25,000

1.2

! 1.7096

 0.1175



Percentiles

Percentiles will play many roles in this course: 1. They will be used to fit curves using percentile matching (Lesson 31). 2. They play a role in the p–p plot (lesson 36); in fact, both p’s in p–p probably stand for percentile. 3. The inversion method of simulation selects random percentiles of a distribution as random numbers (Lesson 60). 4. The Value-at-Risk risk measure (Lesson 8) is a glorified percentile. Many times, instead of saying “the 100p th percentile” (and having to remember to multiply by 100), we prefer to say “the p th quantile” , which means the same thing. Percentiles are essentially an inverse function. Roughly speaking, if F ( x ) is the cumulative distribution for X, a q th quantile is a number x such that F ( x )  q, or in other words it is F −1 ( q ) . Here’s the precise definition again: C/4 Study Manual—17th edition Copyright ©2014 ASM

1. BASIC PROBABILITY

8

A 100p th percentile of a random variable X is a number π p satisfying these two properties: 1. Pr ( X ≤ π p ) ≥ p

2. Pr ( X < π p ) ≤ p

If the cumulative distribution function F is continuous and strictly increasing, it is the unique point at which Pr ( X ≤ π p )  p. In other words, the 100p th percentile is the x for which the graph of the cumulative distribution function F ( x ) equals or crosses the vertical level p. Common continuous distributions have a cumulative distribution function that is strictly increasing except when equal to 0 or 1. For these functions, the quantile (other than the 0 and 100 percentiles) is the inverse function, which is one-to-one. On the other hand, for a discrete distribution, or any distribution with point masses, the inverse may not be defined or well-defined. At points where the inverse is not defined, a single number will be a q th quantile for many q’s; at points where the inverse is not well-defined, many numbers will qualify as the q th quantile. Consider the following example: Example 1C A random variable X has the following distribution: F ( x )  0.2x Pr ( X  2)  0.35

0≤x≤1

Pr ( X  3)  0.35 Pr ( X  4)  0.10 Pr ( X  x )  0

otherwise

Calculate the 15th , 50th , and 90th percentiles of X. Answer: A graph of the distribution function makes it easier to understand what is going on. On a graph, the inverse function consists of starting on the y-axis, going to the right until you hit the function, then going straight down. A graph of F ( x ) is shown in Figure 1.2. The 15th percentile has a unique well-defined inverse, since F ( x ) is continuous in the area where it is equal to 0.15. The inverse is 0.75; F (0.75)  0.15. F (1)  0.2 and F (2)  0.55; there is no x such that F ( x )  0.5. However, the arrow from 0.5 hits a wall at x  2, so 2 is the unique 50th percentile. We can verify that 2 is the unique 50th percentile according to the definition given above: Pr ( X < 2) is no greater than 0.5 (it is 0.2), and Pr ( X ≤ 2) is at least 0.5 (it is 0.55). 2 is also every percentile from the 20th to the 55th . The arrow from 0.9 doesn’t hit a wall; it hits a horizontal line going from 3 to 4. There is no unique 90th percentile; every number from 3 to 4 is a 90th percentile.  For some purposes, it is desirable to have a smoothed percentile. One method of smoothing percentiles will be discussed in Lesson 31.

1.3

Conditional probability and expectation

The probability of event A given B, assuming Pr ( B ) , 0, is Pr ( A | B )  C/4 Study Manual—17th edition Copyright ©2014 ASM

Pr ( A ∩ B ) Pr ( B )

1.3. CONDITIONAL PROBABILITY AND EXPECTATION

9

F (x ) 1

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

0

2

1

3

4

x

Figure 1.2: Plot of F ( x ) in example 1C, illustrating 15th , 50th , and 90th percentiles.

where Pr ( A ∩ B ) is the probability of both A and B occurring. A corresponding definition for continuous distributions uses the density function f instead of Pr: fX ( x | y ) 

f ( x, y ) f ( y)

where f ( y )  f ( x, y ) dx , 0. Two important theorems are Bayes’ Theorem and the Law of Total Probability: Theorem 1.1 (Bayes’ Theorem)

R

Pr ( A | B ) 

Pr ( B | A ) Pr ( A ) Pr ( B )

Correspondingly for continuous distributions fX ( x | y ) 

fY ( y | x ) fX ( x ) fY ( y )

Theorem 1.2 (Law of Total Probability) If B i is a set of exhaustive (in other words, Pr (∪i B i )  1) and mutually exclusive (in other words Pr ( B i ∩ B j )  0 for i , j) events, then for any event A, Pr ( A ) 

X i

Pr ( A ∩ B i ) 

X i

Pr ( B i ) Pr ( A | B i )

Correspondingly for continuous distributions, Pr ( A ) 

Z

Pr ( A | x ) f ( x ) dx

Expected values can be factored through conditions too: Conditional Mean Formula EX [X]  EY EX [X | Y]

f

g

(1.3)

1. BASIC PROBABILITY

10

This formula is one of the double expectation formulas.3 More generally for any function g EX [g ( X ) ]  EY EX [g ( X ) | Y]

f

g

Here are examples of this important theorem. Example 1D There are two types of actuarial students, bright and not-so-bright. For each exam, the probability that a bright student passes it is 80%, and the probability that a not-so-bright student passes it is 40%. All students start with Exam 1 and take the exams in sequence, and drop out as soon as they fail one exam. An equal number of bright and not-so-bright students take Exam 1. Determine the probability that a randomly selected student taking Exam 3 will pass. Answer: A common wrong answer to this question is 0.5 (0.8) + 0.5 (0.4)  0.6. This is an incorrect application of the Law of Total Probability. The probability that a student taking Exam 3 is bright is more than 0.5, because of the elimination of the earlier exams. A correct way to calculate the probability is to first calculate the probability that a student is taking Exam 3 given the two types of students. Let I1 be the event of being bright initially (before taking Exam 1) and I2 the event of not being bright initially. Let E be the event of taking Exam 3. Then by Bayes Theorem and the Law of Total Probability, Pr (E | I1 ) Pr ( I1 ) Pr (E ) Pr (E )  Pr (E | I1 ) Pr ( I1 ) + Pr (E | I2 ) Pr ( I2 )

Pr ( I1 | E ) 

Now, the probability that one takes Exam 3 if bright is the probability of passing the first two exams, or 0.82  0.64. If not-so-bright, the probability is 0.42  0.16. So we have Pr (E )  0.64 (0.5) + 0.16 (0.5)  0.4 (0.64)(0.5) Pr ( I1 | E )   0.8 0.4 and Pr ( I2 | E )  1 − 0.8  0.2 (or you could go through the above derivation with I2 instead of I1 ). Now we’re ready to apply the Law of Total Probability to the conditional distributions given E to answer the question. Let P be the event of passing Exam 3. Then Pr ( P | E )  Pr ( P | I1 &E ) Pr ( I1 | E ) + Pr ( P | I2 &E ) Pr ( I2 | E )  (0.8)(0.8) + (0.4)(0.2)  0.72



Now let’s do a continuous example. Example 1E Claim sizes follow an exponential distribution with mean θ. The parameter θ varies by insured. Over all insureds, θ has a distribution with the following density function: f (θ) 

1 θ2

1≤θ 0.5)  e dθ θ2 1 This integral is calculated using the substitution u  −0.5/θ. If you need a refresher on how to carry out the substitution, see the sidebar. ∞

Z 1

e

−0.5/θ

1 dθ  2e −0.5/θ 2 θ 1 !

 2 1 − e −0.5  2 (1 − 0.606531)  0.786939





Incidentally, the distribution of θ is a single-parameter Pareto distribution with parameters θ  1 (a different θ) and α  1. Recognizing distributions is helpful, since then you can look up the moments in the table if you need them, rather than calculating them.  The unconditional distribution of this example is a continuous mixture. Section 4.1 will discuss mixtures.

1.4

Moment and probability generating functions

Usually, a generating function for a usually infinite sequence {a0 , a1 , a2 , . . .} is f ( z ) of the form f (z ) 

∞ X n0

an z n

1. BASIC PROBABILITY

12

The idea of a generating function is that if you differentiate this function n times, divide by n!, and evaluate at 0, you will recover a n : f ( n ) (0)  an n! where f ( n ) indicates the n th derivative. With this in mind, let’s discuss moment and probability generating functions. The moment generating function (MGF) is defined by MX ( t )  E[e tX ] It has the property that M ( n ) (0) , the n th derivative evaluated at 0, is the n th raw moment. Unlike for other generating functions, the n th derivative is already the n th moment and is not divided by n! to get the n th moment. Another useful property of the moment generating function is, if X is the sum of independent random variables, its moment generating function is the product of the moment generating functions of those variables. It’s usually difficult to add random variables, but if you multiply their moment generating functions and recognize the result, that tells you the distribution of the sum of the random variables. The probability generating function (PGF) is defined by P ( z )  E[z X ]  M (ln z ) The textbook uses it for discrete distributions, and the tables you get at the exam list it (using the notation P ( z ) ) for those distributions. It lives up to its name: the n th derivative at 0, divided by n!, is the probability that the random variable equals n, or P ( n ) (0) pn  n! where P ( n ) denotes the n th derivative. Another useful property of the pgf is that the n th derivative of the pgf evaluated at 1 is the n th factorial moment. The n th factorial moment is µ ( n )  E[X ( X − 1) · · · ( X − n + 1) ]. Examples of factorial moments calculated with the pgf are P 0 (1)  E[X] P 00 (1)  E[X ( X − 1) ]

P 000 (1)  E[X ( X − 1)( X − 2) ] and in general, using f ( n ) for the n th derivative of f , and the textbook’s notation µ ( n ) for the n th factorial moment P ( n ) (1)  µ ( n ) (1.4) With some algebra you can derive the central or raw moments from the factorial moments. For example, since µ (2)  E[X ( X − 1) ]  E[X 2 ] − E[X], it follows that µ02  µ (2) + µ.4 Like the moment generating function, if X is the sum of independent random variables, its probability generating function is a product of the probability generating functions of those variables. Example 1F Calculate the third raw moment of a negative binomial distribution with parameters r and β.

4The textbook only mentions that P 0 (1)  E[X] and P 00 (1)  E[X ( X − 1) ], but not the generalization to higher derivatives of the pgf. The textbook mentions factorial moments only in the appendix that has the distribution tables. C/4 Study Manual—17th edition Copyright ©2014 ASM

1.5. THE EMPIRICAL DISTRIBUTION

13

Answer: The tables give us the mean, variance, and pgf, and that is where the next line’s expansion of the pgf is coming from. P ( z )  1 − β ( z − 1)



 −r

P 000 ( z )  (−β ) 3 (−r ) − ( r + 1)





P 000 (1)  β3 r ( r + 1)( r + 2)

Also,

− ( r + 2)



1 − β ( z − 1)

 −( r+3)

P 000 (1)  E[X ( X − 1)( X − 2) ]  E[X 3 ] − 3 E[X 2 ] + 2 E[X]

and E[X]  rβ while E[X 2 ]  Var ( X ) + E[X]2  rβ (1 + β ) + r 2 β2 , so

E[X 3 ]  r ( r + 1)( r + 2) β3 + 3 ( rβ + rβ 2 + r 2 β 2 ) − 2rβ

1.5



The empirical distribution

There are many continuous and discrete probability distributions in the tables you get at the exam. However, every time you have a sample, you can create a probability distribution based on it. Given a sample x1 , . . . , x n , the empirical distribution is the probability distribution assigning a probability of n1 to each item in the sample. It is a discrete distribution.

Example 1G You are given the sample 1, 1, 2, 3, 5. Calculate: 1. The empirical mean. 2. The empirical variance. 3. The empirical skewness. 4. The empirical 80th percentile.

5. The empirical probability generating function. Answer: The empirical distribution assigns a probability of 1/5 to each point, so we have x Pr ( X  x )

1 0.4

2 0.2

3 0.2

5 0.2

1. The mean is 0.4 (1) + 0.2 (2) + 0.2 (3) + 0.2 (5)  2.4 . 2. The variance is σ 2  0.4 (1 − 2.4) 2 + 0.2 (2 − 2.4) 2 + 0.2 (3 − 2.4) 2 + 0.2 (5 − 2.4) 2  2.24 Alternatively, you could calculate the second raw moment and subtract the square of the mean: µ02  0.4 (12 ) + 0.2 (22 ) + 0.2 (32 ) + 0.2 (52 )  8 σ 2  8 − 2.42  2.24.

1. BASIC PROBABILITY

14

3. The raw third moment is µ03  0.4 (13 ) + 0.2 (23 ) + 0.2 (33 ) + 0.2 (53 )  32.4 The coefficient of skewness is γ1 

32.4 − 3 (8)(2.4) + 2 (2.43 )  0.730196 2.243/2

4. Any number x such that Pr ( X < x ) ≤ 0.8 and Pr ( X ≤ x ) ≥ 0.8 is an 80th percentile. This is true for 3 ≤ x ≤ 5. In fact, the graph of the distribution is horizontal between 3 and 5. So the set of 80th percentiles is {x : 3 ≤ x ≤ 5} . 5. P ( z )  E[z x ]  0.4z + 0.2z 2 + 0.2z 3 + 0.2z 5



Exercises 1.1.

The random variable X has a uniform distribution on [0, 1].

Let h ( x ) be its hazard rate function. Calculate h (0.75) . 1.2.

For a random variable X you are given that

(i) The mean is 4. (ii) The variance is 2. (iii) The raw third moment is 3. Determine the skewness of X. 1.3.

A random variable X has a gamma distribution with parameters α  2, θ  100.

Determine the kurtosis of X. 1.4.

[4B-S93:34] (1 point) Claim severity has the following distribution: Claim Size

Probability

100 200 300 400 500

0.05 0.20 0.50 0.20 0.05

Determine the distribution’s skewness. (A) −0.25 (B) 0 (E) Cannot be determined

(C) 0.15

(D) 0.35

Exercises continue on the next page . . .

EXERCISES FOR LESSON 1

15

Table 1.1: Summary of Probability Concepts

• Median is 50th percentile; n th quartile is 25n th percentile.

Probability Functions F ( x )  Pr ( X ≤ x )

S (x )  1 − F (x ) dF ( x ) f (x )  dx H ( x )  − ln S ( x )

• Mode is x which maximizes f ( x ) .

h (x ) 

• MX( n ) (0)  E[X n ], where M ( n ) is the n th derivative

f (x ) dH ( x )  dx S (x )

Expected value

E[g ( X ) ] 

n th raw moment µ0n  E[X n ]

−∞

g ( x ) f ( x ) dx

n th

µ n  E[ ( X − µ ) ]

Variance

σ2  E[ ( X − µ ) 2 ]  E[X 2 ] − µ2

Skewness

γ1 

central moment

Kurtosis Moment generating function Probability generating function

 Pr ( X  n )

• Bayes’ Theorem: Pr ( B | A ) Pr ( A ) Pr ( B ) fY ( y | x ) fX ( x ) fX ( x | y )  fY ( y )

Pr ( A | B ) 

n

µ3 µ03 − 3µ02 µ + 2µ3  σ3 σ3 µ04 − 4µ03 µ

+ 6µ02 µ2 − 3µ4 µ4  σ4 σ4 tX MX ( t )  E[e ] γ2 

• Law of Total Probability: If B i is a set of exhaustive (in other words, Pr (∪i B i )  1) and mutually exclusive (in other words Pr ( B i ∩ B j )  0 for i , j) events, then for any event A, Pr ( A ) 

X

Pr ( A∩B i ) 

i

P ( z )  E[z X ]

• Standard deviation (σ) is positive square root of variance • Coefficient of variation is σ/µ. •

n!

• PX( n ) (1) is the n th factorial moment of X.

Functions of random variables

Z

PX( n ) (0)

100p th

percentile π is any point satisfying ≤ p and F ( π ) ≥ p. If F is continuous, it is the unique point satisfying F ( π )  p. F ( π− )

X i

Pr ( B i ) Pr ( A | B i )

Correspondingly for continuous distributions, Pr ( A ) 

Z

Pr ( A | x ) f ( x ) dx

• Conditional Expectation Formula: EX [X]  EY EX [X | Y]

f

g

(1.3)

Exercises continue on the next page . . .

1. BASIC PROBABILITY

16

1.5. [4B-F98:27] (2 points) Determine the skewness of a gamma distribution with a coefficient of variation of 1. Hint: The skewness of a distribution is defined to be the third central moment divided by the cube of the standard deviation. (A) 0

(B) 1

(C) 2

(D) 4

(E) 6

1.6. You are given the following joint distribution of two random variables X and Y:

( x, y )

Pr ( X, Y )  ( x, y )

(0,0) (0,1) (0,2)

0.45 0.10 0.05





( x, y )

Pr ( X, Y )  ( x, y )

(1,0) (1,1) (2,1)

0.20 0.15 0.05





Calculate the correlation of X and Y. 1.7. You are given the following joint distribution of two random variables X and Y:

( x, y )

Pr ( X, Y )  ( x, y )

(1,1) (2,1)

0.32 0.24





( x, y )

Pr ( X, Y )  ( x, y )

(1,2) (2,3)

0.22 0.10





( x, y )

Pr ( X, Y )  ( x, y )

(1,3)

0.12





Calculate the third central moment of X. 1.8. [4B-S95:28] (2 points) You are given the following: •

For any random variable X with finite first three moments, the skewness of the distribution of X is denoted Sk ( X ) .

X and Y are independent, identically distributed random variables with mean  0 and finite second and third moments. Which of the following statements must be true?

1.

2 Sk ( X )  Sk (2X )

2.

− Sk ( Y )  Sk (−Y )

3.

| Sk ( X ) | ≥ | Sk ( X + Y ) |

(A) 2 (B) 3 (E) None of A, B, C, or D

(C) 1,2

(D) 2,3

1.9. [4B-S97:21] (2 points) You are given the following: •

Both the mean and the coefficient of variation of a particular distribution are 2.

The third moment of this distribution about the origin is 136. Determine the skewness of this distribution.

Hint: The skewness of a distribution is defined to be the third central moment divided by the cube of the standard deviation. (A) 1/4

(B) 1/2

(C) 1

(D) 4

(E) 17

Exercises continue on the next page . . .

EXERCISES FOR LESSON 1

17

1.10. [4-S01:3] You are given the following times of first claim for five randomly selected auto insurance policies observed from time t  0: 1

2

3

4

5

Calculate the kurtosis of this sample. (A) 0.0

(B) 0.5

(C) 1.7

(D) 3.4

(E) 6.8

[4B-S97:24] (2 points) The random variable X has the density function

1.11.

f (x ) 

4x , 0 < x < ∞. (1 + x 2 ) 3

Determine the mode of X. (A) (B) (C) (D) (E) 1.12.

0 Greater than 0, but less than 0.25 At least 0.25, but less than 0.50 At least 0.50, but less than 0.75 At least 0.75 [3-F01:37] For watches produced by a certain manufacturer:

(i) Lifetimes follow a single-parameter Pareto distribution with α > 1 and θ  4. (ii) The expected lifetime of a watch is 8 years. Calculate the probability that the lifetime of a watch is at least 6 years. (A) 0.44 1.13.

(B) 0.50

(C) 0.56

(D) 0.61

(E) 0.67

[4B-F99:29] (2 points) You are given the following:

A is a random variable with mean 5 and coefficient of variation 1.

B is a random variable with mean 5 and coefficient of variation 1.

C is a random variable with mean 20 and coefficient of variation 1/2.

A, B, and C are independent.

X  A + B.

Y  A + C.

Determine the correlation coefficient between X and Y. √ √ (A) −2/ 10 (B) −1/ 10 (C) 0

√ (D) 1/ 10

√ (E) 2/ 10

Exercises continue on the next page . . .

1. BASIC PROBABILITY

18

1.14.

[CAS3-F03:17] Losses have an Inverse Exponential distribution. The mode is 10,000.

Calculate the median. (A) (B) (C) (D) (E) 1.15.

Less than 10,000 At least 10,000, but less than 15,000 At least 15,000, but less than 20,000 At least 20,000, but less than 25,000 At least 25,000 [CAS3-F03:19] For a loss distribution where x ≥ 2, you are given:

(i) The hazard rate function: h ( x )  z 2 /2x, for x ≥ 2 (ii) A value of the distribution function: F (5)  0.84 Calculate z. (A) 2 1.16.

(B) 3

(C) 4

(D) 5

(E) 6

A Pareto distribution has parameters α  4 and θ  1.

Determine its skewness. (A) (B) (C) (D) (E)

Less than 7.0 At least 7.0, but less than 7.5 At least 7.5, but less than 8.0 At least 8.0, but less than 8.5 At least 8.5

1.17. [CAS3-S04:28] A pizza delivery company has purchased an automobile liability policy for its delivery drivers from the same insurance company for the past five years. The number of claims filed by the pizza delivery company as the result of at-fault accidents caused by its drivers is shown below: Year 2002 2001 2000 1999 1998

Claims 4 1 3 2 15

Calculate the skewness of the empirical distribution of the number of claims per year. (A) (B) (C) (D) (E)

Less than 0.50 At least 0.50, but less than 0.75 At least 0.75, but less than 1.00 At least 1.00, but less than 1.25 At least 1.25

Exercises continue on the next page . . .

EXERCISES FOR LESSON 1

19

1.18. [CAS3-F04:28] A large retailer of personal computers issues a warranty contract with each computer that it sells. The warranty covers any cost to repair or replace a defective computer within the first 30 days of purchase. 40% of all claims are easily resolved with minor technical help and do not involve any cost to replace or repair. If a claim involves some cost to replace or repair, the claim size is distributed as a Weibull with parameters τ  1/2 and θ  30. Which of the following statements are true? 1. 2. 3.

The expected cost of a claim is \$60. The survival function at \$60 is 0.243. The hazard rate at \$60 is 0.012.

(A) 1 only.

(B) 2 only.

(C) 3 only.

(D) 1 and 2 only.

(E) 2 and 3 only.

You are given for the random variable X:

1.19.

(i) E[X]  3 (ii) Var ( X )  100 (iii) E[X 3 ]  30 Calculate the skewness of X. (A) (B) (C) (D) (E)

Less than −1 At least −1, but less than −0.5 At least −0.5, but less than 0 At least 0, but less than 0.5 At least 0.5 The variable X follows a normal distribution with mean 15 and variance 100.

1.20.

Calculate the fourth central moment of X. You are given the following:

1.21. •

X is a random variable with probability density function f (x ) 

E[X]  7500.

E[X 2 ] 75,000,000.

m is the median of X.

α β

!

β x

! α+1

x ≥ β, α > 0, β > 0

Determine the value of f ( m ) . (A) (B) (C) (D) (E)

Less than 0.00020 At least 0.00020, but less than 0.00025 At least 0.00025, but less than 0.00030 At least 0.00030, but less than 0.00035 At least 0.00035

Exercises continue on the next page . . .

1. BASIC PROBABILITY

20

[4-F00:32] You are given the following for a sample of five observations from a bivariate distribution:

1.22. (i)

x

y

1 2 4 5 6

4 2 3 6 4

x¯  3.6, y¯  3.8.

(ii)

A is the covariance of the empirical distribution Fe as defined by these five observations. B is the maximum possible covariance of an empirical distribution with identical marginal distributions to Fe . Determine B − A. (A) 0.9

(B) 1.0

(C) 1.1

(D) 1.2

(E) 1.3

1.23. [CAS3-F04:24] A pharmaceutical company must decide how many experiments to run in order to maximize its profits. •

The company will receive a grant of \$1 million if one or more of its experiments is successful.

Each experiment costs \$2,900.

Each experiment has a 2% probability of success, independent of the other experiments.

All experiments run simultaneously.

Fixed expenses are \$500,000.

Ignore investment income. The company performs the number of experiments that maximizes its expected profit. Determine the company’s expected profit before it starts the experiments.

(A) 77,818 1.24. 800.

(B) 77,829

(C) 77,840

(D) 77,851

(E) 77,862

Claim size for an insurance coverage follows a lognormal distribution with mean 1000 and median

Determine the probability that a claim will be greater than 1200. 1.25. Claim sizes for Kevin follow an exponential distribution with mean 6. Claim sizes for Kira follow an exponential distribution with mean 12. Claim sizes for Kevin and Kira are independent. Kevin and Kira submit one claim apiece. Calculate the probability that the sum of the two claims is greater than 20. Additional released exam questions: CAS3-S06:25, CAS3-F06:25

EXERCISE SOLUTIONS FOR LESSON 1

21

Solutions 1.1. For a uniform distribution, F ( x )  x for 0 ≤ x ≤ 1. We then calculate S, f , and then h  f /S: S (x )  1 − F (x )  1 − x dF f (x )  1 dx f (x ) 1 h (x )   S (x ) 1 − x 1 h (0.75)   4 1 − 0.75

1.2. We will use formula (1.2) for the numerator, µ3 . µ4

µ02  2 + 42  18

µ3  3 − 3 (18)(4) + 2 (43 )  −85 √ σ3  2 2 −85 γ1  √  −30.052 2 2 1.3. Using the Loss Models tables, we have E[X k ]  θ k ( α + k − 1) · · · α. So µ  200

σ2  60000 − 2002  20000

µ02  60000

µ03  24 · 106

µ04  120 · 108

µ4  108 120 − 4 (24)(2) + 6 (6)(22 ) − 3 (24 )  24 · 108



γ2 



24 · 108  6 4 · 108

1.4. The distribution is symmetric, so its skewness is 0 . (B) If this is not obvious to you, calculate the mean, which is 300. Then note that µ3  0.05 (100 − 300) 3 + 0.20 (200 − 300) 3 + 0.20 (400 − 300) 3 + 0.05 (500 − 300) 3  0

so the coefficient of skewness, which is µ3 divided by σ3 , is 0. 1.5. σ2  αθ2

µ  αθ σ 1 1  √ ⇒α1 µ α 3 θ (1)(2)(3) − 3θ 3 (1)(2) + 2θ 3 (13 ) γ1    3/2 (1)( θ2 ) 

6−6+2  2 1

(C)

1. BASIC PROBABILITY

22

1.6. For the covariance, use Cov ( X, Y )  E[XY] − E[X] E[Y]. E[X]  (0.45 + 0.10 + 0.05)(0) + (0.20 + 0.15)(1) + 0.05 (2)  0.45 E[Y]  (0.45 + 0.20)(0) + (0.10 + 0.15 + 0.05)(1) + 0.05 (2)  0.4 E[XY]  (0.45 + 0.20 + 0.10 + 0.05)(0) + (0.15)(1) + 0.05 (2)  0.25 Cov ( X, Y )  0.25 − (0.45)(0.4)  0.07 Calculate the raw second moments, and then the variances, of X and Y. E[X 2 ]  (0.20 + 0.15)(12 ) + 0.05 (22 )  0.55 E[Y 2 ]  (0.10 + 0.15 + 0.05)(12 ) + 0.05 (22 )  0.5 0.07 ρ XY  √  0.2036 (0.3475)(0.34)

Var ( X )  0.55 − 0.452  0.3475 Var ( Y )  0.5 − 0.42  0.34

1.7. Ignore Y and use the marginal distribution of X. From the given data, p1  0.32 + 0.22 + 0.12  0.66 and p2  0.34. So µ  E[X]  0.66 (1) + 0.34 (2)  1.34 and E[ ( X − µ ) 3 ]  0.66 (1 − 1.34) 3 + 0.34 (2 − 1.34) 3  0.071808 . 1.8. Skewness is dimensionless; doubling a random variable has no effect on skewness, since it multiplies the numerator and denominator by 23  8. So 1 is false. Negating a random variable negates the numerator, without affecting the denominator since σ is always positive, so 2 is true. One would expect statement 3 to be true, since as more identical random variables get added together, the distribution becomes more and more normal (which has skewness 0). To demonstrate statement 3: in 3 3 2  23/2 σX . In the numerator, the denominator, Var ( X + Y )  Var ( X ) + Var ( Y )  2σX , so σX+Y E[ ( X + Y ) 3 ]  E[X 3 ] + 3 E[X 2 ] E[Y] + 3 E[X] E[Y 2 ] + E[Y 3 ]  2 E[X 3 ] where the last equality results from the fact that E[X]  E[Y]  0. So Sk ( X + Y ) 

2 E[X 3 ] Sk ( X )  √ 3 23/2 σX 2

making 3 true. (D) 1.9. From the coefficient of variation, we have σ 2 µ σ4 σ2  16 E[X 2 ]  16 + µ2  20 σ3  64 γ1 

136 − 3 (20)(2) + 2 (8) 1  64 2

(B)

EXERCISE SOLUTIONS FOR LESSON 1

1.10.

The variance is σ2 

23

(1 − 3) 2 + (2 − 3) 2 + (4 − 3) 2 + (5 − 3) 2 5

2

The fourth central moment is µ4 

(1 − 3) 4 + (2 − 3) 4 + (4 − 3) 4 + (5 − 3) 4 5

Kurtosis is γ2  1.11.

µ4 6.8  2  1.7 2 σ4

 6.8

(C)

This is a Burr distribution with γ  2, α  2, θ  1. According to the tables, the mode is γ−1 θ αγ + 1

! 1/γ

1  5

! 1/2  0.4472

(C)

If you wish to do the exercise directly: differentiate. We don’t need the denominator of the derivative, since all we want to do is set the entire expression equal to zero. numerator f 0 ( x )  (1 + x 2 ) 3 (4) − (4x )(3)(1 + x 2 ) 2 (2x )





 (1 + x 2 ) 2 (4) (1 + x 2 − 6x 2 )



1 − 5x 2  0 √ x  0.2  0.4472 1.12.



(C)

For a single-parameter Pareto, E[X]  αθ/ ( α − 1) , so

!

α (4) 8 α−1

α2 Then S (6)  (4/6) 2  4/9 . (A) 1.13.

Using the means and coefficients of variations, we have σA  σ B  5

Var ( A )  Var ( B )  25

σC  10

E[A2 ]  E[B 2 ]  52 + 25  50

Var ( C )  100

Also Cov ( A + B ) , ( A + C )  Cov ( A, A ) + Cov ( A, C ) + Cov ( B, A ) + Cov ( B, C )  Var ( A ) + 0 + 0 + 0  25





because A, B, and C are independent. Therefore, √ √ σA+B  25 + 25  50 √ √ σA+C  25 + 100  125 ρ√

25

(50)(125)



25 1 √  √ 25 10 10

(D)

1. BASIC PROBABILITY

24

1.14. For the inverse exponential distribution, the mode is e −θ/x  0.5. Then

θ 2,

so θ  20,000. The median is x such that

−20,000  ln 0.5  − ln 2 x 20,000 x  28,854 ln 2

 R

1.15.

The survival function is S (5)  exp −

5 2

(E)



h ( u ) du , so

1 − 0.84  exp −

5

Z 2

z2 du 2u

!

5

z2 ln u + 2 2

0.16  exp *−

,

-

z2  exp − (ln 5 − ln 2) 2  exp 2  5

z2 ln 2/5 2

!

!

! z 2 /2

z 2 ln 0.16  2 2 ln 0.4 z 2 (A) z could also be −2, but that was not one of the five answer choices. Fortunately for the CAS, they didn’t give range choices with a range including −2 and not 2, since they’d have to accept two answers then. 1.16.

Using the tables, we have

6θ 3 1 (3)(2)(1) 2θ 2 1 E[X 2 ]   (3)(2) 3 1 E[X]  3 1 1 2 Var ( X )  −  3 9 9 E[X 3 ] 

γ1 

1−3

1 1 3 3 + 2 3/2 9

2

1 3 3

 7.0711

(B)

1.17. For the empirical distribution, each observation is treated as having probability culate empirical first, second, and third raw moments: 4 + 1 + 3 + 2 + 15 5 5 16 + 1 + 9 + 4 + 225 µ02   51 5 µ

1 n

 15 , so we cal-

EXERCISE SOLUTIONS FOR LESSON 1

25

64 + 1 + 27 + 8 + 3375  695 5 µ2  51 − 52  26 µ03 

Then the skewness is γ1  1.18.

695 − 3 (51)(5) + 2 (53 )  1.3577 261.5

(E)

For the Weibull, the expected value is (using the tables) E[X]  30Γ (1 + 2)  30 (2)  60

However, there is a 40% chance of not paying a claim, so the expected cost of a claim is less than 60 (in fact, it is 0.6 (60) ) making 1 false. The survival function is Pr ( X > 0) Pr ( X > 60 | X > 0) . The first factor is 60%. The second factor is the survival function of a Weibull at 60, which is e − ( x/θ )  e − (60/30) τ

0.5

 e−

2

 0.243

so the survival function of the cost is 0.6 (0.243) , making 2 false. By the choices given, you already know the answer has to be (C), since that is the only choice that has 1 and 2 false, but let’s discuss why. The hazard rate function of the Weibull is the quotient of the density over the survival, or f ( x ) τ ( x/θ ) τ e − ( x/θ ) /x h (x )   S (x ) e − ( x/θ ) τ τ ( x/θ ) τ (1/2) x −1/2   x 301/2 0.5  0.011785 h (60)  √ (60)(30) τ

The 60% probability of a cost affects f ( x ) and S ( x ) equally. Both functions are multiplied by 0.60, since there is a 0.40 probability of 0. Thus the hazard rate we are interested in involves multiplying and dividing the Weibull hazard rate by 0.6. These two operations cancel out, leaving us with 0.011785 as the hazard rate. Another way to see this is to look at the alternative formula for h ( x ) : h (x )  −

d ln S ( x ) dx

S ( x ) is 0.6 of the Weibull survival function. When you log it, ln S ( x ) is the log of Weibull survival plus ln 0.6. When you differentiate with respect to x, the constant ln 0.6 drops out. 1.19. 30 − 3 (100 + 32 )(3) + 2 (33 ) 897 γ1  −  −0.897 (B) 1.5 1000 100 1.20. The kurtosis of a normal distribution is 3. By definition, the kurtosis is the fourth central moment divided by the variance squared. So E[X 4 ]  3 (1002 )  30,000 .

1. BASIC PROBABILITY

26

1.21. You could recognize that this is a single parameter Pareto and look it up in the tables. It isn’t too hard to integrate either.

!

Z

!

Z

!

Z

α α+1 β F (x )  β α α+1 β β

E[X] 

α α+1 β β

E[X 2 ] 

x

β du 1− α+1 x u

αβ dx  xα α−1

β

β ∞ β

αβ2 dx  x α−1 α − 2

So we have αβ  7500 α−1 αβ 2  75,000,000 α−2

(*)

Dividing the second expression by the square of the first,

( α − 1) 2 4  ( α − 2) α 3 3 ( α − 1) 2  4α ( α − 2) 3α 2 − 6α + 3  4α 2 − 8α α 2 − 2α − 3  0 α  3, −1

and we reject −1 since α > 0. Then from (*), 32 β  7500, so β  23 (7500)  5000. The median is determined from F ( m )  0.5, or 5000 m

!3

 0.5

5000 m  √3  6299.61 0.5

f (m ) 

3 5000

!

5000 6299.61

!4  0.0002381

(B)

1.22. The empirical distribution assigns a probability of n1 to each of n observations. The covariance is the sum of the products of the observations minus the products of the means times the probabilities, which are all n1  51 . To maximize the covariance, the y’s should be in the same order as the x’s. The sum of the products is (1)(4) + (2)(2) + (4)(3) + (5)(6) + (6)(4)  74. If the y’s were ordered in increasing order, the sum would be (1)(2) + (2)(3) + (4)(4) + (5)(4) + (6)(6)  80. We must subtract x¯ y¯ ¯ Then from each and divide by 5, but since we’re just comparing A and B, we won’t bother subtracting x¯ y. 0.2 (80 − 74)  1.2 . (D) 1.23.

The probability of success for n experiments is 1 − 0.98n , so the profit, ignoring fixed expenses, is 1,000,000 (1 − 0.98n ) − 2900n

EXERCISE SOLUTIONS FOR LESSON 1

27

Differentiating this and setting it equal to 0: −106 (0.98n )(ln 0.98) − 2900  0 0.98n  n

−2900 106 ln 0.98 ln 10−2900 6 ln 0.98 ln 0.98

 96.0815

Thus either 96 or 97 experiments are needed. Plugging those numbers into the original expression g ( n )  1,000,000 (1 − 0.98n ) − 2900n gets g (96)  577,818.4 and g (97)  577,794.0, so 96 is best, and the expected profit is 577,818.4 − 500,000  77,818.4 . (A) An alternative to calculus that is more appropriate for this discrete problem is to note that as n increases, at first expected profit goes up and then it goes down. Let X n be the expected profit with n experiments. Then X n  106 (1 − 0.98n ) − 2900n − 500,000

and the incremental profit generated by experiment #n is

X n − X n−1  106 0.98n−1 − 0.98n − 2900.





We want this difference to be greater than 0, which occurs when 106 0.98n−1 − 0.98n > 2900





0.98n−1 (0.02) > 0.0029 0.0029 0.98n−1 >  0.145 0.02 ( n − 1) ln 0.98 > ln 0.145 ln 0.145 −1.93102   95.582 n−1 < ln 0.98 −0.02020

On the last line, the inequality got reversed because we divided by ln 0.98, a negative number. We conclude that the n th experiment increases profit only when n < 96.582, or n ≤ 96, the same conclusion as above. 1.24.

Since the median is 800, e µ  800 and µ  ln 800. Since the mean is 1000, e µ+σ 800e σ e

2 /2

σ 2 /2

 1000  1.25

2

σ  0.4463 σ  0.6680 The probability of a claim greater than 1200 is ln 1200 − ln 800 1 − F (1200)  1 − Φ  1 − Φ (0.61)  0.2709 0.6680

!

2 /2

 1000. Therefore:

1. BASIC PROBABILITY

28

1.25. By the Law of Total Probability, the probability that the sum of two claims is greater than 20 is the integral over all x of the probability that the sum is greater than 20 given that Kira’s claim is x times the density of Kira’s distribution. (The problem can also be done integrating over Kevin’s distribution.) If X is Kira’s claim and Y is Kevin’s claim, then Pr ( X + Y > 20) 

Z 0

Z 

0

Pr ( X + Y > 20 | x ) f ( x ) dx Pr ( X + Y > 20 | x )

1 −x/12 e dx 12

Now, Pr ( X +Y > 20)  1 if X > 20 since Y can’t be negative. If X < 20, then Y > 20− X, and the probability of that under an exponential distribution with mean 6 is Pr ( Y > 20 − X )  e − (20−x )/6 So we split the integral up into x ≥ 20 and 0 ≤ x ≤ 20. Pr ( X + Y > 20) 

Z

∞ 20

1 −x/12 e dx + 12

20

Z 0

e − (20−x )/6

1 −x/12 e dx 12

The first integral, the probability of an exponential random variable with mean 12 being greater than 20, is 1 − FX (20)  e −20/12 . The second integral is 1 12

20

Z 0

e

− (20−x ) /6 −x/12

e

1 dx  12

20

Z 0

e − (40−x )/12 dx

20 1 (12) e −(40−x )/12 12 0  e −20/12 − e −40/12 

So the final answer is Pr ( X + Y > 20)  e −20/12 + e −20/12 − e −40/12  0.188876 + 0.188876 − 0.035674  0.34208

Lesson 2

Parametric Distributions Reading: Loss Models Fourth Edition 4, 5.1, 5.2.1–5.2.3, 5.3—5.4 A parametric distribution is one that is defined by a fixed number of parameters. Examples of parametric distributions are the exponential distribution (parameter θ) and the Pareto distribution (parameters α, θ). Any distribution listed in the Loss Models appendix is parametric. The alternative to a parametric distribution is a data-dependent distribution. A data-dependent distribution is one where the specification requires at least as many “parameters” as the number of data points in the sample used to create it; the bigger the sample, the more “parameters”. Examples of data-dependent distributions are: 1. The empirical distribution based on a sample of size n, as defined in Section 1.5. 2. A kernel-smoothed distribution, as defined in Lesson 27. It is traditional to use parametric distributions for claim counts (frequency) and loss size (severity). Parametric distributions have many advantages. One of the advantages of parametric distributions which makes them so useful for severity is that they handle inflation easily.

2.1

Scaling

A parametric distribution is a member of a scale family if any positive multiple of the random variable has the same form. In other words, the distribution function of cX, for c a positive constant, is of the same form as the distribution function of X, but with different values for the parameters. Sometimes the distribution can be parametrized in such a way that only one parameter of cX has a value different from the parameters of X. If the distribution is parametrized in this fashion, so that the only parameter of cX having a different value from X is θ, and the value of θ for cX is c times the value of θ for X, then θ is called a scale parameter. All of the continuous distributions in the tables (Appendix A) are scale families. The parametrizations given in the tables are often different from those you would find in other sources, such as your probability textbook. They are parametrized so that θ is the scale parameter. Thus when you are given that a random variable has any distribution in the appendix and you are given the parameters, it is easy to determine the distribution of a multiple of the random variable. The only distributions not parametrized with a scale parameter are the lognormal and the inverse Gaussian. Even though the inverse Gaussian has θ as a parameter, it is not a scale parameter. The parametrization for the lognormal given in the tables is the traditional one. If you need to scale a lognormal, proceed as follows: if X is lognormal with parameters ( µ, σ ) , then cX is lognormal with parameters ( µ + ln c, σ ) . To scale a random variable not in the tables, you’d reason as follows. Let Y  cX, c > 0. Then



FY ( y )  Pr ( Y ≤ y )  Pr ( cX ≤ y )  Pr X ≤

y y  FX c c

!

One use of scaling is in handling inflation. In fact, handling inflation is the only topic in this lesson that is commonly tested directly. If loss sizes are inflated by 100r%, the inflated loss variable Y will be (1 + r ) X, where X is the pre-inflation loss variable. For a scale family with a scale parameter, you just multiply θ by (1 + r ) to obtain the new distribution. C/4 Study Manual—17th edition Copyright ©2014 ASM

29

2. PARAMETRIC DISTRIBUTIONS

30

Example 2A Claim sizes expressed in dollars follow a two-parameter Pareto distribution with parameters α  5 and θ  90. A euro is worth \$1.50. Calculate the probability that a claim will be for 20 euros or less. Answer: If claim sizes in dollars are X, then claim sizes in euros are Y  X/1.5. The resulting euro-based random variable Y for claim size will be Pareto with α  5, θ  90/1.5  60. The probability that a claim will be no more than 20 euros is 60 Pr ( Y ≤ 20)  FY (20)  1 − 60 + 20

!5  0.7627



Example 2B Claim sizes in 2010 follow a lognormal distribution with parameters µ  4.5 and σ  2. Claim sizes grow at 6% uniform inflation during 2011 and 2012. Calculate f (1000) , the probability density function at 1000, of the claim size distribution in 2012. Answer: If X is the claim size random variable in 2010, then Y  1.062 X is the revised variable in 2012. The revised lognormal distribution of Y has parameters µ  4.5 + 2 ln 1.06 and σ  2. The probability density function at 1000 is 1

2

2

√ e − (ln 1000−µ) /2σ σ (1000) 2π 2 2 1  √ e −[ln 1000− (4.5+2 ln 1.06) ] /2 (2 ) (2)(1000) 2π

fY (1000) 

 (0.000199471)(0.518814)  0.0001035



Example 2C Claim sizes expressed in dollars follow a lognormal distribution with parameters µ  3 and σ  2. A euro is worth \$1.50. Calculate the probability that a claim will be for 100 euros or less. Answer: If claim sizes in dollars are X, then claim sizes in euros are Y  X/1.5. As discussed above, the distribution of claim sizes in euros is lognormal with parameters µ  3 − ln 1.5 and σ  2. Then ln 100 − 3 + ln 1.5 FY ( y )  Φ  Φ (1.01)  0.8438 2

!



Example 2D Claim sizes X initially follow a distribution with distribution function: FX ( x )  1 −

x 1+x

x>0

Claim sizes are inflated by 50% uniformly. Calculate the probability that a claim will be for 60 or less after inflation. Answer: Let Y be the increased claim size. Then Y  1.5X, so Pr ( Y ≤ 60)  Pr ( X ≤ 60/1.5)  FX (40) . √ 40 FX (40)  1 −  0.8457 41



2.2. TRANSFORMATIONS

2.2

31

Transformations

Students report that there have been questions on transformations of random variables on recent exams. However, you only need to know the simplest case, how to transform a single random variable using a monotonic function. If Y  g ( X ) , with g ( x ) a one-to-one monotonically increasing function, then FY ( y )  Pr ( Y ≤ y )  Pr X ≤ g −1 ( y )  FX g −1 ( y )





and differentiating, fY ( y )  f X g −1 ( y )







(2.1)



(2.2)

 dg −1 ( y ) dy

If g ( x ) is one-to-one monotonically decreasing, then FY ( y )  Pr ( Y ≤ y )  Pr X ≥ g −1 ( y )  SX g −1 ( y )





and differentiating, fY ( y )  − f X g −1 ( y )





 dg −1 ( y ) dy

Putting both cases (monotonically increasing and monotonically decreasing) together:

 dg −1 ( y ) dy

fY ( y )  f X g −1 ( y )



(2.3)

Example 2E X follows a two-parameter Pareto distribution with parameters α and θ. You are given Y  ln



X +1 θ



Determine the distribution of Y. Answer: y  ln



x +1 θ



x θ x  θ ( e y − 1)

ey − 1 

FY ( y )  F X θ ( e y − 1 )





θ 1− θ + θ ( e y − 1) θ 1− θe y

 1 − e −α y So Y’s distribution is exponential with parameter θ  1/α. We see in this example that an exponential can be obtained by transforming a Pareto. There are a few specific transformations that are used to create distributions: C/4 Study Manual—17th edition Copyright ©2014 ASM



2. PARAMETRIC DISTRIBUTIONS

32

1. If the transformation Y  X τ is applied to a random variable X, with τ a positive real number, then the distribution of Y is called transformed. Thus when we talk about transforming a distribution we may be talking about any transformation, but if we talk about a transformed Pareto, say, then we are talking specifically about raising the random variable to a positive power. 2. If the transformation Y  X −1 is applied to a random variable X, then the distribution of Y is prefaced with the word inverse. Some examples you will find in the tables are inverse exponential, inverse Weibull, and inverse Pareto. 3. If the transformation Y  X τ is applied to a random variable X, with τ a negative real number, then the distribution of Y is called inverse transformed. 4. If the transformation Y  e X is applied to a random variable X, we name Y with the name of X preceded with “log”. The lognormal distribution is an example. As an example, let’s develop the distribution and density functions of an inverse exponential. Start with an exponential with parameter θ: F ( x )  1 − e −x/θ f (x ) 

e −x/θ θ

and let y  1/x. Notice that this is a one-to-one monotonically decreasing transformation, so when transforming the density function, we will multiply by the negative of the derivative. Then FY ( y )  Pr ( Y ≤ y )  Pr ( X ≥ 1/y )  SX (1/y )  e −1/( yθ )

dx e −1/( yθ )  θ y2 dy

f y ( y )  f x (1/y )

However, θ is no longer a scale parameter after this transformation. Therefore, the tables in the appendix use the reciprocal of θ as the parameter and call it θ: FY ( y )  e −θ/y fy ( y) 

θe −θ/y y2

As a result of the change in parametrization, the negative moments of the inverse exponential, as listed in the tables, are different from the corresponding positive moments of the exponential. Even though Y  X −1 , the formula for E[Y −1 ] is different from the one for E[X] because the θ’s are not the same. To preserve the scale parameters,1 the transformation should be done after the random variable is divided by its scale parameter. In other words 1. Set Y/θ  ( X/θ ) τ for a transformed random variable. 2. Set Y/θ  ( X/θ ) −1 for an inverse random variable. 3. Set Y/θ  ( X/θ ) −τ for an inverse transformed random variable. 4. Set Y/θ  e X/θ for a logged random variable.

1This method was shown to me by Ken Burton C/4 Study Manual—17th edition Copyright ©2014 ASM

2.3. COMMON PARAMETRIC DISTRIBUTIONS

33

Table 2.1: Summary of Scaling and Transformation Concepts

• If a distribution has a scale parameter θ and X has that distribution with parameter θ, then cX has the same distribution with parameter cθ. • All continuous distributions in the exam tables has scale parameter θ except for lognormal and inverse Gaussian. • If X is lognormal with parameters µ and σ, then cX is lognormal with parameters µ + ln c and σ. • If Y  g ( X ) and g is monotonically increasing, then FY ( y )  Pr ( Y ≤ y )  Pr X ≤ g −1 ( y )  FX g −1 ( y )









(2.1)



(2.2)

• If Y  g ( X ) and g is monotonically decreasing, then FY ( y )  Pr ( Y ≤ y )  Pr X ≥ g −1 ( y )  SX g −1 ( y )







• If Y  g ( X ) and g is monotonically increasing or decreasing, then

 dg −1 ( y ) dy

fY ( y )  f X g −1 ( y )



(2.3)

Let’s redo the inverse exponential example this way. Y X  θ θ

! −1

FY ( y )  Pr ( Y ≤ y )  y Y ≤  Pr θ θ ! X θ  Pr ≥ θ y θ2  Pr X ≥ y  e −θ

!

2 /yθ

 e −θ/y

2.3

Common parametric distributions

The tables provide a lot of information about the distributions, but if you don’t recognize the distribution, you won’t know to use the table. Therefore, it is a good idea to be familiar with the common distributions. You should familiarize yourself with the form of each distribution, but not necessarily the constants. The constant is forced so that the density function will integrate to 1. If you know which distribution you are dealing with, you can figure out the constant. To emphasize this point, in the following discussion, we will use the letter c for constants rather than spelling out what the constants are. You are not trying to recognize the constant; you are trying to recognize the form. C/4 Study Manual—17th edition Copyright ©2014 ASM

2. PARAMETRIC DISTRIBUTIONS

34

The gamma function

The gamma function Γ ( x ) is a generalization to real numbers of the factorial function, defined by ∞

Z Γ(x )  For positive integers n,

0

u x−1 e −u du

Γ ( n )  ( n − 1) !

The most important relationship for Γ ( x ) that you should know is Γ ( x + 1)  xΓ ( x ) for any real number x. Example 2F Evaluate

Γ (8.5) . Γ (6.5)

Answer: Γ (8.5) Γ (8.5)  Γ (6.5) Γ (7.5)

!

Γ (7.5)  (7.5)(6.5)  48.75 Γ (6.5)

!



We will mention the means and variances or second moments of the distributions. You need not memorize any of these. The tables give you the raw moments. You can calculate the variance as E[X 2 ] − E[X]2 . However, for frequently used distributions, you may want to memorize the mean and variance to save yourself some time when working out questions. We will graph the distributions. You are not responsible for graphs, but they may help you understand the distributions. The tables occasionally use the gamma function Γ ( x ) in the formulas for the moments. You should have a basic knowledge of the gamma function; if you are not familiar with this function, see the sidebar. The tables also use the incomplete gamma and beta functions, and define them, but you can get by without knowing them.

2.3.1

Uniform

A uniform distribution has a constant density on [d, u]: 1 u−d 0       x−d F ( x; d, u )     u −d    1  f ( x; d, u ) 

d≤x≤u

x≤d

d≤x≤u

x≥u

You recognize a uniform distribution both by its finite support and by the lack of an x in the density function. C/4 Study Manual—17th edition Copyright ©2014 ASM

2.3. COMMON PARAMETRIC DISTRIBUTIONS

35

Its moments are d+u 2 (u − d )2 Var ( X )  12 E[X] 

Its mean, median, and midrange are equal. The best way to calculate the second moment is to add up the variance and the square of the mean. However, some students prefer to use the following easy-to-derive formula: Z u 1 u 2 + ud + d 2 u3 − d3 2 E[X ]   (2.4) x 2 dx  u−d d 3(u − d ) 3 If d  0, then the formula reduces to u 2 /3. The uniform distribution is not directly in the tables, so I recommend you memorize the formulas for mean and variance. However, if d  0, then the uniform distribution is a special case of a beta distribution with θ  u, a  1, b  1.

2.3.2

Beta

The probability density function of a beta distribution with θ  1 has the form f ( x; a, b )  cx a−1 (1 − x ) b−1

0≤x≤1

The parameters a and b must be positive. They may equal 1, in which case the corresponding factor is missing from the density function. Thus if a  b  1, the beta distribution is a uniform distribution. You recognize a beta distribution both by its finite support—it’s the only common distribution with finite support—and by factors with x and 1 − x raised to powers and no other use of x in the density function. If θ is arbitrary, then the form of the probability density function is f ( x; a, b, θ )  cx a−1 ( θ − x ) b−1

0≤x≤θ

The distribution function can be evaluated if a or b is an integer. The moments are E[X]  Var ( X ) 

θa a+b

θ 2 ab ( a + b ) 2 ( a + b + 1)

The mode is θ ( a − 1) / ( a + b − 2) when a and b are both greater than 1, but you are not responsible for this fact. Figure 2.1 graphs four beta distributions with θ  1 all having mean 2/3. You can see how the distribution becomes more peaked and normal looking as a and b increase.

2.3.3

Exponential

The probability density function of an exponential distribution has the form f ( x; θ )  ce −x/θ θ must be positive. C/4 Study Manual—17th edition Copyright ©2014 ASM

x≥0

2. PARAMETRIC DISTRIBUTIONS

36

y 4.5

a a a a

4 3.5

= 1, b = 0.5 = 2, b = 1 = 6, b = 3 = 18, b = 9

3 2.5 2 1.5 1 0.5 0

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

x

Figure 2.1: Probability density function of four beta distributions with θ  1 and mean 2/3

You recognize an exponential distribution when the density function has e raised to a multiple of x, and no other use of x. The distribution function is easily evaluated. The moments are: E[X]  θ Var ( X )  θ 2 Figure 2.2 graphs three exponential distributions. The higher the parameter, the more weight placed on higher numbers.

2.3.4

Weibull

A Weibull distribution is a transformed exponential distribution. If Y is exponential with mean µ, then X  Y 1/τ is Weibull with parameters θ  µ1/τ and τ. An exponential is a special case of a Weibull with τ  1. The form of the density function is f ( x; τ, θ )  cx τ−1 e − ( x/θ )

τ

x≥0

Both parameters must be positive. You recognize a Weibull distribution when the density function has e raised to a multiple of a power of x, and in addition has a corresponding power of x, one lower than the power in the exponential, as a factor. The distribution function is easily evaluated, but the moments require evaluating the gamma function, which usually requires numerical techniques. The moments are E[X]  θΓ (1 + 1/τ ) E[X 2 ]  θ 2 Γ (1 + 2/τ ) Figure 2.3 graphs three Weibull distributions with mean 50. The distribution has a non-zero mode when τ > 1. Notice that the distribution with τ  0.5 puts a lot of weight on small numbers. To make up C/4 Study Manual—17th edition Copyright ©2014 ASM

2.3. COMMON PARAMETRIC DISTRIBUTIONS

37

y 0.04 θ = 25 θ = 50 θ = 100

0.035 0.03 0.025 0.02 0.015 0.01 0.005 0

0

10

20

30

40

50

60

70

80

90

100

x

Figure 2.2: Probability density function of three exponential distributions

y 0.04 τ = 0.5, θ = 25 τ = 1, θ = 50 p τ = 2, θ = 100/ π

0.035 0.03 0.025 0.02 0.015 0.01 0.005 0

0

10

20

30

40

50

60

70

80

90

100

x

Figure 2.3: Probability density function of three Weibull distributions with mean 50

for this, it will also have to put higher weight than the other two distributions on very large numbers, so although it’s not shown, its graph will cross the other two graphs for high x

2.3.5

Gamma

The form of the density function of a gamma distribution is f ( x; α, θ )  cx α−1 e −x/θ

x≥0

Both parameters must be positive. When α is an integer, a gamma random variable with parameters α and θ is the sum of α independent exponential random variables with parameter θ. In particular, when α  1, the gamma random variable is exponential. The gamma distribution is called an Erlang distribution when α is an integer. We’ll discuss this more in Subsection 19.1.2. You recognize a gamma distribution when the density function has e raised to a multiple of x, and in C/4 Study Manual—17th edition Copyright ©2014 ASM

2. PARAMETRIC DISTRIBUTIONS

38

y 0.06 0.055 0.05 0.045 0.04 α = 0.5, θ = 100 α = 5, θ = 10 α = 50, θ = 1

0.035 0.03 0.025 0.02 0.015 0.01 0.005 0

0

10

20

30

40

50

60

70

80

90

100

x

Figure 2.4: Probability density function of three gamma distributions with mean 50

addition has x raised to a power. Contrast this with a Weibull, where e is raised to a multiple of a power of x. The distribution function may be evaluated if α is an integer; otherwise numerical techniques are needed. However, the moments are easily evaluated: E[X]  αθ Var ( X )  αθ2 Figure 2.4 graphs three gamma distributions with mean 50. As α goes to infinity, the graph’s peak narrows and the distribution converges to a normal distribution. The gamma distribution is one of the few for which the moment generating function has a closed form. In particular, the moment generating function of an exponential has a closed form. The only other distributions in the tables with closed form moment generating functions are the normal distribution (not actually in the tables, but the formula for the lognormal moments is the MGF of a normal) and the inverse Gaussian.

2.3.6

Pareto

When we say “Pareto”, we mean a two-parameter Pareto. On recent exams, they write out “twoparameter” to make it clear, but on older exams, you will often find the word “Pareto” with no qualifier. It always refers to a two-parameter Pareto, not a single-parameter Pareto. The form of the density function of a two-parameter Pareto is f (x )  Both parameters must be positive. C/4 Study Manual—17th edition Copyright ©2014 ASM

c ( θ + x ) α+1

x≥0

2.3. COMMON PARAMETRIC DISTRIBUTIONS

39

y 0.04 α = 0.5, θ = 5 α = 2, θ = 50 α = 5, θ = 200

0.035 0.03 0.025 0.02 0.015 0.01 0.005 0

0

10

20

30

40

50

60

70

80

90

100

x

Figure 2.5: Probability density function of three Pareto distributions

You recognize a Pareto when the density function has a denominator with x plus a constant raised to a power. The distribution function is easily evaluated. The moments are E[X]  E[X 2 ] 

θ α−1

α>1

2θ 2 ( α − 1)( α − 2)

α>2

When α does not satisfy these conditions, the corresponding moments don’t exist. A shortcut formula for the variance of a Pareto is Var ( X )  E[X]2

α α−2

!

Figure 2.5 graphs three Pareto distributions, one with α < 1 and the other two with mean 50. Although the one with α  0.5 puts higher weight on small numbers than the other two, its mean is infinite; it puts higher weight on large numbers than the other two, and its graph eventually crosses the other two as x → ∞.

2.3.7

Single-parameter Pareto

The form of the density function of a single-parameter Pareto is f (x ) 

c x α+1

x≥θ

α must be positive. θ is not considered a parameter since it must be selected in advance, based on what you want the range to be. You recognize a single-parameter Pareto by its having support not starting at 0, and by the density function having a denominator with x raised to a power. A beta distribution may also have x raised to a negative power, but it would have finite support. C/4 Study Manual—17th edition Copyright ©2014 ASM

2. PARAMETRIC DISTRIBUTIONS

40

A single-parameter Pareto X is a two-parameter Pareto Y shifted by θ: X  Y + θ. Thus it has the same variance, and the mean is θ greater than the mean of a two-parameter Pareto with the same parameters. αθ α−1 αθ2 2 E[X ]  α−2 E[X] 

2.3.8

α>1 α>2

Lognormal

The form of the density function of a lognormal distribution is f (x ) 

ce − (ln x−µ ) x

2 /2σ 2

x>0

σ must be nonnegative. You recognize a lognormal by the ln x in the exponent. If Y is normal, then X  e Y is lognormal with the same parameters µ and σ. Thus, to calculate the distribution function, use ! ln x − µ FX ( x )  FY (ln x )  Φ σ where Φ ( x ) is the standard normal distribution function, for which you are given tables. The moments of a lognormal are E[X]  e µ+0.5σ E[X 2 ]  e 2µ+2σ

2

2

More generally, E[X k ]  E[e kY ]  MY ( k ) , where MY ( k ) is the moment generating function of the corresponding normal distribution. Figure 2.6 graphs three lognormals with mean 50. The mode is exp ( µ − σ2 ) , as stated in the tables. For µ  2, the mode is off the graph. As σ gets lower, the distribution flattens out. Table 2.2 is a summary of the forms of probability density functions for common distributions.

2.4

The linear exponential family

The following material, based on Loss Models 5.4 which is on the syllabus, is background for something we’ll learn later in credibility. However, I doubt anything in this section will be tested on directly, so you may skip it. A set of parametric distributions is in the linear exponential family if it can be parametrized with a parameter θ in such a way that in its density function, the only interaction between θ and x is in the exponent of e, which is x times a function of θ. In other words, its density function f ( x; θ ) can be expressed as p ( x ) e r (θ) x f ( x; θ )  q (θ) The set may have other parameters. q ( θ ) is the normalizing constant which makes the integral of f equal to 1. Examples of the linear exponential family are: C/4 Study Manual—17th edition Copyright ©2014 ASM

2.4. THE LINEAR EXPONENTIAL FAMILY

41

y 0.06 0.055 0.05 0.045 0.04 µ = 2, σ = 1.9555 µ = 3, σ = 1.3506 µ = 3.5, σ = 0.9078

0.035 0.03 0.025 0.02 0.015 0.01 0.005 0

0

10

20

30

40

50

60

70

80

90

Figure 2.6: Probability density function of three lognormal distributions with mean 50

Table 2.2: Forms of probability density functions for common distributions

Distribution

Probability density function

Uniform

c

d≤x≤u

Beta

cx a−1 ( θ − x ) b−1

0≤x≤θ

Exponential

ce −x/θ

x≥0

Weibull

cx τ−1 e −x

Gamma

cx α−1 e −x/θ

Pareto

c ( x + θ ) α+1

Single-parameter Pareto

c x α+1

Lognormal

ce − (ln x−µ ) x

τ /θ τ

x≥0 x≥0 x≥0 x≥θ

2 /2σ 2

x>0

100

x

2. PARAMETRIC DISTRIBUTIONS

42

Gamma distribution The pdf is f ( x; µ, σ ) 

x α−1 e −x/θ Γ(α) θα

Let r ( θ )  −1/θ, p ( x )  x α−1 , and q ( θ )  Γ ( α ) θ α . Normal distribution The pdf is 2

f ( x; θ ) 

e − ( x−µ) /2σ √ σ 2π

2

Let θ  µ. The denominator of the pdf does not have x or θ so it can go into q ( θ ) or into p ( x ) . The exponent can be expanded into −

xθ θ2 x2 + − 2σ2 σ2 2σ2

and only the second summand involves both x and θ, and x appears to the first power. Thus we can set 2 2 2 2 √ p ( x )  e −x /2σ , r ( θ )  θ/σ2 , and q ( θ )  e θ /2σ σ 2π. Discrete distributions are in the linear exponential family if we can express the probability function in the linear exponential form. Poisson distribution For a Poisson distribution, the probability function is f ( x; λ )  e −λ

λx e x ln λ  e −λ x! x!

We can let θ  λ, and then p ( x )  1/x!, r ( θ )  ln θ, and q ( θ )  e θ . The textbook develops the following formulas for the mean and variance of a distribution from the linear exponential family: ln q ( θ ) q0 ( θ ) E[X]  µ ( θ )  0  r (θ) q (θ) r0 (θ) 0 µ (θ) Var ( X )  v ( θ )  0 r (θ)



Thus, in the above examples: Gamma distribution d ln q α  dθ θ 1 dr  2 dθ θ α/θ E[X]   αθ 1/θ 2 α Var ( X )   αθ2 1/θ 2 C/4 Study Manual—17th edition Copyright ©2014 ASM

0

2.5. LIMITING DISTRIBUTIONS

43

Normal distribution



θ 2θ  2 2 2σ σ 1 0 r (θ)  2 σ θ/σ2 E[X]  θ 1/σ 2 1  σ2 Var ( X )  1/σ 2

ln q ( θ )

0



Poisson distribution



ln q ( θ )

0

1

1 θ 1 E[X]  θ 1/θ 1 Var ( X )  θ 1/θ r0 (θ) 

2.5

Limiting distributions

The following material is based on Loss Models 5.3.3. I don’t think it has ever appeared on the exam and doubt it ever will. In some cases, as the parameters of a distribution go to infinity, the distribution converges to another distribution. To demonstrate this, we will usually have to use the identity



lim 1 +

α→∞

r α

 er

Equivalently, if c is a constant (not dependent on α), then



lim 1 +

α→∞

r α

 α+c

 er

since we can set α0  α + c, and r/ ( α0 − c ) → r/α0 as α0 → ∞. As a simple example (not in the textbook) of a limiting distribution, consider a gamma distribution with a fixed mean µ, and let α → ∞. Then θ  µ/α. The moment generating function is M ( t )  (1 − θt ) −α  

1 1−

 µt α α

and as α → ∞, the denominator goes to e −µt , so M ( t ) → e µt , which is the moment generating function of the constant µ. So as α → ∞, the limiting distribution of a gamma is a distribution equal to the mean with probability 1. As another example, let’s carry out textbook exercise 5.21, which asks you to demonstrate that the limiting distribution of a Pareto with θ/α constant as α → ∞ is an exponential. Let k  θ/α. The density C/4 Study Manual—17th edition Copyright ©2014 ASM

2. PARAMETRIC DISTRIBUTIONS

44

Table 2.3: Summary of Parametric Distribution Concepts

• If X is a member of a scale family with scale parameter θ with value s, then cX is in the same family and has the same parameter values as X except that the scale parameter θ has value cs. • All distributions in the tables are scale families with scale parameter θ except for lognormal and inverse Gaussian. • If X is lognormal with parameters µ and σ, then cX is lognormal with parameters µ + ln c and σ. • If X is normal with parameters µ and σ2 , then e X is lognormal with parameters µ and σ. • See Table 2.2 to learn the forms of commonly occurring distributions. Useful facts are Uniform on [d, u]

Uniform on [0, u] Gamma

d+u 2 (u − d )2 Var ( X )  12 u2 2 E[X ]  3 Var ( X )  αθ2 E[X] 

• If Y is single-parameter Pareto with parameters α and θ, then Y − θ is two-parameter Pareto with the same parameters. • X is in the linear exponential family if its probability density function can be expressed as f ( x; θ ) 

p ( x ) e r (θ) x q (θ)

function of a Pareto is αθ α ( θ + x ) α+1 α ( αk ) α  ( αk + x ) α+1 kα  ( k + x/α ) α+1 1    α+1 k 1 + ( x/k ) /α

f ( x; α, θ ) 

and the limit as α → ∞ is (1/k ) e −x/k . That is the density function of an exponential with mean k. Notice that as α → ∞, the mean of the Pareto converges to k.

EXERCISES FOR LESSON 2

45

Exercises 2.1. For a commercial fire coverage •

In 2009, loss sizes follow a two-parameter Pareto distribution with parameters α  4 and θ.

In 2010, there is uniform inflation at rate r.

The 65th percentile of loss size in 2010 equals the mean loss size in 2009. Determine r.

2.2. [4B-S90:37] (2 points) Liability claim severity follows a Pareto distribution with a mean of 25,000 and parameter α  3. If inflation increases all claims by 20%, the probability of a claim exceeding 100,000 increases by what amount? (A) (B) (C) (D) (E)

Less than 0.02 At least 0.02, but less than 0.03 At least 0.03, but less than 0.04 At least 0.04, but less than 0.05 At least 0.05 [4B-F97:26] (3 points) You are given the following:

2.3. •

In 1996, losses follow a lognormal distribution with parameters µ and σ.

In 1997, losses follow a lognormal distribution with parameters µ + ln k and σ, where k is greater than 1.

In 1996, 100p% of the losses exceed the mean of the losses in 1997. Determine σ. Note: z p is the 100p th percentile of a normal distribution with mean 0 and variance 1.

(A)

2 ln k

(B)

−z p ±

(C)

zp ±

q

q

z 2p − 2 ln k

r (D)

−z p ±

r (E)

zp ±

z 2p − 2 ln k

q

q

z 2p − 2 ln k

z 2p − 2 ln k

Exercises continue on the next page . . .

2. PARAMETRIC DISTRIBUTIONS

46

2.4. [4B-S94:16] (1 point) You are given the following: •

Losses in 1993 follow the density function f ( x )  3x −4 ,

x ≥ 1,

where x  losses in millions of dollars. •

Inflation of 10% impacts all claims uniformly from 1993 to 1994. Determine the probability that losses in 1994 exceed 2.2 million.

(A) (B) (C) (D) (E)

Less than 0.05 At least 0.05, but less than 0.10 At least 0.10, but less than 0.15 At least 0.15, but less than 0.20 At least 0.20 [4B-F95:6] (2 points) You are given the following:

2.5. •

In 1994, losses follow a Pareto distribution with parameters θ  500 and α  1.5.

Inflation of 5% impacts all losses uniformly from 1994 to 1995. What is the median of the portion of the 1995 loss distribution above 200?

(A) (B) (C) (D) (E)

Less than 600 At least 600, but less than 620 At least 620, but less than 640 At least 640, but less than 660 At least 660

2.6. [CAS3-S04:34] Claim severities are modeled using a continuous distribution and inflation impacts claims uniformly at an annual rate of i. Which of the following are true statements regarding the distribution of claim severities after the effect of inflation? 1. An Exponential distribution will have scale parameter (1 + i ) θ 2. A 2-parameter Pareto distribution will have scale parameters (1 + i ) α and (1 + i ) θ. 3. A Paralogistic distribution will have scale parameter θ/ (1 + i ) (A) 1 only

(B) 3 only

(C) 1 and 2 only

(D) 2 and 3 only

(E) 1, 2, and 3

Exercises continue on the next page . . .

EXERCISES FOR LESSON 2

47

2.7. [4B-S99:17] You are given the following: •

In 1998, claim sizes follow a Pareto distribution with parameters θ (unknown) and α  2.

Inflation of 6% affects all claims uniformly from 1998 to 1999.

r is the ratio of the proportion of claims that exceed d in 1999 to the proportion of claims that exceed d in 1998. Determine the limit of r as d goes to infinity.

(A) (B) (C) (D) (E)

Less than 1.05 At least 1.05, but less than 1.10 At least 1.10, but less than 1.15 At least 1.15, but less than 1.20 At least 1.20

2.8. [4B-F94:28] (2 points) You are given the following: •

In 1993, the claim amounts for a certain line of business were normally distributed with mean µ  1000 and variance σ 2  10,000;

  1 x−µ 2 1 exp − f (x )  √ 2 σ σ 2π •

! − ∞ < x < ∞,

µ  1000, σ  100.

Inflation of 5% impacted all claims uniformly from 1993 to 1994. What is the distribution for claim amounts in 1994?

(A) (B) (C) (D) (E)

No longer a normal distribution Normal with µ  1000 and σ  102.5. Normal with µ  1000 and σ  105.0. Normal with µ  1050 and σ  102.5. Normal with µ  1050 and σ  105.0.

2.9. [4B-S93:11] (1 point) You are given the following: (i)

The underlying distribution for 1992 losses is given by a lognormal distribution with parameters µ  17.953 and σ  1.6028. (ii) Inflation of 10% impacts all claims uniformly the next year. What is the underlying loss distribution after one year of inflation? (A) (B) (C) (D) (E) 2.10.

Lognormal with µ0  19.748 and σ0  1.6028. Lognormal with µ0  18.048 and σ0  1.6028. Lognormal with µ0  17.953 and σ0  1.7631. Lognormal with µ0  17.953 and σ0  1.4571. No longer a lognormal distribution X follows an exponential distribution with mean 10.

Determine the mean of X 4 .

Exercises continue on the next page . . .

2. PARAMETRIC DISTRIBUTIONS

48

2.11. (i) (ii)

You are given X is exponential with mean 2. Y  X 1.5 .

Calculate E[Y 2 ]. 2.12.

X follows a gamma distribution with parameters α  2.5 and θ  10.

Y  1/X. Evaluate Var ( Y ) . Additional released exam questions: CAS3-F05:19,21, CAS3-S06:26,27

Solutions 2.1. The mean in 2009 is θ/3. By definition, the 65th percentile is the number π65 such that F ( π65 )  0.65, so F ( θ/3)  0.65 for the 2010 version of F. In 2010, F is two-parameter Pareto with inflated parameter θ0  (1 + r ) θ and α  4, so

!4

θ0 0 θ + ( θ/3) (1 + r ) θ (1 + r ) θ + θ/3 1+r 4/3 + r √4 r (1 − 0.35)

1−

 0.65 

√4 0.35



√4 0.35

4 √4 0.35 − 1 3 √ (4/3) 4 0.35 − 1 r  0.1107 √4 1 − 0.35 

2.2. Let X be the original variable, Z  1.2X. Since the mean is 25,000, the parameter θ is 25,000 ( α −1)  50,000. 50 Pr ( X > 100,000)  150

!3 

1 27

!3

60 27 Pr ( Z > 100,000)   160 512 27 1 −  0.0157 (A) 512 27 2.3. The key is to understand (iii). If (for example) 30% of losses exceed \$10000, what percentage does not exceed \$10000? (Answer: 70%) And what percentile of the distribution of losses is \$10000? (Answer: 70th ). So statement (iii) is saying that the 100 (1 − p ) th percentile of losses in 1996 equals the mean of losses in 1997. Got it? 2 The mean of 1997 losses is exp ( µ + ln k + σ2 ) . The 100(1 − p)th percentile is exp ( µ − z p σ ) . So: µ − z p σ  µ + ln k + C/4 Study Manual—17th edition Copyright ©2014 ASM

σ2 2

EXERCISE SOLUTIONS FOR LESSON 2

49 σ2 + σz p + ln k  0 2

σ  −z p ±

q

z 2p − 2 ln k

(B)

Notice that p must be less than 0.5, by the following reasoning. In general, the median of a lognormal (e µ ) 2 is less than (or equal to, if σ  0) the mean (e µ+σ /2 ), so the median of losses in 1996 is no more than the mean of losses in 1996, which in turn is less than the mean of losses in 1997 since k > 1, so 100p must be less than 50. Since p is less than 0.5, it follows that z p will be negative, and σ is therefore positive, as it should be. 2.4. We recognize the 1993 distribution as a single-parameter Pareto with θ  1, α  3. The inflated 1.1 3  0.125 . (C) parameters are θ  1.1, α  3. 2.2

 1.5

525 2.5. Let X be the inflated variable, with θ  525, α  1.5. Pr ( X > 200)  525+200  0.6162. Let F be ∗ the original distribution function, F the distribution of X | X > 200. Then F (200)  1 − 0.6162  0.3838 and Pr (200 < X ≤ x ) F ( x ) − F (200) F ∗ ( x )  Pr ( X ≤ x | X > 200)   Pr ( X > 200) 1 − F (200)

So to calculate the median, we set F ∗ ( x )  0.5, which means

F ( x ) − F (200)  0.5 1 − F (200) F ( x ) − 0.3838  0.5 0.6162 F ( x )  0.5 (0.6162) + 0.3838  0.6919 We must find x such that F ( x )  0.6919.

! 1.5

525 525 + x 525 525 + x 525 − 525 (0.4562) 0.4562 x 1−

 0.6919  0.4562 x  625.87

(C)

2.6. All the distributions are parameterized so that θ is the scale parameter and is multiplied by 1 + i; no other parameters change, and you should never divide by 1 + i. Therefore only 1 is correct. (A) 2.7. This is:

1.06θ  2 1.06θ+d θ 2 θ+d



1.062 ( θ + d ) 2 → 1.062  1.1236 . (1.06θ + d ) 2

(C)

2.8. If X is normal, then aX + b is normal as well. In particular, 1.05X is normal. So the distribution of claims after 5% uniform inflation is normal. For any distribution, multiplying the distribution by a constant multiplies the mean and standard deviation by that same constant. Thus in this case, the new mean is 1050 and the new standard deviation is 105. (E) 2.9. Add ln 1.1 to µ: 17.953 + ln 1.1  18.048. σ does not change. (B) C/4 Study Manual—17th edition Copyright ©2014 ASM

2. PARAMETRIC DISTRIBUTIONS

50

2.10.

The k th moment for an exponential is given in the tables: E[X k ]  k!θ k

for k  4 and the mean θ  10, this is 4! (104 )  240,000 . 2.11. While Y is Weibull, you don’t need to know that. It’s simpler to use Y 2  X 3 and look up the third moment of an exponential. E[X 3 ]  3!θ 3  6 (23 )  48 2.12. We calculate E[Y] and E[Y 2 ], or E[X −1 ] and E[X −2 ]. Note that the special formula in the tables for integral moments of a gamma, E[X k ]  θ k ( α + k − 1) · · · α only applies when k is a positive integer, so it cannot be used for the −1 and −2 moments. Instead, we must use the general formula for moments given in the tables, θk Γ(α + k ) E[X k ]  Γ(α) For k  −1, this is

E[X −1 ] 

since Γ ( α )  ( α − 1) Γ ( α − 1) . For k  −2,

θ −1 Γ ( α − 1) 1  Γ(α) θ ( α − 1)

E[X −2 ]  Therefore,

θ2 ( α

1 − 1)( α − 2)

1 1 Var ( Y )  2 − 10 (1.5) 10 (1.5)(0.5)

!2  0.00888889

Lesson 3

Variance You learned everything in this lesson in your probability course. Nevertheless, many students miss a lot of these points. They are very important. There won’t necessarily be any exam questions testing you directly on this material (although there could be). Rather, this material is background needed for the rest of the course.

3.1

Expected value is linear, meaning that f g E[aX + bY]  a E[X] + b E[Y], regardless of whether Xfand Y areg 2 independent or not. Thus E ( X+Y )  E[X 2 ]+2 E[XY]+E[Y 2 ], for example. This means that E ( X + Y ) 2 is not equal to E X 2 + E Y 2 , unless E[XY]  0.

f

g

f

g

2

Also, it is not true in general that E g ( X )  g E[X] . So E[X 2 ] , E[X] .

f

g







Since variance can be expressed in terms of expected value as Var ( X )  E X 2 − E[X]2 , this allows us to develop a formula for Var ( aX + bY ) . If you work it out, you get

f

g

Var ( aX + bY )  a 2 Var ( X ) + 2ab Cov ( X, Y ) + b 2 Var ( Y )

(3.1)

In particular, if Cov ( X, Y )  0 (which is true if X and Y are independent), then Var ( X + Y )  Var ( X ) + Var ( Y ) and generalizing to n independent variables, n n X X + * Xi  Var ( X i ) Var , i1 - i1

If all the X i ’s are independent and have identical distributions, and we set X  X i for all i, then Var *

n X

, i1

X i +  n Var ( X )

-

However, Var ( nX )  n 2 Var ( X ) , not n Var ( X ) . You must distinguish between these two situations, which are quite different. Think of the following example. The stock market goes up or down randomly each day. We will assume that each day’s change is independent of the previous day’s, and has the same distribution. Compare the variance of the following possibilities: 1. You put \$1 in the market, and leave it there for 10 days. 2. You put \$10 in the market, and leave it there for 1 day. In the first case, there are going to be potential ups and downs each day, and the variance of the change of your investment will be 10 times the variance of one day’s change because of this averaging. In the second case, however, you are magnifying a single day’s change by 10—there’s no dampening of the change by C/4 Study Manual—17th edition Copyright ©2014 ASM

51

3. VARIANCE

52

10 different independent random events, the change depends on a single random event. Therefore the variance is multiplied by 100. In the more general case where the variables are not independent, you need to know the covariance. This can be provided in a covariance matrix. If you have n random variables X1 , . . . , X n , this n × n matrix A has a i j  Cov ( X i , X j ) for i , j. For i  j, a ii  Var ( X i ) . This matrix is symmetric and positive semidefinite. However, the covariance of two random variables may be negative. Example 3A For a loss X on an insurance policy, let X1 be the loss amount and X2 the loss adjustment expenses, so that X  X1 + X2 . The covariance matrix for these random variables is 25 5

5 2

!

Calculate the variance in total cost of a loss including loss adjustment expenses. Answer: In formula (3.1), a  b  1, so 25 + 2 (5) + 2  37 .



A sample is a set of observations from n independent identically distributed random variables. The sample mean X¯ is the sum of the observations divided by n. The variance of the sample mean of X1 , . . . , X n , which are observations from the random variable X, is

Pn

Var ( X¯ )  Var

3.2

i1

n

Xi

! 

n Var ( X ) Var ( X )  n n2

(3.2)

Normal approximation

The Central Limit Theorem says that for any distribution with finite variance, the sample mean of a set of independent identically distributed random variables approaches a normal distribution. By the previous section, the mean of the sample mean of observations of X is E[X] and the variance is σ2 /n. These parameters uniquely determine the normal distribution that the sample mean converges to. A random variable Y with normal distribution with mean µ and variance σ2 can be expressed in terms of a standard normal random variable Z in the following way: Y  µ + σZ and you can look up the distribution of Z in a table of the standard normal distribution function that you get at the exam. The normal approximation consists of calculating percentiles of a random variable by assuming that it has a normal distribution. Let Φ ( x ) be the cumulative distribution function of the standard normal distribution. (The standard normal distribution has µ  0, σ  1. Φ is the symbol generally used for this distribution function.) Suppose we are given that X is a normal random variable with mean µ, variance σ2 ; we will write X ∼ n ( µ, σ2 ) to describe X. And suppose we want to calculate the 95th percentile of X; in other words, we want a number x such that Pr ( X ≤ x )  0.95. We would reason as follows: Pr ( X ≤ x )  0.95

x−µ X−µ Pr ≤  0.95 σ σ

!

x−µ Φ  0.95 σ x−µ  Φ−1 (0.95) σ x  µ + σΦ−1 (0.95)

!

3.2. NORMAL APPROXIMATION

53

Note that Φ−1 (0.95)  1.645 is a commonly used percentile of the normal distribution, and is listed at the bottom of the table you get at the exam. You should internalize the above reasoning so you don’t have to write it out each time. Namely, to calculate a percentile of a random variable being approximated normally, find the value of x such that Φ ( x ) is that percentile. Then scale x: multiply by the standard deviation, and then translate x: add the mean. This method will be used repeatedly throughout the course. Example 3B A big fire destroyed a building in which 100 of your insureds live. Each insured has a fire insurance policy. The losses on this policy follow a Pareto distribution with α  3, θ  2000. Even though all the insureds live in the same building, the losses are independent. You are now setting up a reserve for the cost of these losses. Using the normal approximation, calculate the size of the reserve you should put up if you want to have a 95% probability of having enough money in the reserve to pay all the claims. Answer: The mean of each loss is

2000 2

 1000 and the variance is

E X 2 − E[X]2 

f

g

2 (20002 ) − 10002  3,000,000 2

The mean of the sum is the sum of the means, or (100)(1000 √ )  100,000. The variance of the sum is the sum of the variances, or 100 (3,000,000)  3 × 108 . σ  3 × 108  17,320.51. For a standard normal distribution, the 95th percentile is 1.645. We scale this by 17,320.51 and translate it by 100,000: 100,000 + 17,320.51 (1.645)  128,492.24 .  The normal approximation is sometimes called the large sample estimate, since it is based on the Central Limit Theorem, which describes the behavior of the distribution of a sample as its size goes to infinity.

Continuity correction When a discrete distribution is approximated with the normal distribution, a continuity correction is required. If the discrete distribution can assume values a and b but cannot assume values in between a and b, and you want the probability of being strictly above a, you estimate it with the probability that the normal variable is greater than ( a + b ) /2. Use the complement of that if you want the probability of being less than or equal to a. The same goes for b; if you want the probability of being strictly less than b (notice that this is identical to the probability that the variable is less than or equal to a), use the probability that the normal variable is less than ( a + b ) /2. If you want the probability that the variable is greater than or equal to b (which is identical to the probability that the variable is strictly greater than a), use the probability that the normal random variable is greater than ( a + b ) /2. Example 3C The distribution of loss sizes is Size

Probability

1000 1500 2500

0.50 0.25 0.25

Calculate the probability that the average of 100 losses is less than 1550 using the normal approximation. Answer: The mean is 1500. The variance is the second moment minus the mean squared: σ2  0.5 (10002 ) + 0.25 (15002 + 25002 ) − 15002  375,000 C/4 Study Manual—17th edition Copyright ©2014 ASM

3. VARIANCE

54

The variance of the sample mean is the variance of the distribution divided by 100, or 3750. Losses are always multiples of 500. Adding up 100 of them and dividing by 100, the average is a multiple of 5. Therefore, to calculate the probability of the average being less than 1550, we calculate the probability that the normal variable is less than 1547.5, the midpoint between the possible values for the mean, 1545 and 1550. ! 1547.5 − 1500  0.7823 Pr ( X < 1547.5)  Φ √  3750

3.3

Bernoulli shortcut

A Bernoulli distribution is one where the random variable is either 0 or 1. It is 1 with probability q and 0 with probability 1 − q. Its mean is q, and its variance is q (1 − q ) . These are in your tables, but you’ll use these so often you shouldn’t have to look it up. However, any random variable which can only assume two values is a scaled and translated Bernoulli. If X is Bernoulli and Y can only assume the values a and b, with a having probability q, then Y  ( a − b ) X + b. This means that the variance of Y is ( a − b ) 2 Var ( X )  ( a − b ) 2 q (1 − q ) . Remember this! To repeat, for any random variable which assumes only two values, the variance is the squared difference between the two values, times the probabilities of the two values. This is faster than slogging through a calculation of E[X] and E[X 2 ]. OK, so quickly—if a random variable is equal to 20 with probability 0.7 and 120 with probability 0.3, what’s its variance? (Answer below1) This shortcut will be used repeatedly throughout the course. Example 3D For a one-year term life insurance policy of 1000: (i) (ii) (iii) (iv)

The premium is 30. The probability of death during the year is 0.02. The company has expenses of 2. If the insured survives to the end of the year, the company pays a dividend of 3.

Ignore interest. Calculate the variance in the amount of profit the company makes on this policy. Answer: There are only two possibilities—either the insured dies or he doesn’t—so we have a Bernoulli here. We can ignore premium and expenses, since these don’t vary, so they generate no variance. Either the company pays 1000 (probability 0.02) or it pays 3 (probability 0.98). The variance is therefore

(1000 − 3) 2 (0.02)(0.98)  19,482.5764 .



A random variable which can only assume one of two values is called a two point mixture. We will learn about mixtures in Section 4.1.



1



EXERCISES FOR LESSON 3

55

Exercises 3.1. [4B-S93:9] (1 point) If X and Y are independent random variables, which of the following statements are true? 1.

Var ( X + Y )  Var ( X ) + Var ( Y )

2.

Var ( X − Y )  Var ( X ) + Var ( Y )

Var ( aX + bY )  a 2 E[X 2 ] − a (E[X]) 2 + b 2 E[Y 2 ] − b (E[Y]) 2

3.

(A) 1

(B) 1,2

(C) 1,3

(D) 2,3

(E) 1,2,3

3.2. [4B-F95:28] (2 points) Two numbers are drawn independently from a uniform distribution on [0,1]. What is the variance of their product? (A) 1/144

(B) 3/144

(C) 4/144

(D) 7/144

(E) 9/144

3.3. [4B-F99:7] (2 points) A player in a game may select one of two fair, six-sided dice. Die A has faces marked with 1, 2, 3, 4, 5 and 6. Die B has faces marked with 1, 1, 1, 6, 6, and 6. If the player selects Die A, the payoff is equal to the result of one roll of Die A. If the player selects Die B, the payoff is equal to the mean of the results of n rolls of Die B. The player would like the variance of the payoff to be as small as possible. Determine the smallest value of n for which the player should select Die B. (A) 1

(B) 2

(C) 3

(D) 4

(E) 5

3.4. [151-82-92:4] A company sells group travel-accident life insurance with b payable in the event of a covered individual’s death in a travel accident. The gross premium for a group is set equal to the expected value plus the standard deviation of the group’s aggregate claims. The standard premium is based on the following assumptions: (i) All individual claims within the group are mutually independent; and (ii) b 2 q (1 − q )  2500, where q is the probability of death by travel accident for an individual.

In a certain group of 100 lives, the independence assumption fails because three specific individuals always travel together. If one dies in an accident, all three are assumed to die. Determine the difference between this group’s premium and the standard premium. (A) 0 3.5.

(B) 15

(C) 30

(D) 45

(E) 60

You are given the following information about the random variables X and Y:

(i) Var ( X )  9 (ii) Var ( Y )  4 (iii) Var (2X − Y )  22

Determine the correlation coefficient of X and Y.

(A) 0

(B) 0.25

(C) 0.50

(D) 0.75

(E) 1

Exercises continue on the next page . . .

3. VARIANCE

56

3.6. [151-82-93:9] (1 point) For a health insurance policy, trended claims will be equal to the product of the claims random variable X and a trend random variable Y. You are given: (i) (ii) (iii) (iv) (v)

E[X]  10 Var ( X )  100 E[Y]  1.20 Var ( Y )  0.01 X and Y are independent

Determine the variance of trended claims. (A) 144

(B) 145

(C) 146

(D) 147

(E) 148

3.7. X and Y are two independent exponentially distributed random variables. You are given that Var ( X )  25 and Var ( XY )  7500. Determine Var ( Y ) . (A) 25

(B) 50

(C) 100

(D) 200

(E) 300

Solutions 3.1. The first and second statements are true by formula (3.1). The third statement should have squares on the second a and second b, since Var ( aX )  E[ ( aX ) 2 ] − E[aX]2  a 2 E[X 2 ] − a 2 E[X]2

for example. (B)

and the second moment is 13 . So

1 2

3.2. The mean of the uniform distribution is

Var ( XY )  E X 2 Y 2 − E[X]2 E[Y]2

f

1  3 

g

!

1 1 − 3 4

!

!

1 4

!

1 7 1 −  9 16 144

(D)

3.3. The variance of Die A is 1* 7 . 1− 6 2



2



+ 2−

7 2

2



+ 3−

7 2

2



+ 4−

7 2

2



+ 5−

,

7 2

2



+ 6−

7 2

2

+/  35 12 -

Die B is Bernoulli, with only two possible values of 1 and 6 with probabilities 12 , so the variance of one toss is 52

1 2 2



25 4 .

The variance of the mean is the variance of one toss over n (equation (3.2)). So 25 35 < 4n 12 140n > 300 n > 300/140 > 2

EXERCISE SOLUTIONS FOR LESSON 3

57

3.4. The number of fatal accidents for each life, N, has a Bernoulli distribution with mean q and variance q (1−q ) , so the variance in one life’s aggregate claims is the variance of bN. Var ( bN )  b 2 Var ( N )  b 2 q (1− q )  2500. For 100 independent lives, aggregate claims are 100bN, with variance 100 Var ( bN )  100 (2500) . For three lives always traveling together, aggregate claims are 3bN with variance 32 Var ( bN )  9 (2500) . If we add this to the variance of aggregate claims for the other 97 independent lives, the variance is 9 (2500) + 97 (2500)  106 (2500) . The expected value of aggregate claims, however, is no different from the expected value of the totally independent group’s aggregate claims. The difference in premiums is therefore

p

106 (2500) − 100 (2500)  14.7815

p

(B)

3.5. 22  Var (2X − Y )  4 (9) + 4 − 2 (2) Cov ( X, Y ) Cov ( X, Y )  4.5 4.5 ρ XY  √ √  0.75 9 4

(D)

3.6. E[XY]  (10)(1.20)  12 E ( XY ) 2  E[X 2 ]

f

g





E[Y 2 ]  102 + 100 1.202 + 0.01  290





Var ( XY )  290 − 122  146





(C)

3.7. For an exponential variable, the variance is the square of the mean. Let θ be the parameter for Y Var ( XY )  E[X 2 ] E[Y 2 ] − E[X]2 E[Y]2 7500  (25 + 25)(2θ 2 ) − 25θ 2  75θ 2

θ  10 Var ( Y )  θ 2  100

(C)

58

3. VARIANCE

Lesson 4

Mixtures and Splices Reading: Loss Models Fourth Edition 5.2.4–5.2.6, and Loss Models Fourth Edition 18.2 or SN C-21-01 2.4 and SN C-24-05 Appendix B or Introduction to Credibility Theory 5.2

4.1 4.1.1

Mixtures Discrete mixtures

A (finite) mixture distribution is a random variable X whose distribution function can be expressed as a weighted average of n distribution functions of random variables X i , i  1, . . . , n. In other words, FX ( x ) 

n X

w i FX i ( x )

i1

with the weights w i ≥ 0 adding up to 1. Since the density function is the derivative of the distribution function, the density function is the same weighted average of the individual density functions: fX ( x ) 

n X

w i fXi ( x )

i1

If discrete variables are mixed, the probabilities of the mixture are the weighted averages of the component probabilities. For example, suppose X is a mixture of an exponential distribution with mean 100 and weight 60% and an exponential distribution with mean 200 and weight 40%. Then the probability that X ≤ 100 is Pr ( X ≤ 100)  0.6 (1 − e −100/100 ) + 0.4 (1 − e −100/200 )  0.6 (0.6321) + 0.4 (0.3935)  0.5367 A mixture is not the same as a sum of random variables! The distribution function for a sum of random variables—even when they are identically distributed—is usually difficult to calculate. It is important not to confuse the two. Let’s consider the situations where each would be appropriate. A sum of random variables is an appropriate model for a situation where several distinct events occur, and you are interested in the sum. Each event may have the same distribution, or may not. Examples of sums of random variables are: 1. The total of a random sample of n items is a sum of random variables. For a random sample, the items are independent and identically distributed. 2. Aggregate loss on a policy with multiple coverages is a sum of random variables, one for each coverage. If a homeowner’s policy has coverage for fire, windstorm, and theft, the aggregate loss for each of these three coverages could have its own distribution X i , and then the aggregate loss for the entire policy would be X1 + X2 + X3 . These distributions would be different, and may or may not be independent. C/4 Study Manual—17th edition Copyright ©2014 ASM

59

4. MIXTURES AND SPLICES

60

A mixture distribution is an appropriate model for a situation where a single event occurs. However, the single event may be of many different types, and the type is random. For example, let X be the cost of a dental claim. This is a single claim and X has a distribution function. However, this claim could be for preventative work (cleaning and scaling), basic services (fillings), or major services (crowns). Each type of work has a distribution X i . If 40% of the claims are for preventative work, 35% for basic services, and 25% for major services, then the distribution of X will be a weighted average of the distributions of the 3 X i ’s: FX ( x )  0.4FX1 ( x1 ) + 0.35FX2 ( x 2 ) + 0.25FX3 ( x3 ) . It is not true that X  0.4X1 + 0.35X2 + 0.25X3 . In fact, the type of work that occurred is random; it is not true that every claim is 40% preventative, 35% basic, and 25% major. If that were true, there would be less variance in claim size! Since a mixture is a single random variable, it can be used as a model even when there is no justification as given in the last paragraph, if it fits the data well. For calculating means, the mean of a mixture is the weighted average of the means of the components. Since the densities are weighted averages and the expected values are integrals of densities, the expected value of a mixture is the weighted average of the expected values of the components. This is true for any raw moment, not just the first moment. But this immediately implies that the variance of a mixture is not the weighted average of the variances of its components. You must compute the variance by computing the second moment and then subtracting the square of the mean. Example 4A Losses on an auto liability coverage follow a distribution that is a mixture of two Paretos. Each distribution in the mixture has equal weight. One distribution has parameters α  3 and θ  1000, and the other has parameters α  3 and θ  10,000. Calculate the variance of a loss. Answer: Let X be loss size. We have E[X]  0.5

10,000 1000 + 0.5  2750 2 2

!

!

E[X 2 ]  0.5 10002 + 0.5 10,0002  50,500,000









Var ( X )  E[X 2 ] − E[X]2  50,500,000 − 27502  42,937,500



Example 4B The severity distribution for losses on an auto collision coverage is as follows: 2000 F ( x )  1 − 0.7 2000 + x

!3

7500 − 0.3 7500 + x

!4

x≥0

Calculate the coefficient of variation of loss size. Answer: This distribution is a mixture of two Pareto distributions, the first with parameters α  3, θ  2000 and the second with α  4, θ  7500. We calculate the first two moments: 7500 2000 E[X]  0.7 + 0.3  1450 2 3

!

!

2 (2000) 2 2 (75002 ) E[X ]  0.7 + 0.3  8,425,000 (2)(1) (3)(2) 2

!

!

Var ( X )  E[X 2 ] − E[X]2  8,425,000 − 14502  6,322,500 It then follows that the coefficient of variation is √ 6,322,500 CV   1.7341 1450 C/4 Study Manual—17th edition Copyright ©2014 ASM



4.1. MIXTURES

61

Example 4C On an auto collision coverage, there are two classes of policyholders, A and B. 70% of drivers are in class A and 30% in class B. The means and variances of losses for the drivers are: Class

Mean

Variance

A B

300 800

30,000 50,000

A claim is submitted by a randomly selected driver. Calculate the variance of the size of the claim. Answer: This is a mixture situation—a single claim, with probabilities of being one type or another. Let X be claim size. E[X]  0.7 (300) + 0.3 (800)  450 E[X 2 ]  0.7 (30,000 + 3002 ) + 0.3 (50,000 + 8002 )  291,000 Var ( X )  291,000 − 4502  88,500

4.1.2



Continuous mixtures

So far we’ve discussed discrete mixtures. It is also possible for mixtures to be continuous. This means that the distribution function of the mixture is an integral of parametric distribution functions of random variables, and a parameter varies according to a distribution function. The latter distribution is called a mixing distribution. One example which we’ll discuss in detail in Lesson 12 is a random variable based on a Poisson distribution with parameter λ, where λ varies according to a gamma distribution with parameters α and θ. Here is another example. Example 4D The number of losses on a homeowner’s policy is binomially distributed with parameters m  5 and q. The parameter q varies by policyholder uniformly between 0 and 0.4. Calculate the probability of 2 or more losses for a policyholder. Answer: For a single policyholder, Pr ( X  0 | q )  (1 − q ) 5

Pr ( X  1 | q )  5q (1 − q ) 4 To calculate the probability for a randomly selected policyholder, we integrate over q using the uniform 1 density function, which here is 0.4 , as the weight. 1 Pr ( X  0)  0.4

0.4

Z 0

(1 − q ) 5 dq 0.4

1 (1 − q ) 6 0.4 6 0 1  (1 − 0.66 ) 2.4 5  12 (1 − 0.66 ) −

Pr ( X  1)  C/4 Study Manual—17th edition Copyright ©2014 ASM

5 0.4

0.4

Z 0

q (1 − q ) 4 dq

4. MIXTURES AND SPLICES

62

This is easier to integrate by substituting u  1 − q. 

5 0.4

1

Z

0.6

(1 − u ) u 4 du 1

u 5 u 6 − 5 6 0.6 ! ! 1 − 0.65 1 − 0.66  12.5 − 12.5 5 6

!

 12.5

 2.5 (1 − 0.65 ) −

Pr ( X  0) + Pr ( X  1)  2.5 (1 − 0.65 ) −

25 12 (1 20 12 (1

− 0.66 ) − 0.66 )

 2.3056 − 1.5889  0.7167

Therefore, the probability of 2 or more losses is the complement of 0.7167, which is 1 − 0.7167  0.2833 .

4.1.3

Frailty models

A special type of continuous mixture is a frailty model. These models can be used to model loss sizes or survival times. However, the following discusses frailty models only in the context of survival times. Suppose the hazard rate for each individual is h ( x | Λ)  Λa ( x ) , where a ( x ) is some continuous function and the multiplier Λ varies by individual. Thus the shape of the hazard rate function curve does not vary by individual. If you are given that A’s hazard rate is twice B’s at time 1, that implies Λ for A is twice Λ for B. That in turn implies that A’s hazard rate is twice B’s hazard R x rate at all times. Assume that h ( x )  0 for x < 0. Recall from page 4 that H ( x )  0 h ( t ) dt, and that the survival function can be expressed as S ( x )  e −H ( x ) Now let A ( x ) 

R

x 0

a ( t ) dt. Then H ( x | Λ) 

R

x 0

Λa ( t ) dt  ΛA ( x ) and

S ( x | Λ)  e −H ( x|Λ)  e −ΛA ( x )

(∗)

By definition, S ( x )  Pr ( X > x ) , so by the Law of Total Probability (page 9) S ( x )  Pr ( X > x ) 

Z 0

Pr ( X > x | λ ) f ( λ ) dλ  EΛ Pr ( X > x | Λ)  E S ( x | Λ)

f

g

Plugging in S ( x | Λ) from (∗), the unconditional or marginal survival rate S ( x ) is SX ( x )  EΛ S ( x | Λ)  EΛ e −ΛA ( x )  MΛ −A ( x )

f

g

f

g





f

g

(4.1)

where M ( x ) is the moment generating function. In a frailty model, typical choices for the conditional hazard rate given Λ are: • Constant hazard rate, or exponential. This can be arranged by setting a ( x )  1 (or a ( x )  k for any constant k). • Weibull, which can be arranged by setting a ( x )  γx γ−1 Typical choices for the distribution of Λ are gamma and inverse Gaussian, the only distributions (other than exponential, which is a special case of gamma) for which the distribution tables list the moment generating function. Frailty models rarely appear on exams. If they do appear, I would not expect it to be labeled as a “frailty model”, nor would I expect the specific notation (such as a ( x ) ) to be used. Instead, you would be given an appropriate hazard rate conditional on a parameter and a distribution for the parameter. C/4 Study Manual—17th edition Copyright ©2014 ASM

4.2. CONDITIONAL VARIANCE

63

Example 4E For a population following a frailty model, you are given (i) a ( x )  1 (ii) Λ has a gamma distribution with α  0.2 and θ  0.1. For a randomly selected individual from the population: 1. Calculate the probability of surviving to 70. 2. Calculate mean survival time and the variance of survival time. Answer:

1. Use of a ( x )  1 leads to an exponential model. We have x

Z A(x )  S ( x | Λ)  e

1dx  x

0 −ΛA ( x )

 e −Λx

The moment generating function for a gamma (which appears in the distribution tables) is M ( t )  (1 − √5 θt ) −α . In this example, MΛ ( t )  1/ 1 − 0.1t. Then S (70)  MΛ (−70) 1  √5 1 − 0.1 (−70) 1  √5  0.659754 8 2. By equation (4.1), the survival function is 1 S ( x )  M (−x )  1 + 0.1x

! 0.2

10  10 + x

! 0.2

which we recognize as a two-parameter Pareto with α  0.2, θ  10. For such a distribution, all the moments are infinite, so in particular the mean and variance are infinite.  If a gamma Λ is used in conjunction with a Weibull a ( x ) (instead of an exponential, as used in the previous example), then the model has a Burr (instead of a Pareto) distribution.

4.2

Conditional Variance

In Section 1.3, we discussed the conditional mean formula. A useful formula for conditional variance can be developed by calculating the second moment and subtracting the first moment squared using that formula: Var ( X )  E[X 2 ] − E[X]2

 E E[X 2 | I] − E E[X | I]

f

g

f

g2

 E E[X 2 | I] − E[X | I]2 + E E[X | I]2 − E E[X | I]

f

g

 f

 E Var ( X | I ) + Var E[X | I]

f

g





g

f

g 2

4. MIXTURES AND SPLICES

64

We’ve derived the conditional variance formula. Conditional Variance Formula VarX ( X )  VarI EX [X | I] + EI VarX ( X | I )





f

g

(4.2)

This formula is also known as a double expectation formula. This is a very important equation which will be used repeatedly throughout the course. It is especially useful for calculating variance of mixtures. In a mixture, the condition I will indicate the component of the mixture. Let’s apply conditional variance to Example 4C. Example 4C (Repeated for convenience) On an auto collision coverage, there are two classes of policyholders, A and B. 70% of drivers are in class A and 30% in class B. The means and variances of losses for the drivers are: Class

Mean

Variance

A B

300 800

30,000 50,000

A claim is submitted by a randomly selected driver. Calculate the variance of the size of the claim. Answer: Let X be claim size. Let the indicator variable I be the class. It has a probability of 0.7 of being class A and 0.3 of being class B. I is Bernoulli, so we’ll apply the previous section’s shortcut. E Var ( X | I )  0.7 (30,000) + 0.3 (50,000)  36,000

f

g

Var E[X | I]  (0.7)(0.3)(800 − 300) 2  52,500





Var ( X )  36,000 + 52,500  88,500



Example 4F Claim sizes range between 0 and 1500. The probability that a claim is no greater than 500 is 0.8. Claim sizes are uniformly distributed on (0, 500] and on (500, 1500]. Calculate the coefficient of variation of claim sizes. Answer: Let X be claim size. We will condition on the claim size being greater than or less than 500. Let I be 0 if claim size is in (0, 500] and 1 if claim size is in (500, 1500]. The mean claim size given that it is no greater than 500 is 250, since the mean of a uniform is the midrange. Similarly, mean claim size given that it is greater than 500 is 1000. By the conditional mean formula f g E[X]  E E[X | I]  0.8 (250) + 0.2 (1000)  400

For a uniform distribution, the variance is the range squared divided by 12. So the variance of claim size given that it is no greater than 500 is 5002 /12 and the variance of claim size given that it is greater than 500 is 10002 /12. Therefore E[Var ( X | I ) ]  0.8 (5002 /12) + 0.2 (10002 /12)  33,333 31

The variance of the expected values can be calculated with the Bernoulli shortcut. Var (E[X | I])  (0.8)(0.2)(1000 − 250) 2  90,000 C/4 Study Manual—17th edition Copyright ©2014 ASM

4.2. CONDITIONAL VARIANCE We conclude that Var ( X )

q

65

33,333 13 + 90,000



123,333 31 , and the coefficient of variation is



123,333 13 400  0.8780 .

.

You could also calculate the variance as the second moment minus the first moment squared. The second moment given that claims are in (0, 500] is 5002 /3, and the second moment given that claims are in (500, 1500] is the variance of the uniform plus the mean squared, or 10002 /12 + 10002 . Therefore E[X 2 ]  0.8 (5002 /3) + 0.2 (10002 /12 + 10002 )  283,333 13 The variance is 283,333 13 − 4002  123,333 13 , the same as calculated above.



Let’s do an example where I is continuous.

Example 4G (Same situation as Example 4D.) The number of losses on a homeowner’s policy is binomially distributed with parameters m  5 and Q. Q varies by policyholder uniformly between 0 and 0.4. Calculate the variance of the number of losses for a randomly selected policyholder. Answer: The indicator variable here is Q. Using the Loss Models appendix tables, given Q, f the expected g number of losses is 5Q and the variance is 5Q (1 − Q ) . Then Var (5Q )  25 Var ( Q ) , and E 5Q (1 − Q )  5 E[Q] − 5 E Q 2 . For a random variable U having a uniform distribution on [0, 1], the mean is 12 , the 1 second moment is 31 , and the variance is 12 . Since Q  0.4U,

f

g

E[Q]  E[0.4U]  0.4

1 2

 0.2

E[Q 2 ]  E (0.4U ) 2  0.16

f

g

Var ( Q )  Var (0.4U ) 

1 3



0.16 4  12 300

16 300

So, letting N be the number of losses random variable, E Var ( N | Q )  5 E[Q] − 5 E[Q 2 ]  1 −

f

g

Var E[N | Q]  25





Var ( N ) 

?

1 4  300 3

80 11  300 15

!

11 1 16 +  15 3 15



Quiz 4-1 The random variable S is defined by S

50 X

Xi

i1

The variables X i follow a Poisson distribution with mean θ, and are independent given θ. The parameter θ is uniformly distributed on [1, 3]. Calculate the variance of S.

4. MIXTURES AND SPLICES

66

4.3

Splices

Another way of creating distributions is by splicing them. This means using different probability distributions on different intervals in such a way that the total probability adds up to 1. For example, suppose larger loss sizes appear “Pareto”-ish, but smaller loss sizes are pretty uniform. You would like to build a model with a level density function for losses below 100, and then a declining density function starting at 100 which looks like a Pareto. Think a little bit about how you could do this. You are going to use a distribution with a constant density function below 100, and a distribution function which looks like a Pareto above 100. By “looking like a Pareto”, we mean it’ll be of the form f (x ) 

b 2 αθ α ( θ + x ) α+1

where b 2 is a constant that will make things work out right. You may have decided on what α and θ should be. Let’s say α  2 and θ  500. Then the spliced distribution you will use has the form

   b1 f (x )   b (2) 5002   2  (500 + x ) 3

x < 100 x > 100

How would you pick b 1 and b2 ? One thing is absolutely necessary: the total probability, the probability of being less than 100 plus the probability of being greater than 100, must add up to 1. For a spliced distribution, the sum of the probabilities of being in each splice must add up to 1. In our example, let’s say that one third of all losses are below 100. Then you would want F (100)  13 . You would set b 1  1/300 so that

R

100 0

b1 dx  31 . Without b 2 , the Pareto distribution would have

500 1 − F (100)  500 + 100

!2 

25 36

However, you want 1 − F (100)  1 − 31  32 . Therefore, you would scale the Pareto distribution down by 2/3 setting b 2  25/36  0.96. The spliced distribution has density function and distribution function

   1/300 f (x )   0.96 (2) 5002    (500 + x ) 3

x < 100 x > 100

 x/300    !2 F (x )   500    1 − 0.96 500 + x 

x < 100 x > 100

These are graphed in Figure 4.1. Notice that the density function is not continuous. Continuity of the density function is not a requirement of splicing. It may, however, be desirable. In the above example, suppose that we want the density function to be continuous. To allow this to happen, we will not specify the percentage of losses below 100, but will select it to make the density continuous. Then we would have to equate f (100) for the uniform distribution, which is b1 , to f (100) for the Pareto distribution, which is 2b 2 (5002 ) /6003 . The other condition on b1 and b2 was developed above: since 100b 1 is the probability of being below 100, we have ! 25 b2 100b 1  1 − 36 C/4 Study Manual—17th edition Copyright ©2014 ASM

4.3. SPLICES

67

0.004

1

0.0035

0.8

0.003 0.0025

0.6

0.002 0.4

0.0015 0.001

0.2

0.0005 0

0

25

50

75 100 125 150 175 200

0

(a) Density function

0

25

50

75 100 125 150 175 200

(b) Distribution function

Figure 4.1: Spliced distribution with 1/3 weight below 100

Substituting the density equality for b1 and solving, we get

2b 2 (5002 ) 25 100 1− b2 3 36 600

!

200 600

25 25 b2  1 − b2 36 36

!

25 36

!

!

!

!

4 b2  1 3

!

b2  1.08

It follows that b 1  2 (1.08)

5002  6003

 1/400 and F (100)  14 . The density and distribution functions are

1   x < 100    400 f (x )    1.08 (2) 5002    x > 100  (500 + x ) 3 x  x < 100    400  !  2 F (x )   500    1 − 1.08 x > 100 500 + x 

These are graphed in Figure 4.2. C/4 Study Manual—17th edition Copyright ©2014 ASM

4. MIXTURES AND SPLICES

68

0.004

1

0.0035

0.8

0.003 0.0025

0.6

0.002 0.4

0.0015 0.001

0.2

0.0005 0

0

25

50

0

75 100 125 150 175 200

(a) Density function

0

25

50

75 100 125 150 175 200

(b) Distribution function

Figure 4.2: Continuous spliced distribution

?

Quiz 4-2 The distribution of X is a spliced distribution. You are given: (i) Pr ( X ≤ 0)  0.

(ii) For 0 < x ≤ 100, the distribution of X is uniform.

(iii) F (100)  1/3.

(iv) For x > 100, the density function of X is f (x ) 

θ (θ + x )2

Determine θ. The textbook gives a formal definition of a spliced distribution as one whose density function is a weighted sum of density functions, with density function j having support (that is, nonzero) only on interval ( c j−1 , c j ) , with the intervals ( c j−1 , c j ) disjoint, and weights a j adding up to 1. Thus, in our first example with a uniform distribution below 100 and Pr ( X < 100)  31 , and a Pareto distribution above 100, the textbook would say that the interval (0, 100) had a uniform distribution on (0, 100) with density 1 f1 ( x )  100 and weight a1  13 , and a distribution defined by f2 ( x ) 

2 (5002 ) / (500 + x ) 3 25/36

x > 100

with weight a 2  32 on (100, ∞) . This means that every splice is a discrete mixture! It’s a mixture of functions defined on disjoint intervals. If the functions are familiar, you may be able to use the tables or your knowledge of the functions to evaluate moments. C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISES FOR LESSON 4

69

Example 4H X follows a spliced distribution with the following density and distribution functions: 1   x < 100    400  f (x )   1.08 (2) 5002    x > 100  (500 + x ) 3 x  x < 100    400  !2 F (x )    500    1 − 1.08 x > 100 500 + x  Calculate the mean of X. Answer: X can be considered a mixture of a uniform distribution on [0, 100] with weight 1/4 and a shifted Pareto distribution with weight 3/4. The shifted Pareto distribution is shifted by 100, so set y  x − 100.

Since S ( x )  1.08 500/ (500 + x ) divided by 3/4.



2

for x > 100, the conditional survival function is this survival function

4 500 S ( x | X > 100)  (1.08) 3 500 + x

!2

5  1.44 6

!2

600 600 + ( x − 100)

!2

600  600 + y

!2

which is a Pareto with θ  600, α  2. The mean of the shifted Pareto is 100 plus the mean of the unshifted Pareto, so   600 E[X]  0.25 (50) + 0.75 100 +  537.5  2−1 Notice that Example 4F is a splice of two uniform distributions.

Exercises Mixtures 4.1. The random variable X is a mixture of an exponential distribution with mean 5, with weight 23 , and an inverse exponential distribution with θ  5, with weight 13 . Let H ( x ) be the cumulative hazard function of X.

Calculate H (3) .

Exercises continue on the next page . . .

4. MIXTURES AND SPLICES

70

4.2. You are given the following information about a portfolio of insurance risks: •

There are three classes of risks: A, B, and C.

The number of risks in each class, and the mean and standard deviation of claim frequency for each class, are given in the following chart:

Class

Number of Risks

A B C

500 300 200

Claim Frequency Standard Mean Deviation 0.10 0.12 0.15

0.20 0.25 0.35

Determine the standard deviation of claim frequency for a risk randomly selected from the portfolio. (A) (B) (C) (D) (E)

Less than 0.240 At least 0.240, but less than 0.244 At least 0.244, but less than 0.248 At least 0.248, but less than 0.252 At least 0.252

Use the following information for questions 4.3 and 4.4: For a group of 1000 policyholders in three classes, you are given: Number of policyholders

Mean loss

Standard deviation of loss

500 300 200

10 20 30

12 30 60

4.3. The number of claims submitted by each policyholder is identically distributed for all policyholders. 1000 claims are submitted from this group. Using the normal approximation, calculate x such that there is a 95% probability that the sum of the claims is less than x. 4.4. Each policyholder submits one claim. Using the normal approximation, calculate x such that there is a 95% probability that the sum of the claims is less than x.

Exercises continue on the next page . . .

EXERCISES FOR LESSON 4

71

4.5. You are given a portfolio of 100 risks in two classes, A and B, each having 50 risks. The losses of the risks in class A have a mean of 10 and a standard deviation of 5. For the entire portfolio, the mean loss is 20 and the standard deviation is 15. Calculate the standard deviation of losses for risks in class B. (A) (B) (C) (D) (E)

Less than 9 At least 9, but less than 13 At least 13, but less than 17 At least 17, but less than 21 At least 21

4.6. Losses for an insurance coverage follow a distribution which is a mixture of an exponential distribution with mean 10 with 75% weight and an exponential distribution with mean 100 with 25% weight. Calculate the probability that a loss is greater than 50. 4.7. Losses for an insurance coverage follow a distribution which is a mixture of an exponential distribution with mean 5 and an exponential distribution with mean θ. The mean loss size is 7.5. The variance of loss size is 75. Determine the coefficient of skewness of the loss distribution. 4.8.

For a liability coverage, you are given

(i) Losses for each insured follow an exponential distribution with mean γ. (ii) γ varies by insured. (iii) γ follows a single-parameter Pareto distribution with parameters α  1 and θ  1000. Calculate the probability that a loss will be less than 500. 4.9. [151-82-93:11] (2 points) A population is equally divided into two classes of drivers. The number of accidents per individual driver is Poisson for all drivers. For a driver selected at random from Class 1, the expected number of accidents is uniformly distributed over (0.2, 1.0) . For a driver selected at random from Class 2, the expected number of accidents is uniformly distributed over (0.4, 2.0) . For a driver selected at random from this population, determine the probability of zero accidents. (A) 0.41

(B) 0.42

(C) 0.43

(D) 0.44

(E) 0.45

Frailty models 4.10.

For a random variable X, you are given:

(i) h ( x | Λ)  2Λx (ii) Λ has an exponential distribution with mean 0.5. Calculate E X 2 | Λ  0.49 .

f

g

Exercises continue on the next page . . .

4. MIXTURES AND SPLICES

72

4.11. (i) (ii)

For a random variable X, you are given: h ( x | Λ)  Λ Λ follows an inverse Gaussian distribution with µ  2, θ  1.

Calculate F (0.5) . 4.12. (i) (ii)

Survival time X for a population follows a distribution having the following properties: h ( x | Λ)  Λx 2 Λ follows an exponential distribution with mean 0.05.

Calculate median survival time for the population. 4.13. Survival time X for a population of 100-year olds follows a Weibull distribution with the following hazard rate function: h ( x | Λ)  31 Λx −2/3 Λ varies over the population with a gamma distribution having parameters α  5, θ  0.5. Calculate the marginal expected future lifetime for the population. Conditional variance

4.14.

The size of a loss has mean 3λ and variance λ 2 . λ has the following density function: f ( x )  0.00125



4000 x

6

x ≥ 4000.

Calculate the variance of the loss. (A) (B) (C) (D) (E)

Less than 22,000,000 At least 22,000,000, but less than 27,000,000 At least 27,000,000, but less than 32,000,000 At least 32,000,000, but less than 37,000,000 At least 37,000,000

Exercises continue on the next page . . .

EXERCISES FOR LESSON 4

73

[4B-S95:14] (3 points) You are given the following:

4.15. •

For a given risk, the number of claims for a single exposure period will be 1, with probability 3/4; or 2, with probability 1/4.

If only one claim is incurred, the size of the claim will be 80, with probability 2/3; or 160, with probability 1/3.

If two claims are incurred, the size of each claim, independent of the other, will be 80, with probability 1/2; or 160 with probability 1/2. Determine the variance of the pure premium1 for this risk.

(A) (B) (C) (D) (E)

Less than 3600 At least 3600, but less than 4300 At least 4300, but less than 5000 At least 5000, but less than 5700 At least 5700 [4B-F98:8] (2 points) You are given the following:

4.16. •

A portfolio consists of 75 liability risks and 25 property risks.

The risks have identical claim count distributions.

Loss sizes for liability risks follow a Pareto distribution with parameters θ  300 and α  4.

Loss sizes for property risks follow a Pareto distribution with parameters θ  1,000 and α  3. Determine the variance of the claim size distribution for this portfolio for a single claim.

(A) (B) (C) (D) (E)

Less than 150,000 At least 150,000, but less than 225,000 At least 225,000, but less than 300,000 At least 300,000, but less than 375,000 At least 375,000

4.17. The number of claims on a policy has a Poisson distribution with mean P. P varies by policyholder. P is uniformly distributed on [1, 2]. Calculate the variance of the number of claims. (A) 3/2

(B) 19/12

(C) 5/3

(D) 7/4

(E) 23/12

4.18. Claim size is exponentially distributed with mean λ. λ varies by insured, and follows a Pareto distribution with parameters α  5 and θ. Variance of claim size is 9.75. Determine θ. (A) 4

(B) 5

(C) 6

(D) 7

(E) 8

Exercises continue on the next page . . .

4. MIXTURES AND SPLICES

74

[4B-F92:23] (2 points) You are given the following:

4.19. •

A portfolio of risks consists of 2 classes, A and B.

For an individual risk in either class, the number of claims follows a Poisson distribution.

Class A B Total Portfolio

Number of Exposures 500 500 1,000

Distribution of Claim Frequency Standard Mean Deviation 0.050 0.227 0.210 0.561

Determine the standard deviation of the claim frequency for the total portfolio. (A) (B) (C) (D) (E)

Less than 0.390 At least 0.390, but less than 0.410 At least 0.410, but less than 0.430 At least 0.430, but less than 0.450 At least 0.450

4.20. [1999 C3 Sample:10] An insurance company is negotiating to settle a liability claim. If a settlement is not reached, the claim will be decided in the courts 3 years from now. You are given: •

There is a 50% probability that the courts will require the insurance company to make a payment. The amount of the payment, if there is one, has a lognormal distribution with mean 10 and standard deviation 20.

In either case, if the claim is not settled now, the insurance company will have to pay 5 in legal expenses, which will be paid when the claim is decided, 3 years from now.

The most that the insurance company is willing to pay to settle the claim is the expected present value of the claim and legal expenses plus 0.02 times the variance of the present value.

Present values are calculated using i  0.04. Calculate the insurance company’s maximum settlement value for this claim.

(A) 8.89 4.21. (i) (ii) (iii) (iv) (v)

(B) 9.93

(C) 12.45

(D) 12.89

(E) 13.53

[151-83-94:6] (2 points) For number of claims N and aggregate claims S, you are given: Pr ( N  i )  13 , i  0, 1, 2; E[S | N  1]  3; E[S | N  2]  9; Var ( S | N  1)  9; and Var ( S | N  2)  18.

Determine Var ( S ) . (A) 19

(B) 21

(C) 23

(D) 25

(E) 27

Exercises continue on the next page . . .

EXERCISES FOR LESSON 4

75

Splices 4.22. Loss sizes follow a spliced distribution. The probability density function of the spliced distribution below 500 is the same as the probability density function of an exponential distribution with parameter θ  250. The probability density function of the spliced distribution above 500 is a multiple, a, of the probability density function of a Weibull distribution with parameters τ  2, θ  400. Determine a. 4.23. Loss sizes follow a spliced distribution. Losses below 200 are uniformly distributed over (0, 200]. The probability density function of the spliced distribution above 200 is a multiple of the probability density function of an exponential distribution with parameter θ  400. The probability density function is continuous at 200. Calculate the probability that a loss will be below 200. 4.24. Loss sizes follow a spliced distribution. The probability density function of this distribution below 200 is a multiple a of the probability density function of an exponential distribution with θ  300. The probability density function above 200 is the same as for an exponential distribution with θ  400. Let X be loss size. Calculate Pr ( X < 100) . 4.25. Loss sizes follow a spliced distribution. The probability density function of the spliced distribution below 100 is the same as that of a lognormal distribution with parameters µ  3, σ  2. The probability density function of the spliced distribution above 100 is a times the probability density function of a twoparameter Pareto distribution with parameters α  2, θ  300. Calculate the probability that a loss will be greater than 200. 4.26. Loss sizes follow a spliced distribution. The probability density function of this distribution below 500 is a multiple of the probability density function of an exponential with θ  250. The probability density function of the spliced distribution above 500 is a multiple of the probability density function for a single-parameter Pareto distribution with α  3, θ  500. Half of the losses are below 500. Calculate the expected value of a loss. 4.27.

The random variable X has the following spliced distribution: x   0 ≤ x ≤ 100   160 F (x )    1 − 0.375e −( x−100)/200 x > 100 

Calculate Var ( X ) . Additional released exam questions: SOA M-S05:34, SOA M-F05:35, CAS3-F06:18,19,20, SOA M-F06:39, C-S07:3

4. MIXTURES AND SPLICES

76

Solutions 4.1. We first write down the survival function: S ( x )  32 ( e −x/5 ) + 31 (1 − e −5/x ) and then calculate its logarithm at 3: H (3)  − ln S (3)  − ln



2 −3/5 ) 3 (e

+ 31 (1 − e −5/3 )



 − ln 0.6362  0.4522 4.2. This is a mixture distribution. The first moment is

E[X]  0.5 (0.10) + 0.3 (0.12) + 0.2 (0.15)  0.116 The second moment is E[X 2 ]  0.5 (0.102 + 0.202 ) + 0.3 (0.122 + 0.252 ) + 0.2 (0.152 + 0.352 )  0.07707 √ The variance is 0.07707 − 0.1162  0.063614. The standard deviation is 0.063614  0.252218 . (E)

4.3. Since there are 1000 random claims “from the group”, this is a sum of 1000 mixture random variables. The random variable for a single claim is a mixture with mean 0.5 (10) + 0.3 (20) + 0.2 (30)  17 and second moment

0.5 (244) + 0.3 (1300) + 0.2 (4500)  1412 172

so the variance is 1412 −  1123. The mean and variance are multiplied by 1000 for 1000 claims, and the normal approximation requires x  1000 (17) + 1.645 1000 (1123)  18,743.23

p

4.4. Now we have a sum, not a mixture. The mean of the 1000 claims is the same as before, although technically it’s calculated as E[X]  500 (10) + 300 (20) + 200 (30)  17,000 The variance is the sum of the variances, or Var ( X )  500 (122 ) + 300 (302 ) + 200 (602 )  1,062,000 The normal approximation requires x  17,000 + 1.645 1,062,000  18,695.23

p

The variance is lower than in the previous exercise (where it is 1123 per claim, or 1,123,000 for 1000 claims) because the uncertainty on who is submitting the claims has been removed.

EXERCISE SOLUTIONS FOR LESSON 4

77

4.5. Let µ B be the mean loss for class B, and let σB2 be the variance of loss for class B. Since the classes are equal in size, the raw moments of loss for the entire portfolio are equally weighted raw moments of the losses for each class. The first moment of losses for the entire portfolio is 20  21 (10 + µ B ) , so µ B  30. The second moment of losses for the entire portfolio is 1 2 1 625  2

202 + 152 



102 + 52 + 302 + σB2



1025 + σB2





from which it follows that σB  15 . (C) 4.6. The probability that a loss is greater than 50 is the weighted average of the probability that each component of the mixture is greater than 50. Let X be a loss. If X1 is exponential with mean 10 and X2 is exponential with mean 100, then Pr ( X1 > 50)  e −50/10  0.006738 Pr ( X2 > 50)  e −50/100  0.606531 Pr ( X > 50)  0.75 (0.006738) + 0.25 (0.606531)  0.1567 4.7. Let w be the weight of the exponential with mean θ. The second moment is 75 + 7.52  131.25. From equating the first moments, 5 (1 − w ) + θw  7.5

5 − 5w + θw  7.5

w ( θ − 5)  2.5

(*)

From equating the second moments, 50 (1 − w ) + 2wθ 2  131.25 −50w + 2wθ 2  81.25 2w ( θ 2 − 25)  81.25

Dividing the first moment equation into the second moment equation eliminates w: 81.25 2.5 81.25 θ − 5  11.25 2.5 (2)

2 ( θ + 5) 

Plugging into (*), w (6.25)  2.5 2.5 w  0.4 6.25 To calculate skewness, we only need E[X 3 ], since we already know the first and second moments and the variance. For an exponential, E[X 3 ]  6θ 3 , so E[X 3 ]  6 0.6 (53 ) + 0.4 (11.253 )  3867.1875





4. MIXTURES AND SPLICES

78

E[X 3 ] − 3 E[X 2 ]µ + 2µ3 σ3 3867.1875 − 3 (131.25)(7.5) + 2 (7.53 )  753/2 3867.1875 − 2953.125 + 843.75  2.70633  753/2

γ1 

4.8. The conditional probability of a loss less than 500 given γ is F (500 | γ )  1 − e −500/γ . We integrate this over γ, using the density function of a single-parameter Pareto. Pr ( X < 500) 

Z



1000

1 − e −500/γ

 1000 γ2

We evaluate the integral. 1000 ∞ − 2e −500/γ 1000 γ 1000  −2 + + 2e −500/1000 1000  −1 + 2e −1/2  0.2131

Pr ( X < 500)  −

4.9. The probability of 0 accidents is e −λ . We integrate this over the uniform distribution for each class: 1 Class 1: 0.8

Z

1

1 Class 2: 1.6

Z

0.2 2 0.4

e −λ dλ 

e −0.2 − e −1  0.5636 0.8

e −λ dλ 

e −0.4 − e −2  0.3344 1.6

The average is 21 (0.5636 + 0.3344)  0.4490 . (E)

2

2

4.10. Since a ( x )  2x here, A ( x )  x 2 and S ( x | Λ  0.49)  e −0.49x  e − (0.7x ) , so this is a Weibull with θ  1/0.7 and τ  2. According to the tables, the second moment is 1  2.0408 0.72

θ 2 Γ (1 + 2/τ ) 

4.11.

Since a ( x )  1 here, A(x )  x S (0.5)  MΛ (−0.5)

*θ  exp .. *1 − µ , ,

r

*1  exp . *1 − 2

r

1−

+ 2µ2 (−0.5) +// θ --

1−

2 ( 22 ) (−0.5) ++/ 1

-, , √  −0.61803  0.53900  exp 0.5 (1 − 5)  e The cumulative distribution function is F (0.5)  1 − S (0.5)  0.46100 . C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 4

4.12.

79

H ( x | Λ)  Λx 3 /3, so S ( x | Λ)  e −Λx

3 /3

S ( x )  EΛ [e −Λx

3 /3

−x 3

 MΛ

]

!

3 1 1   1 + 0.05 ( x 3 /3) 1 + x 3 /60 √3 The unconditional distribution of X is Burr with α  1, γ  3, θ  60. Later on (Lesson 8), we’ll learn that the tables list percentiles for the distributions under VaRp (this is the 100p th percentile), so we can look up VaR0.5 ( X ) . Otherwise, calculate the median m from first principles: S (m ) 1 1 + m 3 /60 m3 60 m

4.13.

 0.5  0.5 1 √3  60  3.9149

This is a frailty model with a ( x )  31 x −2/3 , or a Weibull model, and x

Z A(x ) 

0

1 −2/3 dt 3t

 x 1/3

By equation (4.1), the marginal survival function is S ( x )  EΛ S ( x | Λ)  EΛ e −Λx

f

g

 MΛ −x 1/3



 1 + θx 1/3



1/3

g



 −α

 1 + 0.5x 1/3



f

 −5

1  1 + (0.125x ) 1/3

!5

This is the survival function for a Burr distribution with θ  8, α  5, γ  1/3. Based on the distribution tables, the expected value is θΓ (1 + 1/γ ) Γ ( α − 1/γ ) Γ(α) 8Γ (1 + 3) Γ (5 − 3)  Γ (5) 8 (3!1!)   2 4!

E[X] 

4. MIXTURES AND SPLICES

80

4.14.

We recognize the distribution as a single-parameter Pareto with θ  4000 and α  5. The mean, 2

) )  5000 and the second moment E[λ 2 ]  (5)(4000  80,000,000 . Therefore, the variance of the E[λ]  (5)(4000 4 3 3 80,000,000 5,000,000 2 distribution, Var ( λ )  − 5000  . We will need this information for the formula: 3 3

Var ( X )  E Var ( X | λ ) + Var E[X | λ]

f

g





 E[λ 2 ] + Var (3λ ) 80,000,000 5,000,000  + (9)  41,666,667 3 3

4.15.

(E)

Let PP be the pure premium. Var ( PP )  Var E[PP | N] + E Var ( PP | N )





f

g

We will use the Bernoulli shortcut discussed in Section 3.3 on page 54. E[PP | N] is a random variable. If N  1, it assumes the value of 23 (80) + 13 (160)  320 3 . When N  2, each claim has expected value 12 (80) + 21 (160)  120, so PP, which is the sum of 2 claims, has expected value 2 (120)  240. The variance of this random variable is the product of the probabilities of N  1 and N  2 times the square of the difference between 320 3 and 240: 3 Var E[PP | N]  4





1 4

!

!

320 − 240 3

2

 3333 31

Var ( PP | N ) is a random variable. When N  1, the conditional variance is the product of the probabilities of 80 and 160 times the square of the difference between 160 and 80: 2 Var ( PP | N  1)  3

12,800 1 (160 − 80) 2  3 9

!

!

When N  2, the conditional variance of each loss is the product of the probabilities of 80 and 160 times the square of the difference between 160 and 80. Since there are two losses, this is multiplied by 2. 1 Var ( PP | N  2)  2 2

!

1 (160 − 80) 2  3200 2

!

The expected value of the Var ( PP | N ) is then E Var ( PP | N ) 

f

Putting it all together

4.16.

g

3 12,800 1 + (3200)  1866 23 4 9 4

!

Var ( PP )  3333 13 + 1866 23  5200

(D)

Let I be the indicator of whether the risk is liability or property. Var ( X )  Var E[X | I] + E Var ( X | I )





f

300  100 4−1 2 · 3002 Var ( X | liability)  − 1002  20,000 3·2 E[X | liability] 

g

EXERCISE SOLUTIONS FOR LESSON 4

81

1,000  500 3−1 2 · 1,0002 Var ( X | property)  − 5002  750,000 2·1 E[X | property] 

By the Bernoulli shortcut (Section 3.3, page 54), since the two expected values are 400 apart and have probabilities 34 and 41 ,     Var E[X | I]  34 14 (500 − 100) 2  30,000 Also,

f

g

+

1 4 (750,000)

 202,500

Var ( X )  30,000 + 202,500  232,500

Var ( N )  Var E[N | P] + E Var ( N | P )  Var ( P ) + E[P] 



4.17.

3 4 (20,000)

E Var ( X | I ) 



f

g

1 12

(C) +

3 2



19 12

. (B)

4.18. 9.75  E Var ( X | λ ) + Var E[X | λ]

f

g





 E λ 2 + Var ( λ )

f

g

2θ 2 13 2 2θ 2 θ 2 + −  θ 12 12 16 48

!



θ 2  36 θ 6 4.19.

(C)

Let I  the random variable indicating the class. Var ( N )  Var E[N | I] + E Var ( N | I )





f

g

E[N | I] is a random variable which assumes the value 0.050 half the time and 0.210 half the time. The probabilities of 0.050 and 0.210 are each 0.50, and the difference of the two values is 0.210 − 0.050  0.160, so by the Bernoulli shortcut, the variance of E[N | I] is (0.50)(0.50)(0.1602 )  0.0064. Similarly, Var ( N | I ) is a random variable which assumes two values, 0.2272 and 0.5612 , each one half the time. The expected value of this random variable is 21 (0.2272 + 0.5612 )  0.183125. Putting it all together Var ( N )  0.0064 + 0.183125  0.189525 √ σN  0.189525  0.4353 (D) This exercise can also be worked out by calculating first and second moments. E[N]  0.5 (0.05) + 0.5 (0.21)  0.13 E[N 2 ]  0.5 (0.052 + 0.2272 ) + 0.5 (0.212 + 0.5612 )  0.206425 Var ( N )  0.206425 − 0.132  0.189525 √ (D) σN  0.189525  0.4353

4. MIXTURES AND SPLICES

82

4.20. The expected present value of the claim is 0.5 (10/1.043 ) , and the present value of legal fees is 5/1.043 , for a total of 10/1.043  8.89. We will compute the variance using the conditional variance formula. The legal expenses are not random and have no variance, so we’ll ignore them. Let I be the indicator variable for whether a payment is required, and X the settlement value. Var ( X )  Var E[X | I] + E Var ( X | I )





f

g

The expected value of the claim is 0 with probability 50% and 10/1.043 with probability 50%. Thus the expected value can only have one of two values. It is a Bernoulli random variable. The Bernoulli shortcut says that its variance is !2   10  19.7579 Var E[X | I]  (0.5)(0.5) 1.043 The variance of the claim is 0 with probability 50% and (20/1.043 ) 2 with probability 50%. The expected value of the variance is therefore

!2

20 +  158.0629 E Var ( X | I )  (0.5) *0 + 1.043

f

g

-

,

Therefore, Var ( X )  19.7579 + 158.0629  177.8208. The answer is 8.89 + 0.02 (177.8208)  12.4464

(C)

4.21. Naturally, if N  0, both mean and variance are zero. Using the conditional variance formula, 0 + 9 + 18 9 3 0+3+9  4 3 0 + 9 + 81  30  3  30 − 42  14

E Var ( S | N ) 

f

g

E E[S | N]

f

g

E E[S | N]2

g

Var E[S | N]



f



Var ( S )  9 + 14  23

(C)

4.22. The probability of a loss below 500 is 1 − e −500/250  1 − e −2 . Therefore, the probability of a loss above 500 is e −2 . Equating this to the Weibull at 500: e −2  ae − (500/400)

2

a  e 25/16−2  e −7/16  0.645649 4.23. Let p be the probability, and a the multiple of the exponential distribution. Then F (200)  p, and that plus Pr ( X > 200) must equal 1, so p + ae −200/400  1 Since the density of the uniform distribution is

p 200

and equals the exponential at 200,

p ae −200/400  200 400 C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 4

83

0.004

1

0.0035

0.8

0.003 0.0025

0.6

0.002 0.4

0.0015 0.001

0.2

0.0005 0

0

0

50 100 150 200 250 300 350 400 (a) Density function

0

50 100 150 200 250 300 350 400 (b) Distribution function

Figure 4.3: Spliced distribution in exercise 4.24.

a  2pe 1/2 Substituting for a in the other expression: p + 2p  1 p

1 3

Since S (200)  e −200/400  e −1/2 , it follows that a 1 − e −200/300  1 − e −1/2 . But then



4.24.

a and



1 − e −1/2 0.393469   0.808638 1 − e −2/3 0.486583

Pr ( X < 100)  0.808638 1 − e −100/300  0.229224





The density and distribution functions are shown in Figure 4.3. Note that this is a discontinuous density function. Splicing two exponentials does not produce a single exponential. 4.25.

We need to calculate a. First we calculate F (100) . ln 100 − 3  Φ (0.80)  0.7881 Φ 2

!

For the given Pareto, S (100) 

3 2 4

 0.5625. Therefore, a must be (1 − 0.7881) /0.5625  0.3767. Then

300 Pr ( X > 200)  0.3767 300 + 200

!2

 (0.3767)(0.36)  0.1356

4.26. For the exponential, F (500)  1 − e −500/250  1 − e −2 , and we are given that F (500)  0.5, so the 0.5 constant for the exponential density is 1−e −2 . The single-parameter Pareto has its entire support over 500— in fact, S (500)  1 for that distribution—so the constant for the Pareto to make S (500)  0.5 is 0.5. Let’s treat this as an equally weighted mixture of a truncated exponential divided by 1 − e −2 and a single-parameter Pareto. The mean of the single-parameter Pareto is αθ/ ( α − 1)  750. The mean of the C/4 Study Manual—17th edition Copyright ©2014 ASM

4. MIXTURES AND SPLICES

84

exponential can be calculated by integration. 500

Z 0

500

xe −x/250 dx  −xe −x/250 250 0  −500e

−2

+

500

Z 0

e −x/250 dx

+ 250 1 − e −2  250 − 750e −2





and dividing by 1 − e −2 , we get (250 − 750e −2 ) / (1 − e −2 )  171.7412. Thus the expected value of a loss is 0.5 (750 + 171.7412)  460.8706 . 4.27. X is a mixture with 5/8 weight on a uniform on [0, 100] and a 3/8 weight on a shifted exponential. The uniform has mean 50 and variance 1002 /12 and the exponential has θ  200 and shift 100, so its mean is 300 and its variance is 2002 . By conditional variance and Bernoulli shortcut, Var ( X )  E[1002 /12, 2002 ] + Var (50, 300) 5  8

!

1002 3 5 + (2002 ) + 12 8 8

!

!

!

3 (300 − 50) 2  30,169.27 8

!

Quiz Solutions 4-1. The X i ’s are independent given θ but are not independent, so we can’t just add up the individual variances. Instead, we’ll use conditional variance, with θ being the condition. E[S | θ]  50 E[X i | θ]  50θ

Var ( S | θ )  50 Var ( X i | θ )  50θ Using conditional variance, Var ( S )  E[Var ( S | θ ) ] + Var (E[S | θ])  E θ (50θ ) + Varθ (50θ )

 50 E[θ] + 2500 Var ( θ )  50 (2) + 2500 (22 /12)  933 31 4-2. Equate 1 − F (100) with the integral of f ( x ) for x > 100. ∞

θ dx 2 100 ( θ + x ) ∞ θ θ 2 −  3 θ + x 100 θ + 100 3θ  200 + 2θ

1 − F (100) 

Z

θ  200

Lesson 5

Policy Limits Reading: Loss Models Fourth Edition 3.1, 8.1, 8.4 Many insurance coverages limit the amount paid per loss. A policy limit is the maximum amount that insurance will pay for a single loss. To model insurance payments, define the limited loss variable X ∧ u in terms of the loss X as follows:

 X X∧u u

X θ; otherwise, E[X ∧ u]  u, since Pr ( X ≥ u )  1. The formula for the limited expected value for a single-parameter Pareto with α  1 and d > θ is missing from the tables. It is easily derived: d

Z E[X ∧ d] 

0 θ

Z 

0

S ( x ) dx 1 dx +

d

Z θ

d

!

θ dx x

 θ + θ (ln x )  θ 1 + ln θ

d θ

!

For exponentials and two-parameter Paretos, the formulas aren’t so good for moments higher than the first since they require evaluating incomplete gammas or betas. C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISES FOR LESSON 5

87

Example 5B Loss amounts follow a Pareto distribution with α  0.25, θ  100. An insurance coverage on the losses has a policy limit of 60. Calculate the expected insurance payment per loss. Answer: The tables say θ θ * .1 − E[X ∧ x]  α−1 x+θ

+/ -

,

so

100 100 * .1 − E[X ∧ 60]  −0.75 160

,

?

! α−1

! −0.75

+/  56.35 -



Quiz 5-1 Loss amounts follow an exponential distribution with θ  100. An insurance coverage on the losses has a policy limit of 80. Calculate the expected insurance payment per loss.

Note on inflation Many of the exercises combine policy limits with inflation. Suppose X is the original variable and Y is the inflated variable: Y  (1 + r ) X. Then (1 + r ) can be factored out of E[Y ∧ u] as follows: u E[Y ∧ u]  E (1 + r ) X ∧ u  (1 + r ) E X ∧ 1+r

f

g





(5.7)

If X is from a scale distribution, you may instead modify the distribution (usually by multiplying the scale factor by 1 + r) and then work with the modified distribution.

Exercises [4B-F97:8 and CAS3-F04:26] (2 points) You are given the following:

5.1. •

A sample of 2,000 claims contains 1,700 that are no greater than \$6,000, 30 that are greater than \$6,000 but no greater than \$7,000, and 270 that are greater than \$7,000.

The total amount of the 30 claims that are greater than \$6,000 but no greater than \$7,000 is \$200,000.

The empirical limited expected value function for this sample evaluated at \$6,000 is \$1,810. Determine the empirical limited expected value function for this sample evaluated at \$7,000.

(A) (B) (C) (D) (E)

Less than \$1,910 At least \$1,910, but less than \$1,930 At least \$1,930, but less than \$1,950 At least \$1,950, but less than \$1,970 At least \$1,970

Exercises continue on the next page . . .

5. POLICY LIMITS

88

Table 5.1: (Limited) Expected Value Formulas

All formulas assume Pr ( X < 0)  0. ∞

Z E[X] 

0

Z E[X ∧ u] 

0

Z  E[X k ]  k

E[ ( X ∧ u ) ]  

0

Z 0

Z

u

u

∞ u

Z0 u 0

S ( x ) dx

(5.2)

x f ( x ) dx + u 1 − F ( u )



(5.4)



S ( x ) dx

(5.6)

kx k−1 S ( x ) dx

(5.1)

x k f ( x ) dx + u k 1 − F ( u )



kx k−1 S ( x ) dx



(5.3) (5.5)

If Y  (1 + r ) X, then



E[Y ∧ u]  (1 + r ) E X ∧

u 1+r



(5.7)

5.2. [4B-S91:27] (3 points) The Pareto distribution with parameters θ  12,500 and α  2 appears to be a good fit to 1985 policy year liability claims. What is the estimated claim severity for a policy issued in 1992 with a 200,000 limit of liability? Assume that inflation has been a constant 10% per year. (A) (B) (C) (D) (E)

Less than 22,000 At least 22,000, but less than 23,000 At least 23,000, but less than 24,000 At least 24,000, but less than 25,000 At least 25,000

Exercises continue on the next page . . .

EXERCISES FOR LESSON 5

89

5.3. [4B-F93:5] (3 points) You are given the following: •

The underlying distribution for 1993 losses is given by f ( x )  e −x ,

x > 0,

where losses are expressed in millions of dollars. •

Inflation of 5% impacts all claims uniformly from 1993 to 1994.

Under a basic limits policy, individual losses are capped at \$1.0 million in each year. What is the inflation rate from 1993 to 1994 on the capped losses?

(A) (B) (C) (D) (E)

Less than 1.5% At least 1.5%, but less than 2.5% At least 2.5%, but less than 3.5% At least 3.5%, but less than 4.5% At least 4.5%

5.4. Losses in 2008 follow a two parameter Pareto distribution with parameters α  1 and θ  1000. Insurance pays losses up to a maximum of 100,000. Annual inflation of 5% increases loss sizes uniformly in 2009 and 2010. Determine the ratio of the average payment per loss in 2010 to the average payment per loss in 2008. 5.5. [4B-F94:8] (3 points) You are given the following: •

In 1993, an insurance company’s underlying loss distribution for an individual claim amount is lognormal with parameters µ  10.0 and σ 2  5.0.

From 1993 to 1994, an inflation rate of 10% impacts all claims uniformly.

In 1994, the insurance company purchases excess-of-loss reinsurance that caps the insurer’s loss at 2,000,000 for any individual claim.

Determine the insurer’s 1994 expected net claim amount for a single claim after application of the 2,000,000 reinsurance cap. (A) (B) (C) (D) (E)

Less than 150,000 At least 150,000, but less than 175,000 At least 175,000, but less than 200,000 At least 200,000, but less than 225,000 At least 225,000

5.6. An insurance company’s underlying loss distribution for individual claim amounts is singleparameter Pareto with α  2.5, θ  1000. Calculate the expected payment per loss for an insurance coverage with a policy limit of 5000. 5.7. An insurance company’s underlying loss distribution for individual claim amounts is singleparameter Pareto with α  1, θ  1000. Calculate the expected payment per loss for an insurance coverage with a policy limit of 3000.

Exercises continue on the next page . . .

5. POLICY LIMITS

90

5.8. [3-F01:28] The unlimited severity distribution for claim amounts under an auto liability insurance policy is given by the cumulative distribution: F ( x )  1 − 0.8e −0.02x − 0.2e −0.001x ,

x≥0

The insurance policy pays amounts up to a limit of 1000 per claim. Calculate the expected payment under this policy for one claim. (A) 57

(B) 108

(C) 166

(D) 205

(E) 240

5.9. The claim size distribution for an insurance coverage is modeled as a mixture of a two-parameter Pareto with parameters α  2, θ  1000 with weight 12 and a two-parameter Pareto with parameters α  1, θ  2000 with weight 21 . Calculate the limited expected value at 3000 of the claim sizes.

5.10. For an insurance coverage, claim sizes follow a distribution which is a mixture of a uniform distribution on [0, 10] with weight 0.5 and a uniform distribution on [5, 13] with weight 0.5. The limited expected value at a of claim sizes is 6.11875. Determine a. [4B-S93:12] (3 points) You are given the following:

5.11. (i)

The underlying distribution for 1992 losses is given by f ( x )  e −x , x > 0, where losses are expressed in millions of dollars. (ii) Inflation of 10% impacts all claims uniformly from 1992 to 1993. (iii) The policy limit is 1.0 (million). Determine the inflation rate from 1992 to 1993 on payments on the losses. (A) (B) (C) (D) (E) 5.12. •

Less than 2% At least 2%, but less than 3% At least 3%, but less than 4% At least 4%, but less than 5% At least 5% For an insurance coverage, you are given

Before inflation, E[X ∧ x]  500 −

1,000,000 x

for x > 2000.

Claims are subject to a policy limit 10,000.

Inflation of 25% uniformly affects all losses. Calculate the ratio of expected payment size after inflation to expected payment size before inflation.

5.13.

Losses follow a uniform distribution on (0, 5000]. An insurance coverage has a policy limit of 3000.

Calculate the variance of the payment per loss on the coverage.

Exercises continue on the next page . . .

EXERCISES FOR LESSON 5

91

5.14. [3-F00:31] For an industry-wide study of patients admitted to hospitals for treatment of cardiovascular illness in 1998, you are given: (i)

(ii)

Duration In Days

Number of Patients Remaining Hospitalized

0 5 10 15 20 25 30 35 40

4,386,000 1,461,554 486,739 161,805 53,488 17,384 5,349 1,337 0

Discharges from the hospital are uniformly distributed between durations shown in the table.

Calculate the mean residual time remaining hospitalized, in days, for a patient who has been hospitalized for 21 days. (A) 4.4

(B) 4.9

(C) 5.3

(D) 5.8

(E) 6.3

Solutions 5.1. Let X be the empirical loss variable. How do E[X ∧ 6000] and E[X ∧ 7000] differ?

1.

For claims below 6000, the full amount of the claim enters both, so they don’t differ.

2.

For claims above 7000, E[X ∧ 6000] includes 6000 and E[X ∧ 7000] includes 7000.

3.

For claims between 6000 and 7000, E[X ∧ 6000] includes 6000, whereas E[X ∧ 7000] includes the full amount of the claim.

Since the limited-at-6000 average of 2000 claims, E[X ∧ 6000], is 1810, the sum of the claims is 2000 (1810) . To this, we add the differences in categories 2 and 3. The difference in category 2 is 1000 times the number of claims in this category, or 1000 (270)  270,000. The difference in category 3 is that E[X ∧ 6000] includes 30 (6000)  180,000 for these 30 claims, whereas E[X ∧ 7000] includes the full amount of 200,000. The difference is 200,000 − 180,000  20,000. The answer is therefore E[X ∧ 7000] 

2000 (1810) + 270,000 + 20,000  1955 2000

(D)

5.2. Let X be the original variable, and let Z  1.17 X be the inflated variable. For Z, θ  12,500 (1.1) 7  24,358.96.

! 24,358.96 + * . /  21,714.28 E[Z ∧ 200,000]  (24,358.96) 1 − 224,358.96 , -

(A)

5. POLICY LIMITS

92

5.3. E[X1993 ∧ 1]  1 1 − e −1  0.6321



In 1994, θ  1.05



E[X1994 ∧ 1]  1.05 1 − e −1/1.05  0.6449





E[X1994 ∧ 1] − 1  0.0202 E[X1993 ∧ 1]

(B)

5.4. Let X be the original loss variable. Let Y  1.052 X, be the inflated loss random variable. The parameters of Y are α  1 and θ  1000 (1.052 )  1102.50. 1000 Average payment per loss for X is E[X ∧ 100,000]  −1000 ln  4615.12 101,000

!

1102.50 Average payment per loss for Y is E[Y ∧ 100,000]  −1102.50 ln  4981.71 101,102.50

!

The ratio is 4981.71/4615.12  1.07943 . 5.5. To scale a lognormal random variable by r, add ln r to µ and do not change σ. (See Table 2.3.) So √ in this exercise, parameters of the lognormal after inflation are µ  10 + ln 1.1  10.0953 and σ  5. Let X be the inflated variable. We are being asked to calculate E[X ∧ 2,000,000]. Using the tables,

  ln 2,000,000 − µ − σ2 σ2 E[X ∧ 2,000,000]  exp µ + Φ + 2,000,000 1 − F (2,000,000) 2 σ !

!

Let’s compute the three components of this formula. exp µ +

σ2 5  exp 10 + ln 1.1 +  295,171 2 2

!





ln 2,000,000 − 10 − ln 1.1 − 5  Φ (−0.26)  0.3974 Φ √ 5 ! ln 2,000,000 − 10 − ln 1.1 F (2,000,000)  Φ  Φ (1.97)  0.9756 √ 5

!

The answer is E[X ∧ 2,000,000]  295,171 (0.3974) + 2,000,000 (1 − 0.9756)  166,101 5.6. E[X ∧ 5000] 

(B)

2.5 (1000) 10002.5 −  1607.04 1.5 (1.5) 50001.5

5.7. The formula in the tables cannot be used for α  1, so we integrate the survival function. The survival function must be integrated from 0, even though the support of the random variable starts at 1000. E[X ∧ 3000] 

1000

Z 0

Z 

0

1000

S ( x ) dx + 1 dx +

Z

Z

3000

1000 3000

1000

S ( x ) dx

1000 dx x

 1000 + 1000 (ln 3000 − ln 1000)  2098.61

EXERCISE SOLUTIONS FOR LESSON 5

93

5.8. Limited expected value is additive, so the LEV of a mixture is the weighted average of the LEV’s of the components. You can also integrate if you wish. The distribution is an 80/20 mixture of exponentials with θ  50 and 1000. The limited expected values are   50 1 − e −1000/50  50 and

1000 1 − e −1000/1000  632.12.





The answer is 0.8 (50) + 0.2 (632.12)  166.42 . (C)

5.9. Use the formulas for the limited expected value of a two-parameter Pareto given in the Loss Models appendix. The limited expected value of the mixture is the weighted average of the two limited expected values of the components. E[X ∧ 3000]  0.5 1000 (1 − 1000/4000) + 0.5 −2000 ln (2000/5000)









 375 − 1000 ln 0.4  1291.29

5.10. This can be done using the definition of E[X ∧ a] or equation (5.6). First we will use the definition. In the following, we will assume a is between 5 and 10. If the result is not in this range, we would have to adjust the integrals. 2 E[X ∧ a]   

1 10

a

Z 0

x dx +

1 10 a (10

− a) +

1 8

a

Z 5

x dx + 81 a (13 − a )

2 1 a2 25 13 a2 10 (10a − a ) + 16 − 16 + 8 a − 8 1 1 25 −a 2 ( 20 + 16 ) + 21 8 a − 16  2 (6.11875) a2 20

+

We will multiply through by 80 to clear the denominators.

9a 2 − 210a + 1104  0

210 ± 2102 − 4 (9)(1104) a  8 , 15 13 18

p

The second solution 15 13 is rejected because it isn’t in the range [5, 10]. The limited expected value at 15 13 is the full expected value which is 7, not 6.11875. If using equation (5.6), you must be careful to integrate S ( x ) from zero, not from 5, even though the second distribution in the mixture has support starting at 5. Thus you get a 5 a 1 1 1 (10 − x ) dx + 1dx + (13 − x ) dx  6.11875 20 0 2 0 16 5   1  1  100 − (10 − a ) 2 + 2.5 + (13 − 5) 2 − (13 − a ) 2  6.11875 40 32   1 1 (20a − a 2 ) + 2.5 + −105 + 26a − a 2  6.11875 40 32

Z

Z

Z

Gathering terms −

9 2 21 25 a + a−  6.11875 160 16 32

Multiplying by 160 which is the same quadratic as above. C/4 Study Manual—17th edition Copyright ©2014 ASM

−9a 2 + 210a − 1104  0

5. POLICY LIMITS

94

5.11.

The original θ  1 and the inflated θ0  1.1. Then E[X ∧ 1]  1 (1 − e −1/1 )  0.6321

E[X 0 ∧ 1]  1.1 (1 − e −1/1.1 )  0.6568 0.6568 (C) − 1  0.0391 0.6321 5.12.

Before inflation, expected claim size is E[X ∧ 10,000]  500 −

1,000,000  400 10,000

After inflation, letting the loss variable be X 0,



E[X 0 ∧ 10,000]  1.25 E X ∧

1,000,000 10,000  1.25 500 −  468.75 1.25 8000







The ratio is 468.75/400  1.171875 . 5.13. This could be worked out from first principles, but let’s instead work it out as a mixture. Let X be the random variable for loss and Y the payment random variable. Then Y is a mixture of a uniform random variable on (0, 3000] with weight 0.6 and the constant 3000 with weight 0.4. The expected value of Y is E[Y]  0.6 (1500) + 0.4 (3000)  2100 The second moment of a uniform distribution on (0, u] is u 2 /3. The second moment of Y is 30002 E[Y ]  0.6 + 0.4 (30002 )  5,400,000 3 2

The variance of Y is

!

Var ( Y )  5,400,000 − 21002  990,000

5.14. The total number of patients hospitalized 21 days or longer is obtained by linear interpolation between 21 and 25: 0.8 (53,488) + 0.2 (17,384)  46,267.2 That will be the denominator. The numerator is the number of days past day 21 hospitalized times the number of patients hospitalized for that period. Within each interval of durations, the average patient released during that interval is hospitalized for half the period. So 46,267.2 − 17,384  28,883.2 patients are hospitalized for 2 days after day 21, 17,384 − 5,349  12,035 for 4 + 2.5  6.5 days, 5,349 − 1,337  4,012 for 11.5 days, and 1,337 for 16.5 days. Add it up: 28,883.2 (2) + 12,035 (6.5) + 4,012 (11.5) + 1,337 (16.5)  204,192.4 The mean residual time is 204,192.4/46,267.2  4.41333 . (A)

Quiz Solutions 5-1. The tables say so

E[X ∧ x]  θ (1 − e −x/θ ) E[X ∧ 80]  100 (1 − e −80/100 )  55.07

Lesson 6

Deductibles Reading: Loss Models Fourth Edition 3.1, 8.1–8.2

6.1

Ordinary and franchise deductibles

Insurance coverages often have provisions to not pay small claims. A policy with an ordinary deductible of d is one that pays the greater of 0 and X − d for a loss of X. For example, a policy with an ordinary deductible of 500 pays nothing if a loss is 500 or less, and pays 200 for a loss of 700. A policy with a franchise deductible of d is one that pays nothing if the loss is no greater than d, and pays the full amount of the loss if it is greater than d. For example, a policy with a franchise deductible of 500 pays nothing if a loss is 500 or less, and pays 700 for a loss of 700. Assume that a deductible is an ordinary deductible unless stated otherwise. If a policy has a deductible, and the probability of a loss below the deductible is not zero, then not every loss is paid. Thus we must distinguish between payment per loss and payment per payment. The expected payment per loss is less than the expected payment per payment, since payments of 0 are averaged into the former but not into the latter.

6.2

Payment per loss with deductible

Let X be the random variable for loss size. The random variable for the payment per loss with a deductible d is Y L  ( X − d )+ . The symbol ( X − d )+ means the positive part of X − d: in other words, max (0, X − d ) . It is usually easy enough to calculate probabilities for Y L by reasoning, but let’s write out the distribution function of Y L . FY L ( x )  Pr ( Y L ≤ x )  Pr ( X − d ≤ x )  Pr ( X ≤ x + d )  FX ( x + d )

(6.1)

The expected value of ( X − d )+ can be obtained from the definition of expected value: E[ ( X − d )+ ] 

Z d

( x − d ) f ( x ) dx

(6.2)

which roughly means that you take the probability of a loss being x and multiply it by x − d, and sum up over all x. Higher moments can be calculated using powers of x − d in the integral. An alternative formula for the first moment, derived by integration by parts, is often easier to use: E[ ( X − d )+ ] 

Z d

S ( x ) dx

Example 6A Loss amounts have a distribution whose density function is f (x )  C/4 Study Manual—17th edition Copyright ©2014 ASM

4 (100 − x ) 3 1004 95

0 < x ≤ 100

(6.3)

6. DEDUCTIBLES

96

An insurance coverage for these losses has an ordinary deductible of 20. Calculate the expected insurance payment per loss. Answer: Using the definition of E[ ( X − 20)+ ], E[ ( X − 20)+ ] 

20

Let u  100 − x. E[ ( X − 20)+ ] 

100

Z

80

Z 0

4 ( x − 20)(100 − x ) 3 dx 1004

4 (80 − u ) u 3 du 1004

4 u5 4  20u − 5 1004 

! 80 0 ! 5

4 80 20 (804 ) − 5 1004

 6.5536

Alternatively, using equation (6.3), x

Z F (x )  E[ ( X − 20)+ ] 

0

Z

100

20

− 

(100 − x ) 4 4 (100 − x ) 3 dx  1 − 1004 1004 (100 − x ) 4

!

1004

dx

100

(100 − x ) 5 5 (1004 ) 20

805  6.5536 5 (1004 )



The expected payment for a franchise deductible is E[ ( X − d )+ ] + dS ( d ) Example 6B Loss amounts have a discrete distribution with the following probabilities: Loss Amount

Probability

100 500 1000 2000

0.4 0.3 0.2 0.1

An insurance coverage for these losses has a franchise deductible of 500. Calculate the expected insurance payment per loss. Answer: The coverage pays 1000 if the loss is 1000 and 2000 if the loss is 2000; otherwise it pays nothing since the loss is below the deductible. The expected payment is therefore 0.2 (1000) + 0.1 (2000)  400 .  The random variable ( X − d )+ is said to be shifted by d and censored. Censored means that you have some, but incomplete, information about certain losses. In this case, you are aware of losses below d, but don’t know the amounts of such losses. C/4 Study Manual—17th edition Copyright ©2014 ASM

6.3. PAYMENT PER PAYMENT WITH DEDUCTIBLE

97

If you combine a policy with ordinary deductible d and a policy with policy limit d, the combination covers every loss entirely. In other words: E[X]  E[X ∧ d] + E[ ( X − d )+ ]

(6.4)

Thus for distributions in the tables, you can evaluate the expected payment per loss with a deductible by the formula E[ ( X − d )+ ]  E[X] − E[X ∧ d] Example 6C Losses follow a two-parameter Pareto distribution with α  2, θ  2000. Calculate the expected payment per loss on a coverage with ordinary deductible 500. Answer: For a Pareto with α > 1, E[X] − E[X ∧ d]  In our case,

6.3

θ θ α−1 θ+d

! α−1

2000  1600 E[X] − E[X ∧ 500]  2000 2500

!



Payment per payment with deductible

The random variable for payment per payment on an insurance with an ordinary deductible is the payment per loss random variable conditioned on X > d, or Y P  ( X − d )+ | X > d. Let’s write out the distribution function of Y P . This random variable is conditioned on X > d so it is not defined for x ≤ d. For x > d, FY P ( x )  Pr ( Y P ≤ x )

 Pr ( X − d ≤ x | X > d )

 Pr ( X ≤ x + d | X > d ) Pr ( d < X ≤ x + d )  Pr ( X > d ) FX ( x + d ) − FX ( d )  1 − FX ( d )

(6.5)

Notice the need to subtract FX ( d ) in the numerator, .because of the   joint condition X > d and X ≤ x + d. A common error made by students is to use FX ( x + d ) 1 − FX ( d ) and to forget to subtract FX ( d ) . The survival function doesn’t have the extra term: SY P ( x ) 

SX ( x + d ) SX ( d )

(6.6)

because the joint condition X > d and X > x + d reduces to X > x + d. For this reason, working with survival functions is often easier. Example 6D Losses follow a single-parameter Pareto with α  2, θ  400. Y P is the payment per payment random variable for a coverage with a deductible of 1000. Calculate Pr ( Y P ≤ 600) . C/4 Study Manual—17th edition Copyright ©2014 ASM

6. DEDUCTIBLES

98

Answer: Let X be the loss random variable. We’ll use formula (6.5). 400 FX (1000)  1 − 1000

!2

 0.84

!2

400 FX (1600)  1 −  0.9375 1600 0.9375 − 0.84  0.609375 FY P (600)  1 − 0.84



The expected value of Y P is E[ ( X − d )+ ]/S ( d ) . It is called the mean excess loss and is denoted by e X ( d ) . In life contingencies, it is called mean residual life or the complete life expectancy and is denoted by e˚d ; the symbol without a circle on the e has a somewhat different meaning. Based on the definition, a formula for e ( d ) is e (d ) 

E[ ( X − d )+ ] S (d )

(6.7)

Adapting formulas (6.2) and (6.3), the formulas for e ( d ) are

R e (d ) 

R e (d ) 

∞ (x d ∞ d

− d ) f ( x ) dx S (d )

or

S ( x ) dx

S (d )

Higher moments of Y P can be calculated by raising x − d to powers in the first integral. Combining (6.4) and (6.7), we get E[X]  E[X ∧ d] + e ( d ) 1 − F ( d )





(6.8)

The random variable Y P is said to be shifted by d and truncated. Truncated means you have absolutely no information about losses in a certain range. Therefore, all your information is conditional—you know the existence of a loss only when it is above d, otherwise you know nothing. Example 6E (Same data as Example 6A) Loss amounts have a distribution whose density function is f (x ) 

4 (100 − x ) 3 1004

0 < x ≤ 100

An insurance coverage for these losses has an ordinary deductible of 20. Calculate the expected insurance payment per payment. Answer: Above, we calculated the payment per loss as 6.5536. We also derived S ( x )  (100 − x ) 4 /1004 . So S (20)  0.84 and the expected payment per payment is 6.5536/0.84  16 .  For a franchise deductible, the payment made is d higher than the corresponding payment under an ordinary deductible, so the expected payment per payment is e ( d ) + d. Example 6F (Same data as Example 6B) Loss amounts have a discrete distribution with the following probabilities: C/4 Study Manual—17th edition Copyright ©2014 ASM

6.3. PAYMENT PER PAYMENT WITH DEDUCTIBLE

99

Loss Amount

Probability

100 500 1000 2000

0.4 0.3 0.2 0.1

An insurance coverage for these losses has a franchise deductible of 500. Calculate the expected insurance payment per payment. Answer: First of all, E[ ( X −500)+ ]  0.2 (1000−500) +0.1 (2000−500)  250. Then e (500)  250/0.3  833 13 , which would be the payment per payment under an ordinary deductible. The payment per payment under a franchise deductible is 500 higher, or 1333 13 .

?



Quiz 6-1 Losses follow a distribution that is a mixture of two exponentials, with weight 0.75 on an exponential with mean 1000 and weight 0.25 on an exponential with mean 2000. An insurance coverage has an ordinary deductible of 500. Calculate the expected payment per payment on this coverage.

Special cases Exponential An exponential distribution has no memory. This means that Y P has the same distribution as X: it is exponential with mean θ. Therefore e (d )  θ (6.9) Example 6G Losses under an insurance coverage are exponentially distributed with mean 1000. Calculate the expected payment per payment for an insurance coverage with franchise deductible 200. Answer: e (200)  1000, so the answer is e ( d ) + d  1200 .



Uniform If X has a uniform distribution on (0, θ], then ( X − d )+ | X > d has a uniform distribution on (0, θ − d].

Example 6H Losses under an insurance coverage follow a uniform distribution on (0, 1000]. Calculate the expected payment per payment for an insurance coverage with ordinary deductible 200. Answer: Payment per payment is uniformly distributed on (0, 800], so the answer is 400 .



The same shortcut applies to any beta distribution with a  1. If X is beta with parameters a  1, b, and θ, then ( X − d )+ | X > d is beta with parameters a  1, b, θ  d. This generalization would help us with Example 6E. In that example, a  1, b  4, and θ  100. After the deductible, the revised θ parameter for the payment distribution is θ − d  80, and the expected payment per payment is ( θ − d ) a/ ( a + b )  80/5  16 . C/4 Study Manual—17th edition Copyright ©2014 ASM

6. DEDUCTIBLES

100

Pareto If X has a 2-parameter Pareto distribution with parameters α and θ, then ( X − d )+ | X > d has a Pareto distribution with parameters α and θ + d. This means that we can easily calculate the moments and percentiles of ( X − d )+ | X > d. In particular, the mean is θ+d α−1

e (d ) 

α>1

(6.10)

You can easily write down a formula for the second moment and calculate the variance as well. Example 6I An insurance coverage has an ordinary deductible of 20. The loss distribution follows a twoparameter Pareto with α  3 and θ  100. Calculate 1. The average payment per payment. 2. The variance of positive payment amounts. 3. The probability that a nonzero payment is greater than 10. Answer: The distribution of nonzero payments is two-parameter Pareto with α  3 and θ  100+20  120. Therefore, with Y  ( X − 20)+ | X > 20 1.

eY (20) 

120  60 3−1

2. E[Y 2 ] 

2 (1202 )  14,400 (2)(1)

Var ( Y )  14,400 − 602  10,800 3. 120 Pr ( Y > 10)  120 + 10

!3  0.78653



If X has a single-parameter Pareto distribution with parameters α and θ, and d ≥ θ, then ( X − d )+ | X > d has a 2-parameter Pareto distribution with parameters α and d, so for α > 1, e (d ) 

d α−1

d≥θ

(6.11)

If d < θ, then ( X − d )+ | X > d is a shifted single-parameter Pareto distribution. It is the original distribution shifted by −d. The mean excess loss is e (d ) 

α (θ − d ) + d α−1

Example 6J Calculate the payment per payment for an insurance coverage with an ordinary deductible of 5 if the loss distribution is 1. exponential with mean 10 C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISES FOR LESSON 6

101

2. Pareto with parameters α  3, θ  20 3. Single-parameter Pareto with parameters α  2, θ  1. Answer:

1. e (5)  E[X]  10 .

2. e (5)  (20 + 5) /2  12.5 . 3. e (5)  5/1  5 .



Terminology note Exam questions are usually pretty clear about whether you are to calculate payment per payment or payment per loss. However, if you encounter a question asking for “expected payment”, think about it this way: The expected value of a random variable X is the average total of the X’s divided by the number of X’s. Thus the “expected payment” is the total of the payments divided by the number of payments. Another type of question asks you for “expected annual payment”. This is neither payment per loss nor payment per payment. “Annual” means “per year”! We will learn more about this in Lesson 16.

Exercises Mean excess loss 6.1.

[151-83-94:1] (1 point) For a random loss X: Pr ( X  3)  Pr ( X  12)  0.5 and E ( X − d )+  3

f

g

Determine d. (A) 4.5 6.2.

(B) 5.0

(C) 5.5

(D) 6.0

(E) 6.5

[4B-F99:19] (2 points) You are given the following:

(i)

Claim sizes for Risk A follow a two-parameter Pareto distribution with parameters θ  10,000 and α  2. √ (ii) Claim sizes for Risk B follow a Burr distribution with parameters θ  20,000, α  2, and γ  2. (iii) r is the ratio of the proportion of Risk A’s claims (in number) that exceed d to the proportion of Risk B’s claims (in number) that exceed d. Determine the limit of r as d goes to infinity. (A) 0

(B) 1

(C) 2

(D) 4

(E) ∞

Exercises continue on the next page . . .

6. DEDUCTIBLES

102

Table 6.1: Summary of Deductible Formulas

if Y L  ( X − d )+

FY L ( x )  F X ( x + d ) E[ ( X − d )+ ] 

Z

E[ ( X − d )+ ] 

Z

d

d

( x − d ) f ( x ) dx

(6.2)

S ( x ) dx

(6.3)

FX ( x + d ) − FX ( d ) 1 − FX ( d ) SX ( x + d ) SY P ( x )  SX ( d ) E[X] − E[X ∧ d] e (d )  S (d ) FY P ( x ) 

R

∞ (x d

R

∞ d

e (d )  e (d ) 

(6.1)

if Y P  ( X − d )+ | X > d

(6.5)

if Y P  ( X − d )+ | X > d

(6.6) (version of 6.7)

− d ) f ( x ) dx S (d )

S ( x ) dx

S (d )

E[X]  E[X ∧ d] + e ( d ) 1 − F ( d )



e (d )  θ θ−d e (d )  2 θ−d e (d )  1+b θ+d e (d )  α−1

 

d

α−1

for exponential

(6.9)

d 0, the limited expected value of Z at d equals (1 + r ) times the limited expected value of X at d.

(A) 2 (B) 3 (C) 2,3 (E) The correct answer is not given by (A) , (B) , (C) , or (D) .

(D) 1,2,3

6.5. [4B-S95:21] (3 points) Losses follow a Pareto distribution with parameters θ and α > 1. Determine the ratio of the mean excess loss function at x  2θ to the mean excess loss function at x  θ. (A) 1/2 (B) 1 (C) 3/2 (E) Cannot be determined from the given information

(D) 2

6.6. [4B-F98:6] (2 points) Claim sizes follow a Pareto distribution with parameters α  0.5 and θ  10,000. Determine the mean excess loss at 10,000. (A) 5,000

(B) 10,000

(C) 20,000

(D) 40,000

(E) ∞

6.7. [4B-F94:16] (1 point) A random sample of auto glass claims has yielded the following five observed claim amounts: 100,

125,

200,

250,

300.

What is the value of the empirical mean excess loss function at x  150? (A) 75

(B) 100

(C) 200

(D) 225

(E) 250

Exercises continue on the next page . . .

6. DEDUCTIBLES

104

Use the following information for questions 6.8 and 6.9: The following random sample has been observed: 2.0,

10.3,

4.8,

16.4,

21.6,

3.7,

21.4,

34.4

The underlying distribution function is assumed to be the following: F ( x )  1 − e −x/10 ,

x ≥ 0.

6.8. [4B-S93:24] (2 points) Calculate the value of the mean excess loss function e ( x ) for x  8. (A) (B) (C) (D) (E)

Less than 7.00 At least 7.00, but less than 9.00 At least 9.00, but less than 11.00 At least 11.00, but less than 13.00 At least 13.00 [4B-S93:25] (2 points) Calculate the value of the empirical mean excess loss function e n ( x ) , for x  8.

6.9. (A) (B) (C) (D) (E) 6.10. •

Less than 7.00 At least 7.00, but less than 9.00 At least 9.00, but less than 11.00 At least 11.00, but less than 13.00 At least 13.00 [4B-S99:10] (2 points) You are given the following:

One hundred claims greater than 3,000 have been recorded as follows: Interval Number of Claims ( 3,000, 5,000] ( 5,000, 10,000] (10,000, 25,000] (25,000, ∞)

6 29 39 26

Claims of 3,000 or less have not been recorded.

Claim sizes follow a Pareto distribution with parameters α  2 and θ  25,000. Determine the expected claim size for claims in the interval (25,000, ∞) .

(A) 12,500 6.11.

(B) 25,000

(C) 50,000

(D) 75,000

(E) 100,000

[4B-S98:3] (3 points) The random variables X and Y have joint density function y

f ( x, y )  e −2x− 2 ,

0 < x < ∞,

0 < y < ∞.

Determine the mean excess loss function for the marginal distribution of X evaluated at X  4. (A) 1/4

(B) 1/2

(C) 1

(D) 2

(E) 4

Exercises continue on the next page . . .

EXERCISES FOR LESSON 6

6.12.

105

[4B-F96:22] (2 points) The random variable X has the density function f (x ) 

1 −x/θ , θe

0 < x < ∞, θ > 0.

Determine e ( θ ) , the mean excess loss function evaluated at θ. (A) 1

(B) θ

(C) 1/θ

(D) θ/e

(E) e/θ

Use the following information for questions 6.13 through 6.15: You are given the following: •

The random variable X follows a two-parameter Pareto distribution with parameters θ  100 and α  2.

The mean excess loss function, e X ( k ) is defined to be E[X − k | X ≥ k].

6.13.

[4B-F99:25] (2 points) Determine the range of e X ( k ) over its domain of [0, ∞) .

(A) [0, 100] 6.14.

(B) [0, ∞]

(C) 100

(E) ∞

[4B-F99:26] (1 point) Y  1.10X

Determine the range of the function (A) (1, 1.10] 6.15.

(D) [100, ∞)

(B) (1, ∞)

eY ( k ) eX ( k )

over its domain [0, ∞) .

(C) 1.10

(D) [1.10, ∞)

(E) ∞

(D) [100, ∞)

(E) [150, ∞)

[4B-F99:27] (2 points) Z  min ( X, 500)

Determine the range of e Z ( k ) over its domain of [0, 500]. (A) [0, 150] 6.16.

(B) [0, ∞)

(C) [100, 150]

[3-F01:35] The random variable for a loss, X, has the following characteristics: F (x ) 0.0 0.2 0.6 1.0

x

0 100 200 1000

E(X ∧ x ) 0 91 153 331

Calculate the mean excess loss for a deductible of 100. (A) 250 6.17.

(B) 300

(C) 350

(D) 400

(E) 450

For an insurance coverage, you are given:

(i) The policy limit is 10,000. (ii) The expected value of a loss before considering the policy limit is 9,000. (iii) The probability that a loss is at least 10,000 is 0.1. (iv) The mean excess loss at 10,000, if the policy limit is ignored, is 20,000. Determine the average payment per loss for losses less than 10,000.

Exercises continue on the next page . . .

6. DEDUCTIBLES

106

[4B-S98:25] (2 points) You are given the following:

6.18. •

100 observed claims occurring in 1995 for a group of risks have been recorded and are grouped as follows: Interval Number of Claims ( 0, 250) [ 250, 300) [ 300, 350) [ 350, 400) [ 400, 450) [ 450, 500) [ 500, 600) [ 600, 700) [ 700, 800) [ 800, 900) [ 900, 1000) [1000, ∞ )

36 6 3 5 5 0 5 5 6 1 3 25

Inflation of 10% per year affects all claims uniformly from 1995 to 1998.

Using the above information, determine a range for the expected proportion of claims for this group of risks that will be greater than 500 in 1998. (A) (B) (C) (D) (E)

Between 35% and 40% Between 40% and 45% Between 45% and 50% Between 50% and 55% Between 55% and 60%

Deductibles 6.19.

Losses follow a uniform distribution on [0, 50,000]. There is a deductible of 1,000 per loss.

Determine the average payment per loss. 6.20. A policy covers losses subject to a franchise deductible of 500. Losses follow an exponential distribution with mean 1000. Determine the average payment per loss. 6.21. Losses follow a Pareto distribution with α  3.5, θ  5000. A policy covers losses subject to a 500 franchise deductible. Determine the average payment per loss.

Exercises continue on the next page . . .

EXERCISES FOR LESSON 6 [4B-S99:7] (2 points) You are given the following:

6.22. •

107

Losses follow a distribution (prior to the application of any deductible) with cumulative distribution function and limited expected values as follows: Loss Size (x) F (x ) E[X ∧ x] 10,000 15,000 22,500 32,500 ∞

0.60 0.70 0.80 0.90 1.00

6,000 7,700 9,500 11,000 20,000

There is a deductible of 10,000 per loss and no policy limit.

The insurer makes a payment on a loss only if the loss exceeds the deductible.

The deductible is raised so that half the number of losses exceed the new deductible compared to the old deductible of 10,000. Determine the percentage change in the expected size of a nonzero payment made by the insurer. (A) (B) (C) (D) (E)

Less than −37.5% At least −37.5%, but less than −12.5% At least −12.5%, but less than 12.5% At least 12.5%, but less than 37.5% At least 37.5% [4B-S94:24] (3 points) You are given the following:

6.23. •

X is a random variable for 1993 losses, having the density function f ( x )  0.1e −0.1x , x > 0.

Inflation of 10% impacts all losses uniformly from 1993 to 1994.

For 1994, a deductible, d, is applied to all losses.

P is a random variable representing payments of losses truncated and shifted by the deductible amount. Determine the value of the cumulative distribution function at P  5, FP (5) , in 1994.

(A) (B) (C) (D) (E)

1 − e −0.1 ((5+d )/1.1)



e −0.1 (5/1.1) − e −0.1 ((5+d )/1.1)

.

0 At least 0.25 but less than 0.35 At least 0.35 but less than 0.45

1 − e −0.1 (5/1.1)



Exercises continue on the next page . . .

6. DEDUCTIBLES

108

Use the following information for questions 6.24 and 6.25: You are given the following: •

Losses follow a distribution (prior to the application of any deductible) with cumulative distribution function and limited expected values as follows: Loss Size(x) F (x ) E[X ∧ x] 10,000 15,000 22,500 ∞

0.60 0.70 0.80 1.00

There is a deductible of 15,000 per loss and no policy limit.

The insurer makes a nonzero payment p.

6.24. (A) (B) (C) (D) (E)

6,000 7,700 9,500 20,000

[4B-F98:12] (2 points) Determine the expected value of p. Less than 15,000 At least 15,000, but less than 30,000 At least 30,000, but less than 45,000 At least 45,000, but less than 60,000 At least 60,000

6.25. [4B-F98:13] (2 points) After several years of inflation, all losses have increased in size by 50%, but the deductible has remained the same. Determine the expected value of p. (A) (B) (C) (D) (E)

Less than 15,000 At least 15,000, but less than 30,000 At least 30,000, but less than 45,000 At least 45,000, but less than 60,000 At least 60,000

Exercises continue on the next page . . .

EXERCISES FOR LESSON 6 [4B-S94:21] (2 points) You are given the following:

6.26. •

109

For 1993 the amount of a single claim has the following distribution: Amount

Probability

\$1000 2000 3000 4000 5000 6000

1/6 1/6 1/6 1/6 1/6 1/6

An insurer pays all losses after applying a \$1500 deductible to each loss.

Inflation of 5% impacts all claims uniformly from 1993 to 1994.

Assuming no change in the deductible, what is the inflationary impact on losses paid by the insurer in 1994 as compared to the losses the insurer paid in 1993? (A) (B) (C) (D) (E)

Less than 5.5% At least 5.5%, but less than 6.5% At least 6.5%, but less than 7.5% At least 7.5%, but less than 8.5% At least 8.5% [4B-F94:17] (2 points) You are given the following:

6.27. •

Losses follow a Weibull distribution with parameters θ  20 and τ  1.0.

The insurance coverage has an ordinary deductible of 10.

If the insurer makes a payment, what is the probability that an insurer’s payment is less than or equal to 25? (A) (B) (C) (D) (E)

Less than 0.65 At least 0.65, but less than 0.70 At least 0.70, but less than 0.75 At least 0.75, but less than 0.80 At least 0.80

6.28. Losses follow a lognormal distribution with parameters µ  5, σ  2. Losses are subject to a 1000 franchise deductible. 10% inflation affects the losses. Calculate the revised franchise deductible so that the expected aggregate cost of claims after inflation with the deductible is the same as it was before inflation with the 1000 franchise deductible. 6.29. Losses follow a Pareto distribution with α  3, θ  5000. Insurance pays the amount of the loss minus a deductible, but not less than zero. The deductible is 100, minus 25% of the excess of the loss over 100, but not less than zero. Calculate the expected payment per payment. 6.30. X is a random variable representing loss sizes. You are given that E[X ∧ d]  100 1 − e −d/100 . Loss sizes are affected by 10% inflation.





Determine the average payment per loss under a policy with a 500 ordinary deductible after inflation. C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

6. DEDUCTIBLES

110

6.31. [CAS3-S04:21] Auto liability losses for a group of insureds (Group R) follow a Pareto distribution with α  2 and θ  2,000. Losses from a second group (Group S) follow a Pareto distribution with α  2 and θ  3,000. Group R has an ordinary deductible of 500, while Group S has a franchise deductible of 200. Calculate the amount that the expected cost per payment for Group S exceeds that for Group R. (A) (B) (C) (D) (E)

Less than 350 At least 350, but less than 650 At least 650, but less than 950 At least 950, but less than 1,250 At least 1,250

6.32. [CAS3-S04:29] Claim sizes this year are described by a 2-parameter Pareto distribution with parameters θ  1,500 and α  4. What is the expected claim size per loss next year after 20% inflation and the introduction of a \$100 deductible? (A) (B) (C) (D) (E)

Less than \$490 At least \$490, but less than \$500 At least \$500, but less than \$510 At least \$510, but less than \$520 At least \$520 For an automobile collision coverage, you are given

6.33. •

Loss sizes, before application of any deductible or limit, follow a distribution which is a mixture of a two-parameter Pareto distribution with parameters α  2, θ  1000 and a two-parameter Pareto distribution with parameters α  3, θ  2000.

For coverage with an ordinary deductible of 500, the average amount paid per claim for each claim above the deductible is 1471.63.

For coverage with an ordinary deductible of 600, the average amount paid per claim for each claim above the deductible is x. Determine x.

6.34. [CAS3-F04:29] High-Roller Insurance Company insures the cost of injuries to the employees of ACME Dynamic Manufacturing, Inc. •

30% of injuries are “Fatal” and the rest are “Permanent Total” (PT). There are no other injury types.

Fatal injuries follow a loglogistic distribution with θ  400 and γ  2.

PT injuries follow a loglogistic distribution with θ  600 and γ  2.

There is a \$750 deductible per injury. Calculate the probability that an injury will result in a claim to High-Roller.

(A) (B) (C) (D) (E)

Less than 30% At least 30%, but less than 35% At least 35%, but less than 40% At least 40%, but less than 45% At least 45%

Exercises continue on the next page . . .

EXERCISES FOR LESSON 6

111

6.35. [151-82-93:6] (2 points) A company has 50 employees whose dental expenses are mutually independent. For each employee, the company reimburses 100% of dental expenses in excess of a \$100 deductible. The dental expense for each employee is distributed as follows: Expense \$

0 50 200 500 1,000

Probability 0.20 0.30 0.30 0.10 0.10

Determine, by normal approximation, the 95th percentile of the cost to the company. (A) \$8,000

(B) \$9,000

(C) \$10,000

(D) \$11,000

(E) \$12,000

6.36. [CAS3-F04:25] Let X be the random variable representing aggregate losses for an insured. X follows a gamma distribution with mean of \$1 million and coefficient of variation 1. An insurance policy pays for aggregate losses that exceed twice the expected value of X. Calculate the expected loss for the policy. (A) (B) (C) (D) (E)

Less than \$100,000 At least \$100,000, but less than \$200,000 At least \$200,000, but less than \$300,000 At least \$300,000, but less than \$400,000 At least \$400,000 [4B-S92:23] (2 points) You are given the following information:

6.37. •

A large risk has a lognormal claim size distribution with parameters µ  8.443 and σ  1.239.

The insurance agent for the risk settles all claims under 5000. (Claims of 5000 or more are settled by the insurer, not the agent.) Determine the expected value of a claim settled by the insurance agent.

(A) (B) (C) (D) (E)

Less than 500 At least 500, but less than 1000 At least 1000, but less than 1500 At least 1500, but less than 2000 At least 2000

Additional released exam questions: CAS3-S05:4,35, SOA M-S05:9,32, CAS3-F05:20, SOA M-F05:14,26, CAS3-S06:39

Solutions 6.1. From the five choices, we see that d > f3, but you g can also deduce this as follows: If d < 3, then since X ≥ 3, it follows that ( X − d )+  X − d and E ( X − d )+  E[X − d]  E[X] − d, but E[X]  0.5 (3 + 12)  7.5,

so E[X] − d > 4.5, contradicting E ( X − d )+  3.

f

g

6. DEDUCTIBLES

112

It is also clear that d < 12, or else E ( X − d )+  0 since ( X − d )+  0 for both possible values of X. From d > 3 if follows that when X  3, ( X − d )+  0, so

f

g

E ( X − d )+  0.5 (0) + 0.5 (12 − d )

f

g

Setting this equal to 3, we obtain d  6 . (D) 6.2. The probability that a claim for Risk A exceeds d is θ S (d )  θ+d

10,000  10,000 + d

!2

The probability that a claim for Risk B exceeds d is 1 S (d )  1 + ( d/θ ) γ The ratio is

1  1 + d 2 /20,000

(20,000 + d 2 ) 2 (10,000 + d )

!2

20,000  20,000 + d 2

!2

!2

As d goes to infinity, the d 2 term will dominate, so the ratio will go to ∞. (E) 6.3.

See equation (6.8). (C)

6.4. Both the standard deviation and the mean are multiplied by 1 + r, so the coefficient of variation doesn’t change, and 1 is false . 2 is true , but 3 is false ; the limited expected value of Z at d (1 + r ) is 1 + r times the limited expected value of X at d. (A) 6.5. From formula (6.10), the mean excess loss at x  θ is ( θ + θ ) / ( α − 1) and the mean excess loss at x  2θ is ( θ + 2θ ) / ( α − 1) . The ratio of the latter to the former is 3θ/2θ  3/2 . (C) Why was this simple problem worth 3 points? 6.6. When α < 1, the expected value of a Pareto, and therefore the mean excess loss, is infinite . (E) 6.7. The empirical distribution assigns a probability of 1/5 to each of the five observations. Since the mean excess loss is calculated at 150, observations less than or equal to 150 are ignored. The excess loss over 150 is 50 for 200, 100 for 250, and 150 for 300. 50 + 100 + 150  100 3

(B)

6.8. The distribution of X is exponential, so the mean excess loss is the mean, 10 . (C) 6.9. The empirical mean excess loss is the mean excess loss based on the empirical distribution. The empirical distribution assigns probability 1/n to each observation. The empirical mean excess loss at x may be computed by summing up all observations greater than x, dividing by the number of such observations, and then subtracting x. Thus e n (8) 

1 (10.3 + 16.4 + 21.6 + 21.4 + 34.4) − 8  12.82 5

(D)

The subscript of n is a common way to indicate an empirical function based on n observations, as we’ll learn in Lesson 22.

EXERCISE SOLUTIONS FOR LESSON 6

113

6.10. You are being asked for total loss size given that a loss is greater than 25,000. The mean excess loss e (25,000) is the excess of a loss over 25,000 given that it is greater than 25,000. So you are being asked for the mean excess loss at 25,000 plus 25,000 (the total loss size, not just the excess). For a Pareto, e ( d )  ( θ + d ) / ( α − 1) , so 25,000 + e (25,000)  25,000 + 6.11.

The marginal distribution of X is

25,000 + 25,000 θ + 25,000  25,000 +  75,000 α−1 1

R

∞ −2x− y 2 e 0

y

(D)

∞   2e −2x , which is exponential

dy  ( e −2x ) −2e − 2



0

with θ  1/2. The mean excess loss is the mean, or 1/2 . (B)

6.12. This is an exponential distribution with mean θ. By equation (6.9), e ( d )  θ for any d, in particular for d  θ. (B) 6.13.

By equation (6.10), the mean excess loss is an increasing linear function for a Pareto: eX ( k ) 

θ+k  100 + k α−1

which is 100 at k  0 and goes to infinity as k goes to infinity. (D) 6.14. By equation (6.10), e X ( k )  100 + k. The inflated variable Y  1.1X is a two-parameter Pareto with parameters θ  1.1 (100)  110 and α  2, as we discussed in Section 2.1. Therefore eY ( k )  110 + k. The quotient (110 + k ) / (100 + k ) is monotonic, equal to 1.1 when k  0 and decreasing asymptotically to 1 as k → ∞. (A) 6.15. This is hard! I wonder whether the exam setters expected you to really work this out, or to select the only choice of the five that could work. Notice that when k  500, e Z ( k )  0, and e Z ( k ) is bounded (it certainly cannot go above 500), so (A) is the only choice of the five that can possibly be correct. Proving it is correct, though, requires work. We are trying to find the maximum for eZ ( k )  and

E[X ∧ 500] − E[X ∧ k] 1 − FX ( k )

E[X ∧ k]  100 − (100 + k ) E[X ∧ 500]  100 − eZ ( k )   The maximum occurs at k  6.16.

−2/3 −2 (1/600)

1002 600 2 − 100 600

1002 100+k 1002 (100+k ) 2 −k 2 2

1002 1002  100 − 2 100 + k (100 + k )

 100 + k −

(100 + k ) 2 600

500 + k+ 600 3 6

 32 (300)  200, and e Z (200)  100 + 200 −

We use equation (6.8). Since F (1000)  1, E[X ∧ 1000]  E[X].

E[X]  E[X ∧ 100] + e (100) 1 − F (100)



331  91 + e (100)(1 − 0.2) 240 e (100)   300 (B) 0.8 C/4 Study Manual—17th edition Copyright ©2014 ASM



3002 600

 150.

6. DEDUCTIBLES

114

6.17.

By equation (6.8), E[X]  E[X ∧ 10,000] + e (10,000) Pr ( X > 10,000)

9,000  E[X ∧ 10,000] + 20,000 (0.1)

E[X ∧ 10,000]  9,000 − 2,000  7,000

However, by the Law of Total Probability, the limited expected value at 10,000 can be decomposed into the portion for losses below 10,000 and the portion for losses 10,000 and higher: E[X ∧ 10,000]  Pr ( X < 10,000) E[X ∧ 10,000 | X < 10,000] + Pr ( X ≥ 10,000) E[X ∧ 10,000 | X ≥ 10,000] 7,000  0.9 E[X | X < 10,000] + 0.1 (10,000)

since X ∧ 10,000  X for X < 10,000 and 10,000 for X > 10,000. Therefore, E[X | X < 10,000]  6.18.

500 1.13

7,000 − 1,000  6,666 23 0.9

 375.66. Somewhere between 36 + 6 + 3  45 and 36 + 6 + 3 + 5  50 claims are below 375.66,

out of a total of 100 claims, so between 50 and 55 percent are above . (D) 6.19. E[X]  25,000 E[X ∧ 1000] 

1 50,000

1000

Z 0

x dx +

(50,000 − 1000) 50,000

(1000)

 10 + 980  990 E[X] − E[X ∧ 1000]  25,000 − 990  24,010 6.20. Average payment per payment for an ordinary deductible is e ( x )  1000. For a franchise deductible, we add 500, making average payment per payment 1500. Average payment per loss is 1500e −500/1000  909.80 . 6.21. e (500)  5500 2.5  2200  average payment per payment for an ordinary deductible. For a franchise  deductible, average payment per payment is 2200 + 500  2700. The average payment per loss is 2700 1 − F (500)  2700



5000 3.5 5500

 1934.15 .

6.22. Under the old deductible, the F ( d )  F (10,000)  0.60 and Pr ( X > d )  1 − F (10,000)  0.40. You would like to select d 0 such that Pr ( X > d 0 )  12 (0.40)  0.20, or F ( d 0 )  0.80. From the table, we see that d 0  22,500. Under the old deductible of d  10,000, E[X]−E[X∧10,000]  20,000−6,000  35,000. Under the new de1−F (10,000) 0.4 ductible of d 0  22,500, it is

20,000−9,500 0.2

 52,500. This is an increase of 50.0% . (E)

6.23. X follows an exponential distribution with mean 10. Let X ∗ be the inflated variable. Then X ∗ follows an exponential disribution with mean 1.1 (10)  11. P is the conditional random variable X ∗ − d given that X ∗ > d. Since X ∗ is exponential and has no memory, the distribution of P is the same as the distribution of X ∗ —exponential with mean 11. Then FP (5)  FX (5)  1 − e −5/11  0.3653

(E)

6.24. The expected payment per loss is E[X] − E[X ∧ 15,000]  20,000 − 7,700  12,300. p is the expected payment per payment, so we divide by 1 − F (15,000)  0.3 and get 12,300/0.3  41,000 . (C) C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 6

6.25.

115

Let X ∗  1.5X be the inflated variable. Then E[X ∗ ]  1.5 E[X]  1.5 (20,000)  30,000 E[X ∗ ∧ 15,000]  E[1.5X ∧ 15,000]  1.5 E[X ∧ 10,000]  1.5 (6,000)  9,000

1 − F ∗ (15,000)  1 − F (10,000)  1 − 0.60  0.40 E[X ∗ ] − E[X ∗ ∧ 15,000] 30,000 − 9,000   52,500 (D) 1 − F ∗ (15,000) 0.4

6.26. Let Y L be the payment per loss. Then E[Y L ]  16 (500 + 1500 + 2500 + 3500 + 4500)  2083 13 . After inflation, with Z  1.05X, the loss amounts are 1050, 2100, 3150, . . . , 6300, and if we let Z L be the payment per loss after a 1500 deductible, E[Z L ]  61 (600+1650+2700+3750+4800)  2250. The increase in expected payment per loss (the “impact”) is 2250 − 1  0.08 (D) 2083 13 6.27. A Weibull with τ  1 is an exponential with mean θ. An exponential has no memory, so the payment distribution is the same as the underlying loss distribution, which is exponential with mean 20. FY P (25)  FX (25)  1 − e −25/20  1 − e −1.25  0.713

(C)

6.28. The aggregate claim costs will be the same if the average payment per loss is the same, since the number of losses is not affected by inflation. If Y L is the average payment per loss variable, and X is the loss variable, then for a franchise deductible of 1000, E[Y L ]  E[X] − E[X ∧ 1000] + 1000 1 − F (1000)



e

5+22 /2

−e

5+22 /2

ln 1000 − 5 − 22 Φ 2



!

 e 7 − e 7 Φ (−1.05)  935.54 Notice that in the expression for E[X ∧ 1000] given in the Loss Models appendix is the same  the last term  as 1000 1 − F (1000) , and therefore cancels out. 0

We now equate the inflated variable Y L to 935.54. Let x be the new deductible. For the inflated variable, µ  5 + ln 1.1 and σ  2. Notice that e 7+ln 1.1  1.1e 7 . 935.54  1.1e 7 − 1.1e 7 Φ

ln x − 5 − ln 1.1 − 22 2

!

935.54 − 1.1e 7 ln x − ln 1.1 − 9 Φ   0.2245 2 −1.1e 7 ln x − ln 1.1 − 9  Φ−1 (0.2245)  −0.76 2 ln x  2 (−0.76) + 9 + ln 1.1  7.575

!

x  e 7.575  1949

6.29.

This is an example of a disappearing deductible. You must understand how much is paid for a loss:

If the loss is 100 or less, nothing is paid, since the deductible is 100.

If the loss is 500 or more, the entire loss is paid, since the deductible is 0. (100 − 0.25 (500 − 100)  0)

6. DEDUCTIBLES

116

For losses x in between 100 and 500, the deductible is 100−0.25 ( x −100) , or 125−0.25x. This is a linear function equal to 100 at 100 and 0 at 500. Hence the amount paid is x − (125 − 0.25x )  1.25x − 125.

Thus the amount the company pays for a loss is: •

0% of the first 100,

125% of the next 400,

100% of the excess over 500.

In effect, the company pays 1.25 times the part of the loss between 100 and 500, plus the part of the loss above 500. In other words, if the loss variable is X and the payment per loss variable Y L , Y L  1.25 ( X ∧ 500 − X ∧ 100) + ( X − X ∧ 500)  X + 0.25 ( X ∧ 500) − 1.25 ( X ∧ 100) . Therefore E[Y L ]  E[X] + 0.25 E[X ∧ 500] − 1.25 E[X ∧ 100]

!2 !2 50 ++ 50 + + * * * * / / − 1.25 .2500 .1 − //  2500 + 0.25 .2500 .1 − 55 51 --, , , ,

 2500 + 108.47 − 121.35  2487.12 To get the payment per payment, we divide by S (100) :

!3

5000  0.942322 S (100)  5100 2487.12  2639.36 0.942322 6.30.

We use E[1.1X ∧ d]  1.1 E[X ∧ d/1.1]. 1.1 E[X] − 1.1 E X ∧

f

500 1.1

g

 110 − 110 1 − e −500/110





 110e −500/110  1.16769

You can also recognize the distribution as exponential and use the tables. 6.31. For Group R, we want e (500) . Using formula (6.10), this is 2000+500  2500. 1 For Group S, we want 200 + e (200)  200 + 3000+200  3400. The difference is 3400 − 2500  900 . (C) 1 6.32.

The new θ is 1.2 (1500)  1800. Using the tables,

!3

1800 * 1800 + .1 − /  89.8382 E[X ∧ 100]  3 1900

,

-

1800 E[X]   600 3 f g E ( X − 100)+  600 − 89.8382  510.1618

(D)

EXERCISE SOLUTIONS FOR LESSON 6

117

6.33. The weighted average of the payment per payment is not the payment per payment for the mixture! What is true is that the weighted average of the payment per loss is the payment per loss of the mixture, and the weighted average of the distribution function is the distribution of the mixture. We must compute the average payment per payment of the mixture as the quotient of the average payment per loss over the complement of the distribution function. We use the fact that for a Pareto, e ( d )  ( θ + d ) / ( α − 1) , where e ( x ) is average payment per payment with a deductible d. Therefore, average payment per loss with a deductible d is e (d ) 1 − F (d ) 





θ+d α−1

!

θ θ+d

Let w be the weight on the first distribution. We have that the payment per loss with d  500 for the mixture is 1000 + 500 w 2−1

!

1000 1000 + 500

and 1 − F ( d ) for the mixture is w

1000 1000 + 500

!2

!2

2000 + 500 + (1 − w ) 3−1

+ (1 − w )

2000 2000 + 500

!

2000 2000 + 500

!3 



4 9

!3

 666 23 − 640 w + 640





− 0.512 w + 0.512



The quotient of the first of these over the second is payment per payment for the mixture. Let’s equate this to 1471.63 and solve for w.



26 32 w + 640  1471.63



w



4 9

− 0.512 w + 0.512  −99.41678w + 753.47456





113.4746  0.9 126.0834

Let’s calculate the quotient with a deductible of 600. x 6.34.

1600

5 2 8 (0.9) 5 2 8 (0.9)

+ 1300 +

10 3 13 (0.1)

10 3 13 (0.1)

 1565.61

This is a mixture distribution, and we want Pr ( X > 750) . Probabilities are weighted probabilities

of the component distributions. For θ  400, Pr ( X1 > 750)  1 −

Pr ( X2 > 750)  1 − 6.35.

(750/600) 2 1+ (750/600) 2

(750/400) 2 1+ (750/400) 2

 0.221453. For θ  600,

 0.390244. Then 0.3 (0.221453) + 0.7 (0.390244)  0.339607 . (B)

The payments after the deductible have the following distribution: Payment Probability 0 100 400 900

0.50 0.30 0.10 0.10

Let Y be the payment for one employee. E[Y]  0.30 (100) + 0.10 (400) + 0.10 (900)  160 E[Y 2 ]  0.30 (1002 ) + 0.10 (4002 ) + 0.10 (9002 )  100,000 Var ( Y )  100,000 − 1602  74,400 √ The 95th percentile is 50 (160) + 1.645 50 (74,400)  11,173 . (D) C/4 Study Manual—17th edition Copyright ©2014 ASM

6. DEDUCTIBLES

118 √

6.36. For a gamma distribution, the coefficient of variation is αθ  √1α . Thus α  1 and θ  1,000,000. The distribution is exponential since α  1. For an exponential, e X ( x )  E[X] for any x, so e (2,000,000)  1,000,000. Then αθ

E ( X − 2,000,000)+  1,000,000 1 − F (2,000,000)  1,000,000e −2  135,335.28

f

g





(B)

6.37. We want the expected payment per payment. Notice that the payment per loss is x if x < 5000 but 0 if x ≥ 5000. What you have in your tables, E[X ∧ 5000], represents the expectation of a payment per loss of x if x < 5000 and 5000 if x ≥ 5000. The expectation we need is therefore  E[X ∧ 5000]  minus 5000 times the probability that the loss is greater than 5000, or E[X ∧ 5000] − 5000 1 − F (5000) , which is e

8.443+1.2392 /2

ln 5000 − 8.443 − 1.2392  10000Φ (−1.18)  1190. Φ 1.239

!

The expected payment per payment is 1190 divided by F (5000)  Φ quotient is 1190/0.5239  2271 . (E)

ln 5000−8.443 1.239

 Φ (0.06)  0.5239. This

Quiz Solutions 6-1. The expected value of losses, X, is 0.75 (1000) + 0.25 (2000)  1250. The limited expected value at 500 is a weighted average of E[X ∧ 500] for each coverage. E[X ∧ 500]  0.75 (1000)(1 − e −500/1000 ) + 0.25 (2000)(1 − e −500/2000 )  0.75 (393.47) + 0.25 (442.40)  405.70 The survival function at 500 is S (500)  0.75e −500/1000 + 0.25e −500/2000  0.649598 Therefore, the expected payment per payment is (1250 − 405.70) /0.649598  1299.72 .

Lesson 7

Loss Elimination Ratio Reading: Loss Models Fourth Edition 8.3 The Loss Elimination Ratio is defined as the proportion of the expected loss which the insurer doesn’t pay as a result of an ordinary deductible. In other words, for an ordinary deductible of d, it is LER ( d ) 

E[X ∧ d] . E[X]

The textbook defines loss elimination ratio only for ordinary deductibles. Example 7A You are given the following information for an auto collision coverage: Ordinary deductible

Average payment per payment

Loss elimination ratio

0 1000

800 720

0 0.8

A new version of the coverage with a 1000 franchise deductible is introduced. Determine the average payment per loss for this coverage. Answer: We have two ways to calculate the average payment per loss with the ordinary deductible of 1000. One of them is that the average payment per loss without the deductible is given as 800, and the loss elimination ratio of the 1000 deductible is 0.8, so the average payment per loss with the 1000 ordinary deductible is 800 (1 − 0.8)  160. The other way is that the average payment per payment is 720 and the probability of a payment is 1 − F (1000) , so the average payment per loss is 720 1 − F (1000) . Equating these two: 160  720 1 − F (1000)





2 9

1 − F (1000) 

Then the average payment per loss for the franchise deductible is 160 plus the additional payment of 1000 whenever the loss is above the deductible, or 160 + 1000 1 − F (1000)  160 + 1000 (2/9)  382 92





Alternatively, the average payment per payment under the franchise deductible is 1000 more than it would be with an ordinary deductible, or 720+1000  1720. The average payment per loss is the average payment per payment times the probability that a claim will be above 1000, or 1 − F (1000)  29 , or 1720 (2/9) , which comes out 382 29 , the same as above.  The next example combines inflation with LER. Example 7B An insurance coverage has an ordinary deductible of 500. Losses follow a two-parameter Pareto distribution with α  3, θ  1000. Calculate the reduction in the loss elimination ratio after 10% inflation as compared to the original loss elimination ratio. C/4 Study Manual—17th edition Copyright ©2014 ASM

119

7. LOSS ELIMINATION RATIO

120

Answer: The loss elimination ratio is E[X ∧ 500]/ E[X]. Checking the tables, we see that the formula for the limited expected value is

! α−1 θ * +/ E[X ∧ d]  E[X] .1 − θ+d ,  2 so the loss elimination ratio in our case is 1 − θ/ ( θ + 500) . For the original variable, this is 1000 1500

1−

!2 

5 9

For the inflated variable with θ  1000 (1.1)  1100, this is 1100 1− 1600

!2 

135 256

The reduction is (5/9) − (135/256)  0.02821 .



For exponentials and Paretos, the formula for E[X ∧ d] includes E[X] as a factor, which cancels out when calculating LER, so LER ( d )  1 − e −d/θ LER ( d )  1 −

θ d+θ

for an exponential

! α−1

for a Pareto with α > 1

You can also write a decent formula for LER of a single-parameter Pareto: LER ( d )  1 −

( θ/d ) α−1 α

α > 1, d ≥ θ

but for a lognormal, the formula isn’t so good.

Exercises 7.1.

[4B-S92:25] (2 points) You are given the following information:

Deductible Expected size of claim with no deductible Probability of a loss exceeding deductible Mean excess loss of the deductible Determine the loss elimination ratio. (A) (B) (C) (D) (E)

250 2500 0.95 2375

Less than 0.035 At least 0.035, but less than 0.070 At least 0.070, but less than 0.105 At least 0.105, but less than 0.140 At least 0.140

Exercises continue on the next page . . .

EXERCISES FOR LESSON 7

121

7.2. [4B-F92:18] (2 points) You are given the following information: • Deductible d • Expected value limited to d, E[X ∧ d] • Probability of a loss exceeding deductible, 1 − F ( d ) • Mean excess loss of the deductible, e ( d ) Determine the loss elimination ratio. (A) (B) (C) (D) (E)

500 465 0.86 5250

Less than 0.035 At least 0.035, but less than 0.055 At least 0.055, but less than 0.075 At least 0.075, but less than 0.095 At least 0.095 [4B-F99:1] (2 points) You are given the following:

7.3. •

Losses follow a distribution (prior to the application of any deductible) with mean 2,000.

The loss elimination ratio (LER) at a deductible of 1,000 is 0.30.

60 percent of the losses (in number) are less than the deductible of 1,000. Determine the average size of a loss that is less than the deductible of 1,000.

(A) (B) (C) (D) (E) 7.4.

Less than 350 At least 350, but less than 550 At least 550, but less than 750 At least 750, but less than 950 At least 950 You are given:

(i) The average loss below the deductible is 500. (ii) 60% of the number of losses is below the deductible. (iii) The loss elimination ratio at the deductible is 31%. (iv) Mean loss is 2000. Determine the deductible. 7.5.

[4B-S94:10] (2 points) You are given the following:

The amount of a single claim has a Pareto distribution with parameters α  2 and θ  2000. Calculate the Loss Elimination Ratio (LER) for a \$500 deductible. (A) (B) (C) (D) (E)

Less than 0.18 At least 0.18, but less than 0.23 At least 0.23, but less than 0.28 At least 0.28, but less than 0.33 At least 0.33

Exercises continue on the next page . . .

7. LOSS ELIMINATION RATIO

122

7.6. [4B-S93:28] (3 points) You are given the following: •

The underlying loss distribution function for a certain line of business in 1991 is: F ( x )  1 − x −5

x > 1.

From 1991 to 1992, 10% inflation impacts all claims uniformly. Determine the 1992 Loss Elimination Ratio for a deductible of 1.2.

(A) (B) (C) (D) (E)

Less than 0.850 At least 0.850, but less than 0.870 At least 0.870, but less than 0.890 At least 0.890, but less than 0.910 At least 0.910

Use the following information for questions 7.7 and 7.8: You are given the following: •

Losses follow a Pareto distribution with parameters θ  k and α  2, where k is a constant.

There is a deductible of 2k.

7.7. [4B-F96:13] (2 points) What is the loss elimination ratio (LER)? (A) 1/3

(B) 1/2

(C) 2/3

(D) 4/5

(E) 1

7.8. [4B-F96:14] (2 points) Over a period of time, inflation has uniformly affected all losses, causing them to double, but the deductible remains the same. Calculate the new loss elimination ratio (LER). (A) 1/6

(B) 1/3

(C) 2/5

(D) 1/2

(E) 2/3

7.9. Losses follow a single-parameter Pareto distribution with α  3, θ  500. Determine the deductible d needed to achieve a loss elimination ratio of 20%. Losses follow a single-parameter Pareto distribution with α  3, θ  500.

7.10.

Determine the deductible d needed to achieve a loss elimination ratio of 80%. [4B-F93:27] (3 points) You are given the following:

7.11. •

Losses for 1991 are uniformly distributed on [0, 10,000].

Inflation of 5% impacts all losses uniformly from 1991 to 1992 and from 1992 to 1993 (5% each year). Determine the 1993 Loss Elimination Ratio for a deductible of \$500.

(A) (B) (C) (D) (E)

Less than 0.085 At least 0.085, but less than 0.090 At least 0.090, but less than 0.095 At least 0.095, but less than 0.100 At least 0.100

Exercises continue on the next page . . .

EXERCISES FOR LESSON 7

123

Use the following information for questions 7.12 and 7.13: You are given the following: •

Losses follow a distribution with density function f (x ) 

−x/1000 1 , 1000 e

0 < x < ∞.

There is a deductible of 500.

10 losses are expected to exceed the deductible each year.

7.12. [4B-S97:19] (3 points) Determine the amount to which the deductible would have to be raised to double the loss elimination ratio (LER). (A) (B) (C) (D) (E)

Less than 550 At least 550, but less than 850 At least 850, but less than 1150 At least 1150, but less than 1450 At least 1450

7.13. [4B-S97:20] (2 points) Determine the expected number of losses that would exceed the deductible each year if all loss amounts doubled, but the deductible remained at 500. (A) (B) (C) (D) (E)

Less than 10 At least 10, but less than 12 At least 12, but less than 14 At least 14, but less than 16 At least 16

Use the following information for questions 7.14 and 7.15: Losses follow a lognormal distribution with parameters µ  6.9078 and σ  1.5174. 7.14. [4B-S99:20] (2 points) Determine the ratio of the loss elimination ratio (LER) at 10,000 to the loss elimination ratio (LER) at 1,000. (A) (B) (C) (D) (E)

Less than 2 At least 2, but less than 4 At least 4, but less than 6 At least 6, but less than 8 At least 8

7.15. [4B-S99:21] (2 points) Determine the percentage increase in the number of losses that exceed 1,000 that would result if all losses increased in value by 10%. (A) (B) (C) (D) (E)

Less than 2% At least 2%, but less than 4% At least 4%, but less than 6% At least 6%, but less than 8% At least 8%

Exercises continue on the next page . . .

7. LOSS ELIMINATION RATIO

124

Use the following information for questions 7.16 and 7.17: You are given the following: •

Losses follow a lognormal distribution with parameters µ  7 and σ  2.

There is a deductible of 2,000.

10 losses are expected each year.

The number of losses and the individual loss amounts are independent.

7.16. [4B-S96:9 and 1999 C3 Sample:17] (2 points) Determine the loss elimination ratio (LER) for the deductible. (A) (B) (C) (D) (E)

Less than 0.10 At least 0.10, but less than 0.15 At least 0.15, but less than 0.20 At least 0.20, but less than 0.25 At least 0.25

7.17. [4B-S96:10 and 1999 C3 Sample:18] (2 points) Determine the expected number of annual losses that exceed the deductible if all loss amounts are increased uniformly by 20%, but the deductible remained the same. (A) (B) (C) (D) (E)

Less than 4.0 At least 4.0, but less than 5.0 At least 5.0, but less than 6.0 At least 6.0, but less than 7.0 At least 7.0 [4B-S95:6] (3 points) You are given the following:

7.18. •

For 1994, loss sizes follow a uniform distribution on [0, 2500].

In 1994, the insurer pays 100% of all losses.

Inflation of 3.0% impacts all losses uniformly from 1994 to 1995.

In 1995, a deductible of 100 is applied to all losses. Determine the Loss Elimination Ratio (L.E.R.) of the deductible of 100 on 1995 losses.

(A) (B) (C) (D) (E)

Less than 7.3% At least 7.3%, but less than 7.5% At least 7.5%, but less than 7.7% At least 7.7%, but less than 7.9% At least 7.9%

Exercises continue on the next page . . .

EXERCISES FOR LESSON 7

7.19.

125

Losses in 2008 follow a distribution with density function 1 x f (x )  1− 500,000 1,000,000





0 ≤ x ≤ 1,000,000.

Reinsurance pays the excess of each loss over 100,000. 10% inflation impacts all losses in 2009. Let LER2008 be the reinsurance loss elimination ratio in 2008, and LER2009 the reinsurance loss elimination ratio in 2009. Determine LER2008 − LER2009 . 7.20.

[SOA3-F03:29] The graph of the density function for losses is: 0.012 0.010 f (x )

0.008 0.006 0.004 0.002 0.000 0

80 Loss amount, x

120

Calculate the loss elimination ratio for an ordinary deductible of 20. (A) 0.20 7.21.

(B) 0.24

(C) 0.28

(D) 0.32

(E) 0.36

[SOA3-F03:34] You are given:

(i) Losses follow an exponential distribution with the same mean in all years. (ii) The loss elimination ratio this year is 70%. (iii) The ordinary deductible for the coming year is 4/3 of the current deductible. Compute the loss elimination ratio for the coming year. (A) 70%

(B) 75%

(C) 80%

(D) 85%

(E) 90%

7.22. [CAS3-S04:20] Losses have an exponential distribution with a mean of 1,000. There is a deductible of 500. The insurer wants to double the loss elimination ratio. Determine the new deductible that achieves this. (A) 219

(B) 693

(C) 1,046

(D) 1,193

(E) 1,546

7.23. [SOA3-F04:18] Losses in 2003 follow a two-parameter Pareto distribution with α  2 and θ  5. Losses in 2004 are uniformly 20% higher than in 2003. An insurance covers each loss subject to an ordinary deductible of 10. Calculate the Loss Elimination Ratio in 2004. (A) 5/9

(B) 5/8

(C) 2/3

(D) 3/4

(E) 4/5

Exercises continue on the next page . . .

7. LOSS ELIMINATION RATIO

126

Use the following information for questions 7.24 and 7.25: Losses have the following distribution: F ( x )  1 − 0.4e −x/20 − 0.6e −x/2000 Insurance coverage is subject to an ordinary deductible of 100. 7.24.

Calculate the average payment per payment.

7.25.

Losses increase uniformly by 20% inflation.

Calculate the Loss Elimination Ratio after inflation. 7.26. For an insurance coverage, losses (before application of any deductible) follow a 2-parameter Pareto with parameters α  3 and θ  5000. The coverage is subject to a deductible of 500. Calculate the deductible needed to double the loss elimination ratio. Additional released exam questions: CAS3-F05:33, SOA M-F05:28

Solutions 7.1.

We’ll use formula (6.8) to evaluate E[X ∧ 250]. E[X ∧ d] E[X] − e ( d ) 1 − F ( d ) (2375)(0.95) LER ( d )   1−  0.0975 E[X] E[X] 2500





(C)

7.2. We’ll use formula (6.8) to evaluate E[X]. E[X]  E[X ∧ d] + e ( d ) 1 − F ( d )  465 + 5250 (0.86)  4980



LER ( d ) 

7.3.



465  0.0934 . (D) 4980

E[X ∧ 1000]  0.3 (2000)  600. Let x be the answer. Then 600  0.6x + 0.4 (1000) , so x  333 13 . (A)

7.4. We are given that 0.31  LER  can decompose E[X ∧ d] into

E[X∧d] E[X]

and E[X]  2000, so E[X ∧ d]  620. On the other hand, we

E[X ∧ d]  E[X ∧ d | X < d] Pr ( X < d ) + E[X ∧ d | X ≥ d] Pr ( X ≥ d )  (Average loss < d ) Pr ( X < d ) + d Pr ( X ≥ d )  500 (0.6) + d (0.4)

from which it follows that 620  300 + 0.4d, so d  800 . 7.5.

E[X]  2000. For E[X ∧ 500] you can use the formula:



E[X ∧ 500]  2000 1 − LER  C/4 Study Manual—17th edition Copyright ©2014 ASM

2000  400 2500

400  0.2 2000



(B)

EXERCISE SOLUTIONS FOR LESSON 7

127

If you wanted to back it out of e X (500) , the calculations would go: 2000 + 500  2500 1 !2 2000 1 − F (500)   0.64 2500 e (500) 

E[X ∧ 500]  E[X] − e (500) 1 − F (500)  2000 − 2500 (0.64)  400.





7.6. Losses X follow a single parameter Pareto with θ  1, α  5. Let Z  1.1X. The new θ for Z is 1.1. LER is the quotient of E[Z ∧ 1.2] over E[Z], and using the tables: αθ α−1 αθ θα E[Z ∧ d]  − α − 1 ( α − 1) d α−1 E[Z] 

d≥θ

θα

LER ( d )  1 − 1−

( α−1) d α−1 αθ α−1 θ α−1

αd α−1

So with θ  1.1, α  5, LER (1.2)  1 −

1.14  0.8588 5 (1.24 )

(B)

Of course, you could also work out the integral for E[Z] and E[Z ∧ 1.2] using the survival function—and this was probably expected by the exam setters, since they gave 3 points for this problem. Then you would get, since the survival function is 1 below 1.1 (Pr ( X < 1.1)  0) ∞

Z E[Z] 

0

1.1

Z 

S ( x ) dx 

0

1 dx +

1.15  1.1 + 4 E[Z ∧ 1.2] 

1.2

Z 0

1.1

Z 

0

!

Z

0 ∞

S ( x ) dx +

Z

!5

1.1

S ( x ) dx

1.1 dx x

1.1

1  1.375 1.14

!

S ( x ) dx  1 dx +

1.1

Z

Z

1.1

Z

1.2 1.1

0

S ( x ) dx +

!5

Z

1.2 1.1

1.1 dx x

1 1.15 1  1.1 + −  1.180832 4 4 1.1 1.24 1.180832 LER   0.8588 1.375

!



S ( x ) dx

7. LOSS ELIMINATION RATIO

128

7.7. Using the formulas in the tables: E[X]  k k 2k E[X ∧ 2k]  k 1 −  3k 3

!

LER 

2 3

(C)

7.8. Double the scale parameter, so the new θ  2k. θ  2k E[X]  2k E[X ∧ 2k]  2k 1 − LER 

1 2

2k k 4k

!

(D)

7.9. Let X be loss size. We calculate the expected value of the loss, E[X]. We then calculate the deductible d such that E[X ∧ d]  0.2 E[X]; this is the deductible which achieves the required loss elimination ratio E[X∧d] E[X]  0.2. For a single-parameter Pareto, E[X] 

αθ 3 (500)   750 α−1 2

The probability that a single-parameter Pareto’s loss is less than θ is 0. This is more or less the same as saying that every loss is greater than 500. (While it is possible for a loss to be less than 500 with probability 0, events with probability 0 do not affect expected values.) Therefore, if we set d < 500, E[X ∧ d]  E[min ( X, d ) ]  d, because the minimum of X and d is the constant d. (Every loss is greater than 500, and therefore certainly greater than d.) To achieve E[X ∧ d]  0.2 (750)  150, we set d  150 . If you tried using the formula for E[X ∧ d] in the tables to back out d, you would obtain a d less than θ  500. But the formula in the tables only works for d ≥ θ, so the d you would get this way would not be correct. 7.10.

As in the previous exercise, E[X] 

αθ 3 (500)   750 α−1 2

We can’t use the method of the last exercise, though, since 0.8 (750) > θ, so we use the formula from the tables: αθ θα − α − 1 ( α − 1) d α−1 5003  750 − 2d 2 5003 LER  1 −  0.8 (2d 2 )(750) 5003  0.2 1500d 2

E[X ∧ d] 

EXERCISE SOLUTIONS FOR LESSON 7

129 5002  0.2 3d 2 5002  0.6 d2 500 d√  645.50 0.6

7.11.

For the inflated variable X 0 in terms of the original variable X E[X 0]  1.052 E[X]  1.052 (5000)

We can evaluate E[X 0 ∧ 500] by conditioning on X 0 < 500. If X 0 is greater, E[X 0 ∧ 500]  500; if it’s less, by uniformity it is 250 on the average. So E[X 0 ∧ 500]  250 Pr ( X 0 ≤ 500) + 500 Pr ( X 0 ≥ 500)  The quotient is LERX0 (500)  7.12.

5,387,500 250 (500) + 500 (11,025 − 500)  10,000 (1.052 ) 10,000 (1.052 )

5,387,500 E[X 0 ∧ 500]   0.088646 0 E[X ] 50,000,000 (1.054 )

(B)

Losses are exponential. Let X be loss size. The formula for the loss elimination ratio is LER ( d ) 

For a deductible of 500,

E[X ∧ d]  E[X]

θ 1 − e −d/θ



θ

  1 − e −d/θ  1 − e −d/1000

LER (500)  1 − e −500/1000  1 − e −1/2  0.39347

We must compute d such that LER ( d )  2 (0.39347)  0.78694.

LER ( d )  1 − e −d/1000  2 (0.39347)  0.78694 e −d/1000  1 − 0.78694  0.21306

d  −1000 (ln 0.21306)  1546.2

(E)

Note: the question is asking for the amount to which the deductible is raised; this means, the amount of the new deductible. If the question had asked "by how much is the deductible raised", then the answer would be the difference between the new deductible and the old deductible. 7.13. For an exponential, Pr ( X > x )  e −x/θ . Since there are 10 expected losses above the deductible of 500, the total expected number of losses is 10/ Pr ( X > 500)  10/e −500/1000  10e 1/2 . When losses are doubled, θ is doubled and becomes 2000. The expected number of losses above the deductible of 500 is then 10e 1/2 Pr ( X > 500)  10e 1/2 e −500/2000  10e 1/4  12.84 . (C) 7.14. By definition, the loss elimination ratio is E[X ∧ d]/ E[X], so the ratio of loss elimination ratios is the ratio of E[X ∧ d]’s. This exercise is asking for 2

1.5174 Φ ln 10,000−6.9078−1.5174 + 10,000 1 − F (10,000) E[X ∧ 10,000] exp 6.9078 + 2 1.5174       2 2 ln 1000−6.9078−1.5174 E[X ∧ 1000] exp 6.9078 + 1.5174 Φ + 1000 1 − F ( 1000 ) 2 1.5174 2





Let’s evaluate the normal distribution functions we need. ln 10,000 − 6.9078 − 1.51742  Φ (0)  0.5 1.5174

!

Φ





 (*)

7. LOSS ELIMINATION RATIO

130

ln 10,000 − 6.9078 1 − F (10,000)  1 − Φ  1 − Φ (1.52)  0.0643 1.5174

!

ln 1000 − 6.9078 − 1.51742  Φ (−1.52)  0.0643 1.5174

!

Φ

ln 1000 − 6.9078  1 − Φ (0)  0.5 1 − F (1000)  1 − Φ 1.5174

!

Plugging these into (*), and also plugging in exp 6.9078 +



1.51742 2



 3162.29, we have

E[X ∧ 10,000] 3162.29 (0.5) + 10,000 (0.0643) 2224.15    3.16 E[X ∧ 1000] 3162.29 (0.0643) + 1000 (0.5) 703.34 7.15.

(B)

Let X be the original loss variable and let Z  1.1X be the inflated loss variable. 1 − FX (1000)  0.5 from previous exercise

ln 1000 − 6.9078 − ln 1.1  1 − Φ (−0.06)  0.5239 1 − FZ (1000)  1 − Φ 1.5174 0.5239 (C) − 1  0.0478 0.5

!

7.16. E[X]  exp (7 + 42 )  8103.08 ln 2000 − 7 − 4 ln 2000 − 7 + * / E[X ∧ 2000]  e Φ + 2000 .1 − Φ 2 2

!

9

!

,   e Φ (−1.7) + 2000 1 − Φ (0.3) 9





-



 e 9 (0.0446) + 2000 (0.3821)  1125.60 1125.60  0.1389 (B) LER  8103.08 7.17. The expected number of annual losses that exceed the deductible is the product of the expected number of losses and the probability that a single loss exceeds the deductible. (For example, if the probability that a loss exceeds the deductible is 0.2 and there are 5 losses, the expected number of losses exceeding the deductible is 5 (0.2)  1.) In this question, the expected number of losses is 10 and the probability of exceeding the deductible of 2000 is 1 − F (2000) . F is the distribution function of the inflated variable. For a lognormal distribution, inflating a variable by 20% is achieved by adding ln 1.2 to µ and not changing σ, as discussed in Section 2.1, page 29. Therefore, the expected number of losses above the deductible is 10 1 − F (2000)  10 .1 − Φ





* ,

  ln 2000 − 7 − ln 1.2 + /  10 1 − Φ (0.21)  4.2 2 !

-

(B)

EXERCISE SOLUTIONS FOR LESSON 7

7.18.

131

The inflated variable X is uniform on [0, 2575].

2575 2 To calculate E[X ∧ 100] for a uniform random variable, treat it as a mixture: probability that it is under 100 times the midpoint, plus probability that it is over 100 times 100. E[X] 

2475 100 (50) + (100)  98.058 E[X ∧ 100]  2575 2575

!

The loss elimination ratio is

!

98.058  0.0762 2575/2

LERX (100) 

(C)

7.19. Let’s substitute y  x/1,000,000. Then fY ( y )  2 (1 − y ) , 0 ≤ y ≤ 1, a beta distribution with a  1, a b  2. E[Y]  a+b  31 . (You can also calculate this from basic principles.) E[Y ∧ 0.1]  0.1

Z 0

0.1

Z 0

2y (1 − y ) dy + 0.1 1 − F (0.1)





0.1

2y (1 − y ) dy  y 2 − 23 y 3 0

 0.01 − 23 (0.001) 0.028 28   3 3000 F (0.1) 

0.1

Z 0

2 (1 − y ) dy

0.1

 2y − y 2  0.19 0 28 + 0.1 (1 − 0.19)  E[Y ∧ 0.1]  3000 271/3000 LER2008   0.271 1/3

271 3000

After inflation, with Z  1.1Y E[Z]  1.1 E[Y] 

1.1 3

E[Z ∧ 0.1]  1.1 E Y ∧

0.1 1.1

f

1/11

Z 0



g

11 30

 1.1

1/11

Z 0

2y (1 − y ) dy + 1.1

1 11

 

1−F

1 11



1/11 1 2 31  2y (1 − y ) dy  y 2 − 23 y 3 −  121 3 · 1331 3993 0   Z 1/11 1/11  2 − 1  21 1 F 11  2 (1 − y ) dy  2y − y 2 0 11 121 121 0     E[Z ∧ 0.1]  1.1

31 3993

+ 0.1 1 −

31 10 331 +  3630 121 3630 331/3630 331   11/30 1331

21 121

 LER2009

LERY (0.1)  LERX (100,000) , because multiplying a random variable by a constant and multiplying the 331 deductible by the same constant does not affect the LER. So the final answer is 0.271 − 1331  0.02231 . C/4 Study Manual—17th edition Copyright ©2014 ASM

7. LOSS ELIMINATION RATIO

132

7.20. f ( x )  0.01 for 0 ≤ x ≤ 80. The line from 80 to 120 has slope −0.01/40 and equals 0 at 120, so an equation for it is 0.01 (120 − x ) /40. If you have good intuition, you can calculate E[X] using E[X]  Pr ( X ≤ 80) E[X | X ≤ 80] + Pr ( X > 80) E[X | X > 80] X is uniform below 80, so the expected value of it given that it is below 80 is 40. Given that X > 80, X has a beta distribution with a  1, b  2, θ  40, shifted 80; recall that such a beta distribution is of the form c (40 − x ) , and the graph is a decreasing line, so it fits this description. Such a beta has mean 80 + 40/3. Pr ( X ≤ 80) is the area of the graph up to 80, or 80 (0.01)  0.8. So E[X]  0.8 (40) + 0.2 (80 + 40/3)  50 32 If you couldn’t do it that way, the straightforward way of calculating E[X] is the definition: 120

Z E[X] 

0

Z 

80

0

x f ( x ) dx

0.01x dx +

 (0.01)

802 2

+

120

Z

0.01 40

80

Z

120

80

x3 0.01 60x 2 −  32 + 40 3  32 +

0.01 40

!

120 − x (0.01) x dx 40

!

(120x − x 2 ) dx

! 120 80

60 (1202 − 802 ) −

1203 − 803 3

!

 32 + 18 32  50 23 The limited expected value at the deductible of 20 is (since X is uniform on [0, 20]) 10 times Pr ( X ≤ 20) plus 20 times Pr ( X > 20) , or E[X ∧ 20]  10 (0.2) + 20 (0.8)  18 So the LER is 18/50 32  0.355263 . (E)

7.21.

For an exponential, the loss elimination ratio is E[X ∧ d]  1 − e −d/θ E[X]

We have e −d/θ  0.3 d  −θ ln 0.3 Using 34 d as the deductible leads to an LER of 1 − e (3) ln 0.3  1 − 0.34/3  0.79917 4

(C)

EXERCISE SOLUTIONS FOR LESSON 7

133

7.22. Yes, you guessed it, the answer is A!!! More seriously, the LER for an exponential is LER ( d )  1 − e −d/θ  1 − e −500/1000  0.393469 Doubling this, we have 1 − e −d/1000  0.786938 e −d/1000  0.213062 d  −1000 ln 0.213062  1546.18 7.23.

(E)

The new θ is 1.2 (5)  6. E[X] 

6 6 1

6  3.75 6 + 10 3.75 5 LER  (B)  6 8





E[X ∧ 10]  6 1 −

7.24. The calculations for (limited) expected value of the mixture are done by taking weighted averages of the corresponding amounts for the two exponentials. E[X]  0.4 (20) + 0.6 (2000)  1208 E[X ∧ 100]  0.4 20 (1 − e −100/20 ) + 0.6 2000 (1 − e −100/2000 )  66.4707









S (100)  0.4e −100/20 + 0.6e −100/2000  0.5734

So the expected payment per payment is E[X] − E[X ∧ 100] 1208 − 66.4707   1990.70 . S (100) 0.5734 7.25. Inflation of the mixture is performed by inflating each component separately; the first exponential’s parameter becomes 20 (1.2)  24 and the second exponential’s parameter becomes 2000 (1.2)  2400. If the new loss variable is X 0, then E[X 0]  0.4 (24) + 0.6 (2400)  1449.60 E[X 0 ∧ 100]  0.4 24 (1 − e −100/24 ) + 0.6 2400 (1 − e −100/2400 )  68.218



LERX0 (100)  7.26.







68.218  0.0471 1449.60

To double the LER, it suffices to double E[X ∧ 500], since the denominator E[X] doesn’t change. E[X ∧ 500] 

!2

5000 * 5000 + /  433.884 .1 − 2 5500

,

-

!2 5000 * +/  2 (433.884)  867.77 E[X ∧ x]  2500 .1 − 5000 + x , C/4 Study Manual—17th edition Copyright ©2014 ASM

7. LOSS ELIMINATION RATIO

134

5000  5000 + x

r 1−

867.77  0.808018 2500

x  1187.98 Alternatively, note that the loss elimination ratio is E[X ∧ x] θ 1− E[X] θ+x

! α−1

and calculate 5000 LER (500)  1 − 5500

!2

 0.173554

!2

5000 LER ( x )  1 −  2 (0.173554)  0.347108 5000 + x √ 5000  1 − 0.347101  0.808018 5000 + x x  1187.98

Lesson 8

Risk Measures and Tail Weight Reading: Loss Models Fourth Edition 3.4–3.5, 5.3.4 Broadly speaking, a risk measure is a real-valued function of a random variable. We use the letter ρ for a risk measure; ρ ( X ) is the risk measure of X. You can probably think of several real-valued functions of random variables: • Moments. E[X], Var ( X ) , etc. • Percentiles. For example, the median is a real valued function of X. • Premium principles. For example, the premium may be set equal to the expected loss plus a constant times the standard deviation of a loss, or ρ ( X )  µ X + kσX . This is called the standard deviation principle. However, the risk measures we are interested in are measures for the solvency of a company. In the insurance context, they are high positive numbers indicating how high company reserves or surplus should be to give comfort to regulators and the public that they can cover losses in an adverse scenario. Among the functions listed above, high percentiles, or the premium principle ρ ( X )  µ X + cσX with a suitable c may qualify as such a risk measure.

8.1

Coherent risk measures

Let’s list four desirable properties of risk measures: 1. Translation invariance. Adding a positive1 constant to the random variable should add the same constant to the risk measure. Or: ρ (X + c )  ρ (X ) + c This is reasonable, since the amount of reserves or surplus needed for a fixed loss equals that loss, no more and no less. A company faced with having to pay the random amount X + c could break up into two companies, one with the obligation to pay X and another with the obligation to pay c. The second company would have a reserve of c and the first company would have a reserve equal to the appropriate risk measure for X. 2. Positive homogeneity. Multiplying the random variable by a positive constant should multiply the risk measure by the same constant: ρ ( cX )  cρ ( X ) This is reasonable, since expressing the random variable in a different currency (for example) should not affect the surplus or reserve needed. 3. Subadditivity. For any two random losses X and Y, the risk measure for X + Y should not be greater than the sum of the risk measures for X and Y separately: ρ ( X + Y ) ≤ ρ ( X ) + ρ (Y )

1I’m not sure why Loss Models says positive, since this holds for positive constants if and only if it holds for all constants. C/4 Study Manual—17th edition Copyright ©2014 ASM

135

8. RISK MEASURES AND TAIL WEIGHT

136

This is reasonable, since combining losses may result in diversification and reducing the total risk measure, but it should not be possible by breaking a risk into two sub-risks to reduce the total risk measure. 4. Monotonicity. For any two random losses X and Y, if X is always less than Y, or even if the probability that X is less than or equal to Y is 1, then the risk measure for X should be no greater than the risk measure for Y. ρ ( X ) ≤ ρ ( Y ) if Pr ( X ≤ Y )  1

This is reasonable, since the reserves or surplus needed to cover adverse scenarios of Y will be adequate to cover X as well with probability 1.

Risk measures satisfying all four of these properties are called coherent. Example 8A Which of the properties of coherence are satisfied by each of the following premium principles? 1. Equivalence principle: ρ ( X )  E[X] 2. Expected value principle: ρ ( X )  k E[X] 3. Variance principle: ρ ( X )  E[X] + k Var ( X ) 4. Standard deviation: ρ ( X )  µ X + kσX Answer: 1. The equivalence principle satisfies all four properties and is therefore coherent. By the properties of expected value, Translation invariance E[X + c]  E[X] + c Positive homogeneity E[cX]  c E[X] Subadditivity E[X + Y]  E[X] + E[Y] Monotonicity If Pr ( X ≤ Y )  1, then Pr ( X −Y ≤ 0)  1, so E[X −Y] ≤ 0, which implies E[X] ≤ E[Y].

2. The expected value principle fails translation invariance since

ρ ( X + c )  k E[X + c]  k E[X] + kc , ρ ( X ) + c  k E[X] + c However, the other three properties are satisfied: Positive homogeneity k E[cX]  ck E[X] Subadditivity k E[X + Y]  k E[X] + k E[Y] Monotonicity If Pr ( X ≤ Y )  1, then Pr ( X − Y ≤ 0)  1, so k E[X − Y] ≤ 0, which implies k E[X] ≤ k E[Y].

3. The variance principle only satisfies translation invariance.

Translation invariance ρ ( X + c )  E[X + c] + k Var ( X + c )  E[X] + c + k Var ( X )  ρ ( X ) + c Positive homogeneity ρ ( cX )  c E[X] + kc 2 Var ( X ) , cρ ( X )  c E[X] + kc Var ( X ) Subadditivity The variance principle fails subadditivity since Var ( X + Y )  Var ( X ) + Var ( Y ) + 2 Cov ( X, Y ) so if Cov ( X, Y ) > 0 the variance of the sum will be greater than the sum of the variances. C/4 Study Manual—17th edition Copyright ©2014 ASM

8.2. VALUE-AT-RISK (VAR)

137

Table 8.1: Coherence properties of four premium principles

Translation invariance Positive homogeneity Subadditivity Monotonicity

Equivalence ! ! ! !

Expected value # ! ! !

Standard deviation ! ! ! #

Variance ! # # #

Monotonicity The variance principle fails monotonicity. For example, let Y be a constant loss of 100 and let X be a loss with mean 99, variance 2/k, and maximum 99.9. We can arrange for X to have this mean and variance, regardless of how small k > 0 is, by setting X equal to some small (possibly negative) number x with probability p and equal to 99.9 with probability 1 − p, so as to make its variance large enough yet make its mean 99. X is always less than Y, yet has a higher risk measure. 4. The standard deviation principle satisfies all properties except monotonicity. Translation invariance ρ ( X + c )  µ X+c + kσX+c  µ X + c + kσX  ρ ( X ) + c. Positive homogeneity ρ ( cX )  cµ X + kcσX  cρ ( X ) . Subadditivity The correlation is always no greater than 1, so Var ( X + Y )  Var ( X ) + Var ( Y ) + 2 Corr ( X, Y ) Var ( X ) Var ( Y ) ≤

p

p

Var ( X ) + Var ( Y )

p

2

Monotonicity The standard deviation principle fails for the same reason as the variance principle fails: a variable X may always be lower than a constant Y and yet have an arbitrarily high standard deviation.  Table 8.1 summarizes these results.

8.2

Value-at-Risk (VaR)

As indicated above, any percentile (or quantile) is a risk measure. This risk measure has a fancy name: Value-at-Risk, or VaR. Definition 1 The Value-at-Risk at security level p for a random variable X, denoted VaRp ( X ) , is the 100p th percentile of X: −1 VaRp ( X )  π p  FX (p ) 2 In practice, p is selected to be close to 1: 95% or 99% or 99.5%. For simplicity, the textbook only deals with continuous X, for which percentiles are well-defined. The tables list VaR for almost any distribution it can be calculated in closed form for. The only distributions for which VaR is not listed are lognormal and normal. We will calculate VaR for these two distributions, and for educational purposes we’ll also calculate VaR for other distributions. VaR for normal and lognormal distribution Let z p be the 100p th percentile of a standard normal distribution. Then, if X is normal, VaRp ( X )  µ + z p σ. VaR reduces to a standard deviation principle. If X is lognormal, then VaRp ( X )  e µ+z p σ . C/4 Study Manual—17th edition Copyright ©2014 ASM

8. RISK MEASURES AND TAIL WEIGHT

138

Example 8B Losses have a lognormal distribution with mean 10 and variance 300. Calculate the VaR at security levels 95% and 99%. Answer: Let X be the loss random variable. We back out parameters µ and σ by matching moments. The second moment is Var ( X ) + E[X]2  300 + 102  400. 2

e µ+0.5σ  10 2

e 2µ+2σ  400 µ + 0.5σ 2  ln 10 2µ + 2σ 2  ln 400 Subtracting twice the first equation from the second equation, σ2  ln 400 − 2 ln 10  5.9915 − 2 (2.3026)  1.3863 √ σ  1.3863  1.1774 µ  ln 10 − 0.5σ2  2.3026 − 0.5 (1.3863)  1.6094 The 95th percentile is The 99th percentile is

VaR0.95  e µ+1.645σ  e 1.6094+ (1.645)(1.1774)  34.68 VaR0.99  e 1.6094+ (2.326)(1.1774)  77.36



VaR for exponential distribution For X exponential with mean θ, if F ( x )  p, then e −x/θ  1 − p

x  −θ ln (1 − p )

so VaRp ( X )  −θ ln (1 − p ) . VaR for Pareto distribution For X following a two-parameter Pareto distribution with parameters α and θ, if F ( x )  p, then

θ 1−p θ+x p θ  α1−p θ+x θ θ+x p α 1−p θ 1−



x so VaRp ( X ) 

 √ α 1−p √ . α

p α

p α

1−p



1−p

θ 1−



1−p

Example 8C Losses follow a Pareto distribution with mean 10 and variance 300. Calculate the VaR at security levels 95% and 99%. C/4 Study Manual—17th edition Copyright ©2014 ASM

8.2. VALUE-AT-RISK (VAR)

139

Answer: Let X be the loss random variable. We back out parameters α and θ by matching moments. θ  10 α−1 2θ 2 E[X 2 ]   400 ( α − 1)( α − 2) E[X] 

We divide the square of the first equation into the second.

2 ( α − 1) 4 α−2 2α − 2  4α − 8 α3

Plugging this into the equation for E[X], we get θ  20. The 95th percentile of the Pareto distribution is x such that S ( x )  0.05. θ S (x )  θ+x

!3

 0.05

!3

20  0.05 20 + x √3 20  0.05  0.368403 20 + x 20 20 + x   54.2884 0.368403 VaR0.95  x  34.29 Similarly, the 99th percentile is x such that S ( x )  0.01. 20 20 + x

!3

 0.01

√3 20  0.01  0.215443 20 + x 20 VaR0.99  x  − 20  72.83 0.215443

!

? ?



Quiz 8-1 Losses X follow a Pareto distribution with parameters α  2 and θ  1000. An insurance company pays Y  max (0, X − 2000) for these losses. Calculate VaR0.99 ( Y ) . Quiz 8-2 Losses X follow a paralogistic distribution with α  2 and θ  1000. Calculate VaR0.99 ( X ) . VaR is not coherent. It satisfies translation invariance, positive homogeneity, and monotonicity, but not subadditivity. To see that it does not satisfy subadditivity, assume that X and Y are two mutually exclusive losses. Each one has a 3% probability of occurring, and each loss size is 1000 if it occurs. Then the 95th percentiles of X and Y are 0, while the 95th percentile of X + Y is 1000 since the loss of 1000 has C/4 Study Manual—17th edition Copyright ©2014 ASM

8. RISK MEASURES AND TAIL WEIGHT

140

a 6% probability of occurring. While X and Y are not continuous random variables, the 95th percentile is well-defined for X, Y, and X + Y in this example. This example is relevant for the insurance industry, particularly the segment insuring against catastrophes. It would be absurd for an insurance company to hold no reserve or surplus for a catastrophe whose probability of occurring is less than the VaR threshold. Therefore, VaR is not considered a good risk measure for insurance.

8.3

Tail-Value-at-Risk (TVaR)

Definition 2 The tail-value-at-risk of a continuous random variable X at security level p, denoted TVaRp ( X ) , is the expectation of the variable given that it is above its 100p th percentile: TVaRp ( X )  E X | X > VaRp ( X )

f

g 2

We will not discuss TVaR for discrete random variables, although if the 100p th percentile is welldefined the above definition may be used. This measure is also called Conditional Tail Expectation (CTE), Tail Conditional Expectation (TCE), and Expected Shortfall (ES). On later life exams, as well as life insurance regulations and literature, it is called CTE; however, on the P/C side it is often called TVaR. TVaR can be calculated directly from the definition as

R TVaRp ( X ) 

∞ VaRp ( X )

x f ( x ) dx

1 − F VaRp ( X )





The numerator is called the partial expectation of X given that X is greater than VaRp ( X ) . The term “partial expectation” is not used in the textbook and you are not responsible for it, but it is a convenient description of the numerator and I will use it. −1 Recall that VaRp ( X )  FX ( p ) , so the above equation can be rewritten as

R TVaRp ( X ) 

∞ −1 ( p ) FX

x f ( x ) dx

1−p

(8.1)

If we substitute y  F ( x ) , then x  F −1 ( y )  VaR y ( X ) and dy  F0 ( x ) dx  f ( x ) dx. The lower limit of the   integral becomes F F −1 ( p )  p, and the upper limit becomes F (∞)  1, so we get2

R TVaRp ( X ) 

1 VaR y ( X ) dy p

1−p

(8.2)

So TVaR can be calculated by integrating percentiles. However, I do not find this equation useful for calculating TVaR, since for most distributions percentiles are difficult to integrate. The only distributions for which percentiles are easy to integrate are beta distributions in which either a or b is 1, and it’s easy enough to calculate partial expectations for those using the original formula. Example 8D X is a uniform distribution on [0, 100]. Calculate TVaR0.95 ( X ) . 2The textbook says that this equation is derived by integration by parts and substitution, but I do not see why integration by parts is needed. C/4 Study Manual—17th edition Copyright ©2014 ASM

8.3. TAIL-VALUE-AT-RISK (TVAR)

141

S (x )

0.05 0.05 VaR0.95 (X ) 0

0

 0.05e VaR0.95 (X ) x

VaR0.95 (X )

Figure 8.1: Illustration of TVaR  0.95 for a continuous loss distribution. The shaded area is (1 − 0.95) TVaR0.95 , and consists of (1 − 0.95) e VaR0.95 ( X ) plus (1 − 0.95) VaR0.95 ( X ) .

Answer: We will use equation (8.2). For X, the 100p th percentile is 100p. (This is rather obvious, but if you must do this with algebra, note that FX ( x )  0.01x and then solve for the x that makes FX ( x )  p.) So

R TVaR0.95 ( X )  

1 100y dy 0.95

1 − 0.95

1 (100y 2 /2)

0.95

0.05 50 − 45.125  97.5  0.05 However, this result is intuitively obvious; the conditional expectation of a uniform given that it is between 95 and 100 is the midpoint.  A more useful equation comes from noting that the difference between the mean excess loss at   VaRp ( X ) , e X VaRp ( X ) , and TVaRp ( X ) is that the former averages only the excess over VaRp ( X ) whereas the latter averages the entire X. Therefore TVaRp ( X )  VaRp ( X ) + e X VaRp ( X )



(8.3)



Figure 8.1 illustrates this equation. The area under the curve is the integral of S ( x ) , the total  expected  value. The shaded region at the right is the partial expectation above VaRp ( X ) , or (1 − p ) e X VaRp ( X ) . The shaded rectangle at the left is (1 − p ) VaRp ( X ) . Formula (8.3) is especially useful for distributions where e ( x ) has a simple formula, such as exponential and Pareto distributions. The tables list TVaR for any distribution for which this can be calculated, except for normal and lognormal distributions. Thus there is no need for you to calculate it for other distributions. However, for educational purposes, we’ll calculate TVaR for other distributions as well. TVaR for exponential distribution We derived above that VaRp ( X )  −θ ln (1 − p ) . Therefore TVaRp ( X )  −θ ln (1 − p ) + θ  θ 1 − ln (1 − p )





8. RISK MEASURES AND TAIL WEIGHT

142

Example 8E X is exponentially distributed with mean 1000. Calculate TVaR0.95 ( X ) and TVaR0.99 ( X ) . Answer: Using the formula we just derived, TVaR0.95  1000 (1 − ln 0.05)  3996 TVaR0.99  1000 (1 − ln 0.01)  5605



TVaR for Pareto distribution We derived the following formula for VaRp ( X ) above. θ 1−



VaRp ( X ) 

p α

p α

1−p



1−p

For a Pareto, e ( x )  ( θ + x ) / ( α − 1) (equation (6.10) on page 100). Thus TVaRp ( X )  VaRp ( X ) +

θ + VaRp ( X )

!α − 1 θ α +  VaRp ( X ) α−1 α−1 

 E[X] .1 +

* ,

1−p + / p α 1−p

α 1−

p α



(8.4)

-

Example 8F Losses follow a two-parameter Pareto distribution with mean 10 and variance 300. Calculate the tail-value-at-risk for the losses at security levels 95% and 99%. Answer: In example 8C, we calculated θ  20 and α  3. Using the above formula,

  √3 0.05 3 1 − +/  61.433 * TVaR0.95 ( X )  10 .1 + √3 0.05 ,  √3 3 1 − 0.01 + * /  119.248 TVaR0.99 ( X )  10 .1 + √3 0.01 ,

?



-

Quiz 8-3 Losses X follow a single-parameter Pareto with θ  10 and α  3. Calculate TVaR0.65 ( X ) . For random variables X following other distributions for which the tables give E[X∧x], we can translate equation (8.3) into f g E[X] − E X ∧ VaRp ( X ) (8.5) TVaRp ( X )  VaRp ( X ) + 1−p TVaR for lognormal distribution For a lognormal distribution with parameters µ and σ, the tables have

  ln x − µ − σ 2 + x 1 − F (x ) σ !

E[X ∧ x]  E[X]Φ C/4 Study Manual—17th edition Copyright ©2014 ASM

8.3. TAIL-VALUE-AT-RISK (TVAR)

143

Since F VaRp ( X )  p,





E X ∧ VaRp ( X )  E[X]Φ

f

g

ln VaRp ( X ) − µ − σ2 σ

!

+ VaRp ( X )(1 − p )

In equation (8.5), this expression is divided by 1 − p and subtracted. The last summand of this expression, VaRp ( X )(1 − p ) , when divided by 1 − p, cancels against the first summand of equation (8.5), VaRp ( X ) . Also, 1 − Φ ( x )  Φ (−x ) . So the formula for TVaR for a lognormal reduces to 1 − Φ ln VaRp ( X ) − µ − σ2



TVaRp ( X )  E[X] .

*

.  σ

1−p

,

+/ -

  ln exp ( µ + z p σ ) − µ − σ2 + * *. 1 − Φ . / +/ .. // σ , . //  E[X] . 1−p .. // . / , ! Φ(σ − zp

 E[X]

1−p

(8.6)

where, as usual, z p  Φ−1 ( p ) is the 100p th percentile of a standard normal distribution. Since TVaR is not listed in the tables for the lognormal distribution, you’ll have to decide whether to memorize equation (8.6) or to be prepared to derive it as needed. Example 8G Losses have a lognormal distribution with mean 10 and variance 300. Calculate the Tail-Value-at-Risk for these losses at the 95% and 99% security levels. Answer: In example 8B, we backed out µ  1.6094 and σ  1.1774. Therefore TVaR0.95 ( X )  10

Φ (−0.47) (10)(0.3192) Φ (1.174 − 1.645)  10   63.84 0.05 0.05 0.05

!

!

(10)(0.1251) Φ (−1.15) Φ (1.174 − 2.326)  10   125.1 TVaR0.99 ( X )  10 0.01 0.01 0.01 !

!



TVaR for normal distribution The tables do not include the normal distribution. For a normal random variable X with mean µ and variance σ2 , X  µ + σZ where Z is standard normal, so   x − µ x − µ E[X | X > x]  E µ + σZ | Z >  µ+σE Z | Z > σ σ so if we calculate the TVaR for a standard normal random variable, we can transform it to get the TVaR for any normal random variable. We will calculate TVaR using the definition, equation (8.1). In the following derivation, VaRp ( Z ) is the 100p th percentile of a standard normal distribution, not of X. The numerator is 1

(1 − p ) TVaRp ( Z )  √

Z

∞ VaRp ( Z )

xe −x

2 /2

dx

∞ 1  −x 2 /2   √ −e 2π VaRp (Z )



e − VaRp ( Z ) √ 2π

2 /2

8. RISK MEASURES AND TAIL WEIGHT

144

so TVaRp ( Z ) is the final expression divided by 1 − p. Also, VaRp ( Z )  Φ−1 ( p )  z p . So for X following a general normal distribution, 2

σ e −z p /2 √ 1 − p 2π φ (zp ) µ+σ 1−p

TVaRp ( X )  µ +

(8.7)

This is a standard deviation principle.3 Since TVaR is not listed in the tables for the normal distribution, you’ll have to decide whether to memorize equation (8.7) or to be prepared to derive it as needed. Example 8H Losses have a normal distribution with mean 10 and variance 300 Calculate the Tail-Value-at-Risk at the 95% and 99% security levels for the loss distribution. Answer: Using formula (8.7), TVaR0.95 ( X )  10 +  10 + TVaR0.99 ( X )  10 +  10 +

√ 300 exp (−1.6452 /2) √ 0.05 2π 17.3205 (0.1031)  45.73 0.05 √ 300 exp (−2.3262 /2) √ 0.01 2π 17.3205 (0.02665)  56.16 0.01



Note the following properties of TVaR: 1. TVaR is coherent. 2. TVaR0 ( X )  E[X]. 3. TVaRp ( X ) ≥ VaRp ( X ) , with equality holding only if VaRp ( X )  max ( X ) .

8.4

Tail Weight

Questions on the material in this section appear in pre-2007 CAS 3 exams. To my knowledge there haven’t been any questions on this material since 2007. Parametric distributions are often used to model loss size. Parametric distributions vary in the degree to which they allow for very large claims. Tail weight describes how much weight is placed on the tail of the distribution. The bigger the tail weight of the distribution, the more provision for high claims. The following quantitative measures of tail weight are available: 1. The more positive raw or central moments exist, the less the tail weight. For a gamma distribution, all positive raw moments exist, but for a Pareto, only the k th moment for k < α exists. Thus the lower the α of the Pareto, the fatter the tail. 3Notice that φ ( x ) is not the same as Φ ( x ) . The former is the probability density function of a standard normal distribution: 2

e −x /2 φ (x )  √ 2π The standard normal distribution function Φ ( x ) is the integral from −∞ to x of the density function. C/4 Study Manual—17th edition Copyright ©2014 ASM

8.4. TAIL WEIGHT

145

Gamma Pareto

0.1 0.01 0.001 0.0001 10−5 10−6 10−7 10−8

0

25

50

75

100

125

150

175

200

Figure 8.2: Comparison of densities for Pareto and gamma with equal means and variances

2. To compare two distributions, the limits of the ratios of the survival functions, or equivalently the ratios of the density functions, can be examined as x → ∞. A ratio going to infinity implies the function in the numerator has heavier tail weight. For example, comparing an exponential with parameter θ to a gamma distribution with parameters α and θα (so that they have the same mean), if we divide the density of the exponential by the density of the gamma and ignore constants, the quotient is e −x/θ  x 1−α e − (1−α ) x/θ x α−1 e −αx/θ If α > 1, the e ( α−1) x/θ will go to infinity faster than a power of x goes to zero (use L’Hospital if necessary), so the exponential distribution has the fatter tail, and the opposite holds if α < 1. The textbook demonstrates that a Pareto’s density function is higher than a gamma’s density function as x goes to ∞, Its example is a Pareto with parameters α  3 and θ  10 compared to a gamma with parameters α  1/3 and θ  15. Both of these have the same mean (5) and variance (75). To make the difference sharper, I’m providing my own graph where I use a logarithmic scale (instead of the book’s linear scale). See Figure 8.2 to see the difference. The gamma swoops down in a straight line, just like an exponential (for high x, a gamma behaves like an exponential), while the Pareto’s descent slows down. We will soon see that the lower the α for a gamma distribution, the heavier the tail weight, yet even with α  13 for the gamma, its tail weight is less than that of a Pareto. 3. An increasing hazard rate function means a lighter tail and a decreasing one means a heavier tail. An exponential distribution has a constant hazard rate function, so it has a medium tail weight. The textbook shows that for a gamma distribution, limx→∞ h ( x )  θ1 , the same as the hazard rate of an exponential. If α > 1, the hazard rate increases; if α < 1, the hazard rate decreases. For a two-parameter Pareto distribution, h (x )  which decreases as x → ∞. C/4 Study Manual—17th edition Copyright ©2014 ASM

f (x ) αθ α  S (x ) ( x + θ ) α+1

!

x+θ θ

!α 

α x+θ

8. RISK MEASURES AND TAIL WEIGHT

146

For a Weibull distribution, the hazard rate is h ( x )  versa, so the higher the τ, the lighter the tail.

τx τ−1 θτ ,

so if τ > 1, then this increases and vice

So far, we only have three classifications: light tail (increasing hazard rate function), medium tail (constant hazard rate function) and heavy tail (decreasing hazard rate function). The textbook then says that to compare the tail weights of two distributions, one should compare the rates of increase of the hazard rate functions. The higher the rate of increase, the lighter the tail. This implies that you should differentiate the hazard rate function. If you do this for a two-parameter Pareto, you find h0 ( x )  −

α (x + θ)2

Larger α will make h 0 ( x ) lower, which by the above logic implies a heavier tail. Yet all of the other measures imply that the higher the α, the lighter the tail. In fact, holding the mean fixed, the limiting distribution as α → ∞ is an exponential. In private correspondence, Klugman agreed that this measure of tail weight—rate of increase of hazard rate function—does not work for comparing Paretos. I think the correct measure of which distribution has a heavier tail weight would be the difference in hazard rate functions as x → ∞. If you compare two Pareto’s, one with parameter α 1 and the other with parameter α2 > α1 , then α1 α2 − α1 , the ratio of mean excess losses is

( θ + x ) / ( α 1 − 1) α 2 − 1  >1 ( θ + x ) / ( α 2 − 1) α 1 − 1 implying that the Pareto with the lower α has the higher tail weight. Using this measure of tail weight, higher tail weight is equivalent to higher TVaR. A decreasing hazard rate function implies an increasing mean excess loss and an increasing hazard rate function implies a decreasing mean excess loss, but the converse isn’t necessarily true—an increasing mean excess loss doesn’t necessarily imply a decreasing hazard rate function. Also, lim e ( d )  lim

d→∞

d→∞

1 h (d )

Since for a gamma distribution limd→∞ h ( d )  θ1 , limd→∞ e ( d )  θ, but e (0)  αθ, so the mean excess loss increases if and only if α is less than 1. The textbook has a discussion of the equilibrium distribution Fe ( x ) , which is defined by the probability density function S (x ) fe (x )  E[X] C/4 Study Manual—17th edition Copyright ©2014 ASM

8.5. EXTREME VALUE DISTRIBUTIONS

147

which is a legitimate density function since R

x 0

S ( u ) du E[X]

R

∞ 0

S ( x ) dx  E[X]. The distribution function is Fe ( x ) 

and the expected value of the equilibrium random variable X e is E[X e ] 

E[X 2 ] 2 E[X]

Using this distribution, it proves that if e ( x ) ≥ e (0) for all x, which would be true for a decreasing hazard rate, then the coefficient of variation is at least 1, and if e ( x ) ≤ e (0) for all x, then the coefficient of variation is at most 1. I doubt the equilibrium distribution will be tested on. Example 8I X follows an exponential distribution with mean θ. Determine the distribution function of the equilibrium distribution for X, or X e . Answer: S ( x )  e −x/θ . So

e −x/θ θ which is the density function of an exponential with mean θ. So X e has the same distribution as X. fe (x ) 

Example 8J X follows a two-parameter Pareto distribution with parameters α and θ. Determine the distribution function of the equilibrium distribution for X, or X e . Answer: S ( x )  θ/ ( θ + x )



. So

 fe (x ) 

θ/ ( θ + x )

θ/ ( α − 1)



( α − 1) θ α−1 (θ + x )α

which is the density function of a Pareto distribution with parameters α − 1 and θ.

8.5



Extreme value distributions

The following material was added to the syllabus for October 2013. It is purely descriptive (no calculations), so exam coverage of it is bound to be light. We will mention two heavy-tailed distributions used for risk management. The first distribution arises as the limit of the maximum of a set of observations as the size of the set goes to infinity. The Fisher-Tippett Theorem states that this limit has only one of three possible distributions. Of the three, we will mention only one of them, the Fréchet distribution. This is the same as the inverse Weibull distribution. The second distribution arises as the limit of the excess loss random variable as the truncation point (the deductible) goes to infinity. The Balkema-de Haan-Pickands Theorem states that this limit has only one of three possible distributions. We will mention only two of them. You could guess what these two are. We’ve learned that the excess loss of an exponential is exponential and the excess loss of a Pareto is Pareto, so these two are certainly limiting distributions. And in fact, an exponential distribution is the limiting distribution for excess loss for lighter-tailed distributions, while the Pareto distribution is a limiting distribution for excess loss for other distributions. Thus a Pareto distribution may be a good model for high-deductible insurances. (The exponential distribution is not heavy-tailed, so we didn’t count it when we said that we’ll mention two heavy-tailed distributions.) Extreme value theory calls the set of limiting distributions for mean excess loss “generalized Pareto“, but this is not the same distribution as the generalized Pareto in the Loss Models appendix. C/4 Study Manual—17th edition Copyright ©2014 ASM

8. RISK MEASURES AND TAIL WEIGHT

148

Table 8.2: Summary of Risk Meaaures

Coherent Risk Measures 1. Translation independent: ρ ( X + c )  ρ ( X ) + c 2. Positive homogeneous: ρ ( cX )  cρ ( X ) 3. Subadditive: ρ ( X + Y ) ≤ ρ ( X ) + ρ ( Y )

4. Monotonic: ρ ( X ) ≤ ρ ( Y ) if Pr ( X ≤ Y )  1

−1 Value-at-Risk: VaRp ( X )  π p  FX (p )

Tail-Value-at-Risk: TVaRp ( X )  E X | X > VaRp ( X )

f

R



x f ( x ) dx

1 − F VaRp ( X )



R 

∞ VaRp ( X )

g



1 VaR y ( X ) dy p

1−p

(8.2)

 VaRp ( X ) + e X VaRp ( X )



(8.3)



E[X] − E X ∧ VaRp ( X )

f

 VaRp ( X ) +

g

Distribution

VaRp ( X )

TVaRp ( X )

Exponential

−θ ln (1 − p )

θ 1 − ln (1 − p )

θ 1−



Pareto Normal Lognormal

p α

p α

1−p

1−p

µ + zp σ e

µ+σz p

(8.5)

1−p





α 1−



E[X] .1 +

* ,



1−p + / p α 1−p



p α

-

µ+σ

E[X]

φ (zp ) 1−p

Φ(σ − zp )

!

1−p

The tables you get at the exam have exponential and Pareto VaR’s and TVaR’s listed, but not normal or lognormal ones.

EXERCISES FOR LESSON 8

149

Table 8.3: Summary of Tail Weight and Extreme Value Distributions

Tail Weight Measures 1. The more positive moments, the lower the tail weight. 2. Ratios of survival or density function: if limit at infinity is greater than 1, numerator has higher tail weight. 3. Increasing hazard rate function implies lighter tail. 4. Increasing mean excess loss implies heavier tail. Equilibrium distribution is defined by f e ( x )  S ( x ) / E[X]. Its mean is E[X e ]  E[X 2 ]/2 E[X]. Extreme Value Distributions A limiting distribution function for the maximum of a sample is the inverse Weibull (or Fréchet) distribution. A limiting distribution function for excess loss as deductible goes to infinity is the Pareto distribution. (Exponential for light-tailed distributions.)

Exercises Risk measures 8.1. Consider the exponential premium principle: ρ (X ) 

ln E[e αX ] , α

α>0

Which of the four coherence properties does the exponential premium principle satisfy? 8.2. Losses follow a Weibull distribution with τ  2, θ  500. Calculate the Value-at-Risk of losses at the 95% security level, VaR0.95 ( X ) . (A) (B) (C) (D) (E)

Less than 875 At least 875, but less than 900 At least 900, but less than 925 At least 925, but less than 950 At least 950

8.3. Losses follow an inverse exponential distribution with θ  1000. Determine the Value at Risk at 99%. 8.4. Losses follow a single-parameter Pareto distribution. Let X be the random variable for losses. You are given: (i) VaR0.95 ( X )  11,052 (ii) VaR0.99 ( X )  32,317 Determine VaR0.995 ( X ) . C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

8. RISK MEASURES AND TAIL WEIGHT

150

8.5. For an insurance company, losses follow a mixture of two Pareto distributions with equal weights. For the first Pareto distribution, α  1 and θ  1000. For the second Pareto distribution, α  2 and θ  1000. Calculate the Value at Risk at 99% for the mixture. 8.6.

X is a random variable for losses. X follows a beta distribution with θ  1000, a  2, b  1.

Calculate TVaR0.90 ( X ) . 8.7. For an insurance company, losses follow a lognormal distribution with parameters µ  5, σ  2. Calculate the Tail-Value-at-Risk at the 90% security level. 8.8.

Annual losses follow a Pareto distribution with mean 100, variance 20,000.

Calculate the Tail-Value-at-Risk at the 65% security level. 8.9. Annual losses follow a Pareto distribution with parameters α  2 and θ  100. The Tail-Value-atRisk at a certain security level is 1900. Determine the security level. 8.10. Annual losses follow a normal distribution with mean 100, variance 900. A company calculates its risk measure as the Tail-Value-at-Risk of losses at the 90% security level. It would calculate the same risk measure if it used Value-at-Risk at the p security level. Determine p. Use the following information for questions 8.11 and 8.12: Annual aggregate losses follow an exponential distribution with mean 1000. Let X be the random variable for annual aggregate losses. 8.11.

Calculate the difference between TVaR0.95 ( X ) and VaR0.95 ( X ) .

8.12.

Calculate the absolute difference between TVaR0.99 ( X ) and TVaR0.95 ( X ) .

8.13.

Losses X follow a normal distribution. You are given

(i) TVaR0.5 ( X )  67.55 (ii) TVaR0.8 ( X )  80.79 Determine TVaR0.9 ( X ) . Tail weight 8.14. Random variable X1 with distribution function F1 and probability density function f1 has a heavier tail than random variable X2 with distribution function F2 and probability density function f2 . Which of the following statements is/are true? (More than one may be true.) I. II.

X1 will tend to have fewer positive moments than X2 . The limiting ratio of the density functions, f1 / f2 , will go to infinity.

III.

The hazard rate of X1 will increase more rapidly than the hazard rate of X2 .

IV.

The mean residual life of X1 will increase more rapidly than the mean residual life of X2 .

Exercises continue on the next page . . .

EXERCISES FOR LESSON 8 [CAS3-F03:16] Which of the following are true based on the existence of moments test?

8.15. I.

151

The Loglogistic Distribution has a heavier tail than the Gamma Distribution.

II.

The Paralogistic Distribution has a heavier tail than the Lognormal Distribution.

III.

The Inverse Exponential has a heavier tail than the Exponential Distribution.

(A) I only

(B) I and II only

(C) I and III only

(D) II and III only

(E) I, II, and III

[CAS3-F04:27] You are given:

8.16. •

X has density f ( x ) , where f ( x )  500,000/x 3 , for x > 500 (single-parameter Pareto with α  2).

Y has density g ( y ) , where g ( y )  ye −y/500 /250,000 (gamma with α  2 and θ  500). Which of the following are true for sufficiently high x and y?

1. 2. 3.

X has an increasing mean residual life function. Y has an increasing hazard rate. X has a heavier tail than Y based on the hazard rate test.

(A) 1 only.

(B) 2 only.

(C) 3 only.

(D) 2 and 3 only.

(E) All of 1, 2, and 3.

8.17. A catastrophe reinsurance policy has a high deductible. You are modeling payments per loss for this policy. Based on extreme value theory, which of the following probability density functions may be appropriate for this model? (A)

f ( x )  5x 4 e −x

(B)

f (x ) 

(C) (D) (E)

5 / (5×105 )

5 × 105 e −10 x6 5 × 105 f (x )  (100 + x ) 6 500x 4 f (x )  (100 + x ) 6 1005 e −100/x f (x )  24x 6

5 /x 5

Solutions 8.1. •

Translation invariance. ρ (X + c ) 

ln E[e α ( X+c ) ] ln e αc + ln E[e αX ]   c + ρ (X ) α α

!

Positive homogeneity. ln E[e cαX ] α A simple counterexample for α  1 is if X only assumes the values 0 and 1 with probabilities 0.5 and c  2. Then ρ (2X )  ln 0.5 (1 + e 2 ) , 2ρ ( X )  2 ln 0.5 (1 + e ) # ρ ( cX ) 

8. RISK MEASURES AND TAIL WEIGHT

152

Subadditivity. ln E[e α ( X+Y ) ] α If X and Y are independent, then ρ ( X + Y )  ρ ( X ) + ρ ( Y ) . However, if X and Y are not independent, there is no reason ρ ( X + Y ) ≤ ρ ( X ) + ρ ( Y ) . For example, if we use the counterexample for positive homogeneity for X and let Y  X, then ρ (2X )  ln 0.5 (1 + e 2 )  1.4338 > 2ρ ( X )  2 ln 0.5 (1 + e )  1.2402. # ρ (X + Y ) 

Monotonicity. If X ≤ Y always, then e αX ≤ e αY , from which it follows that E[e αX ] ≤ E[e αY ]. !

Only translation invariance and monotonicity are satisfied. 8.2.

You can look this up in the tables. VaR0.95 ( X )  θ − ln (1 − 0.95)



 1/τ

 500 (− ln 0.05) 1/2  865.41

8.3. We need the 99th percentile of the loss distribution. Let it be x. Then e −1000/x  0.99 1000  − ln 0.99 x 1000 x−  99,499 ln 0.99 8.4. The formula for VaR, from the tables, is VaRp ( X )  θ (1 − p ) −1/α Therefore, θ (0.05) −1/α  11,052 θ (0.01) −1/α  32,317 Dividing the second into the first, 5−1/α  0.341987 1 − ln 5  ln 0.341987 α ! 1 ln 5 − ln 2  ln 0.341987 α ln 2 2−1/α  0.341987ln 5/ ln 2  0.629954 It follows that the VaR at 99.5% is VaR0.995 ( X )  θ (0.005) −1/α θ (0.01) −1/α 2−1/α 32,317   51,301 0.629954



(A)

EXERCISE SOLUTIONS FOR LESSON 8

153

8.5. We need the 99th percentile of the mixture. The survival function of the mixture is the weighted average of the survival functions of the components. The survival function is 0.01 when the cumulative distribution function is 0.99. Letting x be the 99th percentile, 1000 1000 + 0.5 S ( x )  0.5 1000 + x 1000 + x

!

!2

 0.01

For convenience, let y  1000/ (1000 + x ) . 0.5y 2 + 0.5y  0.01 y 2 + y − 0.02  0 √ −1 + 1 + 0.08 y  0.01961524 2 y must be positive, so we reject the negative solution to the quadratic. 1000  0.01961524 1000 + x 1000 − 1000  49,980.76 x 0.01961524 8.6. The density function for this beta is f ( x )  2x/10002 , 0 ≤ x ≤ 1000. First we calculate the 90th percentile. x

x 2u du  F (x )  2 1000 1000 0 √ x  1000 0.9 √ The partial expectation above x  1000 0.9 is

Z

(1 − p ) TVaR0.9 ( X )  

1000

Z x

!2

2u 2 du 10002 1000

2u 3 3 (10002 ) x

2000 1 − 0.93/2





 0.9

3

  97.4567

Dividing by 1 − p  0.1, we get 974.567 . The same result could be obtained using equation (8.1). The 100y th percentile is

!2

x p F (x )  1000 √ x  1000 p and integrating this,

  3/2 1 2000 1 − 0.9 √ 2 (1 − p ) TVaR0.9 ( X )  1000 y dy  1000 y 3/2  3 3 0.9 0.9 Z

1

8. RISK MEASURES AND TAIL WEIGHT

154

8.7. By formula (8.6), TVaR0.9 ( X )  e

µ+0.5σ 2

Φ ( σ − z 0.9 ) 0.1

!

2

 10e 5+0.5 (2) Φ (2 − 1.282)  10,966Φ (0.72)  8380

8.8. Let X be annual losses. If we wish to do this by first principles, our first task is to calculate the 65th percentile of X. The second moment is E[X 2 ]  Var ( X ) + E[X]2  20,000 + 1002  30,000. We back out the Pareto parameters. θ  100 α−1 2θ 2 E[X 2 ]   30,000 ( α − 1)( α − 2) E[X] 

Dividing the square of the first into the second, 2 ( α − 1) 3 α−2 3α − 6  2α − 2 α4

From the equation for E[X], θ  100 4−1 θ  300 The 65th percentile is x such that F ( x )  0.65 or S ( x )  0.35.

!4

300  0.35 S (x )  300 + x 300  0.351/4 300 + x 300 x − 300  90.0356 0.351/4 For a Pareto, the mean excess loss is e ( x )  ( θ + x ) / ( α − 1) . The TVaR is x + e ( x ) . Here we have TVaR0.65 ( X )  90.0356 +

300 + 90.0356  220.047 3

To do this using formula (8.4), we would back out α and then have

  √4 4 1 − 0.35 + * /  220.047 TVaR0.65 ( X )  100 .1 + √4 0.35 , C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 8

155

8.9. Let’s call the security level p. By equation (8.4), 2 (1 − 1 − p )

p

100 *1 +

,

+  1900 p 2 (1 − 1 − p )  18 p p

1−p

1−p

1− 1−p 9 1−p

p

p

p

1 − p  0.1

1 − p  0.12  0.01 p  0.99

8.10.

By equation (8.7),

φ ( z0.9 ) 0.1

TVaR0.9 ( X )  µ + σ

!

while VaRp ( X )  µ + σz p . So we can ignore µ and σ2 and set z p  φ ( z0.9 ) /0.1. 2

φ ( z 0.9 ) e −1.282 /2 0.4397    1.7540 √ 0.1 0.2507 0.1 2π z p  1.7540 p  Φ (1.75)  0.9599 8.11. TVaR0.95 ( X )  VaR0.95 ( X ) + eVaR0.95 ( X ) . But for an exponential, mean excess loss e ( x )  θ regardless of x. So TVaR0.95 ( X )  VaR0.95 ( X ) + θ and the difference is θ  1000 . 8.12. As we saw in the previous exercise, we can equivalently calculate the absolute difference between the VaR0.99 ( X ) and VaR0.95 ( X ) , since TVaRp ( X ) is always the p th quantile plus 1000. The p th quantile VaRp ( X ) is e − VaRp ( X )/1000  1 − p

VaRp ( X )  −1000 ln (1 − p )

VaR0.95 ( X )  −1000 ln 0.05  2995.73

VaR0.99 ( X )  −1000 ln 0.01  4605.17

The difference is 4605.17 − 2995.73  1609.44 8.13.

We calculate k p  φ ( z p ) / (1 − p ) for p  0.5, 0.8, 0.9. 2

k0.5 k0.8

φ ( z0.5 ) e −0 /2    0.7979 √ 0.5 0.5 2π 2 φ ( z0.8 ) e −0.842 /2    1.3994 √ 0.2 0.2 2π 2

k0.9 

φ ( z0.9 ) e −1.282 /2   1.7540 √ 0.1 0.1 2π

8. RISK MEASURES AND TAIL WEIGHT

156

From equation (8.7), TVaRp ( X )  µ + k p σ. It follows that σ so

TVaR0.9 ( X ) − TVaR0.5 ( X ) TVaR0.8 ( X ) − TVaR0.5 ( X )  k0.9 − k0.5 k0.8 − k0.5

TVaR0.9 ( X )  TVaR0.5 ( X ) + TVaR0.8 ( X ) − TVaR0.5 ( X )



 67.55 + (80.79 − 67.55) 8.14.

 k0.9 − k0.5 ! k0.8 − k0.5

1.7540 − 0.7979  88.55 1.3994 − 0.7979

!

As discussed in the lesson, all of these statements are true except III.

8.15. I.

The loglogistic distribution only has some positive moments and the gamma has all, so I is true. !

II.

The paralogistic distribution only has some positive moments and the lognormal has all, so II is true. !

III.

The inverse exponential distribution has no k th moments for k ≥ 1 and the exponential has all, so III is true. !

(E) 8.16. 1.

For a single-parameter Pareto, the mean residual life is an increasing linear function, so 1 is true. !

2.

The hazard rate of a gamma is difficult to compute because of the Γ ( α ) in the denominator of S ( y ) . However, the textbook provides a clever proof that it increases when α > 1, so 2 is true. !

3.

The hazard rate of a single-parameter Pareto is easy to compute; use the formula h (x )  −

d ln S ( x ) dx

Then ln S ( x )  α (ln θ − ln x ) d ln S ( x ) α −  dx x It decreases. Thus 3 is true. !(E) 8.17. A Pareto distribution would be appropriate for the limit of the mean excess loss as the deductible goes to infinity. (C) is a Pareto density with α  5 and θ  10.

Quiz Solutions √ √ 8-1. By the Pareto formula that we developed, the 99th percentile of the Pareto is 1000 (1− 0.01) / 0.01  9000. After deducting 2000, the VaR is 7000 .

QUIZ SOLUTIONS FOR LESSON 8 8-2. From the tables,

157

VaRp ( X )  θ (1 − p ) −1/α − 1) 1/α



Substituting p  0.99, α  2, θ  1000,

VaR0.99 ( X )  1000 (0.01−1/2 − 1) 1/2  3000 Or from basic principles: We want the 99th percentile, so set F ( x )  0.99 and solve for x.

!2

1 1−  0.99 1 + ( x/1000) 2 √ 1  0.01  0.1 1 + ( x/1000) 2

!2

x 1+  10 1000 √ x  93 1000 x  3000 8-3. From the tables, TVaRp ( X )  Substituting p  0.65, α  3, and θ  10, TVaR0.65 ( X ) 

αθ (1 − p ) −1/α α−1

30 (0.35) −1/3  21.2847 2

Or from first principles: The 65th percentile is x such that 10 x

!3

 0.35

x  √3

10 0.35

 14.1898

For a single-parameter Pareto, e ( x )  x/ ( α − 1) , which is 14.1898/2  7.0949 here. So TVaR0.65 ( X )  14.1898 + 7.0949  21.2847 .

158

8. RISK MEASURES AND TAIL WEIGHT

Lesson 9

Other Topics in Severity Coverage Modifications Reading: Loss Models Fourth Edition 8.5 A coverage may have both a policy limit and a deductible, and we then need to specify the order of the modifications. We distinguish the policy limit from the maximum covered loss: Policy limit is the maximum amount that the coverage will pay. In the presence of a deductible or other modifications, perform the other modifications, then the policy limit. If a coverage has a policy limit of 10,000 and an ordinary deductible of 500, then it pays 10,000 for a loss of 10,500 or higher, and it pays the loss minus 500 for losses between 500 and 10,500. Maximum covered loss is the stipulated amount considered in calculating the payment. Apply this limit first, and then the deductible. If a coverage has a maximum covered loss of 10,000 and an ordinary deductible of 500, then it pays 9,500 for a loss of 10,000 or higher, and it pays the loss minus 500 for losses between 500 and 10,000. The payment per loss random variable in the presence of a maximum covered loss of u and an ordinary deductible of d is Y L  X ∧ u − X ∧ d. The payment per payment is Y L | X > d. Coinsurance of α means that a portion, α, of each loss is reimbursed by insurance. For example, 80% coinsurance means that insurance will pay 80% of the loss.1 The expected payment per loss if there is α coinsurance, d deductible, and u maximum covered loss is E[Y L ]  α E ( X ∧ u ) − E [X ∧ d]





If there is inflation of r, you multiply X by (1 + r ) , then pull the (1 + r ) factor out to get

!   d + u * / −E X∧ E[Y ]  α (1 + r ) . E X ∧ 1+r 1+r , L

You need not memorize these formulas; rather, you should understand how they are derived. Here is an example combining several of the modifications listed above. Example 9A Losses for an insurance coverage have the following density function:

 1   5 1−  f (x )   0

1 10 x



0 ≤ x ≤ 10

otherwise

Insurance reimburses losses subject to an ordinary deductible of 2 and a maximum covered loss of 7. Inflation increases all loss sizes uniformly by 10%. Calculate the expected amount paid per payment after inflation. 1That is the definition as far as this course is concerned. Unfortunately, other courses use the same term for the complementary meaning: the proportion paid by the policyholder. C/4 Study Manual—17th edition Copyright ©2014 ASM

159

9. OTHER TOPICS IN SEVERITY COVERAGE MODIFICATIONS

160

Answer: If we let X be the original loss variable and Y the inflated payment per loss variable, then Y  (1.1X ) ∧ 7 − (1.1X ) ∧ 2. The expected value of Y (which is the expected payment per loss, not per payment) is E[Y]  E (1.1X ) ∧ 7 − E (1.1X ) ∧ 2

f

g

 1.1 E X ∧

 f

7 1.1

f

g

f

−E X∧

2 1.1

g

g

2 The expected payment per payment is this expression divided by 1 − FY (0)  1 − FX 1.1 . We see that a formula for E[X ∧ d] would be useful. We will have to calculate FX ( x ) anyway in order to get FX (2) , so it is easiest to use formula (5.6).





x 1 1 F (x )  1 − u du 5 0 10   1 1 2 x  u− u 5 20 ! 0 x2 1 x−  5 20

Z

d

Z E[X ∧ d] 

0 d

Z 



0



1 − F ( u ) du



1 u2 1− u− du 5 20

!

d

u2 u 3 + 10 300 0 d2 d3 + d− 10 300  u−

We are now ready to evaluate E[Y] and 1 − FY (0) .

703 70 702 +  3.17305 − 2 11 10 · 11 300 · 113 f g 20 202 203 2 E X ∧ 1.1  − +  1.50764 11 10 · 112 300 · 113 !   1 20 202 2 FX 1.1  −  0.33058 5 11 20 · 112 E[Y] 1.1 (3.17305 − 1.50764)   2.7366 1 − FY ( 0 ) 1 − 0.33058

f

E X∧

?

7 1.1

g





Quiz 9-1 Losses follow an exponential distribution with mean 1000. An insurance coverage on the losses has an ordinary deductible of 500 and a policy limit of 2000. Calculate expected payment per payment for this coverage. Calculating the variance of payment per loss and payment per payment in the presence of an ordinary deductible is not so straightforward. We can temporarily ignore inflation and coinsurance, since each of these multiply the random variable by a factor, and we can adjust the variance by multiplying by that factor squared. So let’s calculate the variance of the payment per loss random variable Y L , defined by YL  X ∧ u∗ − X ∧ d∗ C/4 Study Manual—17th edition Copyright ©2014 ASM

9. OTHER TOPICS IN SEVERITY COVERAGE MODIFICATIONS

161

where u ∗ and d ∗ are the maximum covered loss and the deductible respectively, adjusted for inflation rate r by dividing by 1 + r. We can calculate the second moment of Y L as follows:

(Y L ) 2  ( X ∧ u ∗ − X ∧ d ∗ ) 2

 ( X ∧ u ∗ ) 2 − 2 ( X ∧ u ∗ )( X ∧ d ∗ ) + ( X ∧ d ∗ ) 2

Now, we would prefer a formula starting with ( X ∧ u ∗ ) 2 − ( X ∧ d ∗ ) 2 , so we’ll subtract and add 2 ( X ∧ d ∗ ) 2 .

(Y L ) 2  ( X ∧ u ∗ ) 2 − ( X ∧ d ∗ ) 2 + 2 ( X ∧ d ∗ ) 2 − 2 ( X ∧ u ∗ )( X ∧ d ∗ )  ( X ∧ u ∗ ) 2 − ( X ∧ d ∗ ) 2 + 2 (| X{z ∧ d}∗ )( X ∧ d ∗ − X ∧ u ∗ ) ∗

We can replace the X ∧ d ∗ with the star below it with d ∗ . Because if X < d ∗ , then X ∧ d ∗ − X ∧ u ∗  0, since both X ∧ d ∗ and X ∧ u ∗ are X, so the factor doesn’t matter. And if X ≥ d ∗ , then X ∧ d ∗  d ∗ . Making this replacement and taking expectations on both sides, we get the final formula E ( Y L ) 2  E ( X ∧ u ∗ ) 2 − E ( X ∧ d ∗ ) 2 − 2d ∗ E[X ∧ u ∗ ] − E[X ∧ d ∗ ]

f

g

f

g

f

g





I doubt that you have to memorize this formula. Exam questions on variance are rare. It is hard to use this formula for most distributions, since calculating limited second moments such as E[ ( X ∧ d ) 2 ], usually requires evaluating incomplete gamma or beta distributions, which you cannot be expected to do on an exam. There have been some questions on exams to calculate the variance of payment per loss in the presence of a deductible. These questions involved simple loss distributions such as exponentials, and could be solved by alternative methods, such as: 1. Treat the payment per loss random variable as a mixture distribution, a mixture of the constant 0 (with weight equal to the probability of being below the deductible) and the excess loss random variable. For an exponential, the excess loss random variable is the same as the original random variable. Then calculate the second moment of the mixture. 2. Treat the payment per loss random variable as a compound distribution. The primary distribution is Bernoulli: either the loss is higher than the deductible or it isn’t. The secondary distribution is the excess loss random variable. Then use the compound variance formula, equation (14.2) on page 236. The concepts “primary distribution” and “secondary distribution”, as well as the compound variance formula, are discussed in Lesson 14. Example 9B The loss severity random variable follows an exponential distribution with mean 1000. A coverage for this loss has a deductible of 500. Calculate the variance of the payment per loss random variable. Answer: We’ll treat payment per loss as a mixture distribution. Let X be the loss random variable and Y L the payment per loss random variable. Also let p  Pr ( X > 500) . Then Y L is a mixture of the constant 0 with probability 1− p and an exponential random variable with mean 1000 with probability p. The second moment of an exponential is 2θ 2 , so E ( Y L ) 2  (1 − p )(02 ) + p (2)(10002 )  2 · 106 p

f

g

Here, p  e −1/2 and E[Y L ]  1000p. So Var ( Y L )  2 · 106 p − 106 p 2  2 · 106 e −1/2 − 106 e −1  845,182 . C/4 Study Manual—17th edition Copyright ©2014 ASM



9. OTHER TOPICS IN SEVERITY COVERAGE MODIFICATIONS

162

Example 9C Losses for an insurance coverage have the following density function:

 1   5 1− f (x )   0 

1 10 x



0 ≤ x ≤ 10

otherwise

Insurance reimburses losses subject to an ordinary deductible of 2 and a maximum covered loss of 7. Calculate the variance of the average payment, taking into account payments of zero on losses below the deductible. Answer: Let X be the loss variable and Y the payment variable. We will use the formula for E[X ∧ d] developed in the Example 9A. To calculate the variance of Y, we will use Var ( Y )  E[Y 2 ] − E[Y]2 . First we calculate E[Y]. 73 72 +  3.24333 10 300 22 23 E[X ∧ 2]  2 − +  1.62667 10 300 E[Y]  E[X ∧ 7] − E[X ∧ 2]  1.61667 E[X ∧ 7]  7 −

The payment size is 0 when X < 2, X − 2 for X between 2 and 7, and 5 when X > 7. So E[Y 2 ]  FX (2)(02 ) +

7

Z 2

  ( x − 2) 2 f X ( x ) dx + 1 − FX (7) (52 ) .

Let’s evaluate the integral. We will substitute y  x − 2 1 5

7

Z 2

( x − 2)

2



1 1 1 − x dx  10 5



5

Z

1  50

0

 5

Z 0

1−

1 ( y + 2) y 2 dy 10



1 (8 − y ) y dy  50 2

5

5

Z 0

(8y 2 − y 3 ) dy

1 8y 3 y 4  − 50 3 4 0   1 1000 625  −  3.54167 50 3 4

!

Now we evaluate FX (7) , using the formula for FX ( x ) developed in the preceding problem. 1 72 FX ( 7 )  7−  0.91 5 20

!

E[Y 2 ]  3.54167 + (1 − 0.91)(25)  5.79167 So Var ( Y )  5.79167 − 1.616672  3.1781



EXERCISES FOR LESSON 9

163

Exercises 9.1. [4B-F92:3] (1 point) You are given the following: •

Based on observed data truncated from above at 10,000, the probability of a claim exceeding 3000 is 0.30.

Based on the underlying distribution of losses, the probability of a claim exceeding 10,000 is 0.02. Determine the probability that a claim exceeds 3000.

(A) (B) (C) (D) (E)

Less than 0.28 At least 0.28, but less than 0.30 At least 0.30, but less than 0.32 At least 0.32, but less than 0.34 At least 0.34

9.2. [4B-S93:33] (3 points) The distribution for claim severity follows a single-parameter Pareto distribution of the following form:    3 x −4 f (x )  x > 1000. 1000 1000 Determine the average size of a claim between 10,000 and 100,000, given that the claim is between 10,000 and 100,000. (A) (B) (C) (D) (E) 9.3.

Less than 18,000 At least 18,000, but less than 28,000 At least 28,000, but less than 38,000 At least 38,000, but less than 48,000 At least 48,000 [CAS3-F03:21] The cumulative loss distribution for a risk is F ( x )  1 − 106 / ( x + 103 ) 2 .

An insurance policy pays the loss subject to a deductible of 1000 and a maximum covered loss of 10,000. Calculate the percentage of expected aggregate losses that are paid. (A) 10%

(B) 12%

(C) 17%

(D) 34%

(E) 41%

9.4. Losses follow a lognormal distribution with µ  6.9078, σ  2.6283. A policy covers losses subject to a 1000 franchise deductible and a 100,000 policy limit. Determine the average payment per paid claim. 9.5. An insurer pays losses subject to an ordinary deductible of \$1000 and a coinsurance factor of 80%. The coinsurance factor is applied before the deductible, so that nothing is paid for losses below \$1250. You are given: (i) Losses follow a two-parameter Pareto distribution with α  2. (ii) Average payment per loss is \$2500. Determine the average loss.

Exercises continue on the next page . . .

9. OTHER TOPICS IN SEVERITY COVERAGE MODIFICATIONS

164

9.6. An insurer pays losses subject to an ordinary deductible of \$1000 and a coinsurance factor of 80%. The coinsurance factor is applied before the deductible, so that nothing is paid for losses below \$1250. You are given: (i) Losses follow a two-parameter Pareto distribution with α  2. (ii) Average payment per paid claim is \$2500. Determine the average loss. 9.7. [151-82-92:6] The probability density function of the loss, Y, is

   0.02 1− f ( y)   0 

y 100



0 < y < 100 otherwise

The amount paid, Z, is 80 percent of that portion of the loss that exceeds a deductible of 10. Determine E[Z] (A) 17

(B) 19

(C) 21

(D) 23

(E) 25

9.8. Losses follow a two parameter Pareto distribution with parameters α  0.5 and θ  2000. Insurance pays claims subject to a deductible of 500 and a maximum covered loss of 20,000, with 75% coinsurance. Determine the size of the average claim payment. You are given the following:

9.9.

(i) Losses follow a single-parameter Pareto distribution with parameters α  4, θ  5000. (ii) For each loss, insurance covers 80% of the amount of the loss up to 10,000 and 100% of the amount of the loss above that level. Calculate the expected payment per loss. [4B-F95:13] [4B-S98:9] (3 points) You are given the following:

9.10. •

Losses follow a uniform distribution on the interval from 0 to 50,000.

The maximum covered loss is 25,000.

There is a deductible of 5,000 per loss.

The insurer makes a nonzero payment P. Determine the expected value of P.

(A) (B) (C) (D) (E)

Less than 15,000 At least 15,000, but less than 17,000 At least 17,000, but less than 19,000 At least 19,000, but less than 21,000 At least 21,000

Exercises continue on the next page . . .

EXERCISES FOR LESSON 9

9.11.

165

You are given:

1.

The amount of a single claim X has a continuous distribution.

2.

Some values from the distribution are given in the following table: x

F (x )

E[X ∧ x]

100 500 1,000 10,000

75 200 300 800

0.6 0.7 0.9 1.0

Calculate the average payment per payment under a coverage with franchise deductible 100 and maximum covered loss of 1000. 9.12.

[3-S00:30] X is a random variable for a loss.

Losses in the year 2000 have a distribution such that: E[X ∧ d]  −0.025d 2 + 1.475d − 2.25,

d  10, 11, 12, . . . , 26

Losses are uniformly 10% higher in 2001. An insurance policy reimburses 100% of losses subject to a deductible of 11 up to a maximum reimbursement of 11. Calculate the ratio of expected reimbursements in 2001 over expected reimbursements in the year 2000. (A) 110.0%

(B) 110.5%

(C) 111.0%

(D) 111.5%

(E) 112.0%

9.13. [CAS3-F03:22] The severity distribution function of claims data for automobile property damage coverage for Le Behemoth Insurance Company is given by an exponential distribution F ( x ) . F ( x )  1 − exp

−x 5000

!

To improve profitability of this portfolio of policies, Le Behemoth institutes the following policy modifications: (i) It imposes a per-claim deductible of 500. (ii) It imposes a per-claim maximum covered loss of 25,000. Previously, there was no deductible and no maximum covered loss. Calculate the average savings per (old) claim if the new deductible and maximum covered loss had been in place. (A) 490

(B) 500

(C) 510

(D) 520

(E) 530

Exercises continue on the next page . . .

9. OTHER TOPICS IN SEVERITY COVERAGE MODIFICATIONS

166

9.14. [SOA3-F04:7] Annual prescription drug costs are modeled by a two-parameter Pareto distribution with θ  2000 and α  2. A prescription drug plan pays annual drug costs for an insured member subject to the following provisions: (i) The insured pays 100% of costs up to the ordinary annual deductible of 250. (ii) The insured then pays 25% of the costs between 250 and 2250. (iii) The insured pays 100% of the costs above 2250 until the insured has paid 3600 in total. (iv) The insured then pays 5% of the remaining costs. Determine the expected annual plan payment. (A) 1120 9.15.

(B) 1140

(C) 1160

(D) 1180

(E) 1200

Annual losses follow a two-parameter Pareto distribution with θ  500 and α  2.

An insurance plan has the following provisions: (i) The insured pays 100% of the costs up to an ordinary annual deductible of 250. (ii) The insurance pays 80% of the costs between 250 and 2250. (iii) The insurance pays 95% of the costs above 2250. Calculate the Tail-Value-at-Risk for the annual payments of the insurance plan at the 90% security level. 9.16. [CAS3-S04:35] The XYZ Insurance Company sells property insurance policies with a deductible of \$5,000, policy limit of \$500,000, and a coinsurance factor of 80%. Let X i be the individual loss amount of the i th claim and Yi be the claims payment of the i th claim. Which of the following represents the relationship between X i and Yi ? (A)

Yi 

(B)

Yi 

(C)

Yi 

(D)

Yi 

(E)

Yi 

 0     0.80 ( X i − 5,000)    500,000   0     0.80 ( X i − 4,000)    500,000   0     0.80 ( X i − 5,000)    500,000   0     0.80 ( X i − 6,250)    500,000   0     0.80 ( X i − 5,000)    500,000 

X i ≤ 5,000 5,000 < X i ≤ 625,000 X i > 625,000 X i ≤ 4,000 4,000 < X i ≤ 500,000 X i > 500,000 X i ≤ 5,000 5,000 < X i ≤ 630,000 X i > 630,000 X i ≤ 6,250 6,250 < X i ≤ 631,250 X i > 631,250 X i ≤ 5,000 5,000 < X i ≤ 505,000 X i > 505,000

Exercises continue on the next page . . .

EXERCISES FOR LESSON 9

167

[CAS3-F04:33] Losses for a line of insurance follow a Pareto distribution with θ  2,000 and α  2.

9.17.

An insurer sells policies that pay 100% of each loss up to \$5,000. The next year the insurer changes the policy terms so that it will pay 80% of each loss after applying a \$100 deductible. The \$5,000 limit continues to apply to the original loss amount. That is, the insurer will pay 80% of the loss amount between \$100 and \$5,000. Inflation will be 4%. Calculate the decrease in the insurer’s expected payment per loss. (A) (B) (C) (D) (E)

Less than 23% At least 23%, but less than 24% At least 24%, but less than 25% At least 25%, but less than 26% At least 26% For an insurance coverage, you are given

9.18. •

Losses, before application of any deductible or limit, follow a Pareto distribution with α  2.

The coverage is subject to a franchise deductible of 500 and a maximum covered loss of 10,000.

Average payment per paid claim on this coverage is 2500. Determine the average loss size, before application of deductible and limit, for this insurance coverage.

Use the following information for questions 9.19 and 9.20: A jewelry store has obtained two separate policies that together provide full coverage. You are given: (i) (ii) (iii) (iv) (v)

The average ground-up loss is 11,100. Policy A has an ordinary deductible of 5,000 with no policy limit. Under policy A, the expected amount paid per loss is 6,500. Under policy A, the expected amount paid per payment is 10,000. Policy B has no deductible and a policy limit of 5,000.

9.19. [4-F00:18] Given that a loss has occurred, determine the probability that the payment under policy B is 5,000. (A) (B) (C) (D) (E)

Less than 0.3 At least 0.3, but less than 0.4 At least 0.4, but less than 0.5 At least 0.5, but less than 0.6 At least 0.6

9.20. [4-S00:6] Given that a loss less than or equal to 5,000 has occurred, what is the expected payment under policy B? (A) (B) (C) (D) (E)

Less than 2,500 At least 2,500, but less than 3,000 At least 3,000, but less than 3,500 At least 3,500, but less than 4,000 At least 4,000

Exercises continue on the next page . . .

9. OTHER TOPICS IN SEVERITY COVERAGE MODIFICATIONS

168

9.21. Losses follow a uniform distribution on [0, 10,000]. Insurance has a deductible of 1000 and 80% coinsurance. The coinsurance is applied after the deductible, so that a positive payment is made on any loss above 1000. Calculate the variance of the amount paid per loss. 9.22. Losses follow a two-parameter Pareto distribution with parameters α  3 and θ  1000. An insurance coverage has an ordinary deductible of 1500. Calculate the variance of the payment per loss on the coverage. 9.23.

For losses X, you are given x E[X ∧ x]

800 300

1000 380

1250 440

4000 792

5000 828

6250 900

Inflation of 25% affects these losses. Calculate the expected payment per loss after inflation on a coverage with a 1000 ordinary deductible and a 5000 maximum covered loss. Additional released exam questions: CAS3-F06:30, SOA M-F06:6,20,29,31, C-S07:13

Solutions 9.1. Truncation means that claims are observed only if they’re in the untruncated range. In other words, the first bullet is saying Pr ( X > 3000 | X ≤ 10,000)  0.3 Combining the first two bullets,

Pr ( X ≤ 3000) 0.98 Pr ( X ≤ 3000)  0.686 0.7 

Pr ( X > 3000)  0.314

(C)

9.2. Note that we are being asked for the average size of the total claim, including the amount below 10,000, not just the amount between 10,000 and 100,000. The intended solution was probably the following: For a claim, the expected value of the amount of the claim between 10,000 and 100,000 (this is ignoring the condition that the claim is between 10,000 and 100,000) is

Z

100,000 10,000

However, E[X ∧ d]  100,000

Z 0

R

x f ( x ) dx −

d 0

x f ( x ) dx 

100,000

Z 0

x f ( x ) dx −

10,000

Z 0

x f ( x ) dx

x f ( x ) dx + dS ( d ) . Therefore 10,000

Z 0

x f ( x ) dx

 E[X ∧ 100,000] − 100,000S (100,000) − E[X ∧ 10,000] − 10,000S (10,000)









EXERCISE SOLUTIONS FOR LESSON 9

169

That is before the condition that the claim is between 10,000 and 100,000. The probability of that condition is F (100,000) − F (10,000) . Thus the answer to the question is



E[X ∧ 100,000] − 100,000S (100,000) − E[X ∧ 10,000] − 10,000S (10,000)







F (100,000) − F (10,000)

We now proceed to calculate the needed limited expected values and distribution functions. The single-parameter Pareto has parameters α  3, θ  1000. We’ll use the tables in the appendix for the limited expected values. 10003 3 (1000) −  1500 − 5  1495 2 2 (10,0002 ) 10003 E[X ∧ 100,000]  1500 −  1500 − 0.05  1499.95 2 (100,0002 ) E[X ∧ 10,000] 

1000 F (10,000)  1 − 10,000

!3

1000 F (100,000)  1 − 100,000

 0.999

!3

 0.999999



E[X ∧ 100,000] − 100,000S (100,000) − E[X ∧ 10,000] − 10,000S (10,000)







F (100,000) − F (10,000) (1499.95 − 0.1) − (1495 − 10) 14.85    14,864.86 0.999999 − 0.999 0.000999

(A)

It turns out that the correct answer choice could be identified more easily.2 Roughly speaking, the average size of a claim over 10,000 is higher than the average size of a claim between 10,000 and 100,000, since the former brings higher claims into the average. However, the average size of a claim over 10,000 is 10,000 + e (10,000)  15,000. Even after adjusting for the smaller base (the average we need is divided by only the number of claims between 10,000 and 100,000, whereas e ( x ) is divided by the number of claims between 10,000 and ∞), the result is well below 18,000. So (A) must be correct. Let’s carry out this idea formally. The problem is asking for

R

100,000 10,000

x f ( x ) dx

F (100,000) − F (10,000)

(9.1)

where f ( x ) is the single-parameter Pareto’s density function. We can relate this quantity to the mean excess loss, which is easy to calculate. We know by formula (6.11) that e (10,000)  However, by definition,

R e (10,000)  2This idea was shown to me by Rick Lassow. C/4 Study Manual—17th edition Copyright ©2014 ASM

∞ (x 10,000

d 10,000   5000 α−1 3−1 − 10,000) f ( x ) dx

1 − F (10,000)

9. OTHER TOPICS IN SEVERITY COVERAGE MODIFICATIONS

170

R 

x f ( x ) dx

∞ 10,000

x f ( x ) dx

1 − F (10,000)

R 

∞ 10,000

1 − F (10,000)

R − 10,000

∞ 10,000

f ( x ) dx

1 − F (10,000)

− 10,000

Let’s compare the integral in (9.1) to this integral.

R

100,000 10,000

x f ( x ) dx

R

F (100,000) − F (10,000)



R 

100,000 10,000

x f ( x ) dx !

100,000 10,000

x f ( x ) dx !

1 − F (10,000)

1 − F (10,000) F (100,000) − F (10,000)

1 − F (10,000)

1 − 0.999 0.999999 − 0.999

R

 1.001001

100,000 10,000

!

!

x f ( x ) dx

1 − F (10,000)

where we’ve used our calculations of F (10,000) and F (100,000) from above. The integral is no greater than an integral going to ∞ instead of to 100,000. But

R

∞ 10,000

x f ( x ) dx

1 − F (10,000)

 10,000 + e (10,000)  15,000

We’ve shown that the answer is no greater than 15,000 (1.001001)  15,015. Thus it is certainly less than 18,000, and the answer must therefore be (A). 9.3. This is a Pareto distribution with α  2, θ  1000. We must calculate

We’ll use the tables in the appendix.

E[X ∧ 10,000] − E[X ∧ 1000]

1000  500 2000   1000 10,000 E[X ∧ 10,000]  1000 1 −  11000 11 E[X]  1000



E[X ∧ 1000]  1000 1 −



10,000/11 − 500  0.4091 . (E) 1000 9.4. Did you notice that the deductible is a franchise deductible? The insurer pays the entire loss if it is above 1000. If the loss is 100,000, the insurer pays 100,000. If the loss is greater than 100,000, the payment is capped at 100,000 by the policy limit. Hence we are interested in E[X ∧ 100,000]. The answer is

E[X ∧ 100,000]  exp 6.9078 +

2.62832 ln 100,000 − 6.9078 − 2.62832 Φ + 2 2.6283

!

! ln 100,000 − 6.9078 + * / 100,000 .1 − Φ 2.6283 ,    31,627Φ (−0.88) + 100,000 1 − Φ (1.75)

 31,627 (0.1894) + 100,000 (1 − 0.9599)  10,000

!

EXERCISE SOLUTIONS FOR LESSON 9

171

2.62832 ln 1000 − 6.9078 − 2.62832 E[X ∧ 1000]  exp 6.9078 + Φ + 2 2.6283

!

1000 .1 − Φ

*

!

ln 1000 − 6.9078 + / 2.6283

!

-

,

 31,627Φ (−2.63) + 1000 1 − Φ (0)





 31,627 (0.0043) + 500  636

F (1000)  Φ (0)  0.5 The average payment per paid claim for an ordinary deductible would be Average per claim 

10,000 − 636  18,728 1 − F (1000)

For a franchise deductible, each paid claim gets 1000 more. The answer is 19,728 . 9.5. The average amount paid per loss is 80% of the average amount by which a loss exceeds 1250. Before multiplying by 80%, the expected payment per loss is E[X] − E[X ∧ 1250]

By equation (6.8), this is equal to

e (1250) 1 − F (1250)





Now multiply this by 0.8 and equate to 2500.

2500  0.8e (1250) 1 − F (1250)



(*)



For our two-parameter Pareto, by equation (6.10), e (1250)  and from the tables

θ + 1250  θ + 1250 α−1

θ θ  θ + 1250 θ + 1250 Substituting these expressions into (*), and dividing both sides by 0.8, 1 − F (1250) 

θ 3125  ( θ + 1250) θ + 1250

!2

!2

θ 2 − 3125θ − (3125)(1250)  0

3125 + 31252 + 4 (3,906,250) θ  4081.9555 2 θ E[X]   4081.9555 α−1

p

9.6. The average amount payment per paid claim is 80% of the mean excess loss at 1250. As we mentioned in the previous solution, for our Pareto, e (1250)  θ + 1250. 2500  0.8e (1250)  0.8 ( θ + 1250) 0.8θ θ  1875 θ E[X]   1875 α−1 C/4 Study Manual—17th edition Copyright ©2014 ASM

 1500

9. OTHER TOPICS IN SEVERITY COVERAGE MODIFICATIONS

172

9.7. We can apply the 80% factor after calculating E ( Y − 10)+ . To calculate E ( Y − 10)+ , while equation (6.2) could be used, it’s easier to use equation (6.3). First we calculate F ( y ) .

f

y

Z F( y) 

0



0.02 1 −

g

f

g

u du 100



 2 y 0  y 2

u − 1− 100





1− 1−

100 y 2 S( y)  1 − 100 f g Z 100  y 2 E ( Y − 10)+  1− dy 100 10



100

−

 y  3 100 1− 3 100 10

100 (0.93 )  24.3  3

!

E[Z]  0.8 E ( Y − 10)+  (0.8)(24.3)  19.44

f

g

(B)

9.8. Average claim payment per loss is 0.75 E[X ∧ 20,000] − E[X ∧ 500] .





2000 * 2000 E[X ∧ 20,000]  1− −0.5 22,000

,

2000 * 2000 E[X ∧ 500]  1− −0.5 2500

,

! −0.5

! −0.5

+  9266.50 -

+  472.14 2000 0.5

 0.8944 to get claim payment We must divide average claim payment per loss by 1 − F (500)  2500 per claim. Average claim payment is 0.75 (9266.50 − 472.14) /0.8944  7374.30 .

9.9. Did you understand (ii) properly? Statement (ii) does not say that the insurance covers 100% of the loss for losses above 10,000. It says that insurance pays 100% of the amount of the loss above 10,000. For a loss of 11,000, the company would pay 80% of the first 10,000 and 100% of the next 1000, resulting in a payment of 9000, not 11,000. If the loss variable is X, the payment per loss variable X L is equal to X − 0.2 ( X ∧ 10,000) , since 20% of the loss below 10,000 is deducted from the loss. For a single-parameter Pareto, the formulas for E[X] and E[X ∧ d] are E[X] 

αθ α−1

E[X ∧ d]  E[X] − Therefore E[X] 

(4)(5000) 3

θα ( α − 1) d α−1



20,000 3

EXERCISE SOLUTIONS FOR LESSON 9

173

20,000 19,375 50004  − 3 3 3 3 (10,000) 0.2 ( 19,375 ) 20,000 −  5375 E[X L ]  3 3

E[X ∧ 10,000] 

9.10. The question asks for the expected value of the payment random variable P, or the expected payment per payment. That is E[X ∧ 25,000] − E[X ∧ 5,000] E[P]  1 − F (5,000) When X has a uniform distribution on [0, θ], the limited expected value at u can be calculated as a weighted average: the probability of X ≤ u times u/2 plus the probability of X > u time u: u u u E[X ∧ u]  + 1− (u ) θ 2 θ

!





Applying this formula, E[X ∧ 25,000]  0.5 (12,500) + 0.5 (25,000)  18,750 E[X ∧ 5,000]  0.1 (2,500) + 0.9 (5,000)  4,750

The denominator is 1 − F (5,000)  0.9. Therefore, the expected payment per payment is E[P] 

18,750 − 4,750  15555 59 1 − 0.1

(B)

9.11. The payment per loss is X ∧ 1000 − X ∧ 100 + 100 1 − F (100) . The payment per payment is the payment per loss divided by S (100) . So we compute





E[X ∧ 1000] − E[X ∧ 100] 300 − 75 + 100  + 100  662.5 S (100) 1 − 0.6

9.12.

In 2000, we need E[X ∧ 22] − E[X ∧ 11]. In 2001, we need

E[1.1X ∧ 22] − E[1.1X ∧ 11]  1.1 E[X ∧ 20] − E[X ∧ 10]





We therefore proceed to calculate the four limited expected values.

E[X ∧ 22]  −0.025 (222 ) + 1.475 (22) − 2.25  18.1

E[X ∧ 11]  −0.025 (112 ) + 1.475 (11) − 2.25  10.95

E[X ∧ 20]  −0.025 (202 ) + 1.475 (20) − 2.25  17.25

The ratio is

E[X ∧ 10]  −0.025 (102 ) + 1.475 (10) − 2.25  10

1.1 E[X ∧ 20] − E[X ∧ 10]





1.1 (17.25 − 10)  1.115385 (D) E[X ∧ 22] − E[X ∧ 11] 18.1 − 10.95 The calculations can be shortened a bit by leaving out −2.25 from each calculation, since it’s a constant. 9.13.



θ for this exponential is 5000.

E[X ∧ 25,000] − E[X ∧ 500]  5000 (1 − e −5 ) − (1 − e −0.1 )



 4490

5000 − 4490  510 C/4 Study Manual—17th edition Copyright ©2014 ASM

(C)



9. OTHER TOPICS IN SEVERITY COVERAGE MODIFICATIONS

174

Plan Payment 2500 2000 1500 1000 500

250

2250

Loss Size

5100

Figure 9.1: Plan payments as a function of loss in exercise 9.14

9.14.

The four components (i)–(iv) of the formula have the following expected plan payments:

(i)

Insured pays 100% of the costs up to 250: 0 (The plan pays nothing for these losses.)

(ii)

Insured pays 25% of the costs between 250 and 2250: 0.75 E[X ∧ 2250] − E[X ∧ 250] .

(iii) (iv)





Insured pays 100% of the costs between 2250 and u, where u is the loss for which the payment is 3600: 0.  Insured pays 5% of the excess over u: 0.95 E[X] − E[X ∧ u]) .

Let’s determine u. u is clearly above 2250. For a loss of 2250, the insured pays 250 + 0.25 (2250 − 250)  750. The insured then pays 100% until the loss is 3600, so the insured’s payment for a loss u ≥ x ≥ 2250 is 750 + 100% ( x − 2250)  x − 1500, and the payment is 3600 when x − 1500  3600, or x  5100, so we conclude u  5100. A graph of plan payments as a function of loss is shown in Figure 9.1. Combining the four components of expected plan payments, we have 0.75 E[X ∧ 2250] − E[X ∧ 250] + 0.95 E[X] − E[X ∧ 5100]









Let’s calculate the needed limited and unlimited expected values. E[X]  2000

2000  222.22 2250   2000 E[X ∧ 2250]  2000 1 −  1058.82 4250   2000 E[X ∧ 5100]  2000 1 −  1436.62 7100





E[X ∧ 250]  2000 1 −

The insurance pays 0.75 E[X ∧ 2250] − E[X ∧ 250] + 0.95 E[X] − E[X ∧ 5100]









 0.75 (1058.82 − 222.22) + 0.95 (2000 − 1436.62)  1162.66

(C)

EXERCISE SOLUTIONS FOR LESSON 9

175

9.15. The 90th percentile of the Pareto for the ground-up losses may be read off the tables (VaR0.9 ( X ) ), or may be calculated; it is the x determined from:

!2

500  1 − 0.9  0.1 500 + x √ 500  0.1 500 + x 500 500 + x  √  1581.14 0.1 x  1081.14 We need the partial expectation of payments for losses above 1081.14. The partial expectation of the amount of a payment for the loss above 1081.14, given that the loss is greater than 1081.14 is 0.8 E[X ∧ 2250] − E[X ∧ 1081.14] + 0.95 E[X] − E[X ∧ 2250]









We compute the needed limited expected values. E[X]  500

500  341.89 1581.14   500 E[X ∧ 2250]  500 1 −  409.09 2750





E[X ∧ 1081.14]  500 1 −

so the partial expectation of the amount above 1081.14 is 0.8 (409.09 − 341.89) + 0.95 (500 − 409.09)  140.1245. Dividing by 1 − p  0.1, the conditional expectation of the payment given that a loss is greater than 1081.14 is 1401.245. In addition, the insurer pays 0.8 (1081.14 − 250)  664.91 on each loss above 1081.14. Therefore, the Tail-Value-at-Risk of annual payments is 1401.245 + 664.91  2066.16 . 9.16. The policy limit is the most that will be paid. To pay 500,000, claims are needed, so the answer is (C). 9.17.

500,000 0.8

+ 5000  630,000 of covered

In the first year, the payment per loss is



E[X ∧ 5000]  2000 1 −

2000  1428.57 7000



In the second year, the new θ is 1.04 (2000)  2080 and the payment per loss, considering 80% coinsurance, is 0.8 (E[X ∧ 5000] − E[X ∧ 100])  0.8 (2080)



2080 2080 −  1098.81 2180 7080



The ratio is 1098.81/1428.57  0.7692. The decrease is 1 − 0.7692  0.2308 . (B)

9.18. The payment per payment for a franchise deductible is the deductible (500) plus the payment per payment for an ordinary deductible, so we will subtract 500 and equate the payment per payment to 2000. E ( X ∧ 10,000) − E ( X ∧ 500)  2000 1 − F (500) !   10,000 500 ( θ + 500) 2 θ −  2000 10,000 + θ 500 + θ θ2

9. OTHER TOPICS IN SEVERITY COVERAGE MODIFICATIONS

176

(500 + θ )

2



10,000 500  2000θ − 10,000 + θ 500 + θ



 (500 + θ ) 10,000 (500 + θ ) − 500 (10,000 + θ )  2000θ (10,000 + θ ) 

(500 + θ )(10,000θ − 500θ )  2000θ (10,000 + θ ) 9500θ (500 + θ )  2000θ (10,000 + θ ) 95 (500 + θ )  20 (10,000 + θ ) 75θ  200,000 − 500 (95)  152,500 θ  2033 13

9.19. It’s strange that this question is easier than the similar one from the previous exam (the next question). Expected amount per payment is expected amount per loss divided by the probability of a payment, so from (ii) and (iii), the probability of a payment on policy A is 6,500 Pr ( X > 5000) 6,500 Pr ( X > 5000)   0.65 10,000 10,000 

(E)

X > 5000 and policy B paying exactly 5000 are equivalent. 9.20. From the previous question’s solution, we know that Pr ( X > 5000)  0.65. Since policy A and policy B cover the entire loss, their expected payments per loss must add up to 11,100. Since the expected payment per loss under policy A is 6,500, the expected payment per loss under policy B is 11,100 − 6,500  4,600. Let x be the payment per loss given that the loss is less than 5000. By double expectation, Pr ( X ≤ 5000) x + Pr ( X > 5000)(5000)  4600

Then

0.35x + 0.65 (5000)  4600 1350 x  3857 17 (D) 0.35 9.21. Let X be the loss, Y the payment, and Z the payment before coinsurance. Since the payment is Y  0.8Z, its variance is Var ( Y )  0.82 Var ( Z ) . The straightforward way of doing this problem is calculating first and second moments of Z. Since X is uniform on [0, 10,000], Z is a mixture of a variable which is the constant 0 with weight 0.1 and uniform on [0, 9000] with weight 0.9. The probability that a loss leads to a payment is 0.9, and the average payment given that a payment is made is the midpoint of the range or 4500. Therefore the mean is E[Z]  0.1 (0) + 0.9 (4500)  4050 The second moment of a uniform random variable on [0, x] is x 2 /3, so the second moment of the mixture is ! 90002  24,300,000 E[Z 2 ]  0.1 (02 ) + 0.9 3 Then Var ( Z )  24,300,000 − 40502  7,897,500 and Var ( Y )  0.82 (7,897,500)  5,054,400 .

QUIZ SOLUTIONS FOR LESSON 9

177

An alternative is to use the conditional variance formula (formula (4.2) on page 64). Let I be the indicator variable on whether the loss is greater than 1000 or not. Then Z | X < 1000  0, and Z | X > 1000 is a uniform random variable on (0, 9000) with mean 4500 and variance 90002 /12. So Var ( Z )  E Var ( Z | I ) + Var E[Z | I]

f

g

and Var ( Z | I ) is Bernoulli with value 0 or 90002 /12, so





90002  6,075,000 E Var ( Z | I )  0.1 (0) + 0.9 12

f

g

!

while E[Z | I] is either 0 or 4500, so using the Bernoulli variance formula,

Var E[Z | I]  45002 (0.9)(0.1)  1,822,500





0.82 (6,075,000 + 1,822,500)  5,054,400

9.22. We mentioned on page 100 that E[ ( X − d )+ | X > d] has a Pareto distribution with parameters α and θ + d. Here, that means the parameters are α  3 and θ  1000 + 1500  2500. Let X be the random variable for loss, Y L the random variable for payment per loss, and p  Pr ( X > 1500) the probability of a loss above the deductible. The expected payment per loss is a mixture of 0 with weight 1 − p and a Pareto with weight p. The Pareto for payment per payment, which we’ll call Y P , has parameters α  3 and θ  2500.  α The probability of a loss above the deductible is p  θ/ ( θ + d )  (1000/2500) 3  0.064. The L expected value of Y is ! 2500 L P E[Y ]  p E[Y ]  p  0.064 (1250)  80 3−1

The second moment of Y L is

f

L 2

E (Y ) The variance of Y L is

9.23.

g

f

P 2

 p E (Y )

g

2 (25002 )  0.064  400,000 (2)(1)

!

Var ( Y L )  400,000 − 802  393,600

If X 0 is the inflated loss, then X 0  1.25X, and the payment per loss is E[X 0 ∧ 5000] − E[X 0 ∧ 1000]  E[1.25X ∧ 5000] − E[1.25X ∧ 1000]

 1.25 E[X ∧ 4000] − 1.25 E[X ∧ 800]  1.25 (792 − 300)  615

Quiz Solutions 9-1. Let Y P be the payment per payment random variable. E[X ∧ 500]  1000 (1 − e −0.5 )  393.47

E[X ∧ 2500]  1000 (1 − e −2.5 )  917.92

1 − F (500)  e −0.5  0.606531 917.92 − 393.47 E[Y P ]   864.66 0.606531

178

9. OTHER TOPICS IN SEVERITY COVERAGE MODIFICATIONS

Lesson 10

Bonuses Reading: Loss Models Fourth Edition 8.2–8.5 Exams in the early 2000’s used to have bonus questions. To my knowledge, they no longer appear on exams, but these questions give you some practice in manipulating limited expected losses. However, you may skip this lesson. A bonus question goes like this: An agent receives a bonus if his loss ratio is below a certain amount, or a hospital receives a bonus if it doesn’t submit too many claims. Usually the losses in these questions follow a two-parameter Pareto distribution with α  2, to make the calculations simple. As I mentioned earlier, the formulas for E[X ∧ d] are difficult to use for distributions other than exponential, Pareto, and lognormal. To work out these questions,  express the bonus in terms of the earned premium; it’ll always be something like max 0, c ( rP − X ) , where r is the loss ratio, P is earned premium, X is losses, and c is some constant. Then you can pull out crP and write it as crP − c min ( rP, X )  crP − c ( X ∧ rP ) . You then use the tables to calculate the expected value. Since the Pareto with α  2 is used so often for this type of problem, let’s write the formula for E[X ∧ d] down for this distribution: d E[X ∧ d]  θ d+θ

!

Example 10A Aggregate losses on an insurance coverage follow a Pareto distribution with parameters α  2, θ  800. Premiums for this coverage are 500. The loss ratio, R is the proportion of aggregate losses to premiums. If the loss ratio is less than 70%, the insurance company pays a dividend of 80% of premiums times the excess of 70% over the loss ratio. Calculate the expected dividend. Answer: If we let X be aggregate losses, then R  X/500 and the dividend is X (0.8)(500)(0.7 − R )  0.8 (500) 0.7 −  0.8 (350 − X ) 500 when this is above 0. But we have





max 0, 0.8 (350 − X )  0.8 max (0, 350 − X )





 0.8 max (350 − 350, 350 − X )  0.8 350 − min (350, X )



 280 − 0.8 ( X ∧ 350)



So we calculate E[X ∧ 350], using the formulas from the Loss Models appendix. θ θ * 1− E[X ∧ 350]  α−1 θ + 350

 ,

! α−1

800 800 + 350 (800)(350)   243.48 1150  800 1 −

179



+ -

10. BONUSES

180

The expected dividend is then 280 − 0.8 (243.48)  85.22 . The dividend can be graphed as a function of aggregate losses by noting that 1. If aggregate losses are 0, the loss ratio is 0 and the dividend is (0.8)(0.7 − 0)(500)  280.

2. In order for the formula to develop a dividend of zero (before flooring at 0), the loss ratio must be 0.7, which means losses must be (0.7)(500)  350. 3. The dividend formula is linear in aggregate losses, so the graph is a straight line between these two points.

The graph is drawn below. Dividend 280

350

Aggregate losses

You may be able to write down the dividend formula, 280 − 0.8 ( X ∧ 350) , just by drawing the graph. If not, use the algebraic approach. 

Exercises 10.1. An insurance company purchases reinsurance. Reinsurance premiums are 800,000. The reinsurance company reimburses the insurance company for aggregate reinsured losses and also pays an experience refund to the insurance company if experience is good. Aggregate reinsured losses follow a Pareto distribution with parameters α  2, θ  500,000. The experience refund is equal to 80% of the excess of reinsurance premiums over aggregate reinsured losses, minus a charge of 200,000. The charge is applied after multiplying the excess by 80%. However, the experience refund is never less than 0. The net cost of the reinsurance contract to the insurance company is the excess of reinsurance premiums over reinsurance payment of losses and experience refunds. Determine the expected net cost to the insurance company of the reinsurance contract. 10.2. An agent gets a bonus based on product performance. His earned premium is 500,000. Losses follow an exponential distribution with mean 350,000. His bonus is equal to earned premium times a proportion of the excess of 70% over his loss ratio, but not less than zero. The loss ratio is defined as losses divided by earned premium. The proportion is set so that the expected bonus is 20,000. Determine the proportion.

Exercises continue on the next page . . .

EXERCISES FOR LESSON 10

10.3.

181

[3-S00:25] An insurance agent will receive a bonus if his loss ratio is less than 70%. You are given:

(i) His loss ratio is calculated as incurred losses divided by earned premium on his block of business. (ii) The agent will receive a percentage of earned premium equal to 1/3 of the difference between 70% and his loss ratio. (iii) The agent receives no bonus if his loss ratio is greater than 70%. (iv) His earned premium is 500,000. (v) His incurred losses have the following distribution: F (x )  1 −

!3

600,000 , x + 600,000

x>0

Calculate the expected value of his bonus. (A) 16,700

(B) 31,500

(C) 48,300

(D) 50,000

(E) 56,600

10.4. [3-F00:27] Total hospital claims for a health plan were previously modeled by a two-parameter Pareto distribution with α  2 and θ  500. The health plan begins to provide financial incentives to physicians by paying a bonus of 50% of the amount by which total hospital claims are less than 500. No bonus is paid if total claims exceed 500. Total hospital claims for the health plan are now modeled by a new Pareto distribution with α  2 and θ  K. The expected claims plus the expected bonus under the revised model equals expected claims under the previous model. Calculate K. (A) 250

(B) 300

(C) 350

(D) 400

(E) 450

10.5. [3-F02:37] Insurance agent Hunt N. Quotum will receive no annual bonus if the ratio of incurred losses to earned premiums for his book of business is 60% or more for the year. If the ratio is less than 60%, Hunt’s bonus will be a percentage of his earned premium equal to 15% of the difference between his ratio and 60%. Hunt’s annual earned premium is 800,000. Incurred losses are distributed according to the Pareto distribution, with θ  500,000 and α  2. Calculate the expected value of Hunt’s bonus. (A) 13,000

(B) 17,000

(C) 24,000

(D) 29,000

(E) 35,000

10.6. [SOA3-F03:3] A health plan implements an incentive to physicians to control hospitalization under which the physicians will be paid a bonus B equal to c times the amount by which total hospital claims are under 400 (0 ≤ c ≤ 1).

The effect the incentive plan will have on underlying hospital claims is modeled by assuming that the new total hospital claims will follow a two-parameter Pareto distribution with α  2 and θ  300. E ( B )  100. Calculate c. (A) 0.44

(B) 0.48

(C) 0.52

(D) 0.56

(E) 0.60

Exercises continue on the next page . . .

10. BONUSES

182

Solutions 10.1. It may help to draw a graph showing the reinsurance company’s payment as a function of aggregate losses. If aggregate losses are zero, the experience refund would be 0.8 (800,000−0) −200,000  440,000. In order for the experience refund formula to generate 0, 80% of the difference between premiums (800,000) and claims would have to be equal to the 200,000 charge, or 0.8 (800,000 − x )  200,000

640,000 − 0.8x  200,000 0.8x  440,000 x  550,000

The experience refund is a linear function, so we can draw a straight line between x  0 and x  550,000. Figure 10.1 plots the reinsurance company’s payment, namely x plus the experience refund, with the experience refund shaded. We will now derive this algebraically. If aggregate reinsured losses is X, then the experience refund is the maximum of 0 and 80% of the excess of 800,000 over X minus 200,000, or max 0, 0.8 (800,000 − X ) − 200,000





If we let R be the expected value of the experience refund, then R is R  E max (0, 0.8 (800,000 − X ) − 200,000)

f

 E max (0, 640,000 − 0.8X − 200,000)

f

 E max (0, 440,000 − 0.8X )

f

g

g

g

 E max (440,000 − 440,000, 440,000 − 0.8X )

f

g

EXERCISE SOLUTIONS FOR LESSON 10

183

Reinsurance payment in thousands 800

600 440 400

200

0

0

200

550 600

400

800

Losses in thousands

Figure 10.1: Reinsurance payment in exercise 10.1

 E 440,000 − min (440,000, 0.8X )

f

 440,000 − E[0.8X ∧ 440,000]

 440,000 − 0.8 E[X ∧ 550,000]

g

by linearity of expected value

and this looks like our graph. Using the Pareto formulas from the Loss Models appendix, we have E[X]  E[X ∧ 550,000] 

500,000 θ   500,000 α−1 1 θ θ * 1− α−1 θ+x

! α−1

, 

+ -

500,000 500,000 1− 1 500,000 + 550,000 ! 550,000  500,000  261,905 1,050,000 



R  440,000 − 0.8 (261,905)  230,476

The reinsurance company receives 800,000 and pays X plus the experience refund, so the expected net cost is 800,000 − E[X] − R  800,000 − 500,000 − 230,476  69,524 10.2. Let the loss variable be X. The base on which the agent is paid is max (350,000 − X, 0)  350,000 − X ∧ 350,000. 350,000 − E[X ∧ 350,000]  350,000 − 350,000 + 350,000e −1  128,757.80.

The proportion is then

20,000  0.1553 128,757.80

10. BONUSES

184

10.3.

The distribution is Pareto with α  3, θ  600,000. The bonus is 1 1 max (0, 350,000 − X )  (350,000 − X ∧ 350,000) 3 3

We calculate E[X ∧ 350,000].

!2

600,000 * 600,000 + E[X ∧ 350,000]  1−  180,332.41 2 950,000

,

-

1 E[bonus]  (350,000 − 180,332.41)  56,555.86 3 10.4.

The bonus is

1 2

(E)

max (0, 500 − X )  250 − 12 X ∧ 500. Current expected claims are 500. We have

1 500K 500  K + 250 − E[X ∧ 500]  K + 250 − 2 500 + K 250K 250  K − 500 + K (500 + K )( K − 250)  250K 1 2

K 2 + 250K − 250 (500)  250K

K 2  250 (500)  125,000 K  353.55

10.5.

(C)

The bonus is 0.15 max (0, 480,000 − X )  0.15 (480,000 − X ∧ 480,000) . 480,000  244,898 E[X ∧ 480,000]  500,000 980,000

!

0.15 (480,000 − 244,898)  35,265 10.6.

(E)

The bonus is 400 − X ∧ 400. We have 100  E ( B )  c 400 − E[X ∧ 400]





! 400 + 1600c * / 100  c .400 − 300 700 7 , c 10.7.

7  0.4375 16

(A)

Let your bonus be X. X  500 max ( P − 90, 0)



 500 P − min ( P, 90)







E[X]  500 E[P] − E[P ∧ 90]





808 8 (80) 8 (80)  500 − +  2505.50 7 7 7 (907 )

!

!

EXERCISE SOLUTIONS FOR LESSON 10

185

Bonus in thousands 60

0

0

180

Losses in thousands

420

Figure 10.2: Bonus in exercise 10.8

10.8. The bonus is 60,000 if losses are below 180,000, and 0.25 (420,000 − X ) if losses, X, are between 180,000 and 420,000. The bonus is graphed in Figure 10.2. The best way to get an expression for the value of the bonus is to observe that the bonus is 60,000 minus one-quarter of the amount of the loss between 180,000 and 420,000. The expected value of the amount of the loss between 180,000 and 420,000 (this is the amount that an insurance policy with a deductible of 180,000 and a maximum covered loss of 420,000 would pay) is E[X ∧ 420,000] − E[X ∧ 180,000] so the value of the bonus is

60,000 − 0.25 E[X ∧ 420,000] + 0.25 E[X ∧ 180,000]

If you did not observe this, you could derive it algebraically. The expected value of the bonus is 60,000F (180,000) + 0.25

Z

420,000 180,000

(420,000 − x ) f ( x ) dx

In working this out, an important observation is that d

Z E[X ∧ d] 

0

x f ( x ) dx + d 1 − F ( d )





and therefore d

Z 0

b

Z a

x f ( x ) dx  E[X ∧ d] − d 1 − F ( d )





x f ( x ) dx  E[X ∧ b] − E[X ∧ a] − b 1 − F ( b ) + a 1 − F ( a )









Answer  60,000F (180,000) + 0.25 (420,000) F (420,000) − F (180,000)



− 0.25 E[X ∧ 420,000] − 420,000 1 − F (420,000)







+ 0.25 E ( X ∧ 180,000) − 180,000 1 − F (180,000)







 60,000 − 0.25 E[X ∧ 420,000] + 0.25 E[X ∧ 180,000]

Now we evaluate this expression.

420,000 180,000 + 0.25 (500,000) 920,000 680,000

!

!



10. BONUSES

186

 60,000 − 0.25 (228,261) + 0.25 (132,353)  36,023

Lesson 11

Discrete Distributions Reading: Loss Models Fourth Edition 6

11.1

The ( a, b, 0) class

Discrete distributions are useful for modeling frequency. Three basic distributions are Poisson, negative binomial, and binomial. Probabilities, moments, and similar information for them are in Loss Models Appendix B, which you get at the exam. Let p n  Pr ( N  n ) . A Poisson distribution with parameter λ > 0 is defined by p n  e −λ

λn n!

λ>0

The mean and variance are λ. A sum of n independent Poisson P random variables N1 ,. . . , Nn with parameters λ1 , . . . ,λ n has a Poisson distribution whose parameter is ni1 λ i . A negative binomial distribution with parameters r and β is defined by n−1+r pn  n

!

1 1+β

!r

β 1+β

!n

β > 0, r > 0

The mean is rβ and the variance is rβ (1 + β ) . Note that the variance is always greater than the mean. The above parametrization is unusual; you may have learned a k, p parametrization in your probability course. However, this parametrization is convenient when we obtain the negative binomial as a gamma mixture of Poissons in the next lesson. Also note that r must be greater than 0, but need not be an integer. The general definition of a binomial coefficient for any real numerator and a non-negative integer denominator is x x ( x − 1) · · · ( x − n + 1)  n n!

!

The appendix to the textbook (which you get on the exam) writes out this product rather than using a binomial coefficient, but I will occasionally use the convenient binomial coefficient notation. A special case of the negative binomial distribution is the geometric distribution, which is a negative binomial distribution with r  1. By the above formulas, p0 

1 1+β

and β p n−1 pn  1+β

!

187

11. DISCRETE DISTRIBUTIONS

188

so the probabilities follow a geometric progression with ratio β/ (1 + β ) . To calculate Pr ( N ≥ n ) , we can sum the geometric series up: Pr ( N ≥ n ) 

∞ X in

1 1+β

!

β 1+β

!i

β  1+β

!n

(11.1)

Therefore, the geometric distribution is the discrete counterpart of the exponential distribution, and has a similar memoryless property. Namely Pr ( N ≥ n + k | N ≥ n )  Pr ( N ≥ k )

A sum of n independent negative binomial random variables N1 ,. . . ,Nn P having the same β and parameters r1 ,. . . ,r n has a negative binomial distribution with parameters β and ni1 r i . One way to remember that you need the same β’s is that the variance of the sum has to be the sum of the variances, and this wouldn’t work with summing β’s since the variance is a quadratic in β, namely rβ (1 + β ) . A binomial distribution with parameters m and q is defined by m n pn  q (1 − q ) m−n n

!

m a positive integer, 0 < q < 1

and has mean mq and variance mq (1−q ) . Thus its variance is less than its mean. Also, it has finite support; it is zero for n > m. A sum of n independent binomial random variables N1 ,. .P . ,Nn having the same q and parameters m 1 ,. . . ,m n has a binomial distribution with parameters q and ni1 m i . One way to remember that you need the same q’s is that the variance of the sum has to be the sum of the variances, and this wouldn’t work with summing q’s since the variance is a quadratic in q, mq (1 − q ) . Given the mean and variance of a distribution, you can back out the parameters. Moreover, you can tell which distribution to use by seeing whether the variance is greater than the mean (negative binomial), equal (Poisson), or less than the mean (binomial). So quickly, if one of these distributions has mean 10 and variance 30, which distribution is it and what are its parameters? (Answer below1) These three frequency distributions are the complete set of distributions in the ( a, b, 0) class. This class is defined by the following property. If we let p k  Pr ( X  k ) , then, pk b a+ p k−1 k β

a is positive for the negative binomial distribution (how ironic!) and equal to 1+β ; it is 0 for the Poisson; and it is negative for the binomial distribution. Thus, given that a random variable follows a distribution from the ( a, b, 0) class, the sign of a determines which distribution it follows. For the Poisson distribution, b  λ. For the geometric distribution, which is a special case of the negative binomial distribution, b  0. The formulas for a and b all appear in the tables you are given on the exam, so you don’t have to memorize them. For convenience, they are duplicated in Table 11.1 on page 190. For a distribution in the ( a, b, 0) class, the values of p k for three consecutive values of k allow you to determine the exact distribution, since you can set up two equations in two unknowns (a and b). Similarly, knowing p k for two pairs of consecutive values allows determination of a and b. Example 11A Claim frequency follows a distribution in the ( a, b, 0) class. You are given that (i) The probability of 4 claims is 0.066116. (ii) The probability of 5 claims is 0.068761. (iii) The probability of 6 claims is 0.068761. (Answer: Negative binomial, r  5, β  2)

1

11.1. THE ( a, b, 0) CLASS

189

Calculate the probability of no claims. Answer: First we solve for a and b b 5 b a+ 6 b 30 b a+



0.068761 0.066116

1

0.068761 − 1  0.0400 0.066116  1.2 b a  1 −  0.8 6 

Positive a means that this is a negative binomial distribution. We now solve for r and β. a  0.8 

β 1+β

β4 b  ( r − 1)

β  ( r − 1) a 1+β

b  2.5 a !r 1  0.22.5  0.01789 p0  1+β r 1+

?



Quiz 11-1 For a discrete distribution in the ( a, b, 0) class, you are given (i) The probability of 2 is 0.258428. (ii) The probability of 3 is 0.137828. (iii) The probability of 4 is 0.055131. Determine the probability of 0. Sometimes the ( a, b, 0) formula will be more convenient for computing successive probabilities than computing them directly. This is particularly true for the Poisson distribution, for which a  0 and b  λ; for a Poisson with λ  6, if you already know that p 3  0.089235, then compute p4  p3 ( λ/4)  1.5 (0.089235)  0.133853.

Moments The tables you get on the exam provide means and variances for the discrete distributions we’ve discussed. If you need to calculate higher moments, the tables recommend using the probability generating function (the pgf). The factorial moments are derivatives of the pgf. See page 12 for the definition of factorial moments. For an example of calculation of moments using the pgf, see Example 1F, in which we calculated the third raw moment for a negative binomial using the pgf. Here’s another example: Example 11B Calculate the coefficient of skewness of a Poisson distribution with mean λ. Answer: The probability generating function for a Poisson is, from the tables, P ( z )  e λ ( z−1) C/4 Study Manual—17th edition Copyright ©2014 ASM

11. DISCRETE DISTRIBUTIONS

190

Table 11.1: Formula summary for members of the ( a, b, 0) class

Poisson

Binomial

Negative binomial n+r−1 n

1 1+β

!r

Geometric

!n

βn (1 + β ) n+1

pn

e −λ λn!

m n q (1 − q ) m−n n

Mean

λ

mq

β

Variance

λ

mq (1 − q )

rβ (1 + β )

β (1 + β )

a

0

β 1+β

β 1+β

b

λ

( m + 1)

n

!

q 1−q

q 1−q

!

( r − 1)

β 1+β

β 1+β

0

Differentiating three times, which means bringing a lambda down each time, P 000 ( z )  λ3 e λ ( z−1) P 000 (1)  λ3 That is the third factorial moment. The second raw moment is the mean squared plus the variance, or λ 2 + λ. Therefore, E[X ( X − 1)( X − 2) ]  λ3

E[X 3 ] − 3 E[X 2 ] + 2 E[X]  λ3

E[X 3 ]  λ3 + 3 ( λ2 + λ ) − 2λ  λ3 + 3λ 2 + λ

By formula (1.2), the third central moment is E[ ( X − λ ) 3 ]  E[X 3 ] − 3 E[X 2 ]λ + 2λ 3

 λ3 + 3λ 2 + λ − 3 ( λ2 + λ )( λ ) + 2λ 3 λ

The coefficient of skewness, the third central moment over the variance raised to the 1.5, is λ/λ1.5  √ 1/ λ .  The textbook’s distribution tables mention the following recursive formula for factorial moments for any ( a, b, 0) class distribution in terms of a and b: µ( j) 

( a j + b ) µ ( j−1) 1−a

Setting j  1 results in a formula for the mean. If N is the random variable having the ( a, b, 0) distribution, a+b 1−a a+b Var ( N )  (1 − a ) 2 E[N] 

(11.2) (11.3)

11.2. THE ( a, b, 1) CLASS

191

These formulas are not directly provided in the distribution tables you get at the exam. Instead, more general formulas are provided on page 10, in the first paragraph of Subsection B.3.1. If you set p0  0 in those formulas, you get the formulas mentioned here. If you’re going to memorize any of them, formula (11.2) is the more important, in that it may save a little work in a question like the following. Example 11C For a random variable N following a distribution in the ( a, b, 0) class, you are given (i) Pr ( N  2)  0.03072 (ii) Pr ( N  3)  0.04096 (iii) Pr ( N  4)  0.049152 Calculate E[N]. Answer: Back out a and b. b 0.04096 4   3 0.03072 3 b 0.049152 a+   1.2 4 0.04096   4 2 b  − 1.2  12 3 15 24 b  1.6 15 1.6 a  1.2 −  0.8 4 a+

Then the mean is E[N] 

?

a + b 0.8 + 1.6   12 1−a 1 − 0.8



Quiz 11-2 For a distribution from the ( a, b, 0) class, you are given that p1  0.256, p2  0.0768, and p3  0.02048. Determine the mean of the distribution.

11.2

The ( a, b, 1) class

Material on the ( a, b, 1) class was originally in Course 3 in the 2000 syllabus, but was removed in 2002. When they moved severity, frequency, and aggregate loss material to Exam C/4 in 2007, they added material on the ( a, b, 1) class back to the syllabus. They’ve expanded the formula tables you get at the exam to include information about ( a, b, 1) distributions. There were no questions on the ( a, b, 1) class (or the ( a, b, 0) class for that matter) on the Spring 2007 exam, the only released exam after the syllabus change. However, students have reported at least two distinct ( a, b, 1) questions from the question bank used in recent exams. So this topic has some probability of appearing on your exam. Often, special treatment must be given to the probability of zero claims, and the three discrete distributions of the ( a, b, 0) class are inadequate because they do not give appropriate probability to 0 claims. The ( a, b, 1) class consists of distributions for which p0 is arbitrary, but the ( a, b, 0) relationship holds above 1; in other words pk b  a + for k  2, 3, 4, . . . p k−1 k but not for k  1. C/4 Study Manual—17th edition Copyright ©2014 ASM

11. DISCRETE DISTRIBUTIONS

192

One way to obtain a distribution in this class is to take one from the ( a, b, 0) class and truncate it at 0, i.e., make the probability of 0 equal to zero, and then scale all the other probabilities so that they add up to 1. To avoid confusion, let p n be the probabilities from the ( a, b, 0) class distribution we start with and p Tn be the probabilities of the new distribution. We let p 0T  0, and multiply the probabilities p n , n ≥ 1, by 1 1−p0 so that they sum up to 1; in other words, p Tn 

pn for n > 0. 1 − p0

These distributions are called zero-truncated distributions. Now let’s use the notation p kM for the probabilities of the new distribution. We can generalize the above c so that they add up to 1. by assigning p0M some constant 1 − c and then multiplying p n , n ≥ 1, by 1−p 0 Such a distribution is called a zero-modified distribution, with the zero-truncated distributions a special case where c  1. Comparing truncated and modified probabilities: p nM  (1 − p0M ) p Tn

n>0

Example 11D For a distribution from the ( a, b, 1) class, p1M  0.4, p2M  0.2, p3M  0.1. Determine p0M . Answer: We will use superscript M for the probabilities of this distribution, and unsuperscripted variables will denote probabilities of the related ( a, b, 0) distribution. Since

p2M

p1M

 0.5  a + 2b and

p 3M p 2M

 0.5  a + 3b , we conclude that b  0 and a  0.5. This is a zero-modified

geometric distribution, since b  0 implies the distribution is geometric. For a non-modified geometric β β 1 distribution, a  1+β , so β  1 and for the geometric p0  1+β  0.5 and p1  (1+β ) 2  0.25. Then the ratio of each modified probability to the ( a, b, 0) probability is of c as 1 − p0M , 1 − p0M



1 − p0M



1 − p0

0.5 p0M

c 1−p 0 ,

so

c 1−p 0



p 1M p1



0.4 0.25

 85 . By the definition

8 5

8 5  0.2



The zero-truncated geometric distribution has the special property that it is the same as an unmodified geometric distribution shifted by 1. This means that its mean is 1 more than the mean of an unmodified distribution, or 1+ β, and its variance is the same as the variance of an unmodified distribution, or β (1+ β ) .

?

Quiz 11-3 For a random variable N with a zero-truncated geometric distribution has p 1T  0.6. Calculate Pr ( N > 10) . In addition to modifications of ( a, b, 0) distributions, the ( a, b, 1) class includes an extended negative binomial distribution with −1 < r < 0; the negative binomial distribution only allows r > 0. This extended distribution is called the ETNB (extended truncated negative binomial) , even though it may be zeromodified rather than zero-truncated. C/4 Study Manual—17th edition Copyright ©2014 ASM

11.2. THE ( a, b, 1) CLASS

193

Example 11E A random variable N follows an extended truncated negative binomial distribution. You are given: Pr ( N  0)  0 Pr ( N  1)  0.6195 Pr ( N  2)  0.1858 Pr ( N  3)  0.0836 Calculate the mean of the distribution. Answer: Set up the equations for a and b. b 2 b a+ 3 b 6 b

a+

0.1858  0.3 0.6195 0.0836   0.45 0.1858 

 −0.15  −0.9

a  0.75

Then solve for r and β, using the formulas in the tables. a  0.75 

β 1+β

0.75 3 1 − 0.75 b  −0.9  ( r − 1) a −0.9 r 1+  −0.2 0.75

β

Using the formula for the mean in the tables, E[N] 

rβ −0.2 (3)   1.8779 −r 1 − (1 + β ) 1 − 40.2



Taking the limit of r → 0, we get the logarithmic distribution; the zero-truncated form has one parameter β and probability function

 p Tn



β/ (1 + β )

n

n ln (1 + β )

for n  1, 2, 3, . . .

and the zero-modified form is obtained by setting p0M  1 − c and multiplying all other probabilities by c. This distribution is called logarithmic for good reason; if we let u  u n−1 

β 1+β ,

then each probability

 n . This sequence has a logarithmic pattern (think of the Taylor expansion of − ln (1 − u ) ). For the ETNB with −1 < r < 0 and taking the limit of β → ∞, the distribution is sometimes called the Sibuya distribution. No moments exist for it. This distribution is not listed separately in the tables, so let’s discuss the truncated version of it a little. To calculate p1T , the formula in the tables, factoring out 1 + β in the denominator, is rβ   p 1T  (1 + β ) (1 + β ) r − 1 p nM

p1M

When β → ∞, then β/ (1 + β ) → 1. Since r < 0, it follows that (1 + β ) r → 0. So we have p1T  −r for a Sibuya. Also note that a  1 and b  r − 1. C/4 Study Manual—17th edition Copyright ©2014 ASM

11. DISCRETE DISTRIBUTIONS

194

Example 11F A zero-truncated discrete random variable follows a Sibuya distribution with r  −0.5. Calculate p Tk for k  1, . . . , 5. Answer: We have p 1T  −r  0.5, a  1, and b  r − 1  −1.5, so

1.5  0.5 (0.25)  0.125 2   1.5 p3T  0.125 1 −  0.125 (0.5)  0.0625 3   1.5 p4T  0.0625 1 −  0.0625 (0.625)  0.0390625 4   1.5 p5T  0.0390625 1 −  0.0390625 (0.7)  0.02734375 5





p 2T  0.5 1 −



You should be able to go back and forth between a and b and the parameters of the distribution, using the tables. The means and variances of the zero-truncated distributions are given in the tables. Zero-modified distributions are a mixture of a zero-truncated distribution with weight c and a degenerate distribution at zero (one which has p0  1) with mean and variance 0 with weight 1 − c. Therefore, if the mean of the zero-truncated distribution is m and the variance v, the mean of the zero-modified distribution is cm. Letting I be the condition of being in the first or second component of the mixture, the variance, by the conditional variance formula, is Var ( N )  Var E[N | I] + E Var ( N | I )  Var (0, m ) + E[0, v]  c (1 − c ) m 2 + cv





f

g

where Var (0, m ) was computed by the Bernoulli shortcut (page 54). Here is a summary of the mean and variance formulas for zero-modified distributions: For N a zero-modified random variable, (11.4)

E[N]  cm 2

Var ( N )  c (1 − c ) m + cv

(11.5)

where • c is 1 − p 0M .

• m is the mean of the corresponding zero-truncated distribution. • v is the variance of the corresponding zero-truncated distribution. Anyhow, the tables include instructions for calculating the mean and variance of zero-modified distributions. Example 11G For a discrete random variable N, you are given (i) p k  Pr ( N  k ) . (ii) p0  0.8 (iii) p k  p k−1 / (4k ) for k > 1 Calculate the mean and variance of the distribution. Answer: a  0 making this a zero-modified Poisson. b  λ  14 . For the zero-truncated distribution, the mean is λ 0.25   1.130203 −λ 1 − e −0.25 1−e C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISES FOR LESSON 11

195

and the variance is λ (1 − ( λ + 1) e −λ ) 0.25 (1 − 1.25e −0.25 ) 0.00662476    0.135395 0.0489291 (1 − e −0.25 ) 2 (1 − e −λ ) 2 The mean of the zero-modified Poisson is E[N]  (1 − p0 )(1.130203)  0.2 (1.130203)  0.22604 and the variance is Var ( N )  (0.2)(0.8)(1.1302032 ) + (0.2)(0.135395)  0.23146 

Exercises ( a, b, 0) class 11.1. I.

[4B-S92:1] (1 point) Which of the following are true? The mean of the binomial distribution is less than the variance.

II.

The mean of the Poisson distribution is equal to the variance.

III.

The mean of the negative binomial distribution is greater than the variance.

(A) I only (B) II only (C) III only (E) The correct answer is not given by (A) , (B) , (C) , or (D) . 11.2.

(D) I, II, and III

For a distribution in the ( a, b, 0) class, you are given that p1  0.6p0 and p2  0.4p1 .

Determine p0 . 11.3.

For a distribution from the ( a, b, 0) class you are given p2 /p1  0.25 and p4 /p 3  0.225.

Determine p2 . 11.4. [4B-F98:2] (2 points) The random variable X has a Poisson distribution with mean n − 12 , where n is a positive integer greater than 1. Determine the mode of X. (A) n − 2

(B) n − 1

(C) n

(D) n + 1

(E) n + 2

11.5. For a distribution from the ( a, b, 0) class, you are given that p1  0.02835, p2  0.1323, and p 3  0.3087. Determine p4 . 11.6. For an auto collision coverage, frequency of claims follows a distribution from the ( a, b, 0) class. You are given: (i) The probability of 1 claim is 0.19245. (ii) The probability of 2 claims is 0.09623. (iii) The probability of 3 claims is 0.05346. Determine the average number of claims.

Exercises continue on the next page . . .

11. DISCRETE DISTRIBUTIONS

196

11.7.

[3-S01:25] For a discrete probability distribution, you are given the recursion relation p (k ) 

2 × p ( k − 1) , k

k  1, 2, . . .

Determine p (4) . (A) 0.07

(B) 0.08

(C) 0.09

(D) 0.10

(E) 0.11

11.8. [3-F02:28] X is a discrete random variable with a probability function which is a member of the ( a, b, 0) class of distributions. You are given: (i) (ii)

P ( X  0)  P ( X  1)  0.25 P ( X  2)  0.1875

Calculate P ( X  3) . (A) 0.120

(B) 0.125

(C) 0.130

(D) 0.135

(E) 0.140

11.9. [CAS3-F03:14] The Independent Insurance Company insures 25 risks, each with a 4% probability of loss. The probabilities of loss are independent. On average, how often would 4 or more risks have losses in the same year? (A) (B) (C) (D) (E)

Once in 13 years Once in 17 years Once in 39 years Once in 60 years Once in 72 years

11.10. [CAS3-F04:22] An insurer covers 60 independent risks. Each risk has a 4% probability of loss in a year. Calculate how often 5 or more risks would be expected to have losses in the same year. (A) (B) (C) (D) (E)

Once in 3 years Once in 7 years Once in 11 years Once in 14 years Once in 17 years

11.11. [CAS3-F03:12] A driver is selected at random. If the driver is a “good” driver, he is from a Poisson population with a mean of 1 claim per year. If the driver is a “bad” driver, he is from a Poisson population with a mean of 5 claims per year. There is equal probability that the driver is either a “good” driver or a “bad” driver. If the driver had 3 claims last year, calculate the probability that the driver is a “good” driver. (A) (B) (C) (D) (E)

Less than 0.325 At least 0.325, but less than 0.375 At least 0.375, but less than 0.425 At least 0.425, but less than 0.475 At least 0.475

Exercises continue on the next page . . .

EXERCISES FOR LESSON 11

197

11.12. [CAS3-F03:18] A new actuarial student analyzed the claim frequencies of a group of drivers and concluded that they were distributed according to a negative binomial distribution and that the two parameters, r and β, were equal. An experienced actuary reviewed the analysis and pointed out the following: “Yes, it is a negative binomial distribution. The r parameter is fine, but the value of β is wrong. Your parameters indicate that 91 of the drivers should be claim-free, but in fact, 94 of them are claim-free.” Based on this information, calculate the variance of the corrected negative binomial distribution. (A) 0.50

(B) 1.00

(C) 1.50

(D) 2.00

(E) 2.50

11.13. [CAS3-S04:32] Which of the following statements are true about the sums of discrete, independent random variables? 1.

The sum of two Poisson random variables is always a Poisson random variable.

2.

The sum of two negative binomial random variables with parameters ( r, β ) and ( r 0 , β0 ) is a negative binomial random variable if r  r 0.

3.

The sum of two binomial random variables with parameters ( m, q ) and ( m 0 , q 0 ) is a binomial random variable if q  q 0.

(A) (B) (C) (D) (E)

None of 1, 2, or 3 is true. 1 and 2 only 1 and 3 only 2 and 3 only 1, 2, and 3

11.14. [CAS3-F04:23] Dental Insurance Company sells a policy that covers two types of dental procedures: root canals and fillings. There is a limit of 1 root canal per year and a separate limit of 2 fillings per year. The number of root canals a person needs in a year follows a Poisson distribution with λ  1, and the number of fillings a person needs in a year is Poisson with λ  2. The company is considering replacing the single limits with a combined limit of 3 claims per year, regardless of the type of claim. Determine the change in the expected number of claims per year if the combined limit is adopted. (A) (B) (C) (D) (E)

No change More than 0.00, but less than 0.20 claims At least 0.20, but less than 0.25 claims At least 0.25, but less than 0.30 claims At least 0.30 claims

Exercises continue on the next page . . .

11. DISCRETE DISTRIBUTIONS

198

11.15. [4-F01:36] For an insurance policy, you are given: (i) The policy limit is 1,000,000 per loss, with no deductible. (ii) Expected aggregate losses are 2,000,000 annually. (iii) The number of losses exceeding 500,000 follows a Poisson distribution. (iv) The claim severity distribution has Prf(Loss > 500,000)  g0.0106 E min (Loss; 500,000)  20,133 E min (Loss; 1,000,000)  23,759

f

g

Determine the probability that no losses will exceed 500,000 during 5 years. (A) 0.01 11.16.

(B) 0.02

(C) 0.03

(D) 0.04

(E) 0.05

For a discrete probability distribution, you are given that p k  p k−1



1 1 − k 10



k>0

Calculate the mean of the distribution. 11.17. For a frequency distribution in the ( a, b, 0) class, you are given (i) (ii) (iii)

p k  0.0768 p k+1  p k+2  0.08192 p k+3  0.0786432

Determine the mean of this distribution. 11.18. [3-F00:13] A claim count distribution can be expressed as a mixed Poisson distribution. The mean of the Poisson distribution is uniformly distributed over the interval [0, 5]. Calculate the probability that there are 2 or more claims. (A) 0.61

(B) 0.66

(C) 0.71

(D) 0.76

(E) 0.81

11.19. The number of insurance applications arriving per hour varies according to a negative binomial distribution with parameters r  6 and β. The parameter β varies by hour according to a Poisson distribution with mean 4. Within any hour, β is constant. Using the normal approximation, estimate the probability of more than 600 applications arriving in 24 hours.

( a, b, 1) class 11.20. The random variable N follows a zero-modified Poisson distribution. You are given: (i) Pr ( N  1)  0.25. (ii) Pr ( N  2)  0.10. Calculate the probability of 0. 11.21. A zero-truncated geometric distribution has a mean of 3. Calculate the probability of 5. C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

EXERCISES FOR LESSON 11

199

11.22. A claim count distribution can be expressed as an equally weighted mixture of two logarithmic distributions, one with β  0.2 and one with β  0.4. Determine the variance of claim counts. 11.23. A claim count distribution has a zero-truncated binomial distribution with m  4, q  0.2. Determine the probability of 2 or more claims. Use the following information for questions 11.24 and 11.25: For a zero-modified ETNB distribution, you are given (i) (ii) (iii)

p1M  0.72 p2M  0.06

p3M  0.01.

11.24. Determine the probability of 0. 11.25. Determine the variance of the distribution. 11.26. For a zero-modified geometric random variable N, (i) (ii)

E[N]  3 E[N 2 ]  63

Calculate Pr ( N  0) . 11.27. A random variable follows a zero-truncated Poisson distribution with λ  0.8. Calculate the third raw moment of the distribution. 11.28. For a zero-modified logarithmic distribution, p2M  0.2 and p3M  0.1. Determine p0M .

11.29. A discrete random variable follows a zero-modified Sibuya distribution. You are given that p1M  0.15 and p 2M  0.06. Determine p0M .

11.30. [3-S00:37] Given: (i) (ii)

p k denotes the probability that the number of claims equals k for k  0, 1, 2, . . . pn m!  , m ≥ 0, n ≥ 0 pm n!

Using the corresponding zero-modified claim count distribution with p0M  0.1, calculate p1M . (A) 0.1

(B) 0.3

(C) 0.5

(D) 0.7

(E) 0.9

Additional released exam questions: CAS3-S05:15,16,28 SOA M-S05:19, CAS3-S06:31, CAS3-F06:23

Solutions 11.1. For a binomial, E[N]  mq > Var ( N )  mq (1 − q ) since q < 1. For a Poisson, mean and variance are both λ. For a negative binomial, E[N]  rβ < Var ( N )  rβ (1 + β ) , since β > 0. (B) C/4 Study Manual—17th edition Copyright ©2014 ASM

11. DISCRETE DISTRIBUTIONS

200

11.2.

Plug into the ( a, b, 0) class defining formula: a + b  0.6 b a +  0.4 2 a  0.2 b  0.4

a is positive so the distribution is negative binomial.

β 1+β

 0.2, so β  0.25. Then plugging into the formula

for b, r  3. p0  (1.25) −3  0.512 . 11.3.

We solve for a and b. b 2 b 0.225  a + 4 b  0.1 0.25  a +

a  0.2

Positive a means the distribution is negative binomial. b  0.5 a r  1.5

r−1

β  0.2 1+β 2.5 p2  (0.8) 1.5 (0.2) 2  0.05367 2

!

11.4. For a Poisson distribution, a  0 and b  λ, so p n  ( λ/n ) p n−1 . We want to know when λ/n < 1, since then p n < p n−1 and n − 1 is the mode. But λ/n < 1 means n > λ. So the mode will occur at the greatest integer less than λ. Here, λ  n − 12 , and the greatest integer less than n − 21 is n − 1 . (B) 11.5.

We solve for a and b.

p2 14  a+ p1 3 p3 7  a+ p2 3 b  14

b 2 b 3

a−

7 3

Negative a means the distribution is binomial q 7  3 1−q

b  − ( m + 1) a

q  0.7 ⇒

m

p4  5 (0.74 )(0.3)  0.36015 C/4 Study Manual—17th edition Copyright ©2014 ASM

14 −15 7/3

EXERCISE SOLUTIONS FOR LESSON 11

201

Actually, we do not have to back out m and q. We can use the ( a, b, 0) relationship to get p4 : p4  a + 11.6.

b 7 14 p3  − + (0.3087)  0.36015 4 3 4

!





We solve for a and b. b p2 0.09623 1    2 p1 0.19245 2 b p3 0.05346 5  a+   3 p2 0.09623 9 b 1 5 1  − − 6 2 9 18 6 1 b− − 18 3 1 b 2 a −  2 2 3 a+

By formula (11.2), E[N]  11.7.

Since a  0, this is a Poisson distribution, and b  λ  2. Then p (4)  e −2

11.8.

2/3 − 1/3  1 1 − 2/3

24  0.0902 4!

(C)

We have 0.25 1 0.25 b 0.1875 a+   0.75 2 0.25 1 1 a b 2 2 a+b

It is negative binomial, but we don’t even have to calculate r and β. We can proceed with the ( a, b, 0) class relationship. ! b 2 p3  a + p2  p2  0.125 (B) 3 3 11.9. This is a binomial distribution with m  25, q  0.04, so p 0  0.9625  0.360397. You could calculate the other p k ’s directly, or use the ( a, b, 0) class formula: a−

q  −0.041667 1−q

b  − ( m + 1) a  1.083333

p1  ( a + b ) p0  0.375413

b p2  a + p1  0.187707 2

!

b p2  0.059962 3

!

11. DISCRETE DISTRIBUTIONS

202

1 − p0 − p1 − p2 − p3  0.016522 So 4 or more risks would have losses in the same year once in 0.9660

1 0.016522

 60.53 years. (D)

11.10. This is binomial with m  60, q  0.04, p 0   0.086352. As in the previous problem, we can use the ( a, b, 0) class rather than calculating the probabilities directly. a−

q  −0.041667 1−q

b  − ( m + 1) a  2.541667

p1  ( a + b )(0.086352)  0.215881 p2  a +

b (0.215881)  0.265353 2

!

b p3  a + (0.265353)  0.213757 3

!

b (0.215881)  0.126918 p4  a + 4

!

Pr ( N ≥ 5)  1 − 0.086352 − 0.215881 − 0.265353 − 0.213757 − 0.126918  0.091738 The answer is once every 11.11.

1 0.091738

 10.9006 years. (C)

By Bayes’ Theorem, since Pr (good)  Pr (bad)  0.5, Pr (good | 3 claims) 

Pr (good) Pr (3 claims | good) Pr (good) Pr (3 claims | good) + Pr (bad) Pr (3 claims | bad)

Pr (3 claims | good)  e

−1

13  0.06131 3!

!

53 Pr (3 claims | bad)  e  0.14037 3! 0.06131 Pr (good | 3 claims)   0.3040 0.06131 + 0.14037 −5

!

(A)

r

1 11.12. p 0  1+β  19 . “By inspection”, since r  β, they both equal 2—that’s what the official solution says. You could try different values of r to find r  2 by trial and error. Then, since r doesn’t change, we need

!2

1 4  1+β 9 1 2  1+β 3 1 β 2 Var ( N )  rβ (1 + β )  1.5

(C)

EXERCISE SOLUTIONS FOR LESSON 11

203

11.13. If you didn’t remember which ones are true, you could look at the probability generating functions. For a Poisson, P ( z )  e λ ( z−1) . When you multiply two of these together, you get P ( z )  e ( λ1 +λ2 )( z−1) , which has the same form. For a negative binomial, P ( z )  [1 − β ( z − 1) ]−r . If the β’s are the same, then multiplying two of these together means summing up r’s, but you get the same form. On the other hand, if the β’s are different, multiplying 2 of these together does not get you a function of the same form, so 2 is false. For a binomial, P ( z )  [1 + q ( z − 1) ]m , so multiplying two with the same q results in adding the m’s and you get something of the same form. (C) 11.14. Before the change, the probability of 0 root canals is e −1 , so the expected number of root canals paid for is 1 − e −1  0.6321. The probability of 0 fillings is e −2 and the probability of 1 filling is 2e −2 , so the expected number of fillings paid for is p1 + 2 Pr ( N > 1)  2e −2 + 2 1 − e −2 − 2e −2  2 − 4e −2  1.4587





After the change, the combined annual number of fillings and root canals follows a Poisson distribution with mean 3. The expected number of paid claims is the probability of one event, plus twice the probability of two events, plus three times the probability of more than two events, where an event can be either a root canal or a filling and occurs at the rate of 3 per year: p 1 + 2p 2 + 3 Pr ( N > 2)  3e −3 + 2 (4.5e −3 ) + 3 1 − e −3 − 3e −3 − 4.5e −3  3 − 13.5e −3  2.3279





The change in the expected number of claims per year is 2.3279 − 0.6321 − 1.4587  0.2371 . (C)

11.15. Since aggregate losses are 2,000,000 and the average loss size is 23,759 (20,133 is irrelevant), the average number of losses per year is 2,000,000 23,759  84.1786. Since the probability of a loss above 500,000 is 0.0106, the average number of such losses is 84.1786 (0.0106)  0.8923. In 5 years, the average number of such losses is 5 (0.8923)  4.4615. They have a Poisson distribution, so the probability of none is e −4.4615  0.0115 . (A) 11.16.

pk p k−1

1  a + bk , so a  − 10 making this a binomial, and b  1. By formula (11.2),

E[N] 

−0.1 + 1 9  1 + 0.1 11

11.17. Dividing subsequent p k ’s, we have 16 b a+ 15 k+1 b 1a+ k+2 24 b a+ 25 k+3 Subtracting the second equation from the first and the third equation from the second 1 b  15 ( k + 1)( k + 2)  k+1 k+3  15 25 25k + 25  15k + 45 C/4 Study Manual—17th edition Copyright ©2014 ASM

1 25 ( k

+ 2)( k + 3)

11. DISCRETE DISTRIBUTIONS

204

k2 1 15 (3)(4)

 0.8 b 0.8 a 1− 1−  0.8 k+2 2+2

b

By formula (11.2),

0.8 + 0.8  8 1 − 0.8 11.18. We’ll calculate the probability of 0 claims or 1 claim. For a Poisson distribution with mean λ, this is e −λ (1 + λ ) . Now we integrate this over the uniform distribution: E[N] 

Z 0 5

Z 0

5

5

Z

Pr ( N < 2)  0.2

0

e −λ dλ +

5

Z

λe −λ dλ

0

!

e −λ dλ  1 − e −5

λe

−λ

dλ 

5 −λe −λ 0

+

5

Z

e −λ dλ

0 −5

 −5e −5 + 1 − e

 1 − 6e −5

Pr ( N < 2)  0.2 (1 − e −5 ) + (1 − 6e −5 )  0.2 2 − 7e −5  0.3906





Pr ( N ≥ 2)  1 − 0.3906  0.6094





(A)

11.19. Let S be the number of applications in 24 hours. The expected value of S is E[S]  24 6 (4)  576





To calculate the variance of S, we use the conditional variance formula, equation 4.2 on page 64, for one hour’s applications, by conditioning on β. Let S1 be the number of applications in one hour. Var ( S1 )  Var E[S1 | β] + E Var ( S1 | β )





f

g

Given β, the number of applications in one hour is negative binomial with mean rβ  6β and variance rβ (1 + β )  6β (1 + β ) . Var ( S1 )  Var (6β ) + E 6β (1 + β )

f

g

 62 Var ( β ) + 6 E[β] + E[β 2 ]





The parameter β is Poisson with mean 4 and variance 4 E[β]  Var ( β )  4 E[β2 ]  Var ( β ) + E[β]2  4 + 42  20 Var ( S1 )  36 (4) + 6 (4 + 20)  288 The variance of S is 24 times the variance of S1 . Var ( S )  24 (62 )(4) + 24 (6)(4 + 20)  24 (288) Using the normal approximation with continuity correction: 600.5 − 576  1 − Φ (0.29)  1 − 0.6141  0.3859 Pr ( S > 600)  1 − Φ √ (288)(24)

!

EXERCISE SOLUTIONS FOR LESSON 11 M 11.20. Since p kM  p k−1





λ k

205

for a zero-modified Poisson, we have λ 0.1  (0.25) 2

!

λ  0.8 For the corresponding zero-truncated distribution, p1T  so

λ 0.8  0.652773  e λ − 1 e 0.8 − 1 p1M  (1 − p0M ) p1T

0.25  (1 − p0M )(0.652773) 0.25 1 − p0M   0.382982 0.652773 p0M  0.617018 11.21. A zero-truncated geometric is a geometric shifted over by 1, or starting at 1 instead of at 0. Some authors consider this to be the true geometric distribution. (This is not true for the other truncated distributions. A zero-truncated negative binomial is not, in general, the same as a negative binomial distribution shifted by 1.) Therefore, the mean is 1 more than the mean of a geometric with the same parameter. The parameter of a geometric is its mean, so the parameter of the unshifted distribution here is 3 − 1  2. Then p5T  p 4 

β4 16   0.06584 5 243 (1 + β )

11.22. The means and variances of the two components of the mixture (N1 being the component with β  0.2 and N2 being the component with β  0.4) are

E[N2 ] 

0.2 1.2 − 0.2/ ln (1.2)



0.2 E[N1 ]   1.096963 ln 1.2

Var ( N1 ) 



ln 1.2  0.4 1.4 − 0.4/ ln (1.4)

 0.113028



0.4  1.188805 ln 1.4

Var ( N2 ) 

ln 1.4

 0.251069

So the second moments are E N12  0.113028 + 1.0969632  1.316356 and E N22  0.251069 + 1.1888052  1.664328. The mean of the mixture is

f

g

f

g

0.5 (1.096963 + 1.188805)  1.142884 and the second moment is

0.5 (1.316356 + 1.664328)  1.490342.

The variance is 1.490342 − 1.1428842  0.184157 . 3

)(0.8 )  0.693767. This is easy enough to figure out if you know the 11.23. The probability of 1 claim is 4 (0.2 1−0.84 binomial even without the tables, since the numerator is the non-truncated binomial and the denominator is the complement of the non-truncated binomial’s probability of 0. Therefore the probability of 2 or more claims is 1 − 0.693767  0.306233 . C/4 Study Manual—17th edition Copyright ©2014 ASM

11. DISCRETE DISTRIBUTIONS

206

11.24. We back out a and b. b 0.72 0.06  a + 2

!

0.01  a +

b 0.06 3 1  12 1  6

!

b 0.06  2 0.72 b 0.01 a+  3 0.06 b 1 − 6 12 1 b− 2 1 1/2 1 a +  12 2 3

a+

Then r − 1 

b a

 − 23 so r  − 21 . a  p1T 

(1 +

Since p1M  0.72, we have c 

β 1+β

 13 , so β  21 . This implies that

rβ −1/4 −0.25 √   0.908249 − (1 + β ) 3/2 − 3/2 −0.275255

β ) r+1

0.72 0.908249

 0.79274, so p0M  1 − 0.79274  0.20726 .

11.25. The distribution is a mixture of a degenerate distribution at 0, weight 0.20726 (as determined in the previous problem), which has mean 0 and variance 0, and a zero-truncated distribution with weight 0.79274. Using conditional variance, the variance will be 0.79274 times the variance of the zerotruncated distribution plus the variance of the means of the two components, which is (Bernoulli shortcut) (0.79274)(0.20726) times the mean of the zero-truncated distribution squared. The mean and variance of the zero-truncated distribution N are E[N] 

rβ (−0.5)(0.5) −0.25    1.11237 √ −r 1 − (1 + β ) −0.224745 1 − 1.5 rβ (1 + β ) − (1 + β + rβ )(1 + β ) −r



Var ( N ) 



1 − (1 + β ) −r 2  √  −0.5 (0.5) 1.5 − 1.25 1.5  √ (1 − 1.5) 2 0.0077328   0.15309 0.050510



The answer is (0.79273)(0.15309) + (0.79274)(0.20726)(1.112372 )  0.32467 . 11.26. Using the given information about the first two moments, Var ( N )  63 − 32  54. The zero-truncated geometric distribution corresponding to our zero-modified distribution is equivalent to an unmodified geometric distribution shifted 1, and the mean of an unshifted geometric distribution is β, so the mean of a zero-truncated geometric distribution is 1 + β, and its variance is the same as the variance of an unshifted geometric distribution, or β (1 + β ) . For a zero-modified distribution with p0M  1 − c, the mean is cm and the variance is c (1 − c ) m 2 + cv, using m and v of the zero-truncated distribution. Here, m  1 + β and v  β (1 + β ) , so the two equations C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 11

207

for mean and variance are: c (1 + β )  3 c (1 − c )(1 + β ) 2 + cβ (1 + β )  54 and we now solve for c in terms of β from the first equation and substitute into the second equation. 3 (1 + β ) + 3β  54 3 1− 1+β

!

1−

3 (1 + β ) + β  18 1+β

!

1 + β − 3 + β  18 2β  20

β  10 3 c 11 The answer is p0M  1 − c  8/11 .

11.27. The probability generating function of a zero-truncated Poisson, from the tables, is P (z )  Setting λ  0.8 and differentiating three times,

e λz − 1 eλ − 1

0.8e 0.8z e 0.8 − 1 0 P (1)  1.452773

P0 ( z ) 

P 00 (1)  0.8P 0 (1)  1.162218 P 000 (1)  0.8P 00 (1)  0.929775 These are the first three factorial moments. If N is the zero-truncated Poisson variable, then it follows that E[N]  1.452773 and E[N ( N − 1) ]  1.162218, so E[N 2 ]  1.452773 + 1.162218  2.614991. Then E[N ( N − 1)( N − 2) ]  0.929775

E[N 3 ] − 3 E[N 2 ] + 2 E[N]  0.929775

E[N 3 ]  0.929775 + 3 (2.614991) − 2 (1.452773)  5.8692 11.28. For a zero-modified logarithmic distribution, p kM 

0.1 

2 3

( u )(0.2) and u  43 . Then u  p2T 

implying that c 

p 2M p 2T



0.2 0.202879

β 1+β

u M k p1

so β  3. We then have

for some u, so p3  23 up2 which implies

β2 9   0.202879 2 (1 + β ) 2 ln (1 + β ) 2 (16) ln 4

 0.985809 and p0M  1 − c  0.014191 .

11. DISCRETE DISTRIBUTIONS

208

11.29. Since a  1 in a Sibuya, the ( a, b, 1) equation is b 0.06 1+ 0.15 2 so b  −1.2. Then since b  r − 1 in a Sibuya, it follows that r  −0.2. In a Sibuya p 1T  −r, so p1T  0.2. We are given that p1M  0.15, so c  0.75 and p 0M  1 − c  0.25 . 11.30. You may recognize this as zero-modified Poisson with λ  1 because the pattern 1!1 , 2!1 , 3!1 satisfies pn )!  ( n−1  n1 , so a  0 and λ  b  1. Since condition (ii). If not, just take m  n − 1 and you have p n−1 n! p0M  0.1, c  1 − p0M  0.9. Then (letting p1 be the probability of 1 for a non-modified Poisson) p1T 

p1 e −1 0.3679    0.5820 −1 −1 0.6321 1−e 1−e

and p 1M  cp 1T  0.9 (0.5820)  0.5238 (C)

Quiz Solutions 11-1.

The ratios are b 3 b a+ 4 b 12 b

a+

p3 0.137828   0.533333 p2 0.258428 p4 0.055131    0.4 p3 0.137828 

 0.533333 − 0.4  0.133333

 12 (0.133333)  1.6 1.6 0 a  0.4 − 4

Therefore the distribution is Poisson with λ  b  1.6, and p 0  e −1.6  0.201897 . 11-2. b 2 b a+ 3 b 6 b

a+

0.0768  0.3 0.256 0.02048 8   0.0768 30 1  30  0.2 

a  0.2 The mean is E[N] 

a + b 0.2 + 0.2   0.5 1−a 1 − 0.2

QUIZ SOLUTIONS FOR LESSON 11

209

11-3. Setting M  N − 1, this is equivalent to calculating Pr ( M > 9) given that p0  0.6 for M. However, M is an unmodified geometric distribution, so p0  1/ (1 + β )  0.6. It follows that β/ (1 + β )  0.4. By formula (11.1), ! 10 β  0.410  0.000104858 Pr ( M > 9)  Pr ( M ≥ 10)  1+β

210

11. DISCRETE DISTRIBUTIONS

Lesson 12

Poisson/Gamma Reading: Loss Models Fourth Edition 6.3 The negative binomial can be derived as a gamma mixture of Poissons. Assume that in a portfolio of insureds, loss frequency follows a Poisson distribution with parameter λ, but λ is not fixed but varies by insured. Suppose λ varies according to a gamma distribution over the portfolio of insureds. The conditional loss frequency of an insured, if you are given who the insured is, is Poisson with parameter λ. The unconditional loss frequency for an insured picked at random is a negative binomial. The parameters of the negative binomial (r, β) are the same as the parameters of the gamma distribution (α, θ); that is, r  α and β  θ. Loss Models uses an unusual parametrization of the negative binomial in order to make this work. The special case of α  1, the exponential distribution, corresponds to the special case of r  1, the geometric distribution. Since the sum of negative binomials P with parameters r i and β (β the same for all of them) is a negative binomial whose parameters are r  r i and β, if the portfolio of insureds has n exposures, the distribution of total number of claims for the entire portfolio will be negative binomial with parameters nr and β. There used to be an average of one question per exam on the Poisson/gamma model. This topic is now too simple for the exam, and doesn’t appear as often. Still, you should be ready to quickly get the Poisson/gamma question out of the way if it appears. This information may also be useful in connection with Bayesian estimation and credibility (Lesson 47). To make things a little harder, you probably won’t be given the parameters, but instead will be given the mean and variance. The Loss Models appendix has the means and variances of the distributions (actually, the second moment of the gamma, from which you can derive the variance), but you may want to memorize them anyway to save yourself lookup time. For a gamma distribution with parameters α and θ, the mean is αθ and the variance is αθ2 . For a negative binomial distribution, the mean is rβ and the variance is rβ (1 + β ) . Example 12A The number of claims for a glass insurance policy follows a negative binomial distribution with mean 0.5 and variance 1.5. For each insured, the number of claims has a Poisson distribution with mean λ. The parameter λ varies by insured according to a gamma distribution. Determine the variance of this gamma distribution. Answer: Since rβ  0.5 and rβ (1 + β )  1.5, it follows that 1 + β  3, β  2, and r  0.25. Then the gamma distribution has parameters α  0.25 and θ  2. The variance of the gamma is αθ 2  0.25 (22 )  1 . However, a faster way to work this out is to use the conditional variance formula. If we let X be the unconditional number of claims, then Var ( X )  E[Var ( X | λ ) ] + Var (E[X | λ]) X | λ is Poisson, with mean and variance λ, so Var ( X )  E[λ] + Var ( λ ) where the moments are over the gamma distribution. But

f

g

E[X]  E E[X | λ]  E[λ] C/4 Study Manual—17th edition Copyright ©2014 ASM

211

12. POISSON/GAMMA

212

so

Var ( X )  E[X] + Var ( λ )

or In our case, Var ( λ )  1.5 − 0.5  1 .

?

Var ( λ )  Var ( X ) − E[X]

(12.1)



Quiz 12-1 The number of insurance applications arriving per hour follows a Poisson distribution with mean λ. The distribution of λ over all hours is a gamma distribution with α  0.2 and θ  20. Within any hour, λ is constant. Calculate the probability of receiving exactly 2 applications in one hour.

Exercises 12.1. [3-F01:27] On his walk to work, Lucky Tom finds coins on the ground at a Poisson rate. The Poisson rate, expressed in coins per minute, is constant during any one day, but varies from day to day according to a gamma distribution with mean 2 and variance 4. Calculate the probability that Lucky Tom finds exactly one coin during the sixth minute of today’s walk. (A) 0.22

(B) 0.24

(C) 0.26

(D) 0.28

(E) 0.30

Use the following information for questions 12.2 and 12.3: Customers arrive at a bank at a Poisson rate of λ per minute. The parameter λ varies by day. The distribution of λ over all days is exponential with mean 2. 12.2.

Calculate the probability of at least 2 customers arriving in a minute on a random day.

12.3.

Calculate the probability of at least 4 customers arriving in 2 minutes on a random day.

12.4. Customers arrive at a bank at a Poisson rate of λ per minute. The parameter λ varies by minute. The distribution of λ over all minutes is exponential with mean 2. Calculate the probability of at least 4 customers arriving in 2 minutes. 12.5. [4B-S91:31] (2 points) The number of claims a particular policyholder makes in a year has a Poisson distribution with mean λ. λ follows a gamma distribution with variance equal to 0.2. The resulting distribution of policyholders by number of claims is a negative binomial with parameters r and β such that the variance is equal to 0.5. What is the value of r (1 + β ) ? (A) (B) (C) (D) (E)

Less than 0.6 At least 0.6, but less than 0.8 At least 0.8, but less than 1.0 At least 1.0, but less than 1.2 At least 1.2

Exercises continue on the next page . . .

EXERCISES FOR LESSON 12

12.6.

213

[4B-S95:24] (2 points) You are given the following:

The random variable representing the number of claims for a single policyholder follows a Poisson distribution.

For each class of policyholders, the Poisson parameters follow a gamma distribution representing the heterogeneity of risks within that class.

For four distinct classes of risks, the random variable representing the number of claims of a policyholder, chosen at random, follows a negative binomial distribution with parameters r and β, as follows: r β

Class 1 5.88 1/49

Class 2 1.26 1/9

Class 3 10.89 1/99

Class 4 2.47 1/19

The lower the standard deviation of the gamma distribution, the more homogeneous the class. Which of the four classes is most homogeneous?

(A) Class 1 (B) Class 2 (C) Class 3 (E) Cannot be determined from the given information

(D) Class 4

12.7. [3-S01:15] An actuary for an automobile insurance company determines that the distribution of the annual number of claims for an insured chosen at random is modeled by the negative binomial distribution with mean 0.2 and variance 0.4. The number of claims for each individual insured has a Poisson distribution and the means of these Poisson distributions are gamma distributed over the population of insureds. Calculate the variance of this gamma distribution. (A) 0.20 12.8.

(B) 0.25

(C) 0.30

(D) 0.35

(E) 0.40

[4B-F96:15] (2 points) You are given the following:

The number of claims for a single policyholder follows a Poisson distribution with mean λ.

λ follows a gamma distribution.

The number of claims for a policyholder chosen at random follows a distribution with mean 0.10 and variance 0.15. Determine the variance of the gamma distribution.

(A) 0.05 12.9.

(B) 0.10

(C) 0.15

(D) 0.25

(E) 0.30

For a fire insurance coverage, you are given

Claim frequency per insured on the coverage over the entire portfolio of insureds follows a negative binomial distribution with mean 0.2 and variance 0.5.

For each insured, claim frequency follows a Poisson distribution with mean λ.

λ varies by insured according to a gamma distribution. Calculate the variance of λ over the portfolio of insureds.

Exercises continue on the next page . . .

12. POISSON/GAMMA

214

12.10. For an insurance coverage, the number of claims per insured follows a negative binomial. You are given the following information on probabilities of claims: Number of claims

Probability

0 1 2

0.8 0.144 0.03888

Claims for each insured follow a Poisson distribution with mean λ. The distribution of λ over the insureds is a gamma distribution. Determine the variance of the gamma distribution. 12.11. The number of claims on an insurance coverage follows a Poisson distribution with mean λ for each insured. The means λ vary by insured and overall follow a gamma distribution. You are given: (i) The probability of 0 claims for a randomly selected insured is 0.04. (ii) The probability of 1 claim for a randomly selected insured is 0.064. (iii) The probability of 2 claims for a randomly selected insured is 0.0768. Determine the variance of the gamma distribution. 12.12. [4B-F96:26] (2 points) You are given the following: •

The probability that a single insured will produce 0 claims during the next exposure period is e −θ .

θ varies by insured and follows a distribution with density function f ( θ )  36θe −6θ

0 < θ < ∞.

Determine the probability that a randomly selected insured will produce 0 claims during the next exposure period. (A) (B) (C) (D) (E)

Less than 0.72 At least 0.72, but less than 0.77 At least 0.77, but less than 0.82 At least 0.82, but less than 0.87 At least 0.87

12.13. You are given: (i)

The number of claims on an auto comprehensive policy follows a geometric distribution with mean 0.6. (ii) The number of claims for each insured follows a Poisson distribution with parameter λ. (iii) λ varies by insured according to a gamma distribution. Calculate the proportion of insureds for which the expected number of claims per year is less than 1.

Exercises continue on the next page . . .

EXERCISES FOR LESSON 12

215

12.14. [1999 C3 Sample:12] The annual number of accidents for an individual driver has a Poisson distribution with mean λ. The Poisson means, λ, of a heterogeneous population of drivers have a gamma distribution with mean 0.1 and variance 0.01. Calculate the probability that a driver selected at random from the population will have 2 or more accidents in one year. (A) 1/121

(B) 1/110

(C) 1/100

(D) 1/90

(E) 1/81

12.15. [3-S00:4] You are given: (i) The claim count N has a Poisson distribution with mean Λ. (ii) Λ has a gamma distribution with mean 1 and variance 2. Calculate the probability that N  1. (A) 0.19

(B) 0.24

(C) 0.31

(D) 0.34

(E) 0.37

12.16. [3-S01:3] Glen is practicing his simulation skills. He generates 1000 values of the random variable X as follows: (i)

He generates the observed value λ from the gamma distribution with α  2 and θ  1 (hence with mean 2 and variance 2). (ii) He then generates x from the Poisson distribution with mean λ. (iii) He repeats the process 999 more times: first generating a value λ, then generating x from the Poisson distribution with mean λ. (iv) The repetitions are mutually independent. Calculate the expected number of times that his simulated value of X is 3. (A) 75

(B) 100

(C) 125

(D) 150

(E) 175

Exercises continue on the next page . . .

12. POISSON/GAMMA

216

12.17. [CAS3-F03:15] Two actuaries are simulating the number of automobile claims for a book of business. For the population they are studying: (i) The claim frequency for each individual driver has a Poisson distribution. (ii) The means of the Poisson distributions are distributed as a random variable, Λ. (iii) Λ has a gamma distribution. In the first actuary’s simulation, a driver is selected and one year’s experience is generated. This process of selecting a driver and simulating one year is repeated N times. In the second actuary’s simulation, a driver is selected and N years of experience are generated for that driver. Which of the following is/are true? I.

The ratio of the number of claims the first actuary simulates to the number of claims the second actuary simulates should tend towards 1 as N tends to infinity.

II.

The ratio of the number of claims the first actuary simulates to the number of claims the second actuary simulates will equal 1, provided that the same uniform random numbers are used.

III.

When the variances of the two sequences of claim counts are compared the first actuary’s sequence will have a smaller variance because more random numbers are used in computing it.

(A) (B) (C) (D) (E)

I only I and II only I and III only II and III only None of I, II, or III is true

12.18. [3-F02:5] Actuaries have modeled auto windshield claim frequencies. They have concluded that the number of windshield claims filed per year per driver follows the Poisson distribution with parameter λ, where λ follows the gamma distribution with mean 3 and variance 3. Calculate the probability that a driver selected at random will file no more than 1 windshield claim next year. (A) 0.15

(B) 0.19

(C) 0.20

(D) 0.24

(E) 0.31

12.19. The number of customers arriving in a store in an hour follows a Poisson distribution with mean λ. The parameter λ varies by day, and has a Weibull distribution with parameters τ  1, θ  8.5. Determine the probability of 8 or more customers arriving in an hour. Additional released exam questions: CAS3-S05:10

Solutions 12.1. Since αθ  2 and αθ 2  4, β  θ  2 and r  α  1. Then the probability of finding a coin in one minute (nothing special about the sixth minute) is p1 

rβ β 2    0.2222 2 r+1 9 (1 + β ) 1! (1 + β )

(A)

EXERCISE SOLUTIONS FOR LESSON 12

217

12.2. An exponential distribution is a gamma distribution with α  1. It follows that the corresponding negative binomial distribution is a geometric distribution (r  1) with β  θ  2. Using formula (11.1), Pr ( N ≥ 2) 

β 1+β

!2 

2 3

!2

4 . 9



12.3. Since λ only varies by day, it will not vary over the 2 minute period. Thus conditional on λ, the customer count over 2 minutes is Poisson with mean µ  2λ. Since λ is exponential with mean 2 and exponentials are scale families, µ is exponential with mean 4, and the exponential mixture has a geometric distribution with parameter β  4. Using formula (11.1), β 1+β

Pr ( N ≥ 4) 

!4 

4 5

!4  0.4096

12.4. The number of customers arriving in one minute has a geometric distribution with β  2, as derived two exercises ago. Each minute is independent, and the sum of two of these geometric random variables is a negative binomial with parameters r  2, β  2. (In general, the sum of negative binomial random variables having the same β is obtained by summing the r’s.) The probability of at least 4 is Pr ( N ≥ 4)  1 − p0 − p1 − p 2 − p3

!2

!2

!2

1 2 1 2 3 1 2 1− − − 3 1 3 3 2 3 3 1 4 4 32 1− − − −  0.4609 9 27 27 243

12.5.

!

!

!

!2

4 − 3

!

1 3

!2

2 3

!3

From the variance of the gamma, we have αθ 2  0.2, so rβ 2  0.2

(*)

We want the variance of the negative binomial to equal 0.5, or rβ (1 + β )  0.5  rβ 2 + rβ

(**)

Combining these two equations, rβ  0.3. Plugging this into (*), we get β  2/3. Then from (**), r (1 + β )  0.5/ (2/3)  0.75 . (B) 12.6. We don’t have to take the square root of the variance; we can just compare the variances. The variance of the gamma distribution is αθ2  rβ 2 . The variances are: Class 1 0.002449

Class 2 0.015556

Class 3 0.001111

Class 4 0.006842

Class 3 is lowest. (C) 12.7. As discussed in Example 12A, the variance of the gamma is the overall variance minus the overall mean (equation (12.1)), or 0.4 − 0.2  0.2 . (A) 12.8. As discussed in Example 12A, the variance of the gamma is the overall variance minus the overall mean (equation (12.1)), or 0.15 − 0.10  0.05 . (A) 12.9.

By formula (12.1), the answer is 0.5 − 0.2  0.3 .

12. POISSON/GAMMA

218

12.10. Dividing 0.144 by 0.8 and 0.03888 by 0.144, we have 0.18  a + b b 0.27  a + 2 b  −0.18 a  0.36 

β 1+β

0.36  0.64β β  0.5625  θ r  0.5  α The variance is 0.5 (0.56252 )  0.158203125 . 12.11. First we calculate a and b. 0.0768 b  1.2  a + 0.064 2 0.064  1.6  a + b 0.04 So a  0.8, b  0.8. We now calculate r and β of the negative binomial distribution. β  0.8 ⇒β4 1+β ! β b  ( r − 1)  0.8⇒ r  2 1+β a

The parameter α of the gamma distribution is r of the negative binomial distribution and the parameter θ of the gamma distribution is β of the negative binomial distribution. So α  2, θ  4, and the variance of the gamma is αθ 2  2 (42 )  32 . 12.12. Although we are not given that the model is Poisson, it could be Poisson with parameter θ, so we might as well cheat and assume it. The mixing density f ( θ ) is a gamma with α  2 and θ  1/6, so r  2 and β  1/6 for the negative binomial, and 1 p0  1+β

!r

6  7

!2 

36  0.7347 49

(B)

If we didn’t want to cheat, we would have to calculate the probability directly: Pr ( X  0) 

Z 0

 36

e −θ 36θe −6θ dθ ∞

Z 0

θe −7θ dθ

We recognize the integral as a gamma with parameters α  2 and β  1/7. The constant to make it integrate to 1 is 72 /Γ (2)  49. So we have 36 Pr ( X  0)  49

Z 0

1 Γ (2)

θe 1 2 7

−7θ

dθ 

36  0.7347 49

EXERCISE SOLUTIONS FOR LESSON 12

219

12.13. r  1  α and β  0.6  θ. The parameter λ therefore has an exponential distribution with mean 0.6. The probability that λ is below 1 is F (1)  1 − e −1/0.6  0.8111 .

12.14. Since for the gamma distribution αθ  0.1 and αθ 2  0.01, it follows that θ  0.1 and α  1. Then for the corresponding negative binomial r  1 and β  0.1. It’s a geometric distribution. By formula (11.1), the probability of at least 2 accidents is β Pr ( N ≥ 2)  1+β

!2

0.1  1.1

!2

1 121



(A)

12.15. Since for the gamma distribution αθ  1 and αθ 2  2, it follows that β  θ  2 and r  α  1/2 for the negative binomial distribution. Then 0.5 Pr ( N  1)  1

!

1 1+β

! 0.5

β 1  0.5 1+β 3

!

! 0.5

2  0.1925 3

!

(A)

12.16. The resulting distribution of the simulated values is negative binomial with r  2, β  1. Then 4 3

Pr ( X  3) 

!

1 2

!2

1 2

!3 

1 8

Hence the expected number of 3’s in 1000 times is 1000/8  125 . (C) 12.17. The first actuary is simulating a negative binomial distribution with a mean of E[Λ]. The second actuary is simulating a Poisson distribution with a random mean, namely whatever the mean of the selected driver is. That mean is not necessarily the same as E[Λ] (after all, it is selected randomly), so I is false. Statement II is also false; in fact, it’s hard to say what “the same uniform random numbers are used” means, since the first actuary uses almost twice as many numbers to generate the same number of years. The variances of the random sequences are random numbers, so there’s no way to make a statement like III in general. However, to the extent a statement can be made, one would expect the variance of the first model to be larger, because the negative binomial variance includes both the variance of the driver and the variance of the parameter Λ; as the conditional variance theorem states, the variance of the first model is E[Λ] + Var (Λ) , whereas the variance of the second model is just Var (Λ) . (E) 12.18. Since αθ  3 and αθ2  3, it follows that β  θ  1 and r  α  3. Then 1 p0  1+1 p1  3 p0 + p1 

1 2

5 16

!3

!3 

1 8

1 3  2 16

!

(E)

12.19. A Weibull with τ  1 is an exponential, but an exponential is also a gamma distribution with α  1. Accordingly, the exponential mixture of Poissons is negative binomial with r  1, and in this case β  8.5, or in other words a geometric distribution with β  8.5. The probability of 8 or more is then, by formula (11.1), !8 !8 β 8.5 Pr ( N ≥ 8)    0.4107 1+β 9.5

12. POISSON/GAMMA

220

Quiz Solutions 12-1.

The negative binomial parameters are r  0.2 and β  20, so the probability of 2 applications is1 1.2 p2  2

!

1 1 + 20

! 0.2

20 1 + 20

!2

 0.12 (0.543946)(0.907029)  0.0592

1See page 187 for a discussion on how to evaluate a binomial coefficient in which the numerator is not an integer. C/4 Study Manual—17th edition Copyright ©2014 ASM

Lesson 13

Frequency Distributions—Exposure and Coverage Modifications Reading: Loss Models Fourth Edition 8.6

13.1

Exposure modifications

Exposure modifications are off the syllabus. But they are easy enough to handle, and you should be able to distinguish between them and coverage modifications, so I will discuss them briefly. Exposure can refer to the number of members of the insured group or the number of time units (e.g., years) they are insured for. Doubling the size of the group or the period of coverage will double the number of claims. If the number of exposures change, how can we adapt our frequency model? Suppose the model is based on n1 exposures, and you now want a model for n2 exposures. If the original model is Poisson with parameter λ, the new model is Poisson with parameter λn2 /n1 . If the original model is negative binomial, the new model is negative binomial with the first parameter, r, multiplied by n2 /n1 . If the original model is binomial, you are supposed to multiply the first parameter, m, by n 2 /n1 , but this is only acceptable if the revised m is still an integer. Otherwise, the binomial model cannot be used with the new exposures.

13.2

Coverage modifications

The topic of coverage modifications is on the syllabus, and exams frequently have questions about this topic. The most common type of coverage modification is changing the deductible, so that the number of claims for amounts greater than zero changes. Another example would be uniform inflation, which would affect frequency if there’s a deductible. If you need to calculate aggregate losses, and don’t care about payment frequency, one way to handle a coverage modification is to model the number of losses (rather than the number of paid claims), in which case no modification is needed for the frequency model. Instead, use the payment per loss random variable, Y L . This variable is adjusted, and will be zero with positive probability. For distributions of the ( a, b, 0) and ( a, b, 1) classes, however, an alternative method is available. The frequency distribution is modified. The modified frequency—the frequency of positive claims—has the same form as the original frequency, but with different parameters. Suppose the probability of paying a claim, i.e., severity being greater than the deductible, is v. If the model for loss frequency is Poisson with parameter λ, the new parameter for frequency of paid claims is vλ. (Having only one parameter makes things simple.) If the original model is negative binomial, the modified frequency is negative binomial with the same parameter r but with β multiplied by v. Notice how this contrasts with exposure modification. If the original model is binomial, once again the modified frequency is binomial with the same parameter m but with q multiplied by v. This time, since q doesn’t have to be an integer, the binomial model can always be used with the new severity distribution. If you need to calculate aggregate losses, C/4 Study Manual—17th edition Copyright ©2014 ASM

221

13. FREQUENCY— EXPOSURE & COVERAGE MODIFICATIONS

222

the modified frequency is used in conjunction with the modified payment per payment random variable, YP . Example 13A Losses on a major medical coverage on a group of 50 follow a Pareto distribution with parameters α  1 and θ  1000. The number of claims submitted by the group in a year has a negative binomial distribution with mean 16 and variance 20. The group then expands to 60, and a deductible of 250 per loss is imposed. Calculate the variance of claim frequency for the group for the revised coverage. Answer: The original negative binomial parameters are β  0.25 and r  64. The exposure modification 60/50 sends r to r 0  64 (60/50)  76.8. The probability that a loss will be greater than 250 is 1 − FX (250) 

1000  0.8. 1250

The deductible sends β to β0  0.25 (0.8)  0.2. The variance of claim frequency is then 76.8 (0.2)(1.2)  18.432



As the next example demonstrates, it is also possible to adjust claim frequency based on an adjustment to the modification. Example 13B For an auto collision coverage with ordinary deductible 500, ground-up loss amounts follow a Weibull distribution with τ  0.2 and θ  2000. The number of losses follows a geometric distribution. The expected number of claims for non-zero payment amounts per year is 0.3. Calculate the probability of exactly one claim for a non-zero payment amount in one year if the deductible is changed to 1000. Answer: The number of claims over 500 per year is geometric with β  0.3. The number of claims over 1000 per year is also geometric, with a revised β. The revised β can be computed by scaling down the β for a 500 deductible with the ratio of S (1000) /S (500) , where the survival function is for the Weibull distribution. S (500)  e − (500/2000) S (1000)  e

0.2

− (1000/2000) 0.2

Revised β  0.3

 0.468669  0.418721

0.418721  0.268028 0.468669

!

The revised probability of exactly one claim in a year is p1 

?

1 1.268028

!

0.268028  0.1667 1.268028

!

Quiz 13-1 For an insurance coverage, you are given: (i) Losses in 2009 follow an inverse exponential distribution with θ  1000. (ii) Losses are subject to a 500 deductible. (iii) The annual number of losses follows a binomial distribution with m  5. (iv) The expected number of paid claims in 2009 is 0.5. (v) Losses in 2010 are inflated by 10% over losses in 2009. Calculate the variance of the number of paid claims in 2010. C/4 Study Manual—17th edition Copyright ©2014 ASM



EXERCISES FOR LESSON 13

223

Let’s discuss the ( a, b, 1) class now. Please note that in the following discussion, the word “modified” is used in the sense of an ( a, b, 1) class zero-modified distribution with p kM ; we will say “revised” when referring to the severity modification. The same parameter that gets multipliedPby v in the ( a, b, 0) class gets multiplied by v in the ( a, b, 1) class. The balancing item is then p0M  1 − ∞ k1 p k . The textbook (Table 8.3) gives formulas for directly calculating p0M∗ in all cases. Rather than memorizing that table, use the following formula: 1−

p0M∗

 (1 −

p 0M )

1 − p0∗

!

(13.1)

1 − p0

where asterisks indicate distributions with revised parameters. In other words, Pr ( N > 0) in the revised distribution equals the original Pr ( N > 0) times the ratio of the Pr ( N > 0) ’s for the corresponding unmodified ( a, b, 0) class distributions. This formula works even when the unmodified distribution is improper (so that unmodified probabilities are negative or greater than 1), as in the ETNB family. For the logarithmic distribution, use the following: ln (1 + vβ ) 1 − p0M∗  (1 − p0M ) (13.2) ln (1 + β ) The use of formula (13.1) for an improper distribution is illustrated in the following example: Example 13C Frequency of claims per year follows a zero-modified negative binomial distribution with r  −0.5, β  1, and p0M  0.7. Claim size follows a Pareto with α  1, θ  1000, and is independent of claim frequency. A deductible of 500 is imposed. Calculate the probability of no claims payments in a year. Answer: The probability of a payment given a claim is the Pareto survival function at 500: S (500) 

θ 1000 2   θ + 500 1500 3

The revised negative binomial parameters are r ∗  −0.5, β ∗  2/3. By equation (13.1), 1 − p0M  0.3

1 − p0  1 − 1− 1−

p0∗

p 0M∗

1 1+β

1 1− 5/3

!r

1−

! −0.5

1 2

! −0.5

 −0.4142

 −0.2910

−0.2910  0.3  0.2108 −0.4142

!

p 0M∗  1 − 0.2108  0.7892



Exercises 13.1. The losses on an auto comprehensive coverage follow a Pareto distribution with parameters α  2 and θ  1000. The number of losses follows a Bernoulli distribution with an average of 0.2 losses per year. Loss sizes are affected by 10% inflation. A 250 deductible is imposed. Calculate the variance of the frequency of paid losses after inflation and the deductible. C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

13. FREQUENCY— EXPOSURE & COVERAGE MODIFICATIONS

224

Table 13.1: Formula summary—Exposure and coverage modifications

Model Poisson Binomiala Negative binomial

Original Parameters Exposure n1 Pr ( X > 0)  1 λ m, q r, β

Exposure Modification Exposure n2 Pr ( X > 0)  1 ( n2 /n1 ) λ ( n2 /n1 ) m, q ( n2 /n1 ) r, β

Coverage Modification Exposure n1 Pr ( X > 0)  v vλ m, vq r, vβ

These adjustments work for ( a, b, 1) distributions as well as ( a, b, 0) distributions. For ( a, b, 1) distributions, p0M is adjusted as follows: ! 1 − p0∗ M∗ M (13.1) 1 − p 0  (1 − p 0 ) 1 − p0 a (n

2 /n 1 ) m

must be an integer for exposure modification formula to work

13.2. The losses on an auto comprehensive coverage follow a Pareto distribution with parameters α  2 and θ  1000. The number of losses follows a geometric distribution with an average of 0.2 losses per year. Loss sizes are affected by 10% inflation. A 250 deductible is imposed. Calculate the variance of the frequency of paid losses after inflation and the deductible. 13.3. Aggregate claim frequency for an employee dental coverage covering 10 individuals follows a negative binomial distribution with mean 2 and variance 5. Loss size follows an exponential distribution with mean 500. The group expands to 20 individuals and a deductible of 100 is imposed. Calculate the probability of 2 or more paid claims from the group after these revisions. 13.4. The number of losses for the insurer follows a Poisson distribution with parameter λ. λ varies by year according to a gamma distribution with parameters α  6, θ  31 . Loss sizes are independent of the number of losses and are lognormally distributed with parameters µ  10, σ  2. A reinsurance agreement provides that the reinsurer reimburses the insurer for the excess of each loss over 1,000,000. Determine the probability that the reinsurer will pay exactly one loss in a year. 13.5. The number of students taking an actuarial exam has a negative binomial distribution with parameters r  10, β  1.5. Each student has a 0.4 probability of passing. Determine the probability that 3 or more students pass. 13.6.

You are given the following information regarding windstorms:

(i)

The frequency of windstorms causing more than 1,000,000 damage before inflation follows a negative binomial distribution with mean 0.2 and variance 0.4. (ii) Uniform inflation of 5% affects the amount of damage. (iii) The severity of windstorms before inflation follows an exponential distribution with mean 1,000,000. Calculate the probability that there will be exactly 1 windstorm in a year causing more than 2,000,000 damage after one year’s inflation.

Exercises continue on the next page . . .

EXERCISES FOR LESSON 13

225

[SOA3-F03:19] Aggregate losses for a portfolio of policies are modeled as follows:

13.7. (i)

The number of losses before any coverage modifications follows a Poisson distribution with mean λ. (ii) The severity of each loss before any coverage modifications is uniformly distributed between 0 and b. The insurer would like to model the impact of imposing an ordinary deductible, d (0 < d < b) on each loss and reimbursing only a percentage, c (0 < c ≤ 1) , of each loss in excess of the deductible. It is assumed that the coverage modifications will not affect the loss distribution.

The insurer models its claims with modified frequency and severity distributions. The modified claim amount is uniformly distributed on the interval [0, c ( b − d ) ]. Determine the mean of the modified frequency distribution.

(A) λ

(B) λc

(C) λ

b d

(D) λ

b−d b

(E) λc

b−d b

[CAS3-F04:17] You are given:

13.8. •

Claims are reported at a Poisson rate of 5 per year.

The probability that a claim will settle for less than \$100,000 is 0.9. What is the probability that no claim of \$100,000 or more is reported during the next 3 years?

(A) 20.59%

(B) 22.31%

(C) 59.06%

(D) 60.63%

(E) 74.08%

[CAS3-S04:17] Payfast Auto insures sub-standard drivers.

13.9. •

Each driver has the same non-zero probability of having an accident.

Each accident does damage that is exponentially distributed with θ  200.

There is a \$100 per accident deductible and insureds only report claims that are larger than the deductible.

Next year each individual accident will cost 20% more.

Next year Payfast will insure 10% more drivers. Determine the percentage increase in the number of reported claims next year.

(A) (B) (C) (D) (E)

Less than 15% At least 15%, but less than 20% At least 20%, but less than 25% At least 25%, but less than 30% At least 30%

Exercises continue on the next page . . .

13. FREQUENCY— EXPOSURE & COVERAGE MODIFICATIONS

226

13.10. [SOA3-F04:8] For a tyrannosaur with a taste for scientists: (i) The number of scientists eaten has a binomial distribution with q  0.6 and m  8. (ii) The number of calories of a scientist is uniformly distributed on (7000, 9000) . (iii) The numbers of calories of scientists eaten are independent, and are independent of the number of scientists eaten. Calculate the probability that two or more scientists are eaten and exactly two of those eaten have at least 8000 calories each. (A) 0.23

(B) 0.25

(C) 0.27

(D) 0.30

(E) 0.35

13.11. An insurance coverage is subject to an ordinary deductible of 500. You are given: (i) The number of losses above the deductible follows a binomial distribution with m  10, q  0.05. (ii) Payment sizes follow a paralogistic distribution with α  3 and θ  800. The deductible is raised to 1000. Calculate the probability of making exactly one payment with the revised deductible. 13.12. You are given: (i) An insurance coverage is subject to an ordinary deductible of 500. (ii) The number of losses per year above the deductible follows a negative binomial distribution with r  2, β  0.8. (iii) The severity of each loss before coverage modifications is uniform on (0, 2500]. The deductible is raised to x. The probability of zero claims under the new deductible is 0.390625. Determine x. 13.13. An insurance coverage is subject to an ordinary deductible of 100 during 2010. You are given: (i)

The annual number of losses above 100 in 2010 follows a negative binomial distribution with r  1.5, β  1. (ii) Ground up loss sizes in 2010 follow a lognormal distribution with µ  4, σ  2. (iii) Losses are subject to 10% uniform inflation in 2011. (iv) The deductible is raised to 150 in 2011. Calculate the probability of 3 or more nonzero payments on this insurance coverage in 2011. 13.14. You are given (i) An automobile collision coverage has a 500 deductible. (ii) The number of paid claims follows a negative binomial distribution with parameters r  0.5 and β  0.4. (iii) Ground-up Loss sizes follow a single-parameter Pareto distribution with parameters θ  100 and α  1. Determine the deductible needed in order to reduce the variance of paid claim counts to 0.2.

Exercises continue on the next page . . .

EXERCISES FOR LESSON 13

227

13.15. [151-82-92:9] N2 is the number of claims of size 2 in a compound negative binomial distribution in which: (i) The primary distribution is negative binomial with parameters r  5 and β  1/2. (ii) The claim size distribution is: x Pr ( x ) 1 2 3 4

1/2 1/4 1/8 1/8

Determine Var ( N2 ) . (A) 5/64

(B) 5/16

(C) 5/8

(D) 45/64

(E) 15/4

13.16. You are given: (i) The number of losses follows a zero-modified Poisson distribution with λ  0.2 and p 0M  0.4. (ii) Loss size follows a single-parameter Pareto distribution with θ  100, α  0.5. (iii) An insurance coverage has a deductible of 250. Calculate the probability of one paid claim. 13.17. You are given: (i)

The number of losses follows a zero-modified negative binomial distribution with r  2, β  0.5, and p0M  0.25. (ii) Loss size follows a loglogistic distribution with γ  0.2, θ  1000. (iii) An insurance coverage has a deductible of 300. Calculate the probability of no paid claims. 13.18. For an insurance coverage: (i) Each loss is subject to a 250 deductible. (ii) The annual number of nonzero payments follows a zero-modified geometric distribution with β  4. (iii) The average number of nonzero payments per year is 2. (iv) Ground-up severity follows an inverse exponential distribution with θ  100. The deductible is raised to 300. Calculate the probability of 3 or more nonzero payments with the revised deductible.

Exercises continue on the next page . . .

13. FREQUENCY— EXPOSURE & COVERAGE MODIFICATIONS

228

13.19. Losses follow an inverse Pareto distribution with τ  2 and θ  300. The number of nonzero payments N on a coverage with ordinary deductible of 200 has the following distribution: n

Pr ( N  n )

0 1 2 3

0.60 0.30 0.05 0.05

The deductible is raised to 500. Calculate the probability of exactly one nonzero payment on the coverage with the revised deductible. 13.20. Losses follow a Weibull distribution with τ  0.5 and θ  1000. The number of nonzero payments N on a coverage with ordinary deductible 100 follows a negative binomial distribution with r  3 and β  0.1. The deductible is increased to 200. Calculate the probability of exactly one nonzero payment on the coverage with the revised deductible. 13.21. Losses follow a uniform distribution on (0, 1000) . The number of losses greater than 200 follows a binomial distribution with m  10, q  0.1. Calculate the variance of the number of losses above 400. 13.22. Losses follow a uniform distribution on (0, 1000) . The number of losses greater than 200 follows a zero-modified logarithmic distribution with p 0M  0.6, β  2. Calculate the probability that the number of losses greater than 500 is 0.

Additional released exam questions: CAS3-F05:24, CAS3-S06:32, CAS3-F06:24,31, C-S07:39

Solutions 2

13.1. The inflated Pareto has parameters α  2 and θ  1000 (1.1)  1100. S (250)  1100  0.6639. 1350 The revised Bernoulli parameter is then q  (0.6639)(0.2)  0.1328. The variance is (0.1328)(1 − 0.1328)  0.1151 . 13.2. The Pareto parameters are the same as in the previous exercise. We calculated there that S (250)  0.6639. Therefore, the revised geometric parameter β is 0.2 (0.6639)  0.1328. The variance of the geometric is β (1 + β )  (0.1328)(1.1328)  0.1504 . 13.3. The original parameters of the negative binomial are r  4/3, β  3/2. The probability that a loss is greater than 100 is S (100)  e −100/500  0.8187. Doubling the size of the group doubles r so that the new r is 8/3. Eliminating claims with the deductible multiplies β by the probability that a loss will be greater than 100, so the new β is 32 (0.8187)  1.2281. Using these new parameters, and letting N be the number of claims for the group, 1 Pr ( N  0)  2.2281

! 8/3

 0.11807

EXERCISE SOLUTIONS FOR LESSON 13

229

1 8 Pr ( N  1)  3 2.2281

! 8/3

1.2281  0.17356 2.2281

!

Pr ( N ≥ 2)  1 − 0.11807 − 0.17356  0.7084 13.4.

The probability that a loss will be greater than 1,000,000 is 13.8155 − 10 ln 1,000,000 − 10 1−Φ 1−Φ 2 2

!

!

 1 − Φ (1.91)  1 − 0.9719  0.0281

The parameters of the negative binomial distribution for the number of all losses are r  6, β  1/3. For  losses above 1,000,000, the revised parameters are r  6, β  31 (0.0281)  0.0093667. We then have p1  6

1 1.0093667

!6

0.0093667  0.0526 . 1.0093667

!

13.5. The modified negative binomial distribution has r  10 and β  1.5 (0.4)  0.6. The probabilities of 0, 1, and 2 students of passing, p0 , p1 , p2 , are 1 p0  1.6

! 10

 0.009095

10 1

!

1 1.6

! 10

p1 

0.6  0.034106 1.6

11 2

!

1 1.6

! 10

p2 

0.6 1.6

!

!2

 0.070344

The probability that 3 or more students pass is 1 − 0.009095 − 0.034106 − 0.070344  0.886455 13.6. The inflated variable has θ  1,050,000. Also, the frequency of windstorms causing inflated damage of 1,050,000 or more is a negative binomial with mean 0.2 and variance 0.4, which translates into r  0.2 and β  1. The probability of a windstorm greater than 2,000,000 divided by the probability of a windstorm greater than 1,050,000 is S (2,000,000) /S (1,050,000) , or e −2/1.05 /e −1.05/1.05  e −19/21 . So the revised β is e −19/21 . The probability of exactly 1 windstorm is p1 

rβ 0.2e −19/21   0.05383 (1 + β ) 1.2 (1 + e −19/21 ) 1.2

13.7. Coinsurance (c) does not affect claim frequency, but the deductible does. Only ( b − d ) /b of losses will result in claims. So the answer is (D). 13.8. The modified Poisson parameter for claims of \$100,000 or more is 0.1 (5)  0.5. The probability of no claims over \$100,000 in 1 year is e −0.5 , and the probability for three years is e −1.5  0.2231 . (B) 13.9. For each driver, the number of reported accidents is S (100)  e −100/200 . Next year, it will be e −100/240 , and there will be 1.1 times as many drivers, so the ratio is 1.1e −100/240  1.1956 e −100/200 There will be a 19.56% increase. (B) C/4 Study Manual—17th edition Copyright ©2014 ASM

13. FREQUENCY— EXPOSURE & COVERAGE MODIFICATIONS

230

13.10. The probability that a scientist has 8000 calories or more is 12 , so the number of such scientists eaten is a binomial distribution with modified q  0.3. Then 8 (0.32 )(0.76 )  0.296475 2

!

(D)

If exactly two such scientists are eaten, it is automatically true that at least two scientists were eaten, so the fact that at least two scientists are eaten does not add any additional condition. 13.11. Since both frequency and severity pertain to payments rather than to losses, this situation is equivalent to an insurance coverage having no deductible originally, having the same frequency and severity distributions for ground-up losses, and then imposing a 500 deductible. The probability that a payment is greater than 500 under the original 500 deductible is (see the distribution tables for the definition of u) S (500)  u α 

1 1 + (500/θ ) α

!α 

1 1 + (500/800) 3

!3

 0.8037683  0.519268

Therefore, the revised frequency for a 1000 deductible has m  10 and q  0.05 (0.519268)  0.025963. The revised probability of one payment is 10 Pr ( N  1)  (0.025963)(1 − 0.025963) 9  0.2049 1

!

P

13.12. Let primes indicate values modified for a deductible of x. Then p00

1  1 + β0

!2

 0.390625

1  0.625 1 + β0 1 β0  − 1  0.6 0.625 Since β0/β  0.6/0.8  0.75, the probability that a loss is above x is 0.75 times the probability that it is over 500. Under the uniform distribution, S ( x )  1 − x/2500, so S (500)  0.8 and S ( x )  0.6. We conclude that x  1000 . 13.13. Let’s calculate the probability that a loss is above 100 in 2010 and the probability that a loss is above 150 in 2011. For the latter, remember that to scale a lognormal distribution, you add ln (1 + r ) to µ, as mentioned on page 29. In the following, primes indicated inflated variables. Pr ( X > 100)  1 − Φ

ln 100 − 4  1 − Φ (0.30)  0.3821 2

!

ln 150 − ln 1.1 − 4 Pr ( X > 150)  1 − Φ  1 − Φ (0.46)  0.3228 2

!

0

The relative probability is 0.3228/0.3821  0.844805. We multiply β by this number, resulting in β0  0.844805. Then the probabilities of 0, 1, and 2 are p0  C/4 Study Manual—17th edition Copyright ©2014 ASM

1 1.844805

! 1.5

 0.399093

EXERCISE SOLUTIONS FOR LESSON 13

231

1 p1  1.5 1.844805 p2 

(1.5)(2.5)

! 1.5

!

2

0.844805  0.274139 1.844805

!

1 1.844805

! 1.5

0.844805 1.844805

!2

 0.156923

The probability of 3 or more nonzero payments is 1 − 0.399093 − 0.274139 − 0.156923  0.1698 .

13.14. We want r ( vβ )(1 + vβ )  0.2. Let’s solve for v.

0.4v (1 + 0.4v )  0.4 v (1 + 0.4v )  1 0.4v 2 + v − 1  0 √ −1 + 1 + 1.6 v  0.765564 0.8 Now we calculate the deductible such that the probability of a claim is 0.76556 times the current probability. 100/d v 100/500 500 v d 500 500 d   653.11 v 0.765564 13.15.  Counting only claims of size 2 is a severity modification. Thus we multiply β by p2  1/4 to obtain  β0  12 14  18 . The variance of the modified negative binomial distribution is 1 8

rβ (1 + β )  (5)

!

45 9  8 64

!

(D)

13.16. The probability of a loss greater than 250 is (100/250) 0.5  0.632456. The modified parameter of the Poisson is λ  0.632456 (0.2)  0.126491. The modified value of the probability of at least one payment is ! ! 1 − p 0∗ 1 − e −0.126491 1 − p0M∗  (1 − p0M )  0.6  0.393287 1 − p0 1 − e −0.2 The probability of one claim is Pr ( N  1)  ∗

0.393287p1T∗

 0.393287

λ e λ−1

!

0.126491  0.393287 0.126491  0.3689 e −1

!

13.17. The probability of a loss greater than 300 is Pr ( X > 300)  1 −

(300/1000) 0.2  0.559909 1 + (300/1000) 0.2

so the revised β is β∗  0.5 (0.559909)  0.279955. Thus the revised p0M is calculated from the following: p0  C/4 Study Manual—17th edition Copyright ©2014 ASM

1 1.5

!2

 0.444444

13. FREQUENCY— EXPOSURE & COVERAGE MODIFICATIONS

232

p0∗  1−

p 0M∗

1 1.279955

!2

 0.610395

1 − 0.610395  (1 − 0.25)  0.525967 1 − 0.444444

!

p 0M∗  1 − 0.525967  0.474033

13.18. Let’s calculate the probabilities that a loss is above 250 and 300. Pr ( X > 250)  1 − e −100/250  0.329680

Pr ( X > 300)  1 − e −100/300  0.283469

The revised β is β ∗  4 (0.283469/0.329680)  3.439320. The mean of a zero-truncated geometric with β  4 is β + 1  5. The mean of our geometric is 2, which is 1 − p0M times the mean of a zero-truncated geometric. It follows that 1 − p0M  0.4. By formula (13.1), the revised probability that the frequency is nonzero is 1  0.2 5 1 p0∗   0.225260 1 + 3.439320 ! 1 − 0.225260 M∗ 1 − p0  0.4  0.387370 1 − 0.2 p0 

n

For a geometric distribution, Pr ( N ≥ n )  β/ (1+ β ) (see formula (11.1)). A zero-truncated geometric distribution is a geometric distribution shifted 1, so for zero-truncated N T , and n ≥ 1, Pr ( N T ≥ n ) 



 n−1

β/ (1 + β ) . For a zero-modified distribution, these probabilities are multiplied by 1 − p0M . Therefore, in our case, the probability of 3 or more nonzero payments is



Pr ( N M∗ ≥ 3)  (1 − p0M∗ )

β∗ 1 + β∗

!2

 (0.387370)

3.439320 4.439320

!2  0.2325

13.19. The probability of one loss above 500 is the sum of the probability of: 1.

1 loss above 200 which is also above 500.

2.

2 losses above 200, one below 500 and one above 500.

3.

3 losses above 200, two below 500 and one above 500.

Using the tables for an inverse Pareto, the probability that a loss is greater than 500 given that it is greater than 200 is Pr ( X > 500) 1 − (500/800) 2 0.609375    0.725446 Pr ( X > 200) 1 − (200/500) 2 0.84

Now we’ll evaluate the probabilities of the three cases enumerated above. Let N1 be the number of losses above 200 and N2 the number of losses above 500. Pr ( N1  1&N2  1)  0.30 (0.725446)  0.217634 Pr ( N1  2&N2  1)  0.05 C/4 Study Manual—17th edition Copyright ©2014 ASM

2 (0.725446)(1 − 0.725446)  0.019917 1

!

QUIZ SOLUTIONS FOR LESSON 13

233

3 (0.725446)(1 − 0.725446) 2  0.008203 Pr ( N1  3&N2  1)  0.05 1

!

The answer, the sum of the three probabilities, is 0.217634 + 0.019917 + 0.008203  0.24575 . 13.20. Calculate the relative probability of a loss greater than 200, given that it is greater than 100. 0.5

Pr ( X > 200 | X > 100) 

S (200) e −0.2  0.877230  S (100) e −0.10.5

Thus the modified β of the negative binomial is β0  0.1 (0.877230)  0.087723. Then the probability of one nonzero payment is r 1

!

1 1 + β0

!r

β0 1 3 1 + β0 1.087723

!

!3

0.087723  0.18800 1.087723

!

13.21. The probability of a loss greater than 400 given that it is greater than 200 is (1 − 0.4) / (1 − 0.2)  0.75. Therefore the modified q in the binomial is (0.1)(0.75)  0.075, and the variance of the modified distribution is 10 (0.075)(0.925)  0.69375 . 13.22. The probability of a loss greater than 500 given that it is greater than 200 is (1 − 0.5) / (1 − 0.2)  0.625. Thus the revised β is 2 (0.625)  1.25. For a logarithmic distribution, use formula (13.2): 1−

p0M∗

 (1 −

ln (1 + vβ ) p0M ) ln (1 + β )

ln 2.25  0.295256  0.4 ln 3

!

p0M∗  0.704744

Quiz Solutions 13-1. The probability of a loss over 500 in 2009 is 1−e −1000/500  0.864665. In 2010, the new θ  1.1 (1000)  1100, and the probability of a loss over 500 is 1 − e −1100/500  0.889197. The original q for the binomial distribution of paid claims was 0.5/5  0.1. Therefore, the q for the inflated binomial distribution of paid claims is ! 0.889197 0.1  0.102837 0.864665 and the variance of the inflated distribution of paid claims is 5 (0.102837)(1 − 0.102837)  0.461308 .

234

13. FREQUENCY— EXPOSURE & COVERAGE MODIFICATIONS

Lesson 14

Aggregate Loss Models: Compound Variance Reading: Loss Models Fourth Edition 9.1–9.3, 9.5, 9.8.1

14.1

Introduction

Aggregate losses are the total losses paid by an insurance for a defined set of insureds in one period, say a year. They are the sum of the individual losses for the year. There are two ways to model aggregate losses. One way is to only consider the number of claims and the size of each claim. In other words, the size of the group is only relevant to the extent it affects the number of claims. In this model, aggregate losses S can be expressed as: S

N X

Xi

i1

where N is the number of claims and X i is the size of each claim. We make the following assumptions: 1. X i ’s are independent identically distributed random variables. In other words, every claim size has the same probability distribution and is independent of any other claim size. 2. X i ’s are independent of N. The claim counts are independent of the claim sizes. This model is called the collective risk model. S is a compound distribution: a distribution formed by summing up a random number of identical random variables. For a compound distribution, N is called the primary distribution and X is called the secondary distribution. The alternative is to let n be the number of insureds in the group, and X i be the aggregate claims of each individual member. We assume 1. X i ’s are independent, but not necessarily identically distributed random variables. Different insureds could have different distributions of aggregate losses. Typically, Pr ( X i  0) > 0, since an insured may not submit any claims. This is unlike the collective risk model where X i is a claim and therefore not equal to 0. 2. There is no random variable N. Instead, n is a fixed number, the size of the group. The equation for aggregate losses is then S

n X

Xi

i1

This model is called the individual risk model. For certain severity distributions, the individual risk model’s aggregate losses have a familiar parametric distribution. For example, if the X i ’s are all exponential with the same θ, then S has a gamma distribution with parameters n and θ. Calculating the distribution function for the aggregate distribution C/4 Study Manual—17th edition Copyright ©2014 ASM

235

14. AGGREGATE LOSS MODELS: COMPOUND VARIANCE

236

Proof of Compound Variance Formula

Condition S on the number of claims N. By equation (4.2), the conditional variance formula

 

Var ( S )  EN [VarS ( S | N ) ] + VarN (ES [S | N])  EN Var *

N X

, i1

 -

X N

X i | N + + VarN *E 

,  i1

  -

X i | N  +

By mutual independence of X i , we can calculate the variance of the sum as the sum of the variances. We can always calculate the expected value of the sum as the sum of the expected values. We will drop the subscript on X since they are identically distributed. We will drop the condition on N, since the X i ’s are independent of N.

 

EN Var *

N X

, i1

 -

X N

X i | N + + VarN *E 

,  i1

  -

X N

X i | N  +  EN 

 i1

 

Var ( X )  + VarN *

N X

, i1

E[X]+

 EN [N Var ( X ) ] + VarN ( N E[X])

-

Since Var ( X ) and E[X] are constants, they can be factored from the expression. When factoring a constant from variance, it gets squared. EN [N Var ( X ) ] + VarN ( N E[X])  E[N] Var ( X ) + Var ( N ) E[X]2 We’re done.

in a collective risk model, on the other hand, is difficult. An alternative is to use an approximating distribution. In order to do so, we need the mean and variance of the aggregate loss distribution. We proceed to discuss how to calculate them.

14.2

Compound variance

This topic appears frequently on exams, and it’s easy. Assume we have a collective risk model. We assume that aggregate losses have a compound distribution, with frequency being the primary distribution and severity being the secondary distribution. If P N is the frequency random variable, X the severity random variable, and S  N n1 X n , and the X n ’s are identically distributed and independent of each other and of N, then (14.1)

E[S]  E[N] E[X] 2

Var ( S )  E[N] Var ( X ) + Var ( N ) E[X]

f

E S − E[S]

3g

f

 E[N] E X − E[X]

3g

(14.2)

+ 3 Var ( N ) E[X] Var ( X ) + E N − E ( N )

f

3g

E[X]3

(14.3)

A proof of equation (14.2) is given in a sidebar. Equation (14.2) is important, so let’s repeat it: Compound Variance Formula Var ( S )  E[N] Var ( X ) + Var ( N ) E[X]2 C/4 Study Manual—17th edition Copyright ©2014 ASM

(14.2)

14.2. COMPOUND VARIANCE

237

For a compound Poisson distribution, one where the primary distribution is Poisson with parameter λ, the compound variance formula reduces to Compound Variance Formula for Poisson Primary Var ( S )  λ E[X 2 ]

(14.4)

Equation (14.3), which enables calculating skewness, is unlikely to be required on an exam, especially since the formulas for the third moments of discrete distributions are not in the Loss Models appendix. Example 14A For a group of 100 insureds, the number of losses per insured follows a negative binomial distribution with r  3, β  0.01. Claim sizes follow an inverse gamma distribution with α  6, θ  1000. The number of losses is independent of claim sizes, and claim sizes are independent of each other. Determine the mean and variance of aggregate losses. Answer: To model frequency of losses on 100 insureds, we will use a negative binomial with r  300, β  0.01. This is not strictly necessary; an alternative would be to calculate the mean and variance for one insured, and then multiply them by 100. We have E[N]  rβ  300 (0.01)  3 Var ( N )  rβ (1 + β )  300 (0.01)(1.01)  3.03 1000 θ   200 E[X]  α−1 5 θ2 10002 E[X 2 ]    50,000 ( α − 1)( α − 2) (5)(4) Var ( X )  50,000 − 2002  10,000 E[S]  E[N] E[X]  (3)(200)  600

Var ( S )  E[N] Var ( X ) + Var ( N ) E[X]2  (3)(10,000) + (3.03)(2002 )  151,200



In Lesson 9, we mentioned that we can sometimes use the compound variance formula to calculate the variance of a coverage with a deductible. To use it, we set the frequency random variable to be Bernoulli with probability equal to the probability of a claim being greater than 0, and the severity random variable is the loss variable left-shifted and truncated by the deductible, or the payment amount for nonzero payments. This method is especially useful if severity is exponential, since left shifting has no effect on the memoryless exponential distribution or its variance. Example 14B (Repeat of Example 9B) The loss severity random variable follows an exponential distribution with mean 1000. A coverage for this loss has a deductible of 500. Calculate the variance of the payment per loss random variable. Answer: Let X be the loss random variable and Y L the payment per loss random variable. Also let p  Pr ( X > 500) . Then Y L is a compound distribution with primary Bernoulli with parameter p and secondary exponential with mean 1000, since the excess payment per payment is exponential with mean 1000. The variance of Y L is therefore Var ( Y L )  p Var ( X ) + p (1 − p ) E[X]2  e −0.5 (10002 ) + e −0.5 1 − e −0.5 (10002 )  845,182





The following example goes further and uses what we learned in Lesson 13 as well. Example 14C [1999 C3 Sample:20] You are given: C/4 Study Manual—17th edition Copyright ©2014 ASM



14. AGGREGATE LOSS MODELS: COMPOUND VARIANCE

238

• An insured’s claim severity distribution is described by an exponential distribution: F ( x )  1 − e −x/1000 • The insured’s number of claims is described by a negative binomial distribution with β  2 and r  2. • A 500 per claim deductible is in effect. Calculate the standard deviation of the aggregate losses in excess of the deductible. Answer: In this question, the word “claim” is used synonymously with “loss”, and is before the deductible is removed. The frequency of losses has a negative binomial distribution with β  2 and r  2. However, only losses above 500 are paid. The probability that a loss will be paid is e −500/1000  e −0.5 . The distribution of the number of paid claims is then negative binomial with parameters β  2e −0.5 and r  2. The distribution of individual losses above the deductible is exponential with mean 1000, since the exponential distribution is memoryless. Let N be the distribution of paid claims, and X the distribution of individual losses above the deductible. Then E[X]  1000

Var ( X )  10002

E[N]  2 (2e −0.5 )  2.42612

Var ( N )  2 (2e −0.5 )(1 + 2e −0.5 )  5.36916

Var ( S )  2.42612 (10002 ) + 5.36916 (10002 )  7,795,281 √ The standard deviation of S is 7,795,281  2792 .



The compound variance formula can only be used when N and X i are independent. If N | θ and X | θ are conditionally independent, then they are probably not unconditionally independent. However, the compound variance formula may be used on N | θ and X | θ to evaluate Var ( S | θ ) , and then the conditional variance formula can be used to evaluate Var ( S ) . The next example illustrates this. Example 14D You are given: (i) Claim counts, N, follow a Poisson distribution with mean 2. (ii) Claim sizes, X, are exponential with mean θ, and are independent given θ. (iii) θ varies by insured, and is uniform on [0, 12]. (iv) Claims counts and claim sizes are independent. Calculate the variance of aggregate losses. Wrong answer: Var ( S )  E[N] Var ( X ) + Var ( N ) E[X]2  2 Var ( X ) + E[X]2



E[X]  E E[X | θ]  6

f



g

Var ( X )  E Var ( X | θ ) + Var E[X | θ]

f

g





 E θ 2 + Var ( θ )

f



g

122 122 + 3 12

because for a uniform distribution, the second moment is the range squared over 3 and the variance is the range squared over 12

 60 Everything above is correct except for the first line. That formula may not be used, since X is only independent of N given θ.  C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISES FOR LESSON 14

239

The correct answer is Answer: First use the conditional variance formula. Var ( S )  E Var ( S | θ ) + Var E[S | θ]

f

g





E[S | θ]  2θ

Var ( S | θ )  E[N | θ] Var ( X | θ ) + Var ( N | θ ) E[X | θ]2

This is the right place to use the compound variance formula.  2θ 2 + 2θ 2  4θ 2 Var ( S )  E[4θ 2 ] + Var (2θ ) 122 122 4 +4 3 12

!

!

 240



If frequency and severity are conditioned on a parameter that varies by insured, do not use the compound variance formula on the overall compound distribution. You may use the compound variance formula on the conditional distribution, and then use the conditional variance formula to calculate the overall variance.

?

Quiz 14-1 The number of losses on an automobile comprehensive coverage has the following distribution: Number of losses

Probability

0 1 2 3

0.4 0.3 0.2 0.1

Loss sizes follow a Pareto distribution with parameters α  5 and θ  1200 and are independent of loss counts and of each other. Calculate the variance of aggregate losses.

Exercises 14.1. The number of claims on a homeowner’s policy has a binomial distribution fwithg parameters m  3 and q. The parameter q varies by policyholder and has a uniform distribution on 0, 12 . Calculate the probability of no claims for a policy.

14.2. The number of claims on an insurance policy has a Poisson distribution with mean λ. λ varies by insured according to a uniform distribution on [0, 3]. Calculate the probability of 2 or more claims for a policy.

Exercises continue on the next page . . .

14. AGGREGATE LOSS MODELS: COMPOUND VARIANCE

240

14.3.

[4B-S99:8] You are given the following:

(i) Each loss event is either an aircraft loss or a marine loss. (ii) The number of aircraft losses has a Poisson distribution with a mean of 0.1 per year. Each loss is always 10,000,000. (iii) The number of marine losses has a Poisson distribution with a mean of 0.2 per year. Each loss is always 20,000,000. (iv) Aircraft losses occur independently of marine losses. (v) From the first two events each year. the insurer pays the portion of the combined losses that exceeds 10,000,000. Determine the insurer’s expected annual payments. (A) (B) (C) (D) (E)

Less than 1,300,000 At least 1,300,000, but less than 1,800,000 At least 1,800,000, but less than 2,300,000 At least 2,300,000, but less than 2,800,000 At least 2,800,000

14.4. [SOA3-F04:32] Bob is a carnival operator of a game in which a player receives a prize worth W  2N if the player has N successes, N  0, 1, 2, 3,. . . . Bob models the probability of success for a player as follows: (i) (ii)

N has a Poisson distribution with mean Λ . Λ has a uniform distribution on the interval (0, 4) .

Calculate E[W]. (A) 5

(B) 7

(C) 9

(D) 11

(E) 13

14.5. [151-81-96:15] (2 points) An insurer issues a portfolio of 100 automobile insurance policies. Of these 100 policies, one-half have a deductible of 10 and the other half have a deductible of zero. The insurance policy pays the amount of damage in excess of the deductible subject to a maximum of 125 per accident. Assume: (i)

the number of automobile accidents per year per policy has a Poisson distribution with mean 0.03; and (ii) given that an accident occurs, the amount of vehicle damage has the distribution: x

Pr ( X  x )

30 150 200

1/3 1/3 1/3

Compute the total amount of claims the insurer expects to pay in a single year. (A) 270

(B) 275

(C) 280

(D) 285

(E) 290

Exercises continue on the next page . . .

EXERCISES FOR LESSON 14

241

[4B-S92:31] (2 points) You are given that N and X are independent random variables where:

14.6.

N is the number of claims, and has a binomial distribution with parameters m  3 and q  16 . X is the size of claim and has the following distribution: Pr ( X  100)  2/3

Pr ( X  1100)  1/6

Pr ( X  2100)  1/6

Determine the coefficient of variation of the aggregate loss distribution. (A) (B) (C) (D) (E) 14.7.

Less than 1.5 At least 1.5, but less than 2.5 At least 2.5, but less than 3.5 At least 3.5, but less than 4.5 At least 4.5 [151-81-96:4] (1 point) For an insurance portfolio:

(i)

the number of claims has the probability distribution n Pr ( N  n ) 0 1 2 3

0.4 0.3 0.2 0.1

(ii) each claim amount has a Poisson distribution with mean 4; and (iii) the number of claims and claim amounts are mutually independent. Determine the variance of aggregate claims. (A) 8 14.8.

(B) 12

(C) 16

(D) 20

[151-82-98:13] (2 points) For aggregate claims S 

(i)

(ii)

X i has distribution

PN

i1

(E) 24

X i , you are given:

x

Pr ( X  x )

1 2

p 1−p

Λ is a Poisson random variable with parameter p1 ;

(iii) given Λ  λ, N is Poisson with parameter λ; (iv) the number of claims and claim amounts are mutually independent; and (v) Var ( S )  19 2 . Determine p (A) 1/6

(B) 1/5

(C) 1/4

(D) 1/3

(E) 1/2

Exercises continue on the next page . . .

14. AGGREGATE LOSS MODELS: COMPOUND VARIANCE

242

14.9. (i) (ii)

[4B-S90:43] (2 points) You are given: N is a random variable for the claim count with Pr ( N  4)  41 , Pr ( N  5)  12 , and Pr ( N  6)  14 . X is a random variable for claim severity with probability density function f ( x )  3 · x −4 for 1 ≤ x < ∞.

Determine the coefficient of variation, R, of the aggregate loss distribution, assuming that claim severity and frequency are independent. (A) (B) (C) (D) (E)

R < 0.35 0.35 ≤ R < 0.50 0.50 ≤ R < 0.65 0.65 ≤ R < 0.70 0.70 ≤ R

14.10. [151-82-93:7] (2 points) For an insured, Y is the total time spent in the hospital in a year. The distribution of the number of hospital admissions in a year is: Number of Admissions

Probability

0 1 2

0.60 0.30 0.10

The distribution of the length of stay for each admission is gamma with α  1 and θ  5. Determine the variance of Y. (A) 20

(B) 24

(C) 28

(D) 32

(E) 36

14.11. For an auto bodily injury coverage, the number of accidents each year has a binomial distribution with parameters m  5, q  0.02. The number of people injured per accident has the following distribution: P (0)  0.7 P (1)  0.2 P (2)  0.1 The number of people injured in an accident is independent of the number of accidents. The loss size for each person injured has a lognormal distribution with parameters µ  10, σ  2. Loss size is independent of number of people injured. Calculate the variance in aggregate losses for a year.

Exercises continue on the next page . . .

EXERCISES FOR LESSON 14

243

14.12. [4-F02:36] You are given: Number of Claims

Probability

0

1/5

1

3/5

2

1/5

Claim Size

Probability

25

1/3

150

2/3

50

2/3

200

1/3

Claim sizes are independent. Determine the variance of the aggregate loss. (A) 4,050

(B) 8,100

(C) 10,500

(D) 12,510

(E) 15,612

14.13. [3-S00:19] An insurance company sold 300 fire insurance policies as follows: Number of Policies

Policy Maximum

Probability of Claims Per Policy

100 200

400 300

0.05 0.06

You are given: (i) The claim amount for each policy is uniformly distributed between 0 and the policy maximum. (ii) The probability of more than one claim per policy is 0. (iii) Claim occurrences are independent. Calculate the variance of the aggregate claims. (A) 150,000

(B) 300,000

(C) 450,000

(D) 600,000

(E) 750,000

14.14. [3-F00:8] The number of claims, N, made on an insurance portfolio follows the following distribution: n

Pr ( N  n )

0 2 3

0.7 0.2 0.1

If a claim occurs, the benefit is 0 or 10 with probability 0.8 and 0.2, respectively. The number of claims and the benefit for each claim are independent. Calculate the probability that aggregate benefits will exceed expected benefits by more than 2 standard deviations. (A) 0.02

(B) 0.05

(C) 0.07

(D) 0.09

(E) 0.12

Exercises continue on the next page . . .

14. AGGREGATE LOSS MODELS: COMPOUND VARIANCE

244

14.15. [3-S01:29] You are the producer of a television quiz show that gives cash prizes. The number of prizes, N, and the prize amounts, X, have the following distributions: n

Pr ( N  n )

x

Pr ( X  x )

1 2

0.8 0.2

0 100 1000

0.2 0.7 0.1

Your budget for prizes equals the expected prizes plus the standard deviation of prizes. Calculate your budget. (A) 306

(B) 316

(C) 416

(D) 510

(E) 518

14.16. [3-S01:36] The number of accidents follows a Poisson distribution with mean 12. Each accident generates 1, 2, or 3 claimants with probabilities 21 , 13 , 16 , respectively. Calculate the variance in the total number of claimants.

(A) 20

(B) 25

(C) 30

(D) 35

(E) 40

14.17. [CAS3-F04:31] The mean annual number of claims is 103 for a group of 10,000 insureds. The individual losses have an observed mean and standard deviation of 6,382 and 1,781, respectively. The standard deviation of the aggregate claims is 22,874. Calculate the standard deviation for the annual number of claims. (A) 1.47

(B) 2.17

(C) 4.72

(D) 21.73

(E) 47.23

14.18. For an insurance coverage, you are given: (i) Claim frequency follows a geometric distribution with mean 0.15. (ii) Claim severity follows a distribution that is a mixture of two lognormal distributions, the first with parameters µ  3 and σ  1 and the second with parameters µ  5 and σ  2. The first distribution is given 70% weight. (iii) Claim frequency and severity are independent. Calculate the variance of aggregate claims. 14.19. [CAS3-S04:22] An actuary determines that claim counts follow a negative binomial distribution with unknown β and r. It is also determined that individual claim amounts are independent and identically distributed with mean 700 and variance 1,300. Aggregate losses have mean 48,000 and variance 80 million. Calculate the values for β and r. (A) (B) (C) (D) (E)

β β β β β

 1.20, r  57.19  1.38, r  49.75  2.38, r  28.83  1,663.81, r  0.04  1,664.81, r  0.04

Exercises continue on the next page . . .

EXERCISES FOR LESSON 14

245

14.20. [CAS3-F03:24] Zoom Buy Tire Store, a nationwide chain of retail tire stores, sells 2,000,000 tires per year of various sizes and models. Zoom Buy offers the following road hazard warranty: “If a tire sold by us is irreparably damaged in the first year after purchase, we’ll replace it free, regardless of the cause.” The average annual cost of honoring this warranty is \$10,000,000, with a standard deviation of \$40,000. Individual claim counts follow a binomial distribution, and the average cost to replace a tire is \$100. All tires are equally likely to fail in the first year, and tire failures are independent. Calculate the standard deviation of the replacement cost per tire. (A) (B) (C) (D) (E)

Less than \$60 At least \$60, but less than \$65 At least \$65, but less than \$70 At least \$70, but less than \$75 At least \$75

14.21. [CAS3-F03:25] Daily claim counts are modeled by the negative binomial distribution with mean 8 and variance 15. Severities have mean 100 and variance 40,000. Severities are independent of each other and of the number of claims. Let σ be the standard deviation of a day’s aggregate losses. On a certain day, 13 claims occurred, but you have no knowledge of their severities. Let σ0 be the standard deviation of that day’s aggregate losses, given that 13 claims occurred. Calculate (A) (B) (C) (D) (E)

σ σ0

− 1.

Less than −7.5% At least −7.5%, but less than 0 0 More than 0, but less than 7.5% At least 7.5%

14.22. [3-F00:21] A claim severity distribution is exponential with mean 1000. An insurance company will pay the amount of each claim in excess of a deductible of 100. Calculate the variance of the amount paid by the insurance company for one claim, including the possibility that the amount paid is 0. (A) 810,000

(B) 860,000

(C) 900,000

(D) 990,000

(E) 1,000,000

14.23. Loss sizes for an insurance coverage, before taking any deductible into account, are uniformly distributed on [0, 100]. Coverage is subject to an ordinary deductible of 5. Calculate the variance of payments after the deductible, taking into account payments of 0 on losses at or below the deductible. 14.24. Loss sizes for an insurance coverage, before taking any deductible into account, are exponentially distributed with a mean of 50. Coverage is subject to an ordinary deductible of 5. Calculate the variance of payments after the deductible, taking into account payments of 0 on losses at or below the deductible.

Exercises continue on the next page . . .

14. AGGREGATE LOSS MODELS: COMPOUND VARIANCE

246

14.25. [151-81-96:7] (2 points) For a certain insurance, individual losses in 1994 were uniformly distributed over (0, 1000) . A deductible of 100 is applied to each loss. In 1995, individual losses have increased 5%, and are still uniformly distributed. A deductible of 100 is still applied to each loss. Determine the percentage increase in the standard deviation of amount paid per loss. (A) 5.00%

(B) 5.25%

(C) 5.50%

(D) 5.75%

(E) 6.00%

14.26. [4-F01:29] In order to simplify an actuarial analysis Actuary A uses an aggregate distribution S  X1 + · · · + X N , where N has a Poisson distribution with mean 10 and X i  1.5 for all i. Actuary A’s work is criticized because the actual severity distribution is given by Pr ( Yi  1)  Pr ( Yi  2)  0.5,

for all i,

where the Yi ’s are independent. Actuary A counters this criticism by claiming that the correlation coefficient between S and S∗  Y1 + · · · + YN is high. Calculate the correlation coefficient between S and S∗ .

(A) 0.75

(B) 0.80

(C) 0.85

(D) 0.90

(E) 0.95

Use the following information for questions 14.27 and 14.28: The probability function of claims per year for an individual risk is Poisson with a mean of 0.10. There are four types of claims. The annual number of claims of each type has a Poisson distribution. The table below describes the characteristics of the four types of claims. Mean frequency indicates the average number of claims of that type per year. Severity Type of Claim

Mean Frequency

Mean

Variance

W X Y Z

0.02 0.03 0.04 0.01

200 1,000 100 1,500

2,500 1,000,000 0 2,000,000

You are also given: •

Claim sizes and claim counts are independent for each type of claim.

Claim sizes are independent of each other.

14.27. Calculate the variance of a single claim whose type is unknown. 14.28. [4B-S91:26] (2 points) Calculate the variance of annual aggregate losses. (A) (B) (C) (D) (E)

Less than 70,000 At least 70,000, but less than 80,000 At least 80,000, but less than 90,000 At least 90,000, but less than 100,000 At least 100,000

Exercises continue on the next page . . .

EXERCISES FOR LESSON 14

247

14.29. For an insurance coverage, there are four types of policyholder: W, X, Y, and Z. For each type of policyholder, the number of claims per year has a Poisson distribution with a mean of 0.10. The probability that a policyholder is of specific type and the mean and variance of severity of claims for each type of policyholder is described in the following table. Policyholder

Severity

Type

Probability

Mean

Variance

W X Y Z

0.2 0.3 0.4 0.1

200 1,000 100 1,500

2,500 1,000,000 0 2,000,000

Calculate the variance of aggregate losses for a policyholder whose type is unknown. Additional released exam questions: SOA M-S05:17,31, CAS3-S05:7,8,9, SOA M-F05:38,39, CAS3-F06:29

Solutions 14.1.

If N is the number of claims, Pr ( N  0)  2 2

14.2.

1/2

Z 0

3

1/2

Z

(1 − q ) dq  −0.5 (1 −

0

(1 − q ) 3 dq.

1/2 q ) 4 0

!4 15 1 + *  0.5 1 −  2 32 ,

Let N be the number of claims. 1 3

Z

1 Pr ( N  1)  3

Z

Pr ( N  0) 

3 0

0

3

e −λ dλ  13 (1 − e −3 )  0.3167 λe −λ dλ

3 1 3  −λe −λ + e −λ dλ 0 3 0  1  −3e −3 + 1 − e −3 3  1  1 − 4e −3  0.2670 3 Pr ( N ≥ 2)  1 − 0.3167 − 0.2670  0.4163

Z

!

14.3. Losses are a mixture distribution with a 1/3 weight on aircraft losses and a 2/3 weight on marine losses. The expected value of each loss is 31 (10,000,000) + 23 (20,000,000)  16,666,666 23 . If there is one loss the insurer’s expected annual payment is 6,666,666 32 . If there are two or more losses, the insurer’s expected annual payment is 2 (16,666,666 23 ) − 10,000,000  23,333,333. The number of losses is Poisson with λ  0.3. Let p n be the probability of n losses. Expected annual losses are p1 (6,666,666 23 ) + (1 − p0 − p1 )(23,333,333 13 ) C/4 Study Manual—17th edition Copyright ©2014 ASM

14. AGGREGATE LOSS MODELS: COMPOUND VARIANCE

248

with p 1  0.3e −0.3  0.222245 and 1 − p0 − p1  1 − e −0.3 − 0.3e −0.3  1 − 1.3e −0.3  0.036936. Expected annual payments are 0.222245 (6,666,666 23 ) + 0.036936 (23,333,333 13 )  2,343,484 14.4.

(D)

We use the conditional expectation formula: E[W]  E 2N  EΛ E[2N | Λ]

f

g

f

g

The inner expectation, E[2N | Λ], is the probability generating function evaluated at 2. From the tables, the pgf of a Poisson with mean Λ is P ( z )  e Λ ( z−1) so P (2)  e Λ . The expectation of W is therefore

f

E[W]  E e

Λ

g

 0.25

 0.25 e 4 − 1



4

Z 0

e λ dλ



 0.25 (53.5982)  13.3995

(E)

14.5. We can calculate the expectation for the two halves (with and without the deductible) separately. For each of the 50 policies without a deductible, E[X]  30 (1/3) +125 (2/3)  280/3. The way the question is phrased, the policies with the deductible can get as much as 125; 125 is the policy limit, not the maximum covered loss. So for policies with a deductible, E[X]  20 (1/3) + 125 (2/3)  90. We then add everything up: ! 280 50 (0.03) + 50 (0.03)(90)  275 (B) 3 14.6.

Let S be the aggregate loss random variable. Then E[N]  mq  3

1 1  6 2

!

1 5 5 Var ( N )  mq (1 − q )   2 6 12 2 1 E[X]  (100) + (1100 + 2100)  600 3 6  1,750,000 2 1 Var ( X )  (100 − 600) 2 + (1100 − 600) 2 + (2100 − 600) 2  3 6 3

!

Using the compound variance formula, equation (14.2), E[S]  E[N] E[X]  12 (600)  300

Var ( S )  E[N] Var ( X ) + Var ( N ) E[X]2 1 1,750,000 5 5,300,000  + (360,000)  2 3 12 12 √ √ Var ( S ) 5,300,000/12   2.2153 (B) E[S] 300

!

EXERCISE SOLUTIONS FOR LESSON 14

249

14.7. The only tricky thing here is that the secondary, rather than the primary, distribution is Poisson. We calculate E[N]  0.3 (1) + 0.2 (2) + 0.1 (3)  1 E[N 2 ]  0.3 (12 ) + 0.2 (22 ) + 0.1 (32 )  2 Var ( N )  1 Var ( S )  E[N] Var ( X ) + Var ( N ) E[X]2  1 (4) + 1 (42 )  20 14.8. tion,

(D)

In this exercise, the frequency distribution is a compound distribution. For the frequency distribu-

f

g

E[N]  E E[N | Λ]  E[Λ] 

1 . p

We evaluate variance using the conditional variance formula: Var ( N )  Var E[N | Λ] + E Var ( N | Λ)





f

g

 Var (Λ) + E[Λ] 1 1 2  +  p p p

The severity distribution has mean ( p )(1) + (1 − p )(2)  2 − p. It is Bernoulli (shifted by 1), so its variance is p (1 − p ) . Frequency and severity are unconditionally independent, so we can use the compound variance formula. 1 2 19  Var ( S )  ( p )(1 − p ) + (2 − p ) 2 2 p p 19p  2p − 2p 2 + 16 − 16p + 4p 2 2p 2 − 33p + 16  0

(2p − 1)( p − 16)  0 p  16 is impossible, leaving the other possibility, p  14.9.

1 2

. (E)

N is a binomial variable, shifted 4. Shifting does not affect variance. E[N]  5 Var ( N )  2

1 1 2 2



1 2

X has a single parameter Pareto distribution with α  3, θ  1. E[X] 

3 2

E X2  3

f

g

Var ( X )  3 − E[S]  5

3 2

3 2  34 2  15 2 1 3 2 +2 2

Var ( S )  5 43  39 8 √ 39/8 R  0.2944 15/2



(A)

14. AGGREGATE LOSS MODELS: COMPOUND VARIANCE

250

14.10. Use the compound variance formula. E[N]  0.3 (1) + 0.1 (2)  0.5 E[N 2 ]  0.3 (1) + 0.1 (4)  0.7 Var ( N )  0.7 − 0.52  0.45 E[X]  5

Var ( X )  25

Var ( Y )  0.5 (25) + 0.45 (52 )  23.75

(B)

14.11. The number of losses is itself a compound distribution with primary distribution (number of accidents, which we’ll call L) binomial and secondary distribution (number of people injured in an accident, which we’ll call M) having the specified discrete distribution. If we let N be the number of people injured per year, then E[L]  mq  5 (0.02)  0.1 Var ( L )  mq (1 − q )  5 (0.02)(0.98)  0.098 E[M]  0.2 (1) + 0.1 (2)  0.4 E[N]  E[L] E[M]  (0.1)(0.4)  0.04 E M 2  0.2 (12 ) + 0.1 (22 )  0.6

f

g

Var ( M )  0.6 − 0.42  0.44

Var ( N )  E[L] Var ( M ) + Var ( L ) E[M]2  (0.1)(0.44) + (0.098)(0.42 )  0.05968

Loss size, X, has the following mean and variance: E[X]  exp ( µ + σ2 /2)  e 12 E[X 2 ]  exp (2µ + 2σ2 )  e 28 Var ( X )  e 28 − e 12



2

 e 28 − e 24

We use the compound variance formula once again to calculate the variance of aggregate losses: Var ( S )  E[N] Var ( X ) + Var ( N ) E[X]2  0.04 e 28 − e 24 + 0.05968 e 24  58,371,588,495









14.12. Claim size and number of claims are not independent, so the compound variance formula cannot be used. Either the conditional variance formula (4.2) can be used, or the problem can be done directly by calculating the first and second moments. The official solution does the latter, so we shall do the former so you can look at both solutions and decide which method you prefer. The variance of aggregate claims given number of claims is, by the Bernoulli shortcut (see Section 3.3 on page 54): •

If there are 0 claims, 0.

If there is 1 claim, (150 − 25) 2

1 2 3 3



2 (1252 ) . 9

EXERCISE SOLUTIONS FOR LESSON 14 •

251

If there are 2 claims, 2 (200 − 50) 2 23 31  10,000. (We have to multiply the variance of each claim by 2, since there are 2 claims and we want the variance of aggregate claims, not of claim size.)



 

The expected value of the variance is 3 5

2 (1252 ) 1 + (10000)  4083 13 9 5

!

!

!

The average aggregate claims is 0 if zero claims, 325/3 if one claim, and 200 if there are two claims. The variance (no Bernoulli shortcut since there are three possibilities) is calculated by calculating the mean and the second moment. The mean is 3 5 and the second moment

3 5

!

325 1 + (200)  105 3 5

325 3

!

!

!2

!

1 + (200) 2  15,041 32 5

!

so the variance of the means is 15,041 23 −1052  4016 23 . So the variance of aggregate losses is 4083 13 +4016 23  8100 . (B) 14.13. The number of claims is Bernoulli, variance q (1 − q ) . For a uniform distribution, the mean is the maximum over 2 and the variance is the maximum squared over 12. Therefore, for the first 100 policies, if S is aggregate claims for one policy, 4002 Var ( S )  E[N] Var ( X ) + Var ( N ) E[X]  (0.05) + (0.05)(0.95)(2002 )  2566 32 12

!

2

and for one policy from the second 200 policies: 3002 Var ( S )  (0.06) + (0.06)(0.94)(1502 )  1719 12

!

For all 300 policies, add up the variances: 100 (2566 32 ) + 200 (1719)  600,467 . (D)

14.14. Let claim size be X and aggregate benefits S. X is Bernoulli.

E[N]  0.2 (2) + 0.1 (3)  0.7 E[N 2 ]  0.2 (4) + 0.1 (9)  1.7 Var ( N )  1.7 − 0.72  1.21 E[X]  2

Var ( X )  102 (0.2)(0.8)  16 E[S]  (0.7)(2)  1.4 Var ( S )  0.7 (16) + 1.21 (22 )  16.04 E[S] + 2 Var ( S )  9.4100

p

The only possibility for claims less than 10 is 0, whose probability is Pr ( N  0) + Pr ( N  2) Pr ( X  0)



2

+ Pr ( N  3) Pr ( X  0)

 0.7 + (0.2)(0.64) + (0.1)(0.512)  0.8792 The probability of non-zero claims is 1 − 0.8792  0.1208 (E) C/4 Study Manual—17th edition Copyright ©2014 ASM



3

14. AGGREGATE LOSS MODELS: COMPOUND VARIANCE

252

14.15. We calculate the mean and variance of N and X and use the compound variance formula. For the variance of N we use the Bernoulli shortcut. In the following, S is aggregate prizes. E[N]  0.8 (1) + 0.2 (2)  1.2 Var ( N )  (2 − 1) 2 (0.8)(0.2)  0.16

E[X]  0.7 (100) + 0.1 (1000)  170

E[X 2 ]  0.7 (10,000) + 0.1 (1,000,000)  107,000 Var ( X )  107,000 − 1702  78,100 E[S]  (1.2)(170)  204

Var ( S )  1.2 (78,100) + 0.16 (1702 )  98,344 E[S] + σS  204 + 98,344  517.60

p

(E)

14.16. By the variance formula for compound Poisson distributions, equation (14.4), the varif compound g 2 ance is 12 E X , so we just have to calculate the second moment of the severity distribution. 1 1 1 4 9 10 1 2 (1 ) + (22 ) + (32 )  + +  2 3 6 2 3 6 3 Then Var ( S )  12

10 3

 40 . (E)

14.17. Using the usual N, X, S notation, 2 22,8742  103 (1,7812 ) + σN (6,3822 ) 2 σN 

22,8742 − 103 (1,7812 )  4.8247 6,3822

σN  2.1965

(B)

2.17 may be funny rounding. 14.18. E[N]  0.15 E[X]  0.7e

3.5

Var ( N )  0.15 (1.15)  0.1725 + 0.3e 7  352.17

E[X 2 ]  0.7e 8 + 0.3e 18  19,700,077.41 Var ( X )  E[X 2 ] − 352.172  19,576,053.70

Var ( S )  0.15 (19,576,053.70) + 0.1725 (352.172 )  2,957,802.15

14.19. First of all, 48,000  700 E[N], so E[N]  rβ 

480 7 .

Then

Var ( S )  E[N] Var ( X ) + Var ( N ) E[X]2 480 480 (1,300) + (1 + β )(7002 ) 80 × 106  7 7 480 79,910,857.14  (1 + β )(7002 ) 7 ! ! 7 1 1 + β  79,910,857.14  2.38 480 7002 So β  1.38 and the answer is (B) C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 14

253

14.20. To make the numbers easier to handle, we’ll work them out per tire. Since 2,000,000 tires are sold, letting S be the aggregate cost per tire, 10,000,000 5 2,000,000 40,0002 Var ( S )   800 2,000,000 E[S] 

Also, since total expected cost is \$10,000,000 and \$100 per tire, an average of 100,000 tires are damaged, or a probability of 100,000/2,000,000  0.05 of damage for a single tire. Using the compound variance formula, with X the damage per tire, 2 + (0.05)(0.95)(1002 ) 800  (0.05) σX 2  (0.05) σX + 475 325 2 σX   6500 0.05 σX  80.6226 (E)

14.21. The variance with no knowledge of claim count is Var ( S )  8 (40,000) + 15 (1002 )  470,000 With knowledge of 13 claims, the variance is 13 times the individual variance, or 13 (40,000)  520,000. √ The ratio of standard deviations is 47/52  0.9507, making the answer 0.9507 − 1  −0.0493 . (B)

14.22. The best way to do this is to treat it as a compound distribution with a Bernoulli frequency. The frequency of claims over 100 is e −100/1000  0.904837, and the severity of claims above 100 is exponential with mean 1000 since exponentials are forgetful. By the compound variance formula Var ( S )  0.904837 (10002 ) + (0.904837)(1 − 0.904837)(10002 )  990,944

(D)

14.23. N is the frequency random variable. It is Bernoulli, and equal to 1 if the loss is greater than 5. E[N]  0.95, Var ( N )  (0.95)(0.05)  0.0475. X is the payment random variable, and it is uniform on [0, 95], so E[X]  47.5, Var ( X )  952 /12  752.0833. Then Var ( S )  0.95 (752.0833) + 0.0475 (47.52 )  821.6510 . 14.24. Now E[N]  e −0.1  0.90484 and Var ( N )  e −0.1 (1 − e −0.1 )  0.08611. X after the deductible is still exponential with mean 50, variance 2500. So Var ( S )  0.90484 (2500) + 0.08611 (2500)  2477.36 . 14.25. The number of payments per loss, N, is Bernoulli and equals 1 whenever the loss is greater than 100 and 0 otherwise. Payment size given a payment is made is uniform on (0, 900) or (0, 950) after inflation. If Y is the payment random variable before inflation, and Y 0 after inflation. For a uniform variable, the variance is the range squared divided by 12. Var ( Y )  E[N] Var ( Y | N ) + Var ( N ) E[Y | N]2  0.9 C/4 Study Manual—17th edition Copyright ©2014 ASM

9002 + (0.9)(0.1)(4502 )  78,975 12

!

14. AGGREGATE LOSS MODELS: COMPOUND VARIANCE

254

Var ( Y 0 )  E[N 0] Var ( Y 0 | N 0 ) + Var ( N 0 ) E[Y 0 | N 0]2 

p

19 21

!

9502 19 + 12 21

!

87,487/78,975 − 1  0.052513

!

2 (4752 )  87,487 21

!

(B)

14.26. The correlation coefficient ρ is E[SS ∗ ] − E[S] E[S∗ ] ρ √ Var ( S ) Var ( S∗ ) We first calculate the denominator. Var ( S∗ )  E ( N ) Var ( Y ) + Var ( N ) E ( Y ) 2  10 (0.52 ) + 10 (1.52 )  25 Var ( S )  1.52 Var ( N )  22.5 E[S]  E[S∗ ]  15

X  N N X   E[SS ]  E  E[X i Yj ]  i1 j1  ∗

E[X i Yj ]  E[X i ] E[Yj ]

because X j is constant and therefore independent of any other random variable

 (1.5)(1.5)  2.25

X  N N X   E[SS ]  E  2.25  i1 j1  f g   2.25 E N 2  (2.25)(100 + 10)  247.5 ∗

The last line is because E[N 2 ]  Var ( N ) + E[N]2 , and for a Poisson Var ( N )  E[N]  10. So 247.5 − 152  0.9487 ρ√ (25)(22.5)

(E)

14.27. We’ll use the conditional variance formula. Given that a claim occurs, the probability that it is type W is 0.02/0.1  0.2 the probability that it is type X is 0.03/0.1  0.3 the probability that it is type Y is 0.04/0.1  0.4 the probability that it is type Z is 0.01/0.1  0.1 Let U be claim size. We would like to calculate Var (U )  VarI EU [U | I] + EI VarU (U | I )





Let’s calculate VarI EU [U | I] .





EI EU [U | I]  EI [200, 1000, 100, 1500]

f

g

f

g

EXERCISE SOLUTIONS FOR LESSON 14

255

 0.2 (200) + 0.3 (1,000) + 0.4 (100) + 0.1 (1,500)  530 2

EI EU [U | I]

f

g

 EI [2002 , 10002 , 1002 , 15002 ]  0.2 (2002 ) + 0.3 (1,0002 ) + 0.4 (1002 ) + 0.1 (1,5002 )  537,000

VarI EU [U | I]  EI EU [U | I]2 − EI EU [U | I]





f

g

f

g2

 537,000 − 5302  256,100

Now let’s calculate EI VarU (U | I ) .

f

g

EI VarU (U | I ) ]  EI [2500, 1,000,000, 0, 2,000,000]

f

 0.2 (2,500) + 0.3 (1,000,000) + 0.4 (0) + 0.1 (2,000,000)  500,500

The variance of a claim U is 256,100 + 500,500  756,600 . 14.28. Annual aggregate losses is the sum of four compound Poisson distributions. Claim counts and claim sizes are independent, and claim sizes are independent of each other, so we can apply the compound variance formula (14.4) to each type of claim: Var (W )  0.02 (2002 + 2,500)  850 Var ( X )  0.03 (1,0002 + 1,000,000)  60,000 Var ( Y )  0.04 (1002 )  400 Var ( Z )  0.01 (1,5002 + 2,000,000)  42,500 The variance of annual aggregate losses is the sum, 850 + 60,000 + 400 + 42,500  103,750 . (E) 14.29. Unlike in the previous exercise, there are four types of policyholder, not one, and the claim size distributions are not identical for all policyholders; they vary by type of policyholder. We can condition aggregate losses on the type of policyholder since the distribution of aggregate losses is distinct for each type of policyholder. Therefore, we use the conditional variance formula. Let I be the type of policyholder, and let S be aggregate losses. E[S]  E E[S | I]  0.2 (0.1)(200) + 0.3 (0.1)(1,000) + 0.4 (0.1)(100) + 0.1 (0.1)(1,500)

f

g

















 0.2 (20) + 0.3 (100) + 0.4 (10) + 0.1 (150)  53

f

E E[S | I]

2

g

 0.2 (202 ) + 0.3 (1002 ) + 0.4 (102 ) + 0.1 (1502 )  5,370

Var E[S | I]  5370 − 532  2,561





To calculate the conditional variance for each type of policyholder, we use the Poisson compound variance formula. Var ( S | W )  0.1 (2002 + 2,500)  4,250

Var ( S | X )  0.1 (1,0002 + 1,000,000)  200,000 Var ( S | Y )  0.1 (1002 )  1,000

Var ( S | Z )  0.1 (1,5002 + 2,000,000)  425,000

E Var ( S | I )  0.2 (4,250) + 0.3 (200,000) + 0.4 (1,000) + 0.1 (425,000)  103,750

f

g

The variance of aggregate losses is then

Var E[S | I] + E Var ( S | I )  2,561 + 103,750  106,311

f

g

f

g

256

14. AGGREGATE LOSS MODELS: COMPOUND VARIANCE

Quiz Solutions 14-1. E[N]  0.3 + 0.2 (2) + 0.1 (3)  1 Var ( N )  0.4 (0 − 1) 2 + 0.2 (2 − 1) 2 + 0.1 (3 − 1) 2  1 1200 E[X]   300 4 2 (12002 ) Var ( X )  − 3002  150,000 4·3 Var ( S )  (1)(150,000) + (1)(3002 )  240,000

Lesson 15

Aggregate Loss Models: Approximating Distribution Reading: Loss Models Fourth Edition 9.3, 9.8.2 The aggregate distribution may be approximated with a normal distribution. This may be theoretically justified by the Central Limit Theorem if the group is large. If severity is discrete, then the aggregate loss distribution is discrete, and a continuity correction is required. This means that whenever the distribution X assumes values a and b (a < b) but no value in between, all of the following statements are equivalent: • X>a • X≥b

• X > c for any c ∈ ( a, b )

To assure they all resultin the same answer, you evaluate the probability that X is greater than the mid point of the interval, Pr X > ( a + b ) /2 . Similarly the following statements are all equivalent: • X≤a

• X 100) , you instead evaluate Pr ( S > 105) , and if you want to evaluate Pr ( S ≤ 100) , you instead evaluate Pr ( S ≤ 105) .





Example 15A For a group insurance policy, the number of claims from the group has a binomial distribution with mean 100 and variance 20. The size of each claim has the following distribution: Claim Size Probability 1 2 3 4

0.50 0.35 0.10 0.05

Using the approximating normal distribution, calculate the probability that aggregate claims from this group will be greater than 180. Answer: The mean of the secondary distribution is E[X]  0.50 (1) + 0.35 (2) + 0.10 (3) + 0.05 (4)  1.70. The second moment is E[X 2 ]  0.50 (12 ) + 0.35 (22 ) + 0.10 (32 ) + 0.05 (42 )  3.60. C/4 Study Manual—17th edition Copyright ©2014 ASM

257

15. AGGREGATE LOSS MODELS: APPROXIMATING DISTRIBUTION

258

Therefore Var ( X )  3.6 − 1.72  0.71. We then calculate the moments of S: E[S]  100 (1.7)  170 Var ( S )  100 (0.71) + 20 (1.72 )  128.8 Then the probability that aggregate losses are greater than 180, with the continuity correction, is 1−Φ

180.5 − 170  1 − Φ (0.93)  1 − 0.8238  0.1762 √ 128.8

!

  √ Without the continuity correction, the answer would have been 1 − Φ (180 − 170) / 128.8  1 − Φ (0.88)  1 − 0.8106  0.1894. 

If severity has a continuous distribution, no continuity correction is made since S has a continuous distribution when X is continuous. Example 15B [1999 C3 Sample:25] For aggregate losses S  X1 + X2 + · · · + X N , you are given: • N has a Poisson distribution with mean 500. • X1 , X2 , . . . have mean 100 and variance 100. • N, X1 , X2 . . . are mutually independent. You are also given:

• For a portfolio of insurance policies, the loss ratio is the ratio of aggregate losses to aggregate premiums collected. • The premium collected is 1.1 times the expected aggregate losses. Using the normal approximation to the compound Poisson distribution, calculate the probability that the loss ratio exceeds 0.95. Answer: E[S]  E[N] E[X]  500 (100)  50,000. Therefore, premium is 1.1 (50,000)  55,000. For the loss ratio to equal 0.95, losses must be (0.95)(55,000)f  52,250. Next we calculate the variance of S. For a g 2 compound Poisson distribution, this reduces to λ E X  500 (100 + 1002 )  5,050,000. The probability we require is then 1−Φ

?

52,250 − 50,000 2,250 1−Φ  1 − Φ (1.00)  1 − 0.8413  0.1587 . √ 2,247.22 5,050,000

!

!



Quiz 15-1 Claim counts follow a geometric distribution with β  0.15. Claim sizes follow an inverse gamma distribution with parameters α  3, θ  1000. Calculate the Value-at-Risk of aggregate losses at the 95% security level using the normal approximation. When the sample isn’t large enough and has a heavy tail, the symmetric normal distribution isn’t appropriate. Sometimes the lognormal distribution is used instead, even though there is no theoretical justification. The parameters of the lognormal are selected by matching the mean and the variance. This is a special case of fitting a parametric distribution using the method of moments. Fitting a distribution using the method of moments will be discussed in more generality in Lesson 30. C/4 Study Manual—17th edition Copyright ©2014 ASM

15. AGGREGATE LOSS MODELS: APPROXIMATING DISTRIBUTION

259

Example 15C The number of claims on a policy has a Poisson distribution with λ  0.1. Claim sizes have a gamma distribution with α  0.5, θ  1000. Aggregate losses S for the policy are approximated with a lognormal distribution matching the mean and variance of the aggregate distribution. Calculate FS (80) using this approximation. Answer: The primary distribution is Poisson, so we use equation (14.4) to compute the variance of the aggregate distribution. E[X]  (0.5)(1000)  500 E[X 2 ]  θ 2 ( α + 1)( α )  (10002 )(1.5)(0.5)  750,000 E[S]  λ E[X]  (0.1)(500)  50 Var ( S )  λ E X 2  (0.1)(750,000)  75,000

f

g

E[S2 ]  75,000 + 502  77,500 Equating the first two raw moments of the distributions is equivalent to equating the means and variances of the distributions. We now equate the first two raw moments of a lognormal to the corresponding moments of the aggregate distribution. 2

E[S]  e µ+0.5σ  50 2

E[S2 ]  e 2µ+2σ  77,500 Taking logarithms, µ + 0.5σ2  ln 50 2µ + 2σ2  ln 77,500 Subtracting twice the first expression from the second expression, σ2  ln 77,500 − 2 ln 50  3.4340 √ σ  3.4340  1.8531 Solving the first expression for µ, µ  ln 50 − 0.5 (3.4340)  2.1950 So FS (80)  Φ

ln 80 − 2.1950  Φ (1.18)  0.8810 . 1.8531

!



So far, we have been discussing the collective risk model. Similar approximating distributions can be used for the individual risk model. The following example deals with frequency only. Example 15D For a group life insurance policy, you have the following statistics for the group:

Age group

Number in group

Assumed mortality rate

30–34 35–39 40–44 45–49

22 18 15 10

0.005 0.006 0.009 0.013

15. AGGREGATE LOSS MODELS: APPROXIMATING DISTRIBUTION

260

Using a normal approximation and a lognormal approximation, calculate the probability of 2 or more deaths in a year. Answer: Since the number of deaths is discrete, a continuity correction is appropriate, so we will calculate the probability Pr ( N ≥ 1.5) . The individual distributions are binomial with m  1 and q varying. All lives are assumed to be independent. Assume this even if the question doesn’t say it, unless the question says otherwise. Therefore, the mean is the sum of the means and the variance is the sum of the variances. For a binomial, the mean is mq and the variance is mq (1 − q ) . We have E[N]  22 (0.005) + 18 (0.006) + 15 (0.009) + 10 (0.013)  0.483 Var ( N )  22 (0.005)(0.995) + 18 (0.006)(0.994) + 15 (0.009)(0.991) + 10 (0.013)(0.987)  0.478897 E[N 2 ]  0.4832 + 0.478897  0.712186 For the normal distribution, the probability of 1.5 or more deaths is 1.5 − 0.483  1 − Φ (1.47)  1 − 0.9292  0.0708 Pr ( N ≥ 1.5)  1 − Φ √ 0.478897

!

For the lognormal distribution, as in the previous example, µ + 0.5σ2  ln 0.483 2µ + 2σ2  ln 0.712186 σ2  ln 0.712186 − 2 ln 0.483  1.1161 √ σ  1.1161  1.0564 µ  ln 0.483 − 0.5 (1.1161)  −1.2858

ln 1.5 − (−1.2858)  1 − Φ (1.60)  1 − 0.9452  0.0548 Pr ( S ≥ 1.5)  1 − Φ 1.0564

!

You may be curious how close these approximations are to the actual probability. You can calculate the probability of 0 deaths (0.99522 )(0.99418 )(0.99115 )(0.98710 )  0.6157 and the probability of 1 death, which works out to 0.3000. The probability of 2 or more deaths is then 0.0843. So neither approximation was that good. 

Exercises 15.1. The annual number of losses for each insured follows a Poisson distribution with parameter λ. The parameter λ varies by insured according to a gamma distribution with mean 12 , variance 12 , but does not vary by year for any single insured. There are 1500 insureds. Using the normal approximation, calculate the probability of more than 1600 claims in two years.

Exercises continue on the next page . . .

EXERCISES FOR LESSON 15

261

[4B-S96:7] (3 points) You are given the following:

15.2. •

The number of claims follows a negative binomial distribution with mean 800 and variance 3,200.

Claim sizes follow a transformed gamma distribution with mean 3,000 and variance 36,000,000.

The number of claims and claim sizes are independent.

Using the Central Limit Theorem, determine the approximate probability that the aggregate losses will exceed 3,000,000. (A) (B) (C) (D) (E)

Less than 0.005 At least 0.005, but less than 0.01 At least 0.01, but less than 0.10 At least 0.10, but less than 0.50 At least 0.50 [4B-F96:31] (2 points) You are given the following:

15.3. •

A portfolio consists of 1,600 independent risks.

For each risk, the probability of at least one claim is 0.5.

Using the Central Limit Theorem, determine the approximate probability that the number of risks in the portfolio with at least one claim will be greater than 850. (A) (B) (C) (D) (E)

Less than 0.01 At least 0.01, but less than 0.05 At least 0.05, but less than 0.10 At least 0.10, but less than 0.20 At least 0.20

15.4. The number of claims for a portfolio of insureds has a negative binomial distribution with parameters r  10 and β  3. The size of claims has the following distribution: 2000 F ( x )  1 − 0.5 2000 + x

!3

10,000 − 0.5 10,000 + x

!3

x ≥ 0.

Claim counts and claim sizes are independent. Using the normal approximation, determine the probability that aggregate claims will be greater than 100,000. 15.5. The number of claims for an insurance coverage averages 3 per month, with standard deviation 2. Each claim has a gamma distribution with parameters α  20 and θ  0.1. Claim counts and claim sizes are independent. Using the normal approximation, determine the probability that the aggregate losses for a year will be less than 60.

Exercises continue on the next page . . .

15. AGGREGATE LOSS MODELS: APPROXIMATING DISTRIBUTION

262

15.6. The number of claims for an insurance coverage has a Poisson distribution with mean λ. Claim size has the following distribution: F (x )  1 −

3000 3000 + x

!3

x ≥ 0.

Claim counts and claim sizes are independent. Using the normal approximation, the probability that aggregate losses will be greater than 4000 is 0.2743. Determine λ. 15.7. For a group of insureds, you are given the following information regarding losses per individual per year: Mean Standard deviation

Number

Amount

0.2 0.1

1000 700

Number of losses and amount of losses are independent. The insurance group has 76 insureds. Using the normal approximation, determine the probability that aggregate losses for a year will exceed 15,000. 15.8. The annual number of claims on an insurance coverage has a Poisson distribution with mean 4. Claim size is uniformly distributed on [0, u]. Number of claims and claim sizes are independent. Using the normal approximation, you calculate that the probability that aggregate claims for a year will be greater than 50,000 is 0.2743. Determine u. 15.9. [151-83-94:17] (3 points) An insurer offers group term life insurance coverage on 250 mutually independent lives for a premium of 350. The probability of a claim is 0.02 for each life. The distribution of number of lives by face amount is: Face Amount

Number of Lives

20 50 100

100 100 50

Reinsurance is purchased which costs 120% of expected claims above a retention limit of 40 per life. Using the normal approximation to the distribution of retained claims, determine the probability that the total of retained claims and reinsurance premiums exceeds the premium. (A) 0.10

(B) 0.12

(C) 0.14

(D) 0.16

(E) 0.18

Exercises continue on the next page . . .

EXERCISES FOR LESSON 15

263

Use the following information for questions 15.10 and 15.11: You are given the following: (i) The number of claims per year follows a Poisson distribution with mean 300. (ii) Claim sizes follow a Generalized Pareto distribution with parameters θ  1000, α  3, and τ  2. (iii) The number of claims and claim sizes are independent. 15.10. [4B-F99:12] (2 points) Determine the probability that annual aggregate losses will exceed 360,000. (A) (B) (C) (D) (E)

Less than 0.01 At least 0.01, but less than 0.03 At least 0.03, but less than 0.05 At least 0.05, but less than 0.07 At least 0.07

15.11. [4B-F99:13] (2 points) After a number of years, the number of claims per year still follows a Poisson distribution, but the expected number of claims per year has been cut in half. Claim sizes have increased uniformly by a factor of two. Determine the probability that annual aggregate losses will exceed 360,000. (A) (B) (C) (D) (E)

Less than 0.01 At least 0.01, but less than 0.03 At least 0.03, but less than 0.05 At least 0.05, but less than 0.07 At least 0.07

15.12. [3-S00:16] You are given:

Number of Claims Individual Losses

Mean 8 10,000

Standard Deviation 3 3,937

Using the normal approximation, determine the probability that the aggregate loss will exceed 150% of the expected loss. (A) Φ (1.25)

(B) Φ (1.5)

(C) 1 − Φ (1.25)

(D) 1 − Φ (1.5)

(E) 1.5Φ (1)

15.13. [3-F00:2] In a clinic, physicians volunteer their time on a daily basis to provide care to those who are not eligible to obtain care otherwise. The number of physicians who volunteer in any day is uniformly distributed on the integers 1 through 5. The number of patients that can be served by a given physician has a Poisson distribution with mean 30. Determine the probability that 120 or more patients can be served in a day at the clinic, using the normal approximation with continuity correction. (A) 1 − Φ (0.68)

(B) 1 − Φ (0.72)

(C) 1 − Φ (0.93)

(D) 1 − Φ (3.13)

(E) 1 − Φ (3.16)

Exercises continue on the next page . . .

15. AGGREGATE LOSS MODELS: APPROXIMATING DISTRIBUTION

264

15.14. [3-F00:32] For an individual over 65: (i) The number of pharmacy claims is a Poisson random variable with mean 25. (ii) The amount of each pharmacy claim is uniformly distributed between 5 and 95. (iii) The amounts of the claims and the number of claims are mutually independent. Determine the probability that aggregate claims for this individual will exceed 2000 using the normal approximation. (A) 1 − Φ (1.33)

(B) 1 − Φ (1.66)

(C) 1 − Φ (2.33)

(D) 1 − Φ (2.66)

(E) 1 − Φ (3.33)

15.15. [3-F02:6] The number of auto vandalism claims reported per month at Sunny Daze Insurance Company (SDIC) has mean 110 and variance 750. Individual losses have mean 1101 and standard deviation 70. The number of claims and the amounts of individual losses are independent. Using the normal approximation, calculate the probability that SDIC’s aggregate auto vandalism losses reported for a month will be less than 100,000. (A) 0.24

(B) 0.31

(C) 0.36

(D) 0.39

(E) 0.49

15.16. [3-S01:16] A dam is proposed for a river which is currently used for salmon breeding. You have modeled: (i)

For each hour the dam is opened the number of salmon that will pass through and reach the breeding grounds has a distribution with mean 100 and variance 900. (ii) The number of eggs released by each salmon has a distribution with mean of 5 and variance of 5. (iii) The number of salmon going through the dam each hour it is open and the numbers of eggs released by the salmon are independent. Using the normal approximation for the aggregate number of eggs released, determine the least number of whole hours the dam should be left open so the probability that 10,000 eggs will be released is greater than 95%. (A) 20

(B) 23

(C) 26

(D) 29

(E) 32

15.17. [3-F02:27] At the beginning of each round of a game of chance the player pays 12.5. The player then rolls one die with outcome N. The player then rolls N dice and wins an amount equal to the total of the numbers showing on the N dice. All dice have 6 sides and are fair. Using the normal approximation, calculate the probability that a player starting with 15,000 will have at least 15,000 after 1000 rounds. (A) 0.01

(B) 0.04

(C) 0.06

(D) 0.09

(E) 0.12

Exercises continue on the next page . . .

EXERCISES FOR LESSON 15

265

15.18. [3-F01:7] You own a fancy light bulb factory. Your workforce is a bit clumsy—they keep dropping boxes of light bulbs. The boxes have varying numbers of light bulbs in them, and when dropped, the entire box is destroyed. You are given: Expected number of boxes dropped per month: Variance of the number of boxes dropped per month: Expected value per box: Variance of the value per box:

50 100 200 400

You pay your employees a bonus if the value of light bulbs destroyed in a month is less than 8000. Assuming independence and using the normal approximation, calculate the probability that you will pay your employees a bonus next month. (A) 0.16

(B) 0.19

(C) 0.23

(D) 0.27

(E) 0.31

15.19. [SOA3-F03:4] Computer maintenance costs for a department are modeled as follows: (i)

The distribution of the number of maintenance calls each machine will need in a year is Poisson with mean 3. (ii) The cost for a maintenance call has mean 80 and standard deviation 200. (iii) The number of maintenance calls and the costs of the maintenance calls are all mutually independent. The department must buy a maintenance contract to cover repairs if there is at least a 10% probability that aggregate maintenance costs in a given year will exceed 120% of the expected costs. Using the normal approximation for the distribution of the aggregate maintenance costs, calculate the minimum number of computers needed to avoid purchasing a maintenance contract. (A) 80

(B) 90

(C) 100

(D) 110

(E) 120

15.20. [SOA3-F03:33] A towing company provides all towing services to members of the City Automobile Club. You are given: (i)

Towing Distance 0–9.99 miles 10–29.99 miles 30+ miles

Towing Cost 80 100 160

Frequency 50% 40% 10%

(ii)

The automobile owner must pay 10% of the cost and the remainder is paid by the City Automobile Club. (iii) The number of towings has a Poisson distribution with mean of 1000 per year. (iv) The number of towings and the cost of individual towings are all mutually independent. Using the normal approximation for the distribution of aggregate towing costs, calculate the probability that the City Automobile Club pays more than 90,000 in any given year. (A) 3%

(B) 10%

(C) 50%

(D) 90%

(E) 97%

Exercises continue on the next page . . .

15. AGGREGATE LOSS MODELS: APPROXIMATING DISTRIBUTION

266

15.21. [CAS3-F03:30] Speedy Delivery Company makes deliveries 6 days a week. The daily number of accidents involving Speedy vehicles follows a Poisson distribution with mean 3 and are independent. In each accident, damage to the contents of Speedy’s vehicles is distributed as follows: Amount of damage

Probability

\$ 0 \$2,000 \$8,000

1/4 1/2 1/4

Using the normal approximation, calculate the probability that Speedy’s weekly aggregate damages will not exceed \$63,000. (A) 0.24

(B) 0.31

(C) 0.54

(D) 0.69

(E) 0.76

15.22. [CAS3-F04:32] An insurance policy provides full coverage for the aggregate losses of the Widget Factory. The number of claims for the Widget Factory follows a negative binomial distribution with mean 25 and coefficient of variation 1.2. The severity distribution is given by a lognormal distribution with mean 10,000 and coefficient of variation 3. To control losses, the insurer proposes that the Widget Factory pay 20% of the cost of each loss. Calculate the reduction in the 95th percentile of the normal approximation of the insurer’s loss. (A) (B) (C) (D) (E)

Less than 5% At least 5%, but less than 15% At least 15%, but less than 25% At least 25%, but less than 35% At least 35%

15.23. [SOA3-F04:15] Two types of insurance claims are made to an insurance company. For each type, the number of claims follows a Poisson distribution and the amount of each claim is uniformly distributed as follows: Type of Claim

Poisson Parameter λ for Number of Claims

Range of Each Claim Amount

I II

12 4

(0, 1) (0, 5)

The numbers of claims of the two types are independent and the claim amounts and claim numbers are independent. Calculate the normal approximation to the probability that the total of claim amounts exceeds 18. (A) 0.37

(B) 0.39

(C) 0.41

(D) 0.43

(E) 0.45

15.24. For an insurance coverage, the number of claims follows a Poisson distribution with mean 2 and the size of claims follows a Poisson distribution with mean 10. Number of claims and claim sizes are independent. Calculate the probability that aggregate losses will be less than or equal to 5, using the normal approximation.

Exercises continue on the next page . . .

EXERCISES FOR LESSON 15

267

15.25. For an insurance coverage, you are given (i) The number of claims for each insured follows a Poisson distribution with mean λ. (ii) λ varies by insured according to a gamma distribution with parameters α  4, θ  0.2. (iii) Claim size, before application of policy limits, follows a single-parameter Pareto with parameters α  3, θ  10,000. (iv) Coverage is subject to a maximum covered loss of 20,000. (v) Number of claims and claim sizes are independent. Calculate the probability that aggregate losses will be greater than 30,000, using the normal approximation. 15.26. Aggregate losses follow a collective risk model. Each loss follows a lognormal distribution with parameters µ  5, σ  1.2. The number of losses per year follows a Poisson distribution with mean 0.7. Estimate the probability that aggregate losses will exceed 300 using the lognormal approximation of aggregate losses. 15.27. [CAS3-S04:38] You are asked to price a Workers’ Compensation policy for a large employer. The employer wants to buy a policy from your company with an aggregate limit of 150% of total expected loss. You know the distribution for aggregate claims is Lognormal. You are also provided with the following: Number of claims Amount of individual loss

Mean

Standard Deviation

50 4,500

12 3,000

Calculate the probability that the aggregate loss will exceed the aggregate limit. (A) (B) (C) (D) (E)

Less than 3.5% At least 3.5%, but less than 4.5% At least 4.5%, but less than 5.5% At least 5.5%, but less than 6.5% At least 6.5%

15.28. For an insurance company, claim counts per policy follow a Poisson distribution with λ  0.4. Claim sizes follow a Pareto distribution with α  3 and θ  10. Claim counts and claim sizes are independent. There are 500 policies in force. Using the normal approximation, estimate the Tail-Value-at-Risk at the 90% security level for the company. Additional released exam questions: SOA M-S05:40, CAS3-F05:30,34, SOA M-F05:18, C-S07:17

Solutions 15.1. For each insured, the Poisson parameter over two years is Λ  2λ. Since E[Λ]  2 E[λ] and Var (Λ)  4 Var ( λ ) , the parameter Λ follows a gamma distribution withg mean 1 and variance 2. Let N be the number f of losses over the two-year period. Then E[N]  E E[N | Λ]  E[Λ]  1 and the variance of N is Var ( N )  E Var ( N | Λ) + Var E[N | Λ]  E[Λ] + Var (Λ)  1 + 2  3

f

g





15. AGGREGATE LOSS MODELS: APPROXIMATING DISTRIBUTION

268

For 1500 insureds, the aggregate mean is 1500 and the aggregate variance is 1500 (3)  4500. We make a continuity correction and check the probability that a normal distribution with these parameters is greater than 1600.5: Pr ( N > 1600)  1 − Φ

1600.5 − 1500  1 − Φ (1.50)  1 − 0.9332  0.0668 √ 4500

!

15.2. The transformed gamma distribution is not in the tables, but all you need to do the calculation is the mean and variance of claim sizes. Let S be aggregate losses. Using equation (14.2), E[S]  E[N] E[X]  (800)(3000)  2,400,000 Var ( S )  E[N] Var ( X ) + E[X]2 Var ( N )

p

 800 (36 · 106 ) + (9 · 106 ) 3200  57,600 × 106

57,600 × 106  240,000

Pr ( S > 3,000,000)  1 − Φ

600,000  1 − Φ (2.5)  1 − 0.9938  0.0062 240,000

!

(B)

15.3. This is a binomial distribution with parameters m  1600, q  0.5. The mean is 800; the variance is 400. We will make a continuity correction. 1−Φ

850.5 − 800  1 − Φ (2.53)  1 − 0.9943  0.0057 √ 400

!

(A)

Even if you (by mistake) didn’t make a continuity correction, you would get the right range: 850 − 800 1−Φ √  1 − Φ (2.5)  0.0062. 400

!

15.4.

For a negative binomial distribution, the mean is rβ and the variance is rβ (1 + β ) . E[N]  (10)(3)  30 Var ( N )  (10)(3)(4)  120

Severity is a mixture of two Pareto distributions with parameters and (3, 10,000). For .  ( α, θ )  (3, 2000)  2 each one, the mean is θ/ ( α − 1) and the second moment is 2θ ( α − 1)( α − 2) . 2000 10,000 +  3000 2 2 ! 2 (20002 ) 2 (10,000) 2 2 E[X ]  0.5 +  52 (106 ) 2 2 E[X]  0.5





Var ( X )  52 (106 ) − 30002  43 (106 ) E[S]  (30)(3000)  90,000

Var ( S )  30 43 (106 ) + 120 (30002 )  2370 (106 )



Pr ( S > 100,000)  1 − Φ



100,000 − 90,000

p

2370 (106 )

!

 1 − Φ (0.21)  1 − 0.5832  0.4168

EXERCISE SOLUTIONS FOR LESSON 15

269

15.5. Let N be the number of claims in a year. Then E[N]  12 (3)  36 and Var ( N )  12 (4)  48. Let X be claim severity. Then E[X]  (20)(0.1)  2 and Var ( X )  (20)(0.1) 2  0.2 E[S]  (36)(2)  72 Var ( S )  36 (0.2) + 48 (4)  199.2 60 − 72 Φ √  Φ (−0.85)  0.1977 199.2

!

15.6. Claim size X is Pareto with parameters α  3, θ  3000, so E[X]  3000/2  1500 and E[X 2 ]  30002 . Thus for aggregate losses S, E[S]  1500λ Var ( S )  (30002 ) λ Φ−1 (0.2743)  −0.6. Hence:

4000 − 1500λ

p

(30002 ) λ

by formula (14.4)

 0.6

(*)

√ √ 4000 − 1500λ  0.6 (3000) λ  1800 λ √ 1500λ + 1800 λ − 4000  0

−1800 ± 18002 + 4 (1500)(4000)  1.1397, −2.3397 λ 2 (1500) λ  1.299, 5.474

p

However, 5.474 gives −0.6 when plugged back into (*), so the correct answer is 1.299 . 15.7.

Let S be the aggregate losses for 76 insureds.

E[S]  76 (200)  15,200 Var ( S )  76 0.2 (7002 ) + 0.01 (10002 )  8,208,000





15,200 − 15,000  Φ (0.07)  0.5279 Pr ( S > 15,000)  Φ √ 8,208,000

!

15.8.

Let X be severity. E[X] 

u 2

and E X 2 

f

g

u2 3 .

Let S be aggregate losses.

E[S]  4 Var ( S )  4

u 2  2u 4u 2 u2  3  3

Since Φ−1 (0.2743)  −0.60, we have

50,000 − 2u  0.60 √ 2u/ 3

(0.60)(2u )  50,000 3 − 2 3u u

√ 50,000 3

√  18,568

(0.60)(2) + 2 3

15. AGGREGATE LOSS MODELS: APPROXIMATING DISTRIBUTION

270

15.9.

Claims for each life have a binomial distribution with q  0.02. Expected retained claims are 0.02 20 (100) + 40 (150)  160





Expected reinsured claims are 0.02 100 (10) + 50 (60)  80





Reinsurance premium is therefore 1.2 (80)  96, and total expected expenses are 160 + 96  256. Variance of retained claims is   (0.02)(0.98) 100 (202 ) + 150 (402 )  5488 Using the normal approximation, we want 350 − 256  1 − Φ (1.27)  1 − 0.8980  0.1020 1−Φ √ 5488

!

(A)

Strictly speaking, we should make a continuity correction. Expenses are always 96 plus a multiple of 20. In order to be more than 350, they would have to be 356, or 20 more than 336. We should therefore calculate the probability that they are more than the midpoint, or 346. Then we get 346 − 256  1 − Φ (1.21)  0.1131 1−Φ √ 5488

!

Since this is not one of the five answer choices, they apparently didn’t expect you to make a continuity correction. 15.10. We need to use the normal approximation, although the question didn’t mention the normal approximation. 1000Γ (3) Γ (2)  1000 Γ (3) Γ (2) 10002 Γ (4) Γ (1) 10002 (6) E[X 2 ]    3,000,000 Γ (3) Γ (2) 2 E[S]  300 (1000)  300,000 E[X] 

p

Var ( S ) 

p

300 (3,000,000)  30,000

Pr ( S > 360,000)  1 − Φ

360,000 − 300,000  1 − Φ (2)  0.0228 30,000

!

(B)

15.11. We multiply θ by 2 to inflate claim sizes, so θ  2000. E[S] doesn’t change. E[X 2 ] is multiplied by 22 so Var ( S ) is multiplied by 22 /2  2. Then 60,000 Pr ( X > 360,000)  1 − Φ √  1 − Φ (1.41)  0.0793 30,000 2

!

(E)

15.12. The overall mean is 8 (10,000)  80,000. The overall variance is 8 (39372 ) + 32 (10,000) 2  1,023,999,752 √ and the standard deviation is 1,023,999,752  32,000. The probability of exceeding 120,000 is 1 −  Φ 40,000 32,000  1 − Φ (1.25) . (C) C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 15

271

15.13. The discrete uniform distribution has a mean of 3 and a variance of

(1 − 3) 2 + (2 − 3) 2 + (4 − 3) 2 + (5 − 3) 2 5

2

The aggregate mean is 3 (30)  90 and the aggregate variance is Var ( S )  3 (30) + 2 (302 )  1890 Since the number of patients is discrete, we must make a continuity correction and calculate the probability that the normal random variable is greater than 119.5. The probability of 119.5 or more is 119.5 − 90 1−Φ √  1 − Φ (0.68) 1890

!

(A)

15.14. The mean of the uniform distribution is 50. The mean of aggregate claims is the product of the Poisson mean and the severity mean, or 25 (50)  1250. The second moment of the uniform distribution is mean squared plus variance ( (95 − 5) 2 /12). For aggregate claims S, by equation (14.4), the variance is the Poisson mean times the severity second moment, or 902 Var ( S )  25 50 +  25 (2500 + 675)  79,375 12 2

!

The probability of exceeding 2000, using the normal approximation, is 2000 − 1250  1 − Φ (2.66) 1−Φ √ 79,375

!

(D)

15.15. E[S]  (110)(1101)  121,110 Var ( S )  (110)(702 ) + 750 (11012 )  909,689,750

p

909,689,750  30,161

100,000 − 121,110 Pr ( S < 100,000)  Φ 30,161  Φ (−0.7)  0.2420

!

(A)

15.16. The mean number of eggs in 1 hour is (100)(5)  500 and the variance, by the compound variance formula, is Var ( S )  100 (5) + 900 (52 )  23,000 For n hours the mean is 500n and the variance is 23,000n. We’d like 10,000 to be the 5th percentile (so that the number of eggs will be higher 95% of the time), so we solve: 10,000 − 500n  −1.645 √ 23,000n p √ 500n − 1.645 23,000 n − 10,000  0 √ 500n − 249.477 n − 10,000  0 C/4 Study Manual—17th edition Copyright ©2014 ASM

15. AGGREGATE LOSS MODELS: APPROXIMATING DISTRIBUTION

272

√ 249.77 ± 20,062,239 n  4.729, −4.230 1000 n  22.359, 17.890

Naturally, the lower number is the 5th percentile and the higher number is the 95th percentile. You can in fact plug 17.890 back into the original equation and you will get 1.645 instead of −1.645. So the answer is 23 hours are needed. (B) 15.17. For one die, the mean toss is 3.5 and the variance is 35/12. For one round, using the compound variance formula: E[S]  3.52 − 12.5  −0.25

35  2  35 + 3.5  45.9375 Var ( S )  3.5 12 12

!

!

In 1000 rounds, the average gain is 1000 (−0.25)  −250. The probability of gaining at least 0 after 1000 rounds is (we make a continuity adjustment, but this has very little effect): Pr ( S > 0)  1 − Φ √

249.5

!

1000 (45.9375)

 1 − Φ (1.16)  1 − 0.8770  0.1230

(E)

15.18. Let S be the value of the light bulbs. E[S]  (50)(200)  10,000 Var ( S )  (50)(400) + (100)(2002 )  4,020,000 8000 − 10,000 Pr ( S < 8000)  Φ √  Φ (−1)  0.1587 4,020,000

!

(A)

15.19. Let S1 be the maintenance cost for one machine. The mean cost per machine is E[S1 ]  3 (80)  240. and the variance, by the compound variance formula, is E[S1 ]  3 (2002 ) + 3 (802 )  139,200 So the mean cost for n machines is 240n and the variance is 139,200n. If S is aggregate maintenance costs, we want Pr ( S < 1.2 E[S]) ≤ 0.9 Using the normal approximation,

1.2 E[S] − E[S] Φ √ Var ( S ) ! 0.2 (240n ) Φ √ 139,200n √ 48 n √ 139,200 √ 48 n

!

> 0.9 > 0.9 > 1.282 since Φ−1 (0.9)  1.282 > 1.282 139,200

n> 100 machines are needed. (C) C/4 Study Manual—17th edition Copyright ©2014 ASM

p

1.2822 (139,200)  99.3 482

EXERCISE SOLUTIONS FOR LESSON 15

273

15.20. Let X be towing cost per tow, S aggregate towing cost. Then E[X]  0.5 (80) + 0.4 (100) + 0.1 (160)  96 E[X 2 ]  0.5 (802 ) + 0.4 (1002 ) + 0.1 (1602 )  9760 E[S]  1000 (96)  96,000 Var ( S )  1000 (9760)  9,760,000 Since the club pays 90%, claims must be more than 100,000 before the club pays 90,000. To be precise, since the severity distribution is discrete, we should add a continuity correction. The interval between towing fees is 20, so we should add half of that, or 10, to 100,000. It hardly makes a difference with numbers of this magnitude, especially with SOA rounding rules on the argument of the standard normal’s cumulative distribution function. 100,010 − 96,000  1 − Φ (1.28)  1 − 0.8997  0.1003 Pr ( S > 100,010)  1 − Φ √ 9,760,000

!

(B)

15.21. We’ll calculate the mean and variance for 6 days. Let S be weekly aggregate damage, and X the amount of damage per accident. Then 1 1 (2000) + (8000)  3000 2 4

!

E[X] 

!

1 1 E[X ]  (20002 ) + (80002 )  18,000,000 2 4 2

!

!

E[S]  18 (3000)  54,000 Var[S]  18 (18,000,000)  324,000,000 63,000 − 54,000  Φ (0.5)  0.6915 Pr ( S ≤ 63,000)  Φ √ 324,000,000

!

(D)

Since severity is discrete, a continuity correction should be made. The loss random variable is a multiple of 2000. Thus in order to be more than 63,000, it must be at least 64,000, or 2000 more than the next possible value of 62,000. The midpoint is 63,000. Thus the continuity correction has no effect. 15.22. No calculations are needed. The aggregate mean and standard deviation are both reduced by 20%, so the normal approximation, which is a linear combination of the mean and standard deviation, is also reduced by 20% . (C) 15.23. The second moment of a uniform distribution starting at 0 is the upper bound squared over 3. Let S1 be aggregate claims of type I, S2 aggregate claims of type II, and S  S1 + S2 . For type I, E[S1 ]  12 Var ( S1 )  12

1 2 1 3

6 4

For type II, E[S2 ]  4 Var ( S2 )  4 Adding together, total mean is 16 and total variance is

5 2  10 25 100 3  3 112 3 .

Then the normal approximation gives

18 − 16 Pr ( S > 18)  1 − Φ √  1 − Φ (0.33)  1 − 0.6293  0.3707 112/3

!

(A)

15. AGGREGATE LOSS MODELS: APPROXIMATING DISTRIBUTION

274

15.24. E[X 2 ]  100 + 10  110, so Var ( S )  2 (110)  220. We must make a continuity correction (use 5.5 instead of 5), so ! 5.5 − 20  Φ (−0.98)  0.1635 Pr ( S < 5.5)  Φ √ 220 15.25. Recall from the first paragraph of Lesson 12 that a gamma mixture of Poisson distributions is a negative binomial distribution with the same parameters: r  α (which is 4 here) and β  θ (which is 0.2 here). It follows that E[N]  0.8 Var ( N )  0.96 Alternatively, for variance, use the conditional variance formula, equation (4.2) on page 64, and the fact that for a gamma distribution with parameters α, θ, the mean is αθ and the variance is αθ2 : Var ( N )  E Var ( N | λ ) + Var E[N | λ]  E[λ] + Var ( λ )  (4)(0.2) + (4)(0.22 )  0.96

f

g





Now for severity. 3 · 10,000 10,0003 −  13,750 2 2 (20,0002 ) 3 · 10,0002 2 · 10,0003 E[ ( X ∧ 20,000) 2 ]  −  2 · 108 1 20,000 Var ( X )  2 · 108 − 13,7502  10,937,500 E[X ∧ 20,000] 

E[S]  (0.8)(13,750)  11,000

Var ( S )  0.8 (10,937,500) + 0.96 (13,7502 )  190,250,000 30,000 − 11,000 Pr ( X > 30,000)  1 − Φ √  1 − Φ (1.38)  0.0838 190,250,000

!

15.26. We calculate aggregate mean and variance. 2

E[S]  0.7e 5+0.5 (1.2 )  213.4334 2

Var ( S )  0.7e 10+2 (1.2 )  274,669.8 E[S2 ]  274,669.8 + 213.43342  320,223.7 We solve for the µ and σ parameters of the lognormal having this mean and second moment. µ + 0.5σ2  ln 213.4334 2µ + 2σ2  ln 320,223.7 σ2  ln 320,223.7 − 2 ln 213.4334  1.9501 σ  1.3965

µ  ln 213.4334 − 0.5 (1.9501)  4.3883

Now apply the lognormal approximation.

ln 300 − 4.3883 Pr ( S > 300)  1 − Φ  1 − Φ (0.94)  0.1736 1.3965

!

EXERCISE SOLUTIONS FOR LESSON 15

275

15.27. For the aggregate distribution S: E[S]  50 (4,500)  225,000 Var ( S )  50 (30002 ) + 122 (45002 )  3,366,000,000 We must fit a lognormal to this mean and variance. We have for the lognormal’s µ and σ µ + 0.5σ2  ln 225,000  12.32386 2µ + 2σ2  ln (3,366,000,000 + 225,0002 )  24.71208 σ2  24.71208 − 2 (12.32386)  0.06436 σ  0.2537

µ  12.32386 − 0.5 (0.06436)  12.2917 The probability that S is greater than 1.5 (225,000)  337,500 is ln 337,500 − 12.2917 1−Φ  1 − Φ (1.72)  1 − 0.9573  0.0427 0.2537

!

(B)

15.28. The formula for TVaR for a normal distribution is equation (8.7): TVaR0.9 ( X )  µ + σ

φ ( z0.9 ) 1 − 0.9

In our case, the 90th percentile of a standard normal distribution is z0.9  1.282, and 2

φ (1.282) 

e −1.282 /2 0.4399   0.1754 √ 2.5066 2π

φ (1.282)  1.754 1 − 0.9

Let n  500 be the sample size and let S be aggregate losses. Mean aggregate losses are 10 θ  (500)(0.4)  1000 E[S]  nλ α−1 2

!

!

and letting X be the claim size distribution, the variance of aggregate losses is, using the compound variance formula for a compound Poisson distribution, Var ( S )  nλ E[X 2 ]  (500)(0.4)

2θ 2 2 (102 )  (500)(0.4)  20,000 ( α − 1)( α − 2) (2)(1)

!

Therefore, TVaR at the 90% security level is approximated as 1000 + 1.754 20,000  1248

p

!

15. AGGREGATE LOSS MODELS: APPROXIMATING DISTRIBUTION

276

Quiz Solutions 15-1.

Calculate the aggregate mean and variance. E[N]  0.15 E[X] 

Var ( N )  (0.15)(1.15)  0.1725

1000  500 2 E[S]  0.15 (500)  75

Var ( X ) 

10002 − 5002  250,000 2·1

Var ( S )  0.15 (250,000) + 0.1725 (5002 )  80,625 √ The 95th percentile of aggregate losses is 75 + 1.645 80,625  542.09

Lesson 16

Aggregate Losses: Severity Modifications Reading: Loss Models Fourth Edition 9.7 Individual losses may be subject to deductibles, limits, or coinsurance. These modifications reduce the expected annual payments made on losses, or the expected annual aggregate costs. In previous lessons, when losses were subject to per-claim deductibles, we needed to distinguish between expected payment per loss and expected payment per payment, since not every loss resulted in a payment. In the previous paragraph, we introduced a third concept: expected annual payments, or expected annual aggregate costs. This concept is neither payment per payment nor payment per loss; it is payment per year. The word “annual” means “per year”. Expected annual payments are the average total payments divided by the number of years. Do not divide by number of losses, nor by number of payments. Do not exclude years in which no payments are made from the denominator. However, the concept of “expected annual payments” is related to the concepts of “payment per payment” and “payment per loss”. You may calculate expected annual aggregate payments (sometimes called “expected annual aggregate costs”) in one of two ways: 1. You may calculate expected number of losses times expected payment per loss. In other words, do not modify the frequency distribution for the deductible. Calculate expected number of losses (not payments). But modify the severity distribution. Use the payment-per-loss random variable Y L for the severity distribution, so that there will be a non-zero probability of 0. Multiply expected payment per loss times expected number of losses. 2. You may calculate expected number of payments times expected payment per payment. In other words, modify the frequency distribution for the deductible. Calculate the expected number of payments (not losses). Modify the severity distribution; use the payment per payment variable Y P . Payments of 0 are excluded. Multiply expected payment per payment times the number of payments. To repeat, you have two choices: Expected payment per loss × Expected number of losses per year OR

Expected payment per payment × Expected number of payments per year Both of these formulas will result in the same answer. The one thing to avoid is mixing the two formulas. You may not use expected payment per payment times expected number of losses per year! In the first formula there is no modification to frequency. Therefore, it is usually easier to use the first formula for discrete severity distributions. In the second formula frequency must be modified. However, expected payment per payment is easier to calculate than expected payment per loss if severity is exponential, Pareto, or uniform, making the second formula preferable in those cases. C/4 Study Manual—17th edition Copyright ©2014 ASM

277

16. AGGREGATE LOSSES: SEVERITY MODIFICATIONS

278

Example 16A Annual frequency of losses follows a Poisson distribution with mean 0.4. Sizes of loss have the following discrete distribution: Loss size Probability 5 10 20 40

0.4 0.3 0.2 0.1

An insurance coverage has a per-loss ordinary deductible of 8. Calculate expected annual aggregate payments on this coverage. Answer: The expected payment per loss after the deductible is (0.3)(10 − 8) + (0.2)(20 − 8) + (0.1)(40 − 8)  6.2. The expected number of losses per year is 0.4. Hence expected annual aggregate payments are (0.4)(6.2)  2.48 . This was the easier way to calculate it, but you could also use the other formula. Expected payment per payment, since only 0.6 of losses get any payment, is 6.2/0.6. Expected number of payments per year is (0.6)(0.4)  0.24. Hence expected annual aggregate payments are 0.24 (6.2/0.6)  2.48 .  Example 16B Annual number of losses follows a negative binomial distribution with parameters r  2, β  0.1. Size of individual loss follow a two-parameter Pareto distribution with α  2, θ  10. An insurance coverage has a per-claim ordinary deductible of 8. Calculate expected annual aggregate payments on this coverage. Answer: The expected payment per payment is ( θ + d ) / ( α − 1)  (10 + 8) / (2 − 1)  18. The modified frequency distribution for the number of payments per year has mean rβ Pr ( X > 8)  (2)(0.1)(10/18) 2 . Expected annual aggregate payments are 18 (0.2)(10/18) 2  10/9 . It was easier to use this formula than the expected payment per loss times number of losses formula in this case.  Everything we said for expected values holds for variances as well. You may either use N (number of losses) in conjunction with Y L , or N P (number of payments) in conjunction with Y P . Either way, you use the compound variance formula.

?

Quiz 16-1 Annual frequency of losses follows a negative binomial distribution with parameters r  1.5, β  0.2. Individual loss sizes follow an exponential distribution with θ  40. An insurance coverage has a per-loss ordinary deductible of 25. Calculate the variance of annual aggregate payments on this coverage.

Exercises 16.1. A company insures a fleet of vehicles. Aggregate losses have a compound Poisson distribution. The expected annual number of losses is 10. Loss amounts, regardless of vehicle type, have a two-parameter Pareto distribution with parameters α  1, θ  200. Insurance is subject to a per-loss deductible of 100 and a per-loss maximum payment of 500. Calculate expected annual aggregate payments under this insurance.

Exercises continue on the next page . . .

EXERCISES FOR LESSON 16

279

16.2. [3-S01:26] A company insures a fleet of vehicles. Aggregate losses have a compound Poisson distribution. The expected number of losses is 20. Loss amounts, regardless of vehicle type, have exponential distribution with θ  200. In order to reduce the cost of the insurance, two modifications are to be made: (i)

a certain type of vehicle will not be insured. It is estimated that this will reduce loss frequency by 20%. (ii) a deductible of 100 per loss will be imposed. Calculate the expected aggregate amount paid by the insurer after the modifications. (A) 1600

(B) 1940

(C) 2520

(D) 3200

(E) 3880

Use the following information for questions 16.3 and 16.4: •

Losses follow a Pareto distribution with parameters θ  1000 and α  2.

10 losses are expected each year.

The number of losses and the individual loss amounts are independent.

For each loss that occurs, the insurer’s payment is equal to the entire amount of the loss if the loss is greater than 100. The insurer makes no payment if the loss is less than or equal to 100.

16.3. (A) (B) (C) (D) (E)

[4B-S95:22] (2 points) Determine the insurer’s expected annual payments. Less than 8000 At least 8000, but less than 9000 At least 9000, but less than 9500 At least 9500, but less than 9900 At least 9900

16.4. [4B-S95:23] (2 points) Determine the insurer’s expected number of annual payments if all loss amounts increased uniformly by 10%. (A) (B) (C) (D) (E)

Less than 7.9 At least 7.9, but less than 8.1 At least 8.1, but less than 8.3 At least 8.3, but less than 8.5 At least 8.5

Exercises continue on the next page . . .

16. AGGREGATE LOSSES: SEVERITY MODIFICATIONS

280

16.5. [CAS3-S04:39] PQR Re provides reinsurance to Telecom Insurance Company. PQR agrees to pay Telecom for all losses resulting from “events”, subject to a \$500 per event deductible. For providing this coverage, PQR receives a premium of \$250. Use a Poisson distribution with mean equal to 0.15 for the frequency of events. Event severity is from the following distribution:

Loss

Probability

250 500 750 1,000 1,250 1,500

0.10 0.25 0.30 0.25 0.05 0.05

i  0%

Using the normal approximation to PQR’s annual aggregate losses on this contract, what is the probability that PQR will pay out more than it receives? (A) (B) (C) (D) (E)

Less than 12% At least 12%, but less than 13% At least 13%, but less than 14% At least 14%, but less than 15% At least 15%

Use the following information for questions 16.6 and 16.7: Auto collision coverage is subject to a 100 deductible. Claims on this coverage occur at a Poisson rate of 0.3 per year. Claim size after the deductible has the following distribution: Claim size

Probability

100 300 500 700

0.4 0.3 0.2 0.1

Claim frequency and severity are independent. 16.6.

The deductible is raised to 200.

Calculate the variance in claim frequency with the revised deductible. 16.7.

The deductible is raised to 400.

Calculate the variance in aggregate payments per year with the revised deductible.

Exercises continue on the next page . . .

EXERCISES FOR LESSON 16

281

[4B-F98:28] (2 points) You are given the following:

16.8. •

Losses follow a lognormal distribution with parameters µ  10 and σ  1.

One loss is expected each year.

For each loss less than or equal to 50,000, the insurer makes no payment.

For each loss greater than 50,000, the insurer pays the entire amount of the loss up to the policy limit of 100,000. Determine the insurer’s expected annual payments.

(A) (B) (C) (D) (E)

Less than 7,500 At least 7,500, but less than 12,500 At least 12,500, but less than 17,500 At least 17,500, but less than 22,500 At least 22,500

Use the following information for questions 16.9 and 16.10: An insurer has excess-of-loss reinsurance on auto insurance. You are given: (i) Total expected losses in the year 2001 are 10,000,000. (ii) In the year 2001 individual losses have a Pareto distribution with

!2

2000 F (x )  1 − , x + 2000

x>0

(iii) Reinsurance will pay the excess of each loss over 3000. (iv) Each year, the reinsurer is paid a ceded premium, C year , equal to 110% of the expected losses covered by the reinsurance. (v) Individual losses increase 5% each year due to inflation. (vi) The frequency distribution does not change. 16.9.

[3-F00:41] Calculate C 2001 .

(A) 2,200,000

(B) 3,300,000

(C) 4,400,000

(D) 5,500,000

(E) 6,600,000

(C) 1.06

(D) 1.07

(E) 1.08

16.10. [3-F00:42] Calculate C 2002 /C 2001 . (A) 1.04

(B) 1.05

16.11. The number of losses per year on an insurance coverage follows a binomial distribution with m  9, q  0.1. The size of each loss is uniformly distributed on (0, 60]. The size of loss is independent of the number of losses. The insurance coverage has a per-loss ordinary deductible of 12. Calculate the variance of annual aggregate payments on the coverage. 16.12. Losses follow an exponential distribution with mean 1000. Insurance pays losses subject to a deductible of 500 and a maximum covered loss. The expected annual number of losses is 10. Determine the maximum covered loss needed so that expected aggregate annual payments equal 5000.

Exercises continue on the next page . . .

16. AGGREGATE LOSSES: SEVERITY MODIFICATIONS

282

16.13. You are given the following: (i)

The underlying loss distribution for an individual claim amount is a single-parameter Pareto with α  1 and θ  500. (ii) An insurance coverage has a deductible of 1,000 and a maximum covered loss of 10,000. (iii) The expected number of losses per year for each policyholder is 3. (iv) Annual claim counts and claim amounts are independent. Calculate the expected annual claim payments for a single policyholder on this coverage. Use the following information for questions 16.14 and 16.15: You are given the following: •

Loss sizes for Risk 1 follow a Pareto distribution with parameters θ and α, α > 2.

Loss sizes for Risk 2 follow a Pareto distribution with parameters θ and 0.8α, α > 2.

The insurer pays all losses in excess of a deductible of k.

1 loss is expected for each risk each year.

16.14. [4B-F97:22] (2 points) Determine the expected amount of annual losses paid by the insurer for Risk 1. θ+k (A) α−1 θα (B) (θ + k )α αθ α (C) ( θ + k ) α+1 θ α+1 (D) ( α − 1)( θ + k ) α θα (E) ( α − 1)( θ + k ) α−1 16.15. [4B-F97:23] (1 point) Determine the limit of the ratio of the expected amount of annual losses paid by the insurer for Risk 2 to the expected amount of annual losses paid by the insurer for Risk 1 as k goes to infinity. (A) 0

(B) 0.8

(C) 1

(D) 1.25

(E) ∞

Exercises continue on the next page . . .

EXERCISES FOR LESSON 16

283

16.16. [CAS3-S04:19] A company has a machine that occasionally breaks down. An insurer offers a warranty for this machine. The number of breakdowns and their costs are independent. The number of breakdowns each year is given by the following distribution: # of breakdowns

Probability

0 1 2 3

50% 20% 20% 10%

The cost of each breakdown is given by the following distribution: Cost

Probability

1,000 2,000 3,000 5,000

50% 10% 10% 30%

To reduce costs, the insurer imposes a per claim deductible of 1,000. Compute the standard deviation of the insurer’s losses for this year. (A) 1,359

(B) 2,280

(C) 2,919

(D) 3,092

(E) 3,434

16.17. The annual number of losses on an insurance coverage has the following distribution: Number of losses

Probability

0 1 2 3

0.45 0.25 0.20 0.10

Size of loss follows a two-parameter Pareto distribution with α  4, θ  100. The insurance coverage has a deductible of 80. Calculate the variance of aggregate annual payments on the coverage. 16.18. Annual claim counts follow a geometric distribution with β  0.2. Loss sizes follow a uniform distribution on [0, 1000]. Losses are independent of each other and are independent of claim counts. A per-claim deductible of 200 is imposed. Calculate the raw second moment of annual aggregate payments.

Exercises continue on the next page . . .

16. AGGREGATE LOSSES: SEVERITY MODIFICATIONS

284

16.19. [3-F01:6] A group dental policy has a negative binomial claim count distribution with mean 300 and variance 800. Ground-up severity is given by the following table: Severity 40 80 120 200

Probability 0.25 0.25 0.25 0.25

You expect severity to increase 50% with no change in frequency. You decide to impose a per claim deductible of 100. Calculate the expected total claim payment after these changes. (A) (B) (C) (D) (E)

Less than 18,000 At least 18,000, but less than 20,000 At least 20,000, but less than 22,000 At least 22,000, but less than 24,000 At least 24,000

16.20. [SOA3-F04:17] The annual number of losses follows a Poisson distribution with a mean of 5. The size of each loss follows a two-parameter Pareto distribution with θ  10 and α  2.5. Claims counts and sizes are independent. An insurance for the losses has an ordinary deductible of 5 per loss. Calculate the expected value of the aggregate annual payments for this insurance. (A) 8

(B) 13

(C) 18

(D) 23

(E) 28

16.21. On an auto collision coverage, the number of losses per year follows a Poisson distribution with mean 0.25. Loss size is exponentially distributed with mean 1200. An ordinary deductible of 500 is applied to each loss. Loss sizes and loss counts are independent. Calculate the probability that aggregate claim payments for a year will be greater than 100, using the normal approximation. 16.22. On an auto collision coverage, you are given: (i)

The number of claims per year per individual has the following distribution: f (0)  0.7 f (1)  0.2 f (2)  0.1

(ii) Loss sizes are exponentially distributed with mean 1200. (iii) Loss sizes and claim counts are independent. An ordinary deductible of 500 is applied to each loss. Calculate the probability that aggregate claim payments for a year will be greater than 100, using the normal approximation.

Exercises continue on the next page . . .

EXERCISES FOR LESSON 16

285

Use the following information for questions 16.23 and 16.24: On an insurance coverage, loss size has the following distribution: 2000 F (x )  1 − x

!3

x ≥ 2000.

The number of claims has a negative binomial distribution with mean 0.3, variance 0.6. Claim counts and loss sizes are independent. 16.23. A deductible of 2000 is applied to each claim. Calculate the variance of aggregate payments. 16.24. A deductible of 3000 is applied to each claim. Calculate the variance of aggregate payments. 16.25. For an insurance coverage, you are given •

Losses in 2009, before application of any deductible or limit, have a distribution with density function 1      3000 fX ( x )   − ( x−1000) /2000    e  3000

0 < x ≤ 1000 x ≥ 1000

Losses in 2010, before application of any deductible or limit, are impacted by 10% uniform inflation.

Insurance coverage is subject to a deductible of 500.

Loss frequency is the same in both years. You may use the following: ∞

Z 0

Z 0

e −x/β dx  β xe −x/β dx  β 2

Determine the percentage increase in aggregate claim payments in 2010 over 2009. Additional released exam questions: CAS3-S05:6

Solutions 16.1.

Expected payment per loss is E[X ∧ 600] − E[X ∧ 100]  −200 ln (200/800) + 200 ln (200/300)  196.166

Multiplying by 10 losses per year, the answer is 1961.66 . You could also calculate expected payment per payment and modify frequency, but that is a harder way.

16. AGGREGATE LOSSES: SEVERITY MODIFICATIONS

286

16.2. The first change will reduce frequency to 16. The second change will multiply frequency by S (100)  e −100/200  0.606531. For an exponential, expected payment per payment with the deductible— mean excess loss—is still θ  200. So expected aggregate losses are expected payment per payment times number of payments, or 16 (0.606531)(200)  1940.90 . (B) You could also calculate expected payment per loss and multiply that by unmodified frequency of 20, but that is harder. 16.3.

There are two ways to calculate annual payments:

Multiply expected number of losses times expected payment per loss.

Multiply expected number of payments times expected payment per payment.

In this question, there is an easy formula for mean excess loss for a Pareto, equation (6.10), so the second method is preferable. The mean excess loss at 100 is ( θ + d ) / ( α − 1)  1100/1  1100. The expected payment per payment for a franchise deductible of 100 is therefore 1100 + 100  1200. The expected number of payments per year is !2   1000 10 Pr ( X > 100)  10 1 − F (100)  10 1100 Therefore, expected annual payments are 10 16.4.

1000 2 1100 (1200)

The inflated θ for the Pareto is 1000 (1.1)  1100. The expected number of annual payments is 10 Pr ( X > 100)  10

16.5.

 9917.36 . (E)

1100 1100 + 100

!2  8.403

(D)

Mean loss size and second moment are, after the \$500 deductible: E[X]  0.3 (250) + 0.25 (500) + 0.05 (750) + 0.05 (1000)  287.5 E[X 2 ]  0.3 (2502 ) + 0.25 (5002 ) + 0.05 (7502 ) + 0.05 (10002 )  159,375

Expected losses are 0.15 (287.5)  43.125, and variance is 0.15 (159,375)  23,906.25. We need the probability of paying out more than 250. Since the aggregate distribution is discrete, this is the same as the probability of paying out at least 500, and we need to make a continuity correction. We’ll calculate the probability of paying out more than 375, the midpoint of (250, 500) . 375 − 43.125 1−Φ √  1 − Φ (2.15)  1 − 0.9842  0.0158 23,906.25

!

(A)

The answer is way out of the range of the choices. Even without the continuity correction it would be far out of the range. We’ll never know what the CAS was thinking about. 16.6. The probability of paying a claim is 0.6 of what it was before, so the new Poisson rate is (0.6)(0.3)  0.18 . 16.7. The deductible of 100 was increased to 400, so all claims are 300 less. Thus the revised payment per payment distribution is a 2/3 probability of paying 200 and a 1/3 probability of paying 400. The revised Poisson rate is (0.3)(0.3)  0.09. We calculate the variance using the compound Poisson formula. Let S be aggregate payments and Y P individual payments.) Var ( S )  λ P (E[X 2 ]) 2 1 E[ ( Y P ) 2 ]  (2002 ) + (4002 )  80,000 3 3 C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 16

287

Var ( S )  0.09 (80,000)  7200 Alternatively, and easier, you could let the compound model be number of losses and payment per loss. Then the Poisson rate is 0.3 and the payment per loss has a 0.7 probability of 0, 0.2 probability of 200, and 0.1 probability of 400. Let Y L be individual payments. E[ ( Y L ) 2 ]  0.2 (2002 ) + 0.1 (4002 )  24,000 Var ( S )  0.3 (24,000)  7200 16.8.

We want the payment per loss for a coverage with a franchise deductible. Answer  E[X ∧ 100,000] − E[X ∧ 50,000] + 50,000 1 − F (50,000)





The last term is added in to cover the franchise deductible. As the following development shows, it cancels one of the terms in E[X ∧ 50,000]. E[X ∧ 100,000]  e 10.5 Φ (ln 100,000 − 11) + 100,000 1 − Φ (ln 100,000 − 10)



E[X ∧ 50,000]  e 10.5 Φ (ln 50,000 − 11) + 50,000 1 − Φ (ln 50,000 − 10)



50,000 1 − F (50,000)  50,000 1 − Φ (ln 50,000 − 10)













Answer  e 10.5 Φ (ln 100,000 − 11) + 100,000 1 − Φ (ln 100,000 − 10)



− e 10.5 Φ (ln 50,000 − 11)



 e 10.5 Φ (0.51) + 100,000 1 − Φ (1.51) − e 10.5 Φ (−0.18)





 36,316 (0.6950) + 100,000 (1 − 0.9345) − 36,316 (0.4286)  25,239 + 6,550 − 15,565  16,224

(C)

16.9. Each individual loss has a Pareto distribution has parameters α  2 and θ  2000, so the mean individual loss is θ E[X]   2000 α−1

Since expected (aggregate) losses are 10,000,000, this means the expected number of losses is 10,000,000  2000 5000. Frequency doesn’t change, so the expected number of losses in 2001 and 2002 is also 5000. We need to calculate the expected reinsurance payment per loss, not per payment, since fwe know theg expected number of losses, not the expected number of reinsurance payments. We need E ( X − 3000)+ . This can be calculated by using E[ ( X − 3000)+ ]  E[X] − E[X ∧ 3000]

using the formula in the Loss Models appendix for E[X ∧ 3000], or using the special formula for the mean excess loss of a Pareto, equation (6.10) on page 100, in conjunction with formula (6.7). Let’s do it the latter way, since these two formulas are easy to remember and use. θ + 3000  5000 α−1 !2 2000 S (3000)   0.16 2000 + 3000 e (3000) 

E[ ( X − 3000)+ ]  e (3000) S (3000)  800 Therefore, C 2001  5000 (800)(1.1)  4,400,000 . (C) C/4 Study Manual—17th edition Copyright ©2014 ASM

16. AGGREGATE LOSSES: SEVERITY MODIFICATIONS

288

16.10. The new Pareto parameters for expected individual loss size are α  2, θ  2100. Proceeding as in the previous exercise, e (3000)  5100 2100 S (3000)  5100

!2

21002  864.7059 5100 864.7059  1.080882  800

E[ ( X − 3000)+ ]  C 2002 C 2001

(E)

Note that the other factors (5000, 1.1) are the same in both years, and therefore don’t affect the ratio. 16.11. Payment per payment is uniform on (0, 48]. The number of payments per year follows a binomial distribution with m  9, q  0.1 Pr ( X > 12)  0.08. Using the compound variance formula for payments: E[N P ]  mq  9 (0.08)  0.72 Var ( N P )  mq (1 − q )  9 (0.08)(0.92)  0.6624 48  24 E[X P ]  2 482 Var ( X P )   192 12 Var ( S )  (0.72)(192) + 0.6624 (242 )  519.7824 16.12. The average payment per loss is E[X ∧ u] − E[X ∧ 500]  1000 1 − e −u/1000 − 1 − e −500/1000





 1000 e −1/2 − e −u/1000









and we set this final expression equal to 500 so that with 10 expected losses it will equal 5000. 1000 e −1/2 − e −u/1000  500





e −u/1000  e −1/2 −

1 2

 0.10653

u  2239.32

16.13. The formulas in the tables don’t work for a single-parameter Pareto with α  1, so we work this out from first principles by integration. We integrate the survival function from 1000 to 10,000.

Z

10,000 1000

500 dx 10,000  500 (ln 10,000 − ln 1000)  1151.29  500 ln x 1000 x

With 3 expected losses per year, expected annual claim payments are 3 (1151.29)  3453.87 . 16.14. Expected annual losses are expected number of losses per year (1) times expected payment per loss. Thus we want E[X] − E[X ∧ k], which from the tables is θ θ * θ − 1− α−1 α−1 θ+k

,

! α−1

θ + θ α−1 θ+k -

! α−1

EXERCISE SOLUTIONS FOR LESSON 16

289

16.15. Using the previous exercise’s solution, the ratio is

( α − 1)( θ + k ) α−1 θ 0.8α ∼ ( θ + k ) 0.2α → ∞ . θα (0.8α − 1)( θ + k ) 0.8α−1

(E)

16.16. To make the numbers more manageable, we’ll express cost in thousands. Let N be number of breakdowns, X cost per breakdown, S aggregate losses. E[N]  0.2 (1) + 0.2 (2) + 0.1 (3)  0.9 E[N 2 ]  0.2 (12 ) + 0.2 (22 ) + 0.1 (32 )  1.9 Var ( N )  1.9 − 0.92  1.09 The deductible makes each cost 1 (thousand) less. E[X]  0.1 (1) + 0.1 (2) + 0.3 (4)  1.5 E[X 2 ]  0.1 (12 ) + 0.1 (22 ) + 0.3 (42 )  5.3 Var ( X )  5.3 − 1.52  3.05

Var ( S )  0.9 (3.05) + 1.09 (1.52 )  5.1975 √ σS  5.1975  2.2798

The standard deviation is 1000 times 2.2798, or 2,280 . (B) 16.17. Frequency is not in the ( a, b, i ) family (i  0, 1), so it cannot be modified. We will use a compound model with number of losses and payment per loss. Frequency of losses has mean 0.25 + 2 (0.2) + 3 (0.1)  0.95 and second moment 0.25 + 4 (0.2) + 9 (0.1)  1.95, hence variance 1.95 − 0.952  1.0475. The modified distribution for payment per payment is a Pareto with α  4, θ  100 + 80  180, as we learned on page 100. Its mean is 180/3  60 and its variance is 2 (1802 ) / (3)(2) −602  7200. The payment per loss is a two-component mixture with probability (100/180) 4 of a payment and probability 1 − (100/180) 4 of 0. Let’s calculate the mean and variance of this twocomponent mixture random variable Y L , with I being the component of the mixture. 100 E[Y ]  60 180 L

!4

 5.715592

Var ( Y L )  E[Var ( Y L | I ) ] + Var (E[Y L | I])

!4



100 100 (7200) + 180 180

 996.1386

!4

!4 *1 − 100 + (602 ) 180 , -

Using the compound variance formula on our compound model, Var ( S )  (0.95)(996.1386) + (1.0475)(5.7155922 )  980.55 16.18. We’ll calculate the raw second moment as the variance plus the mean squared. The payment per payment is uniform on (0, 800]. The probability of a payment is 0.8, so the modified

16. AGGREGATE LOSSES: SEVERITY MODIFICATIONS

290

claim count distribution is geometric with β  0.2 (0.8)  0.16. Then E[S]  0.16 (400)  64 8002 Var ( S )  0.16 + (0.16)(1.16)(4002 )  38,229 31 12

!

E[S2 ]  38,229 13 + 642  42,325 13 16.19. The new expectation for severity, the modified payment per loss, is 0.25 1.5 (80) − 100 + 0.25 1.5 (120) − 100 + 0.25 1.5 (200) − 100  75













Then 75 (300)  22,500 . (D) 16.20. The easier method for solving this is to calculate the expected number of payments, the expected number of claims above the deductible, and multiply by the expected payment per payment,  e ( d ) ,which is easy to calculate for a Pareto. The expected number of claims above the deductible is 5 1 − F (5) , and e (d ) 

15 θ+d   10, α − 1 1.5

by equation (6.10) on page 100. Putting it together, the answer is 5

10 15

! 2.5

(10)  18.1444

(C)

The alternative is to calculate the expected number of losses and multiply by the expected payment per loss. 16.21. Using the ideas in Lesson 13, we modify the Poisson parameter to reflect the frequency of claim payments. For each loss, a payment will be made only if it is over 500, i.e. with probability S (500)  e −5/12 . Hence claim payment frequency has a Poisson distribution with parameter 0.25e −5/12 . The size of the payment, given a payment is made, is exponential with mean 1200, since the exponential distribution is memoryless. Letting N be the frequency of payments and X the claim payment distribution, we calculate Var ( S ) , aggregate payments, the usual way. E[S]  0.25e −5/12 (1200)  197.77 Var ( S )  0.25e −5/12 (12002 ) + 0.25e −5/12 (12002 )  474,653.25 197.77 − 100 Pr ( S > 100)  Φ √  Φ (0.14)  0.5557 474,653.25

!

16.22. Let N P be the number of payments. This exercise is like the last exercise, except that the modified distribution of payment frequency is harder to calculate. There are two ways of making one payment: 1. 2.

One loss greater than 500; the probability of this is Pr ( N P  1) 1 − F (500)  0.2e −5/12  0.13185.





Two losses, one greater than 500 and one less, which can happen in either order; the probability of this is   2 Pr ( N P  2) F (500) 1 − F (500)  2 (0.1) e −5/12 (1 − e −5/12 )  0.04493

EXERCISE SOLUTIONS FOR LESSON 16

291

The probability of one payment is therefore 0.13185 + 0.04493  0.17678. The probability of two payments is the probability of two losses times S (500) 2 , or 0.1e −10/12  0.04346. E[N P ]  1 (0.17678) + 2 (0.04346)  0.26370 E[ ( N P ) 2 ]  1 (0.17678) + 4 (0.04346)  0.35062 Var ( N P )  0.35062 − 0.263702  0.28108 Since the exponential is memoryless, the distribution of payments given that a payment is made is exponential with mean 1200. E[S]  0.26370 (1200)  316.44 Var ( S )  0.26370 (12002 ) + 0.28108 (12002 )  784,478 Pr ( S > 100)  Φ

316.44 − 100  Φ (0.24)  0.5948 √ 784,478

!

The alternative is to calculate number of losses and payment per loss. Let N be number of losses. Then E[N]  0.4, E[N 2 ]  0.2 + 0.1 (22 )  0.6, and Var ( N )  0.6 − 0.42  0.44. The payment per loss distribution is a two-point mixture of 0 and an exponential with mean 1200, with weight S (500) on the latter. Let Y L be the payment per loss random variable. E[Y L ]  1200e −5/12 Var ( Y L )  e −5/12 (12002 ) + e −5/12 (1 − e −5/12 )(12002 )  1,272,792 where Var ( Y L ) was computed using the compound variance formula with a Bernoulli primary with q  e −5/12 and an exponential secondary with θ  1200. Then E[S]  0.4 (1200e −5/12 )  316.44 Var ( S )  0.4 (1,272,792) + 0.44 (1200e −5/12 ) 2  784,478 as before. 16.23. Severity has a single parameter Pareto distribution with α  3, θ  2000. The deductible of 2000 doesn’t affect the frequency of claims since all losses are greater than 2000. Therefore, the variance of payment size is unaffected, and the expected value of payment size is reduced by the deductible of 2000. Let X be the loss and Y P be the payment size.

(3)(2000)

 3000 2 E[X 2 ]  (3)(20002 )  12 (106 ) E[X] 

Var ( Y P )  Var ( X )  12 (106 ) − 30002  3 (106 ) E[Y P ]  E[X] − 2000  1000

Var ( S )  0.3 3 (106 ) + 0.6 (106 )  1,500,000





16.24. This exercise is harder than the previous one, since the deductible now affects claim frequency. There are two ways we can do the exercise: 1.

We can let N P be the number of payments and Y P the payment per payment.

16. AGGREGATE LOSSES: SEVERITY MODIFICATIONS

292

2.

We can let N be the number of losses and Y L the payment per loss, or

Both methods require work. I think the first method is easier, but will demonstrate both ways. First method The negative binomial has rβ  0.3 and rβ (1 + β )  0.6, so β  1, r  0.3. The probability of a loss above 3000 is !α !3 θ 2000 8 Pr ( X > 3000)    3000 3000 27 By Table 13.1, the modified negative binomial has r  0.3, β  8/27, so its moments are 2.4  0.088889 27

E[N P ] 

0.3 (8)(35)  0.115226 272

Var ( N P ) 

As we mentioned on page 100, Y P is a two-parameter Pareto with modified parameters with parameters θ  3000 and α  3. Using the tables to calculate its mean and variance: θ 3000   1500 α−1 3−1 f g 2 (30002 ) 2θ 2   30002 E (Y P ) 2  ( α − 1)( α − 2) 2 Var ( Y P )  30002 − 15002  6,750,000 E[ ( Y P ) ] 

The variance of aggregate payments is Var ( S )  E[N P ] Var ( Y P ) + Var ( N P ) E[Y P ]2  (0.088889)(6,750,000) + (0.115226)(15002 )  859,259 Second method We computed the mean and variance of Y P in the first method. Therefore, 8 E[Y ]  E[Y ] Pr ( X > 3000)  (1500)  444.444 27 L

!

P

The variance is computed by treating Y L as a compound distribution. The primary distribution is Bernoulli with q  Pr ( X > 3000) and the secondary is Y P . 8 8 Var ( Y )  (6,750,000) + 27 27 L

!

!

19 (15002 )  2,469,136 27

!

The variance of aggregate payments is Var ( S )  0.3 (2,469,136) + 0.6 (444.4442 )  859,259 16.25. Let X be the loss random variable in 2009, and Y the loss random variable in 2010. One way to compute the expected values is as follows: 1000

Z E[X]  1000

Z 0

0

x dx + 3000

Z

1000

1000 1000 x dx   3000 6000 0 6 x2

xe − ( x−1000)/2000 dx 3000

QUIZ SOLUTIONS FOR LESSON 16 Substituting y  x − 1000:

Z

1000

293

1 xe − ( x−1000)/2000 dx  3000 3000 

So E[X] 

1000 6

Z 0

( y + 1000) e −y/2000 dy

20002 + 1000 (2000)  2000 3000

+ 2000  2166 23 . Now let’s compute E[X ∧ 500]. E[X ∧ 500] 

500

Z 0

  x dx + 500 1 − F (500) 3000

5 5002 + 500  6000 6

!

 458 13 So the average payment per loss E[X] − E[X ∧ 500]  2166 23 − 458 13  1708 13 . The number of losses doesn’t change with inflation, so if we calculate the average payment per loss after inflation, we’re done. But E[Y]  E[1.1X]  1.1 E[X]  2383.333 E[Y ∧ 500]  E[1.1X ∧ 500]  1.1 E X ∧

f

f

E X∧

500 1.1

g

500/1.1

Z 

0

500 1.1

x dx 500 * 500 + .1 − F / + 3000 1.1 1.1

!

, 500 1.1

g

-

< 1000, so the first definition of f ( x ) applies. 500 F  1.1

!

f

E X∧

500 1.1

g



500/1.1

Z 0

dx 500 1   3000 1.1 (3000) 6.6

(500/1.1) 2 6000

+

500 * 1 + .1 − / 1.1 6.6

500  34.4353 + 1.1

! , !

-

5.6  420.1102 6.6

E[Y ∧ 500]  1.1 (420.1102)  462.1212

So the average payment per loss is 2383.333 − 462.1212  1921.212. The percentage increase is 12.46% .

1921.212 1708.333

−1 

Quiz Solutions 16-1. Because of the exponential, it is easier to use the compound variance formula on the payment distribution rather than on the loss distribution. The modified negative binomial has parameters r  1.5 and β  0.2 Pr ( X > 25)  0.2e −25/40  0.107052. Payments are exponential with mean 40. Then the variance of annual aggregate payments is (using N P for the modified frequency and Y P for the modified severity) E[N P ] Var ( Y P ) + Var ( N P ) E[Y P ]2  1.5 (0.107052)(402 ) + (1.5)(0.107052)(1.107052)(402 )  541.36

294

16. AGGREGATE LOSSES: SEVERITY MODIFICATIONS

Lesson 17

Aggregate Loss Models: The Recursive Formula Reading: Loss Models Fourth Edition 9.6 (before Section 9.6.1), 9.6.2–9.6.4 (skip 9.6.1) In order to help us calculate the aggregate loss distribution, we will assume the distribution of the severities, the X i ’s, is discrete. This may sound strange, since usually we assume severities are continuous. However, a continuous distribution may be approximated by a discrete distribution. We will discuss how to do that in Section 19.2. Once we make the assumption that severities are discrete, aggregate losses are also discrete. To obtain the distribution of aggregate losses, we only need to calculate Pr ( S  n ) for every possible value n of the aggregate loss distribution. Using the law of total probability, Pr ( S  n ) 

X k

k X * Xi  n+ Pr ( N  k ) Pr , i1

(*)

In principle, this expression can be calculated, although for high values of n it may require an inordinate amount of computing power. We will use the following notations for the probability functions of the three distributions—frequency, severity, aggregate loss. p n  Pr ( N  n )  f N ( n ) f n  Pr ( X  n )  f X ( n ) g n  Pr ( S  n )  fS ( n ) To calculate g n , we must sum up over all possibilities of k claims summing up to n. In other words, we have the following version of (*): gn 

∞ X k0

pk

X i 1 +···+i k n

f i1 f i2 · · · f i k

(**)

The product of the f i t ’s is called the k-fold convolution of the f ’s, or f ∗k . If the probability of a claim size of 0, or f0 , is zero, then the outer sum is finite, but if f0 , 0, the outer sum is an infinite sum, since any number of zeroes can be included in the second sum. However, if the primary distribution is in the ( a, b, 0) class, we showed in Lesson 13 how the primary distribution can be modified so that the probability of X  0 can be removed. On exams, you will only need to calculate Pr ( S  n ) for very low values of n. You will be able to use (*) or (**). However, for the ( a, b, 0) and ( a, b, 1) classes, there is a recursive formula for calculating g n that is more efficient than the (**) for higher values of n and automatically takes care of the probability that X  0. For the ( a, b, 0) class, the formula is given in Corollary 9.9 of the textbook:

! k X bj 1 a+ f j g k−j gk  1 − a f0 k j1

295

k  1, 2, 3, . . .

17. AGGREGATE LOSS MODELS: THE RECURSIVE FORMULA

296

For a Poisson distribution, where a  0 and b  λ, the formula simplifies to gk 

k λX j f j g k− j k

k  1, 2, 3, . . .

j1

For the ( a, b, 1) class, Theorem 9.8 of the textbook provides the following formula:

 gk 

p1 − ( a + b ) p0 f k +



Pk

j1 ( a

+ b j/k ) f j g k− j

1 − a f0

k  1, 2, 3, . . .

To start the recursion, we need g0 . If Pr ( X  0)  0, this is p0 . In the general case, we can use the formula of Theorem 6.14, g0  PN ( f0 ) , where PN ( z ) is the probability generating function of the primary distribution. Formulas for PN ( z ) for the ( a, b, 0) and ( a, b, 1) classes are included in the tables you get with the exam. However, Theorem 6.14 is not on the syllabus. Should you memorize the recursive formula? The recursive formula is more efficient than convolution for calculating g n for large n, but for n ≤ 3, there’s hardly a difference, and exam questions are limited to this range. Pre-2000 syllabus exams required your knowledge of the formula for a Poisson primary. There were frequent problems requiring your backing out f k ’s or other numbers using the formula. Exams since 2000, however, have not asked any question requiring it. You could solve any question on these exams through convolution, possibly eliminating f0 using the techniques of Lesson 13, as we will illustrate in Example 17B. It is also noteworthy that the formula for g0 , as well as the p n , f n , g n notation, is off the syllabus. Therefore it is not worthwhile memorizing the recursive formula. You will never use it. The following example is worked out both ways. Example 17A For an insurance coverage, the number of claims has a negative binomial distribution with mean 4 and variance 8. Claim size is distributed as follows: Claim Size

Probability

1 2 3

0.50 0.40 0.10

Calculate Pr ( S ≤ 3) . Answer: First we will use the convolution method. We can have from 0 to 3 claims. The negative binomial has parameters β  1 and r  4. We have: 1 p0  2

!4

4 p1  1

!

5 p2  2

!

6 3

!

p3 



1 16

1 2

!5

1 2

!6

1 2

!7



1 8



5 32



5 32

1 1 Then g 0  p0  16  0.0625. For S to equal 1, there must be one claim of size 1, so g1  p 1 f1  18 (0.5)  16  5 1 2 0.0625. For S to equal 2, there must be one claim of size 2 or two claims of size 1, so g2  32 (0.5 ) + 8 (0.4)  C/4 Study Manual—17th edition Copyright ©2014 ASM

17. AGGREGATE LOSS MODELS: THE RECURSIVE FORMULA

297

0.0890625. For S to equal 3, there must be one claim of size 3, or two claims of sizes 2 and 1 which can happen in two orders, or three claims of size 1. Then g3 

1 5 5 (0.1) + (0.4)(0.5)(2) + (0.53 )  0.09453125. 8 32 32

Finally, Pr ( S ≤ 3)  g0 + g1 + g 2 + g3  0.0625 + 0.0625 + 0.0890625 + 0.09453125  0.30859375 . 1 Next we redo the example using the recursive method. g0  p0  16 . f0  0, so the fraction in front of the sum of the formula is 1. a  g1 



β 1+β



1 2

and b  ( r − 1) a  23 .

1 1 1 3 + (0.5)   0.0625 2 2 16 16



!

1 3 1 1 3 1 g2  + (0.5) + + (0.4)  0.0890625 2 4 16 2 2 16





!



!



1 3 1 6 1 3 1 1 g3  (0.5)(0.0890625) + (0.4) + (0.1)  0.09453125 + + + 2 6 2 6 16 2 2 16







!







!

Finally, Pr ( S ≤ 3)  g0 + g1 + g2 + g3  0.0625 + 0.0625 + 0.0890625 + 0.09453125  0.30859375 .



We can use the methods of Lesson 13 to work out questions regarding the aggregate claim distribution without the recursive formula if Pr ( X  0) , 0. We modify the frequency distribution and the severity distribution so that Pr ( X  0)  0. Example 17B The number of claims on an insurance coverage follows a binomial distribution with parameters m  3, q  0.2. The size of each claim has the following distribution: x Pr ( X  x ) 0 1 2

0.50 0.35 0.15

Calculate the probability of aggregate claims of 3 or more. Answer: The probability of aggregate claims of 3 or more is 1 − Pr ( S ≤ 2) , so we will calculate Pr ( S ≤ 2) . First we will calculate it using convolutions. To eliminate the probability of 0, we revise the binomial distribution by multiplying q by 1−0.5. The modified binomial has parameters m 0  3, q 0  (0.2)(1−0.5)  0.1. The revised severity distribution has f1  0.35/ (1 − 0.5)  0.7 and f2  0.15/ (1 − 0.5)  0.3. We now compute the p j ’s. p 0  0.93  0.729 3 p1  (0.92 )(0.1)  0.243 1

!

3 (0.9)(0.12 )  0.027 2

!

p2  Then

g0  p0  0.729 g1  (0.243)(0.7)  0.1701 g2  (0.243)(0.3) + (0.027)(0.72 )  0.08613 C/4 Study Manual—17th edition Copyright ©2014 ASM

17. AGGREGATE LOSS MODELS: THE RECURSIVE FORMULA

298

So Pr ( S ≤ 2)  0.729 + 0.1701 + 0.08613  0.98523 and the answer is 1 − 0.98523  0.01477 . In this particular case, since the binomial distribution has only a finite number of possibilities with nonzero probability, we can calculate the probability as a finite sum even without modifying the frequency distribution. For the unmodified distribution p0  0.83  0.512 3 (0.82 )(0.2)  0.384 p1  1

!

3 (0.8)(0.22 )  0.096 p2  2

!

p3  0.23  0.008 The probability of aggregate claims of 0 is the probability of no claims, or one 0, or two 0’s, or three 0’s: g0  0.512 + 0.384 (0.5) + 0.096 (0.52 ) + 0.008 (0.53 )  0.729 The probability of aggregate claims of 1 is the probability of one 1, or two claims of 1 and 0 (in two possible orders), or three claims of 1,0,0 (in three possible orders): g1  0.384 (0.35) + 0.096 (2)(0.5)(0.35) + 0.008 (3)(0.52 )(0.35)  0.1701 The probability of aggregate claims of 2 is the probability of one 2, or two claims of 2 and 0 (in two possible orders) or 1 and 1, or three claims of 1,1,0 (three orders) or 2,0,0 (three orders): g2  0.384 (0.15) + 0.096 (2)(0.5)(0.15) + 0.352 + 0.008 (3) (0.352 )(0.5) + (0.15)(0.52 )  0.08613









Even though it is possible to work it out this way, it’s easier to work it out with the modified distribution. For comparison, we will work it out with the recursive formula too. a  − 14 and b  1. Then g0  P (0.5)  1 + 0.2 (−0.5)



1

3

 0.93  0.729

g1 

1 − + 1 (0.35)(0.729)  0.1701 1 4 1 + 4 (0.5)

g2 

    *. − 1 + 1 (0.35)(0.1701) + − 1 + 1 (0.15)(0.729) +/  0.08613 4 2 4 1 + 14 (0.5) , -





1

We see that we got the same probabilities and therefore the same answer.

?



Quiz 17-1 Claim counts and sizes on an insurance coverage are independent and have the following distribution: Claim Sizes Claim Counts Number of claims

Probability

Claim size

Probability

0 1 2

0.4 0.2 0.4

100 200 400 1000

0.3 0.5 0.1 0.1

Let S be aggregate claims. Calculate Pr ( S ≤ 300) . C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISES FOR LESSON 17

299

In Subsections 9.6.2–9.6.4, the textbook discusses the following considerations of the recursive method. 1. The recursion starts at the probability of 0, fS (0) . However, fS (0) is likely to be small for a large group, possibly smaller than the smallest number a computer can store. To get around this, you can start at fS ( k ) , where k is selected so that fS ( k ) doesn’t underflow. For example, k could be selected 6 standard deviations below the mean. Assign an arbitrary set of values to fS (0) ,. . . fS ( k ) , like (0, 0, . . . , 1) . These values are used to start the recursion. Compute fS ( n ) for n > k until n is so large that fS ( n ) < fS ( k ) . After computing all the probabilities, divide each of them by their sum so that they add up to 1. An alternative is to calculate the distribution for a smaller parameter, like λ/2n instead of λ if frequency is Poisson, and then to convolve the results on themselves n times. 2. The algorithm is stable for Poisson and negative binomial since all factors in the summands are bj positive, but for a binomial there is a potential for numerical instability because a + k could be negative. 3. There is an integral formula if severity is continuous. I doubt the exam will test on these considerations.

Exercises Use the following information for questions 17.1 and 17.2: Customers arrive in a store at a Poisson rate of 0.5 per minute. The amount of profit the store makes on each customer is randomly distributed as follows: Profit Probability 0 1 2 3

0.7 0.1 0.1 0.1

17.1.

Determine the probability of making no profit in 10 minutes.

17.2.

Determine the probability of making profit of 2 in 10 minutes.

17.3.

[1999 C3 Sample:14] You are given:

An aggregate loss distribution has a compound Poisson distribution with expected number of claims equal to 1.25.

Individual claim amounts can take only the values 1, 2, or 3, with equal probability. Determine the probability that aggregate losses exceed 3.

Exercises continue on the next page . . .

17. AGGREGATE LOSS MODELS: THE RECURSIVE FORMULA

300

Use the following information for questions 17.4 and 17.5: Taxis pass by a hotel. The number of taxis per minute has a binomial distribution with parameters m  4 and q  0.25. The number of passengers each taxi picks up has the following distribution:

17.4. is 0. 17.5. is 1.

Number of passengers

Probability

0 1 2

0.5 0.3 0.2

Determine the probability that the total number of passengers picked up by the taxis in a minute Determine the probability that the total number of passengers picked up by the taxis in a minute

17.6. [3-F02:36] The number of claims in a period has a geometric distribution with mean 4. The amount of each claim X follows Pr ( X  x )  0.25, x  1, 2, 3, 4. The number of claims and the claim amounts are independent. S is the aggregate claim amount in the period. Calculate FS (3) . (A) 0.27

(B) 0.29

(C) 0.31

(D) 0.33

(E) 0.35

17.7. The number of taxis arriving at an airport in a minute has a Poisson distribution with mean 4. Each taxi picks up 1 to 4 passengers, with the following probabilities: Number of passengers

Probability

1 2 3 4

0.70 0.20 0.05 0.05

Calculate the probability that in one minute 4 or more passengers leave the airport by taxi. Use the following information for questions 17.8 and 17.9: For an insurance coverage, you are given: (i)

Claim frequency, before application of deductibles and limits, follows a geometric distribution with mean 5. (ii) Claim size, before application of deductibles and limits, follows a Poisson distribution with mean 1. (iii) Claim frequency and claim size are independent. (iv) There is a per-claim deductible of 1 and the maximum covered loss is 2 per loss. 17.8.

Calculate average aggregate payments per year.

17.9.

Calculate the probability that aggregate payments are greater than 3.

Exercises continue on the next page . . .

EXERCISES FOR LESSON 17

301

Use the following information for questions 17.10 and 17.11: For an insurance coverage, you are given: (i)

Claim frequency, before application of deductibles and limits, follows a geometric distribution with mean 5. (ii) Claim size, before application of deductibles and limits, follows a Poisson distribution with mean 1. (iii) Claim frequency and claim size are independent. (iv) There is a per-claim deductible of 1 and a maximum covered loss of 3 per loss. 17.10. Calculate average aggregate payments per year. 17.11. Calculate the probability that aggregate payments are greater than 2. 17.12. [151-82-93:12] (2 points) You are given: (i) Aggregate claims has a compound Poisson distribution with λ  0.8. (ii) Individual claim amount distribution is

(iii)

x

Pr ( X  x )

1 2 3

0.5 0.3 0.2

The probabilities for certain values of the aggregate claims, S, are: x

Pr ( S  x )

2 3 5

0.1438 0.1198 0.0294

Determine Pr ( S  4) . (A) 0.051

(B) 0.064

(C) 0.076

(D) 0.089

(E) 0.102

17.13. [151-82-93:10] (2 points) Aggregate claims S has a compound Poisson distribution with individual claim amount distribution: x

Pr ( X  x )

1 3

1/3 2/3

Also, Pr ( S  4)  Pr ( S  3) + 6 Pr ( S  1) . Determine Var ( S ) . (A) 76

(B) 78

(C) 80

(D) 82

Additional released exam questions: SOA M-F05:27, CAS3-S06:36, C-S07:8

(E) 84

17. AGGREGATE LOSS MODELS: THE RECURSIVE FORMULA

302

Solutions 17.1. Modify the Poisson to eliminate 0. The Poisson parameter for 10 minutes is (10)(0.5)  5, and the probability of non-zero profit is 0.3, so the Poisson parameter for non-zero profits is (5)(0.3)  1.5. Then the probability of 0 profit is e −1.5  0.2231 . 17.2. Modify the secondary distribution to eliminate the probability of 0. The conditional probabilities of 1 and 2 given profit greater than 0 are then 1/3 and 1/3 respectively. The modified Poisson parameter is 1.5, as we stated in the solution to the previous exercise. Then the probability of 2 is the probability of one customer with profit of 2 or two customers with profit 1 apiece: g 2  p 1 f2 +

p2 f12

 1.5e

−1.5

1 1.52 e −1.5 + 3 2

!

!

1 3

!2  0.1395

You can also work this out with the recursive formula combined with g0 from the previous exercise: g1  (5)(0.1)(0.2231)  0.1116 g2  25 (0.1)(0.1116) + 5 (0.1)(0.2231)  0.1395 but this method is inferior. 17.3. The probability of 0 is e −1.25  0.2865048. We’ll factor this out of the other probabilities. To get 1, there must be exactly 1 claim of size 1, so the probability is e −1.25

1.25  0.416667e −1.25 . 3

To get 2, there must be 2 claims of size 1 or 1 claim of size 2, so the probability is

* 1.252

e −1.25 .

,

2

!

1 1.25 + /  0.503472e −1.25 + 9 3

!

-

To get 3, there must be 3 claims of size 1 or 2 claims of sizes 1 and 2 (in either order) or 1 claim of size 3, so the probability is

! ! ! ! * 1.253 1 + 2 1.252 1 + 1.25 +/  0.602334e −1.25 6 27 2 9 3 ,

e −1.25 .

Also, e −1.25  0.2865048. So the probability of more than 3 claims is 1 − 0.2865048 (1 + 0.41667 + 0.503472 + 0.602334)  1 − 0.2865048 (2.52247)  0.2773 17.4. Modify the binomial distribution to eliminate 0 passengers by multiplying q by 1 − p0  0.5; the revised q  0.125. Then the probability of 0 passengers is (1 − 0.125) 4  0.5862 .

17.5. The modified binomial has m  4 and q  0.125, as mentioned in the solution to the last exercise. The modified probability of 1 passenger per taxi, given more than 0 passengers per taxi, is 0.3/0.5  0.6. The probability that the total number of passengers is 1 is 4 (0.8753 )(0.125)(0.6)  0.2010 1

!

EXERCISE SOLUTIONS FOR LESSON 17

303

17.6. You could use the recursive formula. An alternative is to work out all ways to get 3 or less. 1  0.2 and each successive p is obtained by multiplying the For the geometric distribution, p0  1+β previous one by 0.8: p1  0.16, p2  0.128, p3  0.1024. With one claim, there are three ways to get 3 or less (probability 0.75). With two claims, there are three ways to get a sum 3 or less (1 + 2, 2 + 1, or 1 + 1), and with three claims, there’s one way to get a sum of 3 or less (1 + 1 + 1). Adding these all up: FS (3)  0.2 + 0.16 (3)(0.25) + 0.128 (3)(0.252 ) + 0.1024 (1)(0.253 )  0.3456 17.7.

(E)

Using the recursive formula: g0  e −4 g1  4 (0.7) e −4  2.8e −4 g2  2 (0.7)(2.8) + 2 (0.2) e −4  4.72e −4



g3 

4 3





 (0.7)(4.72) + 2 (0.2)(2.8) + 3 (0.05) e −4  6.098667e −4

Using convolutions: g1  (4e −4 )(0.7)  2.8e −4 g2  (8e −4 )(0.72 ) + (4e −4 )(0.2)  4.72e −4 −4 3 −4 −4 −4 g3  ( 32 3 e )(0.7 ) + (8e )(2)(0.7)(0.2) + 4e (0.05)  6.098667e

Either way, Pr ( N ≥ 4)  1 − (1 + 2.8 + 4.72 + 6.098667) e −4  1 − 14.61867e −4  0.73225 .

17.8. For the Poisson, P (0)  P (1)  e −1 . Hence p0 , the probability of a payment of 0, is 2e −1 . The only other possible payment is 1, and p1  1 − 2e −1 . The average number of claims is 5, so the answer is 5 (1 − 2e −1 )  1.321

17.9. Aggregate payments greater than 3 means more than three payments. Using what we learned about coverage modifications, the frequency distribution for losses of 2 or more is a geometric distribution with β equal to the original β times the probability of a loss 2 or more, or 5 (1−2e −1 )  1.321. The probability of more than three payments is !4 !4 β 1.321   0.1050 1+β 2.321 17.10. We will use payment size as the subscript. When we take the deductible and limit into account, and apply a Poisson distribution with parameter 1: Payment size

Claim size

Probability

0 1 2

0 or 1 2 3 or more

f0  Pr ( X  0) + Pr ( X  1)  e −1 + e −1  0.73576 f1  Pr ( X  2)  0.5e −1  0.18394 f2  Pr ( X > 2)  1 − 0.73576 − 0.18394  0.08030

Therefore average payment size per claim is 0.18394 (1) + 0.08030 (2)  0.34454. The average number of claims per year is 5, making average aggregate payments 5 (0.34454)  1.7228 . 17.11. You can modify the geometric distribution to remove individual payments of 0. Use the formula in Table 13.1 on page 224 to adjust the geometric distribution for severity modifications. Here, the probability of a payment for each loss is 1 − f0  1 − 0.73576  0.26424, so the adjusted geometric distribution for non-zero payments has parameter β0  0.26424β  0.26424 (5)  1.3212, and the payment distribution conditional on non-zero payments would be (using Y for the payment variable) Pr ( Y  1)  C/4 Study Manual—17th edition Copyright ©2014 ASM

f1 0.18394   0.69611 0.26424 0.26424

17. AGGREGATE LOSS MODELS: THE RECURSIVE FORMULA

304

Pr ( Y  2) 

f2 0.08030   0.30389 0.26424 0.26424

Then the revised geometric distribution has the following probabilities. We use the ( a, b, 0) formula for the geometric distribution, β0 1.3212 p k−1  p k−1  0.5692p k−1 pk  0 1+β 2.3212

!

!

to calculate successive probabilities: p0  1 −

β0  1 − 0.5692  0.4308 1 + β0

p 1  (0.5692)(0.4308)  0.2452

p 2  (0.5692)(0.2452)  0.1396 The probability of zero aggregate payments is 0.4308. The probability of aggregate payments of 1 is the probability of one claim of size 1, or g1  (0.2452)(0.69611)  0.1707. The probability of aggregate payments of 2 is the probability of two claims of size 1 or one claim of size 2: g2  (0.1396)(0.696112 ) + (0.2452)(0.30389)  0.1422 Pr ( S ≥ 3)  1 − 0.4308 − 0.1707 − 0.1422  0.2563 . The alternative is to use the recursive formula. First of all, the probability aggregate claims is 0 is, using the off-syllabus formula, g0  PN ( f0 )  PN (0.73576) , where we’ve take f0 from the solution to the previous exercise. By Loss Models Appendix, for a geometric distribution, P ( z )  1 − β ( z − 1)



g0  1 − 5 (0.73576 − 1)



 −1

 −1

. So

 0.4308

Then g1  g2 

1 1−

1

1 − 56 (2e −1 )

 2.5849

5 1 −1 6 2e

 

5 −1 6 (2e )

5 6



5 6





 (0.4308)  0.1707

1 −1 2 e (0.1707)

+ (1 − 2.5e −1 )(0.4308)



 (0.1839)(0.1707) + (0.0803)(0.4308)  0.1422

Then Pr ( S ≥ 3)  1 − 0.4308 − 0.1707 − 0.1422  0.2563 .

17.12. This is an exercise where they virtually force you to use the recursive formula. One way to do it is to calculate g 1 with the recursive formula: g0  Pr ( S  0)  e −0.8  0.449329 g1  λ (1)(0.5) g0  0.4 (0.449329)  0.179732 Then calculate g 4 using the recursive formula:

!

g4 

 λ  (1) f1 g 3 + (2) f2 g 2 + (3) f3 g 1 4

EXERCISE SOLUTIONS FOR LESSON 17

305

 0.2 (1)(0.5)(0.1198) + (2)(0.3)(0.1438) + (3)(0.2)(0.179732)  0.05080





(A)

With this method, g5  Pr ( S  5) is not needed. The official method backed out g 4 using the recursive formula and the values of g5 , g3 , and g2 :

 0.8  (1)(0.5)( g4 ) + (2)(0.3)(0.1198) + (3)(0.2)(0.1438) 5 0.0294 − 0.07188 − 0.08628 0.5g 4  0.16 0.5g 4  0.02559

0.0294 

!

g 4  0.05118

(A)

The difference between the two answers is because of the use of rounded values for g2 and g3 . To do it without the recursive formula, consider all ways S could equal 4. Either four 1’s, one 2 and two 1’s (three orders), two 2’s, or one 3 and one 1 (two orders). Pr (1, 1, 1, 1)  e

−0.8

Pr (2, 1, 1)  3e −0.8 Pr (2, 2)  e

−0.8

Pr (3, 1)  2e

−0.8

0.84 (0.54 )  0.001067e −0.8 4!

!

0.83 (0.3)(0.52 )  0.0192e −0.8 3!

!

0.82 (0.32 )  0.0288e −0.8 2!

!

0.82 (0.2)(0.5)  0.064e −0.8 2!

!

Pr ( S  4)  (0.001067 + 0.0192 + 0.0288 + 0.064) e −0.8  0.113067e −0.8  0.050804 17.13. This is an example of the old style exam question which virtually required knowing the recursive formula, at least for the Poisson distribution. We must back out λ from the given relationship g4  g3 +6g1 . From the recursive formula 1 λ 2 g3 + 3λ g1 4 3 3 λ λ  g3 + g1 12 2

g4 





Since we are given g4  g3 +6g1 , we must have both λ/2  6 and λ/12  1, either one implying that λ  12. Without using the recursive formula, you’d have to set up an equation for Pr ( S  4) as follows: Pr ( S  1)  Pr ( N  1) Pr ( X  1) 

λe −λ 3

2 −λ 1 λ3 e −λ λe + 3 3 6 3 2 e −λ λ 1 λ4 e −λ 4 Pr ( S  4)  2 Pr ( N  2) | Pr ( X  1) Pr ( X  3) + Pr ( N  4) Pr ( X  1) 4  + 4 9 2 3 24

Pr ( S  3)  Pr ( N  1) Pr ( X  3) + Pr ( N  3) Pr ( X  1) 3 

Using Pr ( S  4)  Pr ( S  3) + 6 Pr ( S  1) , we get, after substituting the above expressions, multiplying out the denominators, and dividing by λe −λ , λ 3 − 12λ2 + 432λ − 5184 C/4 Study Manual—17th edition Copyright ©2014 ASM

17. AGGREGATE LOSS MODELS: THE RECURSIVE FORMULA

306

You can use the cubic equation if you know it, or notice that λ − 12 is a factor, so we get

( λ − 12)( λ2 + 432)  0

and λ  12. We now calculate the second moment of the secondary distribution: E[X 2 ]  13 (1) + 32 (9)  Var ( S )  12

19  76 3

19 3 .

Finally

!

(A)

Quiz Solutions 17-1. The probability of aggregate claims of 0 is 0.4. The probability of aggregate claims of 100 is (0.2)(0.3)  0.06. The probability of aggregate claims of 200 is the sum of the probabilities of two claims of 100 and one claim of 200, or fS (200)  0.2 (0.5) + 0.4 (0.32 )  0.136 The probability of aggregate claims of 300 is the probability of two claims of 100 and 200, or fS (300)  2 (0.4)(0.3)(0.5)  0.12 The sum of these probabilities is Pr ( S ≤ 300)  0.4 + 0.06 + 0.136 + 0.12  0.716 .

Lesson 18

Aggregate Losses—Aggregate Deductible Reading: Loss Models Fourth Edition 9.3, 9.5 This lesson discusses the expected aggregate payments in the presence of an aggregate deductible. An aggregate deductible is a deductible that is applied to aggregate losses rather than individual losses. Stoploss reinsurance is structured this way. A stop-loss reinsurance contract reimburses aggregate losses only above a certain level. When we refer to aggregate losses, we mean the sum of individual claim payments after individual claim modifications such as policy limits and deductibles, but before consideration of aggregate modifications. In the following, we shall continue to use the notation N for the loss frequency distribution, X for the loss severity distribution, and S for the aggregate loss distribution. If frequency and severity are independent, then E[S]  E[N] E[X]. The expected value of aggregate losses above the deductible is called the net stop-loss premium. If S is f g aggregate losses and d is the deductible, we denote it by E ( S − d )+ , the same notation as in Lesson 6. To simplify matters, we will assume that severity is discrete. Since frequency is discrete, aggregate loss is discrete. By equation (6.4), f g E ( S − d )+  E[S] − E[S ∧ d], and since we can evaluate E[S] as the product of E[N] and E[X] if N and X are independent, we only have to deal with E[S ∧ d], a finite integral. We have two formulas for limited expected value, equation (5.3) and equation (5.6). In the following, we shall continue using the notation of Lesson 17: p n  Pr ( N  n ) f n  Pr ( X  n ) g n  Pr ( S  n ) We will assume that for some h, Pr ( S  n ) is nonzero only for n a multiple of h. The first step in evaluating E[S ∧ d] is to calculate g n for n < d using one of the methods of Lesson 17. This gets harder and harder as d increases. Exams will not make you evaluate more than four values of gn . After doing that, there are three methods you can use to evaluate E[S ∧ d], and none of them require formula memorization. It is helpful to see these formulas graphically. I will explain the three methods as part of solving the next example. Example 18A On an insurance coverage, the number of claims has a geometric distribution with mean 4. The distribution of claim sizes is as follows: x Pr ( X  x ) 2 4 6 8

0.45 0.25 0.20 0.10

(i) Calculate E[ ( S − 2.8)+ ]. C/4 Study Manual—17th edition Copyright ©2014 ASM

307

18. AGGREGATE LOSSES—AGGREGATE DEDUCTIBLE

308

(ii) Calculate E[ ( S − 4)+ ]. Answer: In this example, h  2; all severities are multiples of 2. Let’s start by calculating E[S], and the probabilities of aggregate losses of 0 and 2, since this will be necessary regardless of the method used to calculate E[S ∧ d]. We have E[X]  0.45 (2) + 0.25 (4) + 0.20 (6) + 0.10 (8)  3.9 E[S]  4 (3.9)  15.6 Claim counts follow a geometric distribution with mean β  4. The probabilities of N  0 and N  1 are 1  0.2 1+4 ! 4  0.16 p 1  0.2 1+4

p0 

The probability that aggregate losses are 0 is the same as the probability of no claims, or g 0  Pr ( S  0)  0.2. The probability that aggregate losses are 2 is the probability of one claim of size 2, or g 2  Pr ( S  2)  0.16 (0.45)  0.072. We can then calculate the aggregate survival function S ( x ) for x < 4: SS (0)  Pr ( S > 0)  1 − g0  1 − 0.2  0.8

SS (2)  Pr ( S > 2)  SS (0) − g2  0.8 − 0.072  0.728

and SS ( x )  SS (0) for x < 2, SS ( x )  SS (2) for 2 ≤ x < 4. 1. Using the definition of E[S ∧ d]

Equation (5.3) is the definition of E[S ∧ d]. For a discrete distribution in which the only possible values are multiples of h, it becomes E[S ∧ d] 

u X j0

h j g h j + d Pr ( S ≥ d )

where u  dd/he − 1, so the sum is over all multiples of h less than d.1 The sum can actually start at j  1 instead of j  0, since the j  0 term is 0. If u − 1 < 0 (for example if d  1 and h  2), the sum is empty and the limited expected value is d Pr ( S ≥ d ) . For part (i), we sum up the probabilities of aggregate losses equal to values below 2.8 times the values, and then add 2.8 times the probability of aggregate losses greater than 2.8. In Subfigure 18.1a, this means summing rectangles A and B. The area of rectangle A is (0.072)(2) and the area of rectangle B is (0.728)(2.8) . So we get E[S ∧ 2.8]  0.2 (0) + 0.072 (2) + 0.728 (2.8)  2.1824

E[ ( S − 2.8)+ ]  15.6 − 2.1824  13.4176

For part (ii), we sum up the probabilities of getting values 0 and 2 times the values, and then add 4 times the probability of losses greater than or equal to 4. In Subfigure 18.1b, this means summing rectangles A and B. The area of rectangle A is (0.072)(2) and the area of rectangle B is (0.728)(4) . So we get E[S ∧ 4]  0.2 (0) + 0.072 (2) + 0.728 (4)  3.056

E[ ( S − 4)+ ]  15.6 − 3.056  12.544

1 dxe means the least integer above x. Thus if x  3, dxe  3, while if x  3.1, dxe  4. C/4 Study Manual—17th edition Copyright ©2014 ASM

18. AGGREGATE LOSSES—AGGREGATE DEDUCTIBLE

309

S (x )

S (x )

1

1

0.8 0.728

0.8 0.728

A

0.6

0.6

0.4

0.4

B

0.2 0

A

B

0.2 0

2

2.8

x

6

4

(a) Example 18A(i)

0

0

2

4

(b) Example 18A(ii)

6

x

Figure 18.1: Calculating E[S ∧ d] using the definition

2. Calculating E[S ∧ d] by integrating the survival function

Equation (5.6) for a discrete distribution in which the only possible values are multiples of h becomes E[S ∧ d] 

u−1 X j0

hS ( h j ) + ( d − hu ) S ( hu )

where once again u  dd/he − 1. If u − 1 < 0 (for example if d  1 and h  2), the sum is empty and the limited expected value is ( d − hu ) S ( hu ) . This formula sums up the probabilities of S being above each of the possible values below d times the distance between the possible values (the first term), and also the distance between jthekhighest possible value below d and d (the second term). In Example 18A(i), d  2.8 and h  2, so u  2.8 2 − 1  1 and there is one term in the sum plus one additional term. This is shown in Subfigure 18.2a, where we sum up rectangles C and D. Please don’t memorize the formula; try to understand it graphically and you will be easily able to reproduce it. The area of rectangle C is 2 (0.8)  1.6, and the area of rectangle D is 0.8 (0.728)  0.5824. So we have E[S ∧ 2.8]  2S (0) + 0.8S (2)

 1.6 + 0.5824  2.1824

E[ ( S − 2.8)+ ]  15.6 − 2.1824  13.4176 In Example 18A(ii), d  4 and h  2, so u 

4 2

j k

− 1  1, and

E[S ∧ 4]  2 (0.8) + 2 (0.728)  3.056

E[ ( S − 4)+ ]  15.6 − 3.056  12.544

Because S ( h j ) may be computed recursively (for example, S (4)  S (2) − g4 ), this is called the recursive formula, but has no direct relationship to the recursive formula of Lesson 17. C/4 Study Manual—17th edition Copyright ©2014 ASM

1

0.8 0.728

0.8 0.728

0.6

0.6

0.4

C

0.4

D

0.2 0

S (0)

1

S (2)

S (x )

S (0)

S (x )

C

S (2)

18. AGGREGATE LOSSES—AGGREGATE DEDUCTIBLE

310

D

0.2 0

2

2.8

4

(a) Example 18A(i)

6

x

0

0

2

4

(b) Example 18A(ii)

6

x

Figure 18.2: Calculating E[S ∧ d] by integrating the survival function

3. Proceeding backwards A variant of the second method for calculating E[S ∧ d] is to express it as d minus something. The something is the sum of the probabilities of S  k for some k < d, times d − k. Thus in part (i), the deductible of 2.8 saves the insurance company 2.8 unless aggregate losses are 0 or 2. If aggregate losses are 0, the expected amount not saved is 2.8g0 . If aggregate losses are 2, the expected amount not saved is 0.8g 2 , since the deductible only saves the company 2 rather than 2.8 in this case. So we have E[S ∧ 2.8]  2.8 − 2.8g0 − 0.8g 2  2.8 − 2.8 (0.2) − 0.8 (0.072)  2.1824

The graph is Subfigure 18.3a. We start with the rectangle from (0, 0) to (2.8, 1) and then subtract rectangles E and F. In part (ii), the savings are 4g0 + 2g2 , so E[S ∧ 4]  4 − 4g0 − 2g2  4 − 4 (0.2) − 2 (0.072)  3.056

The graph is Subfigure 18.3b. We start with the rectangle from (0, 0) to (4, 1) and then subtract rectangles E and F. 

?

Quiz 18-1 Claim counts and sizes on an insurance coverage are independent and have the following distribution: Claim Sizes Claim Counts Number of claims

Probability

Claim size

Probability

0 1 2 3

0.5 0.3 0.1 0.1

100 200 300 400

0.2 0.4 0.3 0.1

A stop-loss reinsurance contract pays the excess of aggregate losses over 200. The premium for the contract is 150% of expected payments. Calculate the premium. C/4 Study Manual—17th edition Copyright ©2014 ASM

18. AGGREGATE LOSSES—AGGREGATE DEDUCTIBLE

311

S (x )

S (x )

1

1 E

E

0.8 0.728

0.8 0.728

F

0.6

0.6

0.4

0.4

0.2

0.2

0

0

2

2.8

4

(a) Example 18A(i)

6

x

0

F

0

2

4

(b) Example 18A(ii)

6

x

Figure 18.3: Calculating E[S ∧ d] as d minus excesses of d over values of S

The second method is the best one for problems where you must solve for the deductible. Example 18B (Continuation of previous example.) On an insurance coverage, the number of claims has a geometric distribution with mean 4. The distribution of claim sizes is as follows: x Pr ( X  x ) 2 0.45 4 0.25 6 0.20 8 0.10 A stop-loss reinsurance contract sets the deductible so that expected payments under the contract are 12. Calculate the deductible needed to satisfy this condition. Answer: We continue calculating recursively E[ ( S − d )+ ] until it is below 12. Since E ( S )  15.6, we need E[S ∧ d] ≥ 15.6 − 12  3.6. We use the second method, and continue adding rectangles until the area adds up to 3.6. Using just S (2) , we can calculate E[S ∧ 4]  E[S ∧ 2] + 2S (2)  1.6 + 2 (0.728)  3.056

In order to proceed, we will have to calculate additional g n ’s. For the geometric distribution, which is the same as in Example 18A, we calculated that the probabilities of 0 and 1 are p0  0.2 and p1  0.16 respectively. Then p2  0.8p1  0.128, and g4  0.16 (0.25) + 0.128 (0.452 )  0.06592 S (4)  0.728 − 0.06592  0.66208

We solve for d:

E[S ∧ 6]  E[S ∧ 4] + 2S (4)  3.056 + 2 (0.66208) > 3.6 3.6  E[S ∧ d]  3.056 + ( d − 4)(0.66208) 3.6 − 3.056 d 4+  4.8217 0.66208



18. AGGREGATE LOSSES—AGGREGATE DEDUCTIBLE

312

Note that between possible values of S, E[ ( S − u )+ ] is a linear function of u.

Example 18C For aggregate losses S, you are given:

(i) E[ ( S − 100)+ ]  4200 (ii) E[ ( S − 150)+ ]  4180 (iii) S does not assume any value in the interval (100, 150) . Calculate E[ ( S − 120)+ ]. Answer: E[ ( S − 120)+ ]  0.6 (4200) + 0.4 (4180)  4192



Exercises 18.1. [151-83-94:15] (2 points) Aggregate claims has a compound Poisson distribution with λ  2, Pr ( X  1)  0.4, and Pr ( X  2)  0.6, where X is individual claim size. An insurer charges a premium of 4.5 and returns a dividend of the excess, if any, of 2.7 over claims. Determine the excess of premiums over expected claims and dividends. (A) 0.4

(B) 0.5

(C) 0.6

(D) 0.7

(E) 0.8

18.2. The number of claims on an insurance coverage has a negative binomial distribution with mean 2 and variance 6. The claim size distribution is binomial with parameters m  3 and q  0.4. A reinsurance contract pays aggregate claims over an aggregate deductible of 2. Determine the expected aggregate loss paid by reinsurance. 18.3. [151-81-96:16] (2 points) A stop-loss reinsurance pays 80% of the excess of aggregate claims above 20, subject to a maximum payment of 5. All claim amounts are non-negative integers. For aggregate claims S, you are given: E ( S − 16)+  3.89

E ( S − 25)+  2.75

E ( S − 24)+  2.84

E ( S − 27)+  2.65

f

g

f

E ( S − 20)+  3.33

f

f

g

E ( S − 26)+  2.69

g

f

g

f

g

g

Determine the total amount of claims the reinsurer expects to pay. (A) 0.46

(B) 0.49

(C) 0.52

(D) 0.54

(E) 0.56

Exercises continue on the next page . . .

EXERCISES FOR LESSON 18

313

18.4. The number of claims on an insurance coverage has a negative binomial distribution with parameters r  2 and β  1. The claim size distribution is as follows: Amount 1 2 3 4

Probability 0.4 0.3 0.2 0.1

Reinsurance covers the aggregate loss with an aggregate deductible of d. The deductible d is set so that the expected reinsurance payment is 2. Determine d. 18.5. The Late Night Quiz Show gives out 4 prizes per night. Each prize has a 0.8 probability of being 1000 and a 0.2 probability of being 2000. Determine the probability that total prizes for a night will be 6000. 18.6. The number of claims per year on a homeowner’s policy follows a Poisson distribution with mean 0.2 per year. The claim size distribution has the following probability function: f (x ) 

1 4 4 5

!x

x  1, 2, 3, . . . .

Reinsurance pays 90% of claims for a year subject to an aggregate deductible of 2. The deductible is applied after multiplying the claims by 90%. Determine the expected reinsurance payment per year. Use the following information for questions 18.7 and 18.8: The Late Night Quiz Show gives out prizes each night. The number of prizes given out is randomly distributed as follows: Number of Prizes

Probability

1 2 3 4

0.20 0.35 0.30 0.15

Each prize has a 0.8 probability of being 1000 and a 0.2 probability of being 2000. 18.7.

Determine the probability that total prizes are 6000.

18.8. The prize payment for a night, minus an aggregate deductible of 1500, is insured. The premium for the insurance equals 120% of expected claims. Calculate the premium.

Exercises continue on the next page . . .

18. AGGREGATE LOSSES—AGGREGATE DEDUCTIBLE

314

18.9.

For an insurance coverage, aggregate losses have the following distribution: Amount 0 1000 2000 3000 4000 or more

Probability 0.50 0.10 0.15 0.10 0.15

Average aggregate losses are 2800. An insurance coverage pays 80% of aggregate losses, minus a deductible of 1000. The deductible is applied after the 80% factor. Determine the average payment of the insurance coverage. 18.10. You have a \$10 gift card for use at Amigo’s Department Store. You plan to buy several items there. The number of items you will buy has the following distribution: Number of items

Probability

0 1 2 3

0.2 0.4 0.3 0.1

The price of each item is \$4 ( X + 1) , where X is a random variable having a binomial distribution with parameters m  2, q  0.2. You will use the gift card, regardless of the value of the amount spent. Calculate the expected amount of money you spend net of the value of the gift card. 18.11. A group life insurance policy covers 40 lives. Each policy is for 100,000. The probability of death for each life covered is 0.01. A reinsurance contract reimburses the insurance company for 80% of aggregate losses, subject to an aggregate deductible of 200,000. The aggregate deductible is subtracted after multiplying aggregate losses by 80%. The reinsurance premium is 110% of expected reinsurance payments. Calculate the reinsurance premium. 18.12. [151-82-92:16] A group policyholder’s aggregate claims, S, has a compound Poisson distribution with λ  1 and all claim amounts equal to 2. The insurer pays the group the following dividend:

 6 − S D 0 

S 0)   0.5510 0.784 Therefore, the probability of aggregate claims of size 1 is g1  (0.2377)(0.5510)  0.1310 Now we can compute E[ ( S − 2)+ ]. E[S]  ( rβ )( mq )  (1)(2)(3)(0.4)  2.4 Pr ( S ≥ 2)  1 − 0.3894 − 0.1310  0.4796 Using the third method, the reinsurance pays 2.4, minus 1 if S  1, minus 2 if S ≥ 2, or E[ ( S − 2)+ ]  2.4 − 0.1310 (1) − 0.4796 (2)  1.3098 18.3. First of all, ignore the 80% coinsurance. Then the reinsurer pays for the amount of each claim between 20 and 26.25, since 0.8 (6.25)  5. All claims are integers, so the reinsurer pays min ( S, 26) − min ( S, 20) + 0.25 min ( S, 27) − min ( S, 26)





To express this in terms of ( S − n )+ variables, remember that min ( x, y )  x + y −max ( x, y )  x −max (0, x − y ) , so the payment equals S − max (0, S − 26) − S + max (0, S − 20) + 0.25 S − max (0, S − 27) − S + max (0, S − 26)





 ( S − 20)+ − ( S − 26)+ + 0.25 ( S − 26)+ − ( S − 27)+









and the expected value of this is E ( S − 20)+ − E ( S − 26)+ + 0.25 E ( S − 26)+ − E ( S − 27)+

f

g

f

g

 f

g

f

g

(3.33 − 2.69) + 0.25 (2.69 − 2.65)  0.65

Multiplying this by the coinsurance factor 0.8 we get 0.8 (0.65)  0.52 . (C) 18.4. Let N be the number of claims and S the aggregate loss. f We willg calculate the probabilities of values of S directly and use the recursive method for calculating E ( S − x )+ . E[S]  2 (2)  4 Pr ( N  0) 

1 2 2

 0.25

Pr ( S  0)  0.25 SS (0)  1 − 0.25  0.75

E ( S − 1)+  4 − 0.75  3.25

f

g

Pr ( N  1)  2

1 3 2

 0.25

Pr ( S  1)  0.25 (0.4)  0.1 SS (1)  0.75 − 0.1  0.65

E ( S − 2)+  3.25 − 0.65  2.60

f

g

18. AGGREGATE LOSSES—AGGREGATE DEDUCTIBLE

320

Pr ( N  2)  3

1 4 2

 0.1875

Pr ( S  2)  0.25 (0.3) + 0.1875 (0.42 )  0.105 SS (2)  0.65 − 0.105  0.545

E ( S − 3)+  2.60 − 0.545  2.055

f

g

Pr ( N  3)  4

1 5 2

 0.125

Pr ( S  3)  0.25 (0.2) + 0.1875 (2)(0.4)(0.3) + 0.125 (0.43 )  0.103 SS (3)  0.545 − 0.103  0.442

E ( S − 4)+  2.055 − 0.442  1.613

f

g

Interpolating between 3 and 4, d 3+

2.055 − 2  3.124 2.055 − 1.613

18.5. If X is binomial with m  4, q  0.2, the loss variable is 1000X + 4000. The probability that this is  equal to 6000 is the probability X  2, or 42 (0.82 )(0.22 )  0.1536 .

18.6. The size of claims X is 1 plus a geometric distribution with β  4. Using this, or directly summing the geometric series, we have E[X]  5. The probabilities are f1  0.2, f2  0.16. Letting S be aggregate 2 losses, E[S]  0.2 (5)  1. Subtracting 2 after multiplying by 90% is equivalent to first subtracting 0.9  20 9 20 and then multiplying by 90%, so the reinsurance payment is 0.9 ( S − 9 )+ . The aggregate probabilities are g0  e −0.2  0.8187 g1  0.2 (0.2) e −0.2  0.04e −0.2  0.0327 0.22 (0.22 ) + (0.2)(0.16) e −0.2  0.02685 g2  2

!

f

Pr ( S > 2)  1 − 0.8187 − 0.0327 − 0.02685  0.1217

E (S −

0.9 E ( S −

f

20 9 )+

g

20 9 )+

g

 1 − 0.0327 − 0.02685 (2) − 0.1217

20 9

 0.6432

 (0.9)(0.6432)  0.5788

18.7. Total prizes are 6000 if three prizes are 2000 or if two prizes are 2000 and two are 1000. The proba bility of this is (0.3)(0.23 ) + (0.15) 42 (0.22 )(0.82 )  0.02544 .

18.8. If S is total prizes, then Pr ( S  1000)  0.2 (0.8)  0.16. The average prize is 1200 and the average f number of prizes is 1 (0.20) + 2 (0.35) + 3 (0.30) + 4 (0.15)  2.4, so E[S]  2.4 (1200)  2880. Then E ( S − 1500)+  2880 − 1000 (0.16) − 1500 (0.84)  1460. Multiplying by 120%, (1460)(1.2)  1752 .

g

18.9. If S is the loss variable, the payment variable is 0.8 ( S − 1250)+ . We calculate E ( S − t )+ recursively. E[ ( S − 1250)+ ] is obtained by linear interpolation between E[ ( S − 1000)+ ] and E[ ( S − 2000)+ ].

f

S (0)  0.5

S (1000)  0.4

E ( S − 1000)+  2800 − 0.5 (1000)  2300

f

g

E ( S − 2000)+  2300 − 0.4 (1000)  1900

f

g

E ( S − 1250)+  0.75 (2300) + 0.25 (1900)  2200

f

g

g

EXERCISE SOLUTIONS FOR LESSON 18

321

E 0.8 ( S − 1250)+  0.8 (2200)  1760

f

g

An alternative solution2 is to calculate E[0.8S] − E[0.8S ∧ 1000]. Since we are given that E[S]  2800, then E[0.8S]  0.8 (2800)  2240. Now, 0.8S ∧ 1000 is 0 with probability 0.5, 800 with probability 0.1, and 1000 with probability 0.4, so E[0.8S ∧ 1000]  0.1 (800) + 0.4 (1000)  480

The answer is 2240 − 480  1760 .

18.10. E[N]  1 (0.4) + 2 (0.3) + 3 (0.1)  1.3, so without the card expected spending would be (1.3)(1.4)(4)  7.28. The gift card reduces the amount spent by 10, except that it reduces the amount spent by 0 if S  0, etc. We calculate (using the third method) g0  0.2 g4  (0.82 )(0.4)  0.256 g8  (0.4)(0.32) + (0.3)(0.642 )  0.25088 So the expected amount covered by the gift card is 10 − 0.2 (10) − 0.256 (6) − 0.25088 (2)  5.96224. The expected amount spent is 7.28 − 5.96224  1.31776 .

18.11. We have

p0  0.9940  0.66897 p1  40 (0.9939 )(0.01)  0.27029 p2  780 (0.9938 )(0.012 )  0.05324 The mean payment before deductible is (40)(100,000)(0.01)  40,000. Ignoring the coinsurance, the deductible removes 250,000 (which is multiplied by 80%), unless there are 0, 1, or 2 deaths. We have E max (0, 250,000 − X )  0.66897 (250,000) + 0.27029 (150,000) + 0.05324 (50,000)  210,448.65

f

g

Hence the expected reinsurance payment before coinsurance is 40,000 − 250,000 + 210,448.65  448.65. Multiplying this by 80% and then by 110%, we get 394.81 . 18.12. The dividend is 6 if no claims, 4 if one claim, 2 if two claims, otherwise 0, so the expected value is 6e

−1

+ 4e

−1

e −1 11 +2  2 e

!

(D)

18.13. The probability that the secondary distribution is 0 is e −0.5 . We modify the primary geometric’s parameter to 3 (1−e −0.5 ) and zero-truncate the Poisson. Then β/ (1+β )  3 (1−e −0.5 ) / (4−3e −0.5 )  0.541370. We will use p k for primary probabilities, f k for secondary probabilities. We will use ( a, b, i ) methods to calculate probabilities recursively: for the geometric, repeated multiplication by 0.541370; for the Poisson, multiplication by λ/k. p 0  1 − 0.541370  0.458630 p2  0.541370 (0.248289)  0.134416

0.5e −0.5  0.770747 1 − e −0.5 f2  0.25 (0.770747)  0.192687

p3  0.541370 (0.134416)  0.072769

f3  (0.5/3)(0.192687)  0.032114

p 1  0.541370 (0.458630)  0.248289

2shown to me by Bryn Clarke C/4 Study Manual—17th edition Copyright ©2014 ASM

f1 

18. AGGREGATE LOSSES—AGGREGATE DEDUCTIBLE

322

Let N be total number of losses. Pr ( N  0)  0.458630 Pr ( N  1)  (0.248289)(0.770747)  0.191368 Pr ( N  2)  (0.134416)(0.7707472 ) + (0.248289)(0.192687)  0.127692 Pr ( N  3)  (0.072769)(0.7707473 ) + 2 (0.134416)(0.770747)(0.192687) + (0.248289)(0.032114)  0.081217 Pr ( N ≥ 4)  1 − 0.458630 − 0.191368 − 0.127692 − 0.081217  0.141094

18.14. Using the results of the previous exercise, we calculate E[N ∧ 4] and multiply it by 100. E[N ∧ 4]  0.458630 (0) + 0.191368 (1) + 0.127692 (2) + 0.081217 (3) + 0.141094 (4)  1.254779

The expected value of N is 3 (0.5)  1.5, so 100 E[N] − E[N ∧ 4]  100 (1.5 − 1.254779)  24.5221





18.15. Without the deductible, E[S]  0.3 (10) + 0.3 (20) + 0.4 (50)  29. The probabilities of aggregate losses of 0, 10, and 20 (computed directly, without the recursive formula) are: g0  e −1  0.367879 g 10  e −1 (0.3)  0.110364 e −1 (0.32 )  0.126918 2 The probability of 30 or greater is 1 − 0.367879 − 0.110364 − 0.126918  0.394838. So E[S ∧ 30], using the first method, is E[S ∧ 30]  0.110364 (10) + 0.126918 (20) + 0.394838 (30)  15.4872 g 20  e −1 (0.3) +

The expected amount of claims over 30 is 29 − 15.4872  13.5128 . (E)

18.16. The probabilities that aggregate losses are 0 or 1 are g0  e −2  0.135335 g1  e

−2

1  0.090224 (2) 3

!

So the probability that aggregate losses are 2 or more is 1 − 0.135335 − 0.090224  0.774441. Then E[S]  2 (2)  4

E[S ∧ 2]  0.090224 (1) + 0.774441 (2)  1.639106

E ( S − 2)+  4 − 1.639106  2.360894

f

g

(B)

18.17. E[S]  3 0.3 (1) + 0.2 (2) + 0.1 (3)  3





Pr ( S  0)  0.43  0.064 E[S ∧ 1]  (1 − 0.064)  0.936

E ( S − 1)+  3 − 0.936  2.064

f

g

(C)

EXERCISE SOLUTIONS FOR LESSON 18

323

18.18. E[S]  E[N] E[X]  (1.2)(170)  204. Now, we calculate probabilities that S, aggregate prizes, is 0 or 100. g0  0.8 (0.2) + 0.2 (0.22 )  0.168 g100  0.8 (0.7) + 2 (0.2)(0.2)(0.7)  0.616 In the calculation of g100  Pr ( S  100) , we added the probability of one prize of 100 and of two prizes, either the first 0 and the second 100 or the first 100 and the second 0 (hence the multiplication by 2). Pr ( S ≥ 200)  1 − 0.168 − 0.616  0.216. So E[S ∧ 200]  0.616 (100) + 0.216 (200)  104.8

E ( S − 200)+  204 − 104.8  99.2

f

g

The answer is 2.75 (99.2)  272.8 . (D). 18.19. Retained claims are claims not paid by the reinsurer. Reinsurance pays 0 unless a member of the group has claims of 3, in which case it pays 1. The probability of a claim of 3 is 0.1, so the average reinsurance payment per person is (0.1)(1)  0.1, and total reinsurance claims for the group is 2 (0.1)  0.2. The reinsurance premium is therefore (1.1)(0.2)  0.22. The dividend is 3 minus administrative expenses of (0.2)(3)  0.6, reinsurance premium of 0.22, and claims, or 3 − 0.6 − 0.22 − retained claims, or 2.18 − retained claims, but not less than 0. If retained claims are greater than 2, the dividend will be 0. The probability of retained claims of 0 is the probability that both members have claims of 0, which is (0.4)(0.4)  0.16. The probability of retained claims of 1 is the probability that one member has claims of 0 and the other has claims of 1, or 2 (0.4)(0.3)  0.24. Since reinsurance pays the amount of the claim above 2, the probability that retained claims for a member are 2 is the probability that claims are 2 or greater, or 0.3. The probability that retained claims for the group are 2 is the probability of 0 for one and 2 retained for the other, or 2 (0.4)(0.3)  0.24, plus the probability of 1 for both, or (0.3)(0.3)  0.09, so the total probability of retained claims of 2 is 0.24 + 0.09  0.33. Summarizing: Retained Claims

Probability

Dividend

0 1 2

0.16 0.24 0.33

2.18 1.18 0.18

The expected value of the dividend is therefore

(0.16)(2.18) + (0.24)(1.18) + (0.33)(0.18)  0.6914

(A)

18.20. At each factory, expected repair costs are E[X]  0.3 (1) + 0.2 (2) + 0.1 (3)  1, and the limited expected value at 1 is E[X ∧ 1]  0.6. Therefore, insurance premium is 1.1 (1 − 0.6)  0.44. Nonrandom profit is then 3 − 0.15 (3) − 2 (0.44)  1.67. Profits will be non-negative if 1.

both factories have 0 repair cost, probability 0.42  0.16, or

2.

if one factory has non-zero repair costs and the other has 0 repair costs, probability 2 (0.4)(0.6)  0.48.

Expected value of non-negative profit is 0.16 (1.67) + 0.48 (0.67)  0.5888 . (E)

18. AGGREGATE LOSSES—AGGREGATE DEDUCTIBLE

324

18.21. E[S]  4 (40)  160. To calculate E[S ∧ 100] we need the probabilities of 0, 40, and 80, which are the geometric distribution’s probabilities (p n ) of 0, 1, and 2, which can be calculated recursively by repeated multiplication by β/ (1 + β )  0.8 1  0.2 p0  1+4

!

p 1  0.8 (0.2)  0.16 p 2  0.8 (0.16)  0.128 So Pr ( N > 2)  1 − 0.2 − 0.16 − 0.128  0.512. E[S ∧ 100]  0.16 (40) + 0.128 (80) + 0.512 (100)  67.84

E ( S − 100)+  160 − 67.84  92.16

f

g

(C)

18.22. Expected aggregate claims, S, is E[N] E[X]  (1.3)(4)  5.2. To exceed 4 (5.2)  20.8 there must be two claims, either one for 20 and one for 10, probability 2 (0.4)(0.1)(0.2)  0.016, or two for 20, probability 0.4 (0.12 )  0.004, so the total probability is 0.016 + 0.004  2% . (A) 18.23. The probability of paying less than or equal to \$150 is the probability that all losses will be less than or equal to 500, since an individual loss greater than 500 is at least 800, on which 200 is paid after the per event and annual deductible. The modified frequency distribution can be calculated by multiplying the Poisson parameter λ  0.15 by the probability of a claim above 500, as we learned in Lesson 13, so the new parameter is 0.15 (1 − 0.10 − 0.25)  0.0975. Then the probability of no losses above 500 is e −0.0975  0.9071 and the probability of at least one loss above 500 is 1 − 0.9071  9.29% . (E)

Quiz Solutions 18-1.

We’ll use the second method. E[S]  0.3 + 0.1 (2) + 0.1 (3)





0.2 (100) + 0.4 (200) + 0.3 (300) + 0.1 (400)  (0.8)(230)  184



g0  0.5 g100  (0.3)(0.2)  0.06

SS (0)  0.5 SS (100)  0.5 − 0.06  0.44

E[S ∧ 200]  (100)(0.5) + (100)(0.44)  94 E[ ( S − 200)+ ]  184 − 94  90 The stop-loss reinsurance premium is 1.5 (90)  135 .

Lesson 19

Aggregate Losses: Miscellaneous Topics Reading: Loss Models Fourth Edition 9.4, 9.6.5 This lesson is devoted to topics on the syllabus not conveniently fitting into the other lessons.

19.1

Exact Calculation of Aggregate Loss Distribution

In some special cases, one can combine the frequency and severity models into a closed form for the distribution function of aggregate losses. The distribution function of aggregate losses at x is the sum over n of the probabilities that the claim count equals n and the sum of n loss sizes is less than or equal to x. When the sum of loss random variables has a simple distribution, it may be possible to calculate the aggregate loss distribution. Two cases for which the sum of independent random variables has a simple distribution are: 1. Normal distribution. If X i are normal with mean µ and variance σ 2 , their sum is normal. 2. Exponential or gamma distribution. If X i are exponential or gamma, their sum has a gamma distribution. We shall now discuss these distributions in greater detail.

19.1.1

Normal distribution

If n random variables X i are independent and normally distributed with parameters µ and σ2 , their sum is normally distributed with parameters nµ and nσ2 . Thus we can calculate the probability that the sum is less than a specific value by referring to the normal distribution table. Example 19A For an insurance coverage, the number of losses follows a binomial distribution with m  2, q  0.3. Loss sizes are normally distributed with mean 1000 and variance 40,000. (1) Determine the probability that aggregate losses are less than 1200. Do not use the normal approximation. (2) Repeat (1) with the normal approximation. Answer: 1. The probabilities of 0, 1, and 2 losses are p0  0.72  0.49, p1  2 (0.3)(0.7)  0.42, and p2  0.09. If there is no loss, then aggregate losses are certainly below 1200. If there is N  1 loss, then for the aggregate distribution S, 1200 − 1000 Pr ( S < 1200 | N  1)  Φ √  0.8413 40,000

!

325

19. AGGREGATE LOSSES: MISCELLANEOUS TOPICS

326

If there are N  2 losses, the sum of those 2 losses is normal with mean 2000 and variance 80,000, so 1200 − 2000 Pr ( S < 1200 | N  2)  Φ √  Φ (−2.83)  0.0023 80,000

!

The probability that aggregate losses are less than 1200 is Pr ( S < 1200)  0.49 + 0.42 (0.8413) + 0.09 (0.0023)  0.8436 2. With the normal approximation, the aggregate mean is (0.6)(1000)  600 and the aggregate variance is Var ( S )  E[N] Var ( X ) + Var ( N ) E[X]2  0.6 (40,000) + 0.42 (10002 )  444,000 The probability that aggregate losses ae less than 1200 is 1200 − 600  Φ (0.90)  0.8159 Pr ( S < 1200)  Φ √ 444,000

!

19.1.2



Exponential and gamma distributions

The Erlang distribution function The sum of n exponential random variables with common mean θ is a gamma distribution with parameters α  n and θ. When a gamma distribution’s α parameter is an integer, the gamma distribution is also called an Erlang distribution. I will use the letter n instead of the letter α as the first parameter of an Erlang distribution. Let’s discuss how to compute the distribution function F ( x ) for an Erlang distribution with parameters n and θ. If n  1, the Erlang distribution is an exponential distribution, and F ( x )  1 − e −x/θ . But let’s develop a formula that works for any n. In a Poisson process1 with parameter 1/θ, the time between events is exponential with mean θ. Therefore, the time until the n th event occurs is Erlang with parameters n and θ. In other words, the probability that n events occur before time x is FX ( x ) , where X is Erlang(n, θ). Equivalently, the probability of at least n events occurring before time x in a Poisson process is equal to FX ( x ) . By Poisson probability formulas, the probability of exactly n events occurring before time x in this Poisson process is e −x/θ ( x/θ ) n /n! The formula we have just developed for the Erlang distribution function FX ( x ) is FX ( x )  1 −

n−1 X j0

e −x/θ

( x/θ ) j j!

(19.1)

Example 19B Loss sizes on an insurance coverage are exponentially distributed with mean 1000. Three losses occur in a day. Calculate the probability that the total insurance reimbursement for these three losses is greater than 4000. 1Poisson processes are covered in Exam ST. All Poisson processes mentioned in this lesson are homogeneous. If you have not taken Exam ST, just be aware of the following: • In a Poisson process with parameter λ, the number of events occurring by time t has a Poisson distribution with mean λt. • In a Poisson process with parameter λ, the time between events follows an exponential distribution with parameter 1/λ.

19.1. EXACT CALCULATION OF AGGREGATE LOSS DISTRIBUTION

327

Answer: Let X be the total insurance reimbursement. Total insurance reimbursement follows an Erlang distribution with parameters n  3 and θ  1000. Notice that the example asks for the survival function at 4000 rather than the distribution function. As discussed above, the corresponding Poisson process has parameter 1/1000. The probability that the reimbursement is greater than 4000 is the probability of at most three events occurring before time 4000, or Pr ( X > 4000)  e

−4000/1000

2 X (4000/1000) n n0

n!

 e −4 (1 + 4 + 8)  0.2381



Since we can calculate the Erlang distribution function, we can in principle calculate FS ( x ) for any compound distribution with exponential severities. Suppose the mean of the exponential is θ. To calculate FS ( x ) , sum up over all n the probabilities of n events times the probability that an Erlang with parameters n and θ is no greater than x. In symbols: FS ( x ) 

∞ X

Pr ( N  n ) FX n ( x )

n0

where X n follows an Erlang distribution with parameters n and θ. The only problem with this method is that the sum is usually infinite, and to decide when to stop summing, we’d have to determine when the sum has converged to the desired accuracy. However, if the frequency distribution has only a finite number of values—for example, if frequency is binomial—this formula provides an exact answer. Example 19C Claim counts have a binomial distribution with m  2, q  0.2. Claim sizes are exponential with mean 1000. Calculate the probability that aggregate claims are less than their mean. Answer: The mean is 2 (0.2)(1000)  400. We must calculate2 FS (400) . The probability of zero claims is p0  0.82  0.64. The probability of one claim is p1  2 (0.8)(0.2)  0.32. The probability that one exponential claim is less than 400 is 1 − e −400/1000  0.329680. The probability of two claims is 0.04. The probability that an Erlang distribution with parameters n  2 and θ  1000 is less than 400 is the probability of at least two events by time 400 in a Poisson process with parameter 1/1000, or 1 − e −0.4 (1 + 0.4)  0.061552. Summing up, the probability that aggregate claims are less than their mean is FS (400)  0.64 + 0.32 (0.329680) + 0.04 (0.061552)  0.74796



We can similarly calculate the distribution of the compound model if the severities are Erlang (instead of exponential) with the same parameter θ, since the sums of n such Erlangs is an Erlang whose first parameter is the sum of the n first parameters and whose second parameter is θ. But since n will be a big number (for example, summing up two Erlangs each with n i  2 results in n  4), this is not a realistic set-up for an exam question. Negative binomial/exponential compound models The following material is unlikely to appear on an exam. In the fourth edition of Loss Models, it only appears in an example. 2The aggregate distribution is continuous except at 0, so I will use “less than 400” and “no greater than 400” interchangeably. C/4 Study Manual—17th edition Copyright ©2014 ASM

19. AGGREGATE LOSSES: MISCELLANEOUS TOPICS

328

Suppose a compound model has negative binomial frequency with parameters r and β, and exponential severities with parameter θ. Moreover, suppose r is an integer. The textbook proves the remarkable result that this model is equivalent to a compound model with binomial frequency with parameters m  r and q  β/ (1 + β ) and exponential severities with parameter θ (1 + β ) . We can then use the method discussed above to calculate the compound model’s probabilities. The textbook proves this by algebraic manipulation on the probability generating function of the compound distribution. You need not learn this derivation. To remember the above-mentioned parameters • The binomial m equals the negative binomial r. (Memorize this fact.)

r

• The probability of 0 must not change. For the negative binomial, p0  1/ (1 + β ) . For which binomial would this be p0 ? Since m  r, we must have q  β/ (1 + β ) to make the probability p0 .



• The expected value of the compound model must not change. For the original negative binomial/exponential model, it is rβθ. Therefore, for the binomial/exponential model with binomial mean rβ/ (1 + β ) as derived in the previous paragraph, the exponential must have mean θ (1 + β ) . The easiest case (and considering how unlikely exam questions are on this topic, perhaps the only one that would appear on an exam) is when the r parameter of the negative binomial distribution is equal to 1. In this case, the negative binomial distribution is a geometric distribution, and the Erlang distribution is an exponential distribution. The binomial distribution in the equivalent model is a Bernoulli with p0  1/ (1 + β ) , p1  β/ (1 + β ) . Thus the compound geometric/exponential model’s distribution function is a two-point mixture of a degenerate distribution at 0 with weight 1/ (1 + β ) and an exponential distribution with mean θ (1 + β ) , weight β/ (1 + β ) . Example 19D Claim counts follow a geometric distribution with mean 0.2. Claim sizes follow an exponential distribution with mean 1000. A stop-loss reinsurance contract pays all losses with an aggregate deductible of 5000. 1. Determine the probability that aggregate losses will be greater than 5000. 2. Determine the expected losses paid under the contract. Answer: The model is equivalent to a model having • Bernoulli frequency with parameter β/ (1 + β )  0.2/1.2  1/6. • Exponential severity with parameter (1 + β ) θ  1.2 (1000)  1200. 1. An exponential distribution with mean 1200 has survival function S (5000)  e −5000/1200  0.015504. Since the probability of a loss is 1/6, the probability that aggregate losses are greater than 5000 is 0.015504/6  0.002584 . 2. We want E ( S − 5000)+  E[S] − E[S ∧ 5000]. For an exponential, E[X ∧ x]  θ 1 − e −x/θ by the formulas in the distribution tables. In our case, since the probability of a loss in the Bernoulli/exponential model is 1/6, the limited expected value E[S ∧ 5000] is 1/6 of the corresponding value for an exponential. Therefore,

f

g



E ( S − 5000)+  16 (1200) − 61 (1200) 1 − e −5000/1200

f

g



 200e −5000/1200  3.1008

Now let’s do a non-geometric example. C/4 Study Manual—17th edition Copyright ©2014 ASM



 

19.2. DISCRETIZING

329

Example 19E You are given: (i) Claim counts follow a negative binomial distribution with r  2, β  0.25. (ii) Claim sizes follow an exponential distribution with mean 800. Calculate the probability that aggregate claims are less than 400. Answer: The equivalent binomial model has binomial frequency m  2, q  1+β  0.25 1.25  0.2 and exponential severity with mean θ (1 + β )  800 (1.25)  1000. So this example reduces to Example 19C, and the answer is the same: 0.74796 .  β

19.1.3

Compound Poisson models

If S j are a set of compound P Poisson distributions with Poisson parameters λ j and severity random P variables X j , the sum S  nj1 S j is a compound Poisson distribution with Poisson parameter λ  nj1 λ j and severity equal to a weighted average, or a mixture, of the individual severities X j . The weights are λ j /λ. This means that if you are interested in the distribution function of S, you can calculate it directly rather than calculating the distribution functions of the S j separately and convolving them. The syllabus goes to the trouble of excluding the textbook’s example of compound Poisson models, although it doesn’t exclude the textbook’s discussion preceding the example, which I’ve summarized in the above paragraph. Perhaps they think the textbook’s example (which involves a sum of ten compound Poisson models with Erlang severities) is too complicated, so here’s a simple example. Example 19F An automobile liability policy covers bodily injury and property damage losses. Annual aggregate losses from bodily injury claims follow a compound Poisson process with mean 0.1 per year. Loss sizes follow an exponential distribution with mean 10,000. Annual aggregate losses from property damage claims follow a compound Poisson process with mean 0.3 per year. Loss sizes follow an exponential distribution with mean 5,000. Calculate the median size of the next loss from this policy. Answer: The combined losses form a compound Poisson process with mean 0.4 per year, in which loss sizes are a mixture of the two exponential distributions with weight 0.1/ (0.1 + 0.3)  0.25 on the distribution with mean 10,000 and weight 0.75 on the distribution with mean 5,000. Setting the survival function equal to 0.5, 0.25e −x/10,000 + 0.75e −x/5,000  0.5 Let u  e −x/10,000 , and multiply through by 4. 3u 2 + u − 2  0 √ −1 + 25 2  u 6 3 2 −x/10,000 e  3 x  −10,000 ln 2/3  4055

19.2



Discretizing

This topic was briefly on the exam syllabus in the early 2000’s, then removed, then returned in 2005. To my knowledge, no exam questions have ever been asked on it, so you’re safe skipping it. At most, you probably only need to know the method of rounding, which is easy. C/4 Study Manual—17th edition Copyright ©2014 ASM

19. AGGREGATE LOSSES: MISCELLANEOUS TOPICS

330

The recursive method for calculating the aggregate distribution as well as the direct convolution method require a discrete severity distribution. Usually the severity distribution is continuous. There are two methods for discretizing the distribution. In both methods, you pick a span, the distance between the points that will have a positive probability in the discretized distribution. For example, you may create a distribution with probabilities at all integers (span  1), or only at multiples of 1000 (span  1000). In the following example, recall our notational convention that f n is the probability that the severity equals n. (We use p n for the frequency and g n for aggregate losses, but those won’t come up in the following.)

19.2.1

Method of rounding

In the method of rounding,  the severity values  within a span  are rounded to the endpoints. If h is the span, f kh is set equal to F ( k + 0.5 − 0) h − F ( k − 0.5 − 0) h , where − 0 indicates that the lower bound is included but the upper bound isn’t; as usual in rounding, 0.5 rounds up. This rounding convention makes no difference if F is continuous everywhere. Example 19G Loss sizes follow a Pareto distribution with α  2, θ  3. The distribution will be discretized by the method of rounding with a span of 4. Calculate the resulting probabilities of 0, 4, 8, and 12; f0 , f4 , f8 , and f12 . Answer: Anything below 2 gets rounded to 0, so f0  F (2)  1 −

3 2 5

 0.64 .

2 3 2 − 39  0.24889 . 5 2 3 2 6 and 10 gets rounded to 8, so f8  F (10) − F (6)  39 − 13  0.05786 . 3 2 3 2  0.02211 10 and 14 gets rounded to 12, so f12  F (14) − F (10)  13 − 17

Anything between 2 and 6 gets rounded to 4, so f4  F (6) − F (2) 

Anything between Anything between

19.2.2

. 

Method of local moment matching

The method of rounding will usually result in a distribution whose mean is different from the original mean. The method of local moment matching guarantees that the discretized distribution will have the same mean as the original distribution. In the method of local moment matching, the probabilities and partial moments within a span are distributed between the two endpoints. Let h be the span, the distance between endpoints. We’ll only discuss the simplest case, in which only the first moments are matched. In this case, the endpoints of an interval are x k and x k+1 , where x k  x0 +kh. (Usually x0  0 for the distributions we use.) We assign masses m0k and m 1k to these endpoints such that 1. m0k + m 1k  F ( k + 1) h − F ( kh ) . This means the probabilities match.



2. x k m 0k + x k+1 m1k  matches.



R

( k+1) h kh

x f ( x ) dx. This means that locally, the contribution of the span to the mean

These are two equations in two unknowns—the two masses. After the masses are determined, they are added together to get the probabilities. In other words, f kh is equal to the sum of m 0k and m 1k−1 , the masses from the starting endpoint of the span and the ending endpoint of the previous span. As in the method of rounding, the convention (if F is not fully continuous) is to include the left endpoint but not the right endpoint in calculating the probabilities and moments of the spans. Example 19H Repeat example 19G using the method of local moment matching, matching the first moment. Calculate f0 and f4 . C/4 Study Manual—17th edition Copyright ©2014 ASM

19.2. DISCRETIZING

331

Answer: In this example, h  4. We must calculate m 00 , m 01 , and m10 .

The sum of the two masses m 00 and m 10 for the first span should equal Pr (0 ≤ X < 4)  1− Note that for this Pareto, θ * θ .1 − E[X ∧ x]  α−1 x+θ

,

so that E[X ∧ 4] 

12 7

! α−1

3 2 7

 0.81633.

  3x +/  3 1 − 3  3+x 3+x -

 1.71429 and E[X ∧ 8]  24/11. Also, in general, b

Z a

x f ( x ) dx  E[X ∧ b] − bS ( b ) − E[X ∧ a] − aS ( a )









The sum 0m 00 + 4m 10 should equal 4

Z 0

Then m10 

0.97959 4

x f ( x ) dx  E[X ∧ 4] − 4S (4)  1.71429 − 4 (0.18367)  0.97959

 0.24490 and m00  0.81633 − 0.24490  0.57143.

For the second span, the sum of the two masses m 01 and m 11 should equal F (8) − F (4)  0.10929. The first moment matching is 4m 01 + 8m 11  E[X ∧ 8] − 8S (8) − E[X ∧ 4] − 4S (4)



24 3  −8 11 11

!2

3 2 7

3 2 11





− 0.97959  0.60719

Solving: m 01  0.10929 − m 11

4 (0.10929 − m 11 ) + 8m 11  0.60719 0.60719  0.15180 0.10929 − m 11 + 2m 11  4 m 11  0.15180 − 0.10929  0.04250 m 01  0.10929 − 0.04250  0.06679

Then f0  m 00  0.57143 and f4  m 10 + m01  0.24490 + 0.06679  0.31169 To calculate f8 , you’d need the left mass of the span [8, 12) .



The textbook provides the general formulas m 0k

Z −

m 1k 

Z

x k +h−0

x k −0 x k +h−0 x−k−0

x − xk − h dF ( x ) h

x − xk dF ( x ) h

when matching one moment, where − 0 as before means include the point mass at the lower bound only. For a continuous F, dF ( x )  f ( x ) dx. C/4 Study Manual—17th edition Copyright ©2014 ASM

19. AGGREGATE LOSSES: MISCELLANEOUS TOPICS

332

The textbook provides the following simpler set of formulas directly for the probabilities f k (rather than for the masses m 0k and m1k ) in the exercises: E[X ∧ h] h 2 E[X ∧ ih] − E[X ∧ ( i − 1) h] − E[X ∧ ( i + 1) h]  , h

f0  1 − f ih

i  1, 2, . . .

(19.2)

The textbook also generalizes to any number of moments. In this case, you select points uniformly throughout the span, p + 1 points including the endpoints for p moments, and get p + 1 equations in p + 1 unknowns. This means that probability gets assigned to the intermediate points as well; if you were matching second moments with a span of 4, you’d assign probabilities to 0, 2, 4, . . . . The probabilities assigned to the intermediate points are the masses, while at the endpoints you’d add up the masses for the two intervals ending and starting at the endpoint, as above. When matching first moments, you will never get negative masses, but you could get negative masses when matching higher moments. Due to the difficulty of the calculation, I doubt you will be asked to discretize matching higher moments on an exam. Actually, I doubt there will be any discretizing questions of any type on the exam.

Exercises Exact Calculation of Aggregate Loss Distribution 19.1. In a collective risk model for aggregate losses, claim sizes follow a normal distribution with parameters µ  25, σ2  50. Claim sizes are independent of each other. Claim counts have the following distribution: n

pn

0 1 2

0.5 0.3 0.2

Claim sizes are independent of claim counts. Calculate the probability that aggregate claims are greater than 40. 19.2. In a collective risk model for aggregate losses, claims counts follow a binomial distribution with m  3, q  0.4. Claim sizes follow a normal distribution with parameters µ  100, σ2  1000. Claim sizes are independent of each other and of claim counts. Calculate the probability that aggregate claims are greater than their mean. 19.3.

You are given:

(i) Claim counts follow a binomial distribution with parameters m  2, q  0.2. (ii) Claim sizes follow an exponential distribution with mean 5000. (iii) Claim sizes are independent of each other and of claim counts. Determine the probability that aggregate losses are greater than 3000.

Exercises continue on the next page . . .

EXERCISES FOR LESSON 19

333

19.4. Claim sizes follow an exponential distribution with θ  5, and are independent of each other. Claim counts are independent of claim sizes, and have the following distribution: n

pn

0 1 2

0.7 0.2 0.1

Calculate FS (3) . 19.5.

For a block of 50 policies:

(i) Claim sizes on a coverage follow an exponential distribution with mean 1500. (ii) Claim counts for each policy follow a negative binomial distribution with parameters r  0.02 and β  25. (iii) Claim sizes are independent of each other. Claim counts and claim sizes are independent. Determine the probability that aggregate claims are within 0.842 standard deviations of the mean. 19.6. Claim sizes on a coverage follow an exponential distribution with mean 500. 100 lives are covered under the contract. Claim counts for each life follow a negative binomial distribution with mean 0.1 and variance 1.1. Claim counts and claim sizes are independent. A stop-loss reinsurance contract is available at 150% of expected claim cost. You are willing to pay 1500 for the contract. Determine the aggregate deductible needed to make the cost of the contract 1500. 19.7.

For a collective risk model:

(i) Claim counts follow a geometric distribution with β  0.2. (ii) Claim sizes follow an exponential distribution with θ  8000. (iii) Claims sizes are independent of each other and of claim counts. Calculate TVaR0.9 ( S ) for the aggregate loss distribution S. Discretizing 19.8.

X has an exponential distribution with mean 1.

Calculate p2 of the distribution discretized using the method of rounding with a span of 1. 19.9. Claim counts follow a Poisson distribution with mean 3. Claim sizes follow an exponential distribution with θ  2. Claim counts and claim sizes are independent. The severity distribution is discretized using the method of rounding with span 1. A stop-loss reinsurance contract has an aggregate deductible of 1.6. Calculate expected losses paid by the reinsurance contract. 19.10. X has a single-parameter Pareto distribution with θ  2, α  1. The distribution of X is discretized using the method of local moment matching with span 2, matching the first moment. Calculate p4 .

Exercises continue on the next page . . .

19. AGGREGATE LOSSES: MISCELLANEOUS TOPICS

334

Use the following information for questions 19.11 and 19.12: You are given: (i) Claims counts follow a negative binomial distribution with r  2, β  0.5. (ii) Claim sizes follow an exponential distribution with θ  3. (iii) Claim counts and claim sizes are independent. 19.11. The severity distribution is discretized using the method of local moment matching with span 1. Calculate FS (1) . 19.12. Using the actual severity distribution, calculate FS (1) . Additional released exam questions: CAS3-S06:36

Solutions 19.1.

Let N be the number of claims and S aggregate claims. If there is one claim Pr ( S > 40 | N  1)  Φ

25 − 40  Φ (−2.12)  0.0170 √ 50

!

If there are two claims, their sum is normal with mean 50 and variance 100, so the probability that aggregate losses are greater than 40 is 50 − 40 Pr ( S > 40 | N  2)  Φ √  Φ (1)  0.8413 100

!

The probability aggregate claims are greater than 40 is 0.3 (0.0170) + 0.2 (0.8413)  0.1734 . 19.2. Mean aggregate claims is 3 (0.4)(100)  120. Conditional probability of aggregate claims greater than 120 for each number of claims N is 120 − 100  Φ (−0.63)  0.2643 √ 1000 ! 120 − 200 Pr ( S > 120 | N  2)  1 − Φ √  Φ (1.79)  0.9633 2000 ! 120 − 300 Pr ( S > 120 | N  3)  1 − Φ √  Φ (3.29)  0.9995 3000

Pr ( S > 120 | N  1)  1 − Φ

!

The probabilities of 1, 2, 3 claims are p1  3 (0.62 )(0.4)  0.432, p2  3 (0.42 )(0.6)  0.288, and p 3  0.43  0.064. The probability aggregate losses are greater than their mean is 0.432 (0.2643) + 0.288 (0.9633) + 0.064 (0.9995)  0.4556 . 19.3.

If there is one claim, the probability that aggregate losses are greater than 3000 is Pr ( S > 3000 | N  1)  e −3000/5000  0.548812

If there are two claims, aggregate losses are Erlang with n  2 and Pr ( S > 3000 | N  2)  e −3000/5000 (1 + 0.6)  0.878099 C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 19

335

The distribution of claim counts is p 1  2 (0.2)(0.8)  0.32 and p2  0.22  0.04. The probability that aggregate losses are greater than 3000 is Pr ( S > 3000)  0.32 (0.548812) + 0.04 (0.878099)  0.2107 19.4. The probability of 0 is 0.7. The probability that one claim is less than 3 is the probability that an exponential with mean 5 is less than 3, or 1 − e −3/5  0.45119. The probability that two claims are less than 3 is the probability that an Erlang with parameters n  2 and θ  5 is less than 3. That probability is the same as the probability that at least two events occur by time 3 in a Poisson process with parameter 1/5, or 1 − e −0.6 (1 + 0.6)  0.12190 Summing up the three probabilities, FS (3)  0.7 + 0.2 (0.45119) + 0.1 (0.12190)  0.80243 19.5. For the block of 50 policies, claim counts follow a negative binomial with r  50 (0.02)  1 and β  25, or a geometric. The mean and variance of the geometric are β  25 and β (1 + β )  650 respectively. Letting S be aggregate losses for the entire block, E[S]  25 (1500)  37,500. By the √ compound variance formula, Var ( S )  25 (15002 ) + 650 (15002 ) , and the standard deviation is σ  1500 25 + 650  38,971. Then 37,500 − 0.842 (38,971)  4686 and 37,500 + 0.842 (38,971)  70,314, so we want the probability of 4686 ≤ S ≤ 70,314. β In the equivalent Bernoulli/exponential distribution, the Bernoulli has parameter (1+β )  25 26 and the exponential has parameter θ (1 + β )  (26)(1500)  39,000. The probabilities of the aggregate distribution being below 4686 and 70,314 are 25 −4686/39,000 e  0.1473 26 25 FS (70,314)  1 − e −70,314/39,000  0.8415 26 FS (4686)  1 −

So the probability of being in the interval is 0.8415 − 0.1473  0.6942 . If the normal approximation had been used, the probability would be 0.6 since Φ (0.842)  0.8. 19.6. For each life, rβ  0.1 and rβ (1 + β )  1.1, so β  10 and r  0.01. For the group, r  100 (0.01)  1 and β  10, making the distribution geometric. f g In order to make the cost 1500, expected claim costs, E ( S − d )+ , must be 1000. We have

f

g

∞

Z

E ( S − d )+ 

d

1 − FS ( x ) dx



∞ β  e −x/θ (1+β ) dx 1+β d Z ∞ 10  e −x/5500 dx 11 d  10   5500e −d/5500 11

Z

19. AGGREGATE LOSSES: MISCELLANEOUS TOPICS

336

 5000e −d/5500 We set E ( S − d )+ equal to 1000.

f

g

5000e −d/5500  1000 e −d/5500  0.2 d  −5500 ln 0.2  8852 19.7. The aggregate loss distribution is equivalent to Bernoulli claim counts with q  β/ (1+β )  0.2/1.2  1/6 and exponential claim sizes with mean θ (1 + β )  8000 (1.2)  9600. To find the 90th percentile of S, since the probability of a loss is 1/6, we need x for which Pr ( S > x )  0.1 and Pr ( S > x )  Pr ( N  1) Pr ( X > x )  61 Pr ( X > x ) , so we need Pr ( X > x )  0.6. Thus e −x/9600  0.6 x  −9600 ln 0.6  4903.93 The average value of S given S > 4903.93, due to lack of memory of the exponential, is 9600, so TVaR0.9 ( S )  4903.93 + 9600  14,503.93 . 19.8.

The interval [1.5, 2.5) goes to 2, so p2  e −1.5 − e −2.5  0.1410 .

19.9. The discretized distribution will have p0  Pr ( X < 0.5)  1 − e −0.5/2  0.221199 and p1  e −0.5/2 − e −1.5/2  0.306434. One problem with the method of rounding is that the mean of the discretized distribution is not the same as the mean of the original distribution. Fortunately, it is easy to calculate for an exponential (but not for other distributions). Since p k  e − ( k−0.5)/2 − e − ( k+0.5)/2 except for p0 , we have E[X] 

∞ X

kp k

k0

 e −1/4 − e −3/4 + 2 e −3/4 − e −5/4 + 3 e −5/4 − e −7/4 + · · ·









 e −1/4 + e −3/4 + e −5/4 + · · · 

e −1/4  1.979318 1 − e −1/2

Expected aggregate losses are E[S]  3 (1.979318)  5.937953. We’ll modify the Poisson to be the frequency of claims above 0, so that we don’t have to use the recursive formula; we’ll make the parameter λ0  λ (1 − p0 )  3 (1 − 0.221199)  2.336402. Then, letting p n be the probability of n claims, 0

p0  e −λ  e −2.336402  0.0966745 0

p1  λ0 e −λ  2.336402 (0.0966745)  0.225871 The modified probability of a claim size of 1 (given that the claim size is not 0) is f1  0.306434/ (1 − 0.221199)  0.393469. Letting g n be the probability that aggregate losses are n, g0  p0  0.0966745 g1  p1 f1  (0.225871)(0.393469)  0.0888734 C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 19

337

Thus the survival function of aggregate losses is SS (0)  1 − 0.0966745  0.903325

SS (1)  0.903325 − 0.0888734  0.814452

Then E[S∧1.6]  0.903325+0.6 (0.814452)  1.391996. The expected losses paid by the reinsurance contract are E[S] − E[S ∧ 1.6]  5.937953 − 1.391996  4.545957 19.10. We’ll use the textbook’s formulas for the masses, with x1  2 and x2  4. m11

Z 

2

Z 

2

Z 

4

x−2 f ( x ) dx 2

4

x−2 2

4

! !

2 dx x2

!

( x − 2) dx x2

2

2 4 x 2  ln 4 + 12 − ln 2 − 1  ln 2 −  ln x +

Z

m02  −

4 6

Z 

4

6

x−4−2 2

!

1 2

 0.19315

2 dx x2

!

(6 − x ) dx x2

6 6 − ln x x 4  −1 − ln 6 + 23 + ln 4 −



1 2

+ ln 32  0.09453

So p4  0.19315 + 0.09453  0.28768 . Alternatively, we can use formulas (19.2). For a single-parameter Pareto with α  1, x ≥ θ, x

Z E[X ∧ x] 

0

S ( x ) dx 

θ

Z 0

1dx +

x

Z θ

x dx x  θ 1 + ln θ θ





so E[X ∧ 2]  2

E[X ∧ 4]  2 (1 + ln 2) E[X ∧ 6]  2 (1 + ln 3)

2 2 (1 + ln 2) − 2 − 2 (1 + ln 3)



p4 



2

 2 ln 2 − ln 3  0.28768

19. AGGREGATE LOSSES: MISCELLANEOUS TOPICS

338

19.11. We’ll use formulas (19.2) for the probabilities of severities. E[X ∧ 1]  3 1 − e −1/3  0.85041





E[X ∧ 2]  3 1 − e −2/3  1.45975





f0  1 − 0.85041  0.14959

f1  2 (0.85041) − 1.45975  0.24106 We’ll modify the negative binomial to have non-zero claims only by multiplying β by 1 − f0 , obtaining β0  (0.5)(1 − 0.14959)  0.42520. We must also modify the probabilities of severities to make them conditional on severity not being zero by dividing by 1 − f0 ; the modified f1 is 0.24106/ (1 − 0.14959)  0.28346. Then 1 Pr ( S  0)  1 + β0

!r

1  1.42520

!2

 0.49232

rβ0 2 (0.42520)  (0.49232)  0.29376 p1  p0 0 1+β 1.42520

!

!

Pr ( S  1)  (0.28346)(0.29376)  0.08327 Then FS (1)  0.49232 + 0.08327  0.57559 . 19.12. In the equivalent binomial/exponential compound model, the binomial has parameters m  2, q  β/ (1 + β )  1/3 and the exponential has parameter θ (1 + β )  3 (1.5)  4.5. The probability of 0 claims is (2/3) 2  4/9. In the equivalent model: • •

The probability of 1 claim is 2 (2/3)(1/3)  4/9. If there is one claim, the probability of the claim being less than 1 is 1 − e −1/4.5  0.19926.

The probability of 2 claims is 1/9. If there are 2 claims, the sum of the two claims is Erlang with parameters n  2, θ  4.5. The probability of the sum being less than 1 is the probability of at least two events by time 1 in a Poisson process with parameter 1/4.5, or 1 − e −1/4.5 (1 + 1/4.5)  0.02132.

Summing up these probabilities, FS ( 1 ) 

4 4 1 + (0.19926) + (0.02132)  0.53537 9 9 9

The discretized estimate was 7.5% too high.

Lesson 20

Supplementary Questions: Severity, Frequency, and Aggregate Loss 20.1. Loss sizes follow a spliced distribution. In the range (0, 100) , the probability density function is of the form f ( x )  c 1 e −x/θ1 . In the range (100, ∞) , the probability density function is of the form f ( x )  c 2 θ22 / ( θ2 + x ) 3 . The parameters c 1 , c 2 , θ1 , and θ2 are chosen so that F (50)  0.5 F (100)  0.7 F (200)  0.9 Determine F (150) . (A) 0.80 20.2.

(B) 0.81

(C) 0.82

(D) 0.83

(E) 0.84

For a zero-modified random variable N from the ( a, b, 1) class, you are given

(i) Pr ( N  1)  0.6 (ii) Pr ( N  2)  0.18 (iii) Pr ( N  3)  0.072 Determine Pr ( N  0) . (A) 0.06 20.3.

(B) 0.07

(C) 0.08

(D) 0.09

(E) 0.10

(C) 237

(D) 244

(E) 250

For loss size, you are given:

(i) (ii)

α , h ( x )  100+x E[X]  50.

x > 0.

Calculate TVaR0.9 ( X ) . (A) 223

(B) 230

20.4. Claim costs for an insurance coverage follow a gamma distribution with parameters α  4 and θ  10. Claim adjustment costs are a proportion of the claim costs. The proportion is uniformly distributed on (0.05, 0.15) , and is independent of claim costs. Determine the variance of total claim costs including claim adjustment costs. (A) 483

(B) 484

(C) 485

(D) 486

339

(E) 487

Exercises continue on the next page . . .

20. SUPPLEMENTARY QUESTIONS: SEVERITY, FREQUENCY, AND AGGREGATE LOSS

340

20.5. Loss counts follow a negative binomial distribution with r  2, β  5. Loss sizes follow an inverse exponential distribution with θ  10. Let N be the number of losses of amounts less than 20. Determine the coefficient of variation of N. (A) 0.40 20.6.

(B) 0.60

(C) 0.64

(D) 0.66

(E) 0.82

For each of five tyrannosaurs with a taste for scientists:

(i) The number of scientists eaten has a binomial distribution with parameters m  1, q  0.6. (ii) The number of calories of a scientist is uniformly distributed on (7000, 9000) . (iii) The number of calories of each scientist is independent of the others and independent of the number of scientists eaten. Determine the probability that two or more scientists are eaten and that no more than two have at least 8000 calories each. (A) 0.50 20.7.

(B) 0.60

(C) 0.63

(D) 0.65

(E) 0.75

The conditional hazard rate of a random variable X given Θ is h ( x | Θ)  0.1Θ

The probability density function of Θ is f ( θ )  1002 θe −100θ

θ>0

Calculate the median of X. (A) (B) (C) (D) (E)

Less than 150 At least 150, but less than 250 At least 250, but less than 350 At least 350, but less than 450 At least 450

20.8. The number of snowstorms in January has a binomial distribution with m  8, q  0.5. The distribution of the number of inches of snow is: Inches

Probability

1 2 3 4 5 6

0.2 0.3 0.2 0.1 0.1 0.1

The number of snowstorms and the number of inches of snow are independent. Determine the expected amount of snow in January given that at least 4 inches of snow fall. (A) 11.7

(B) 11.8

(C) 11.9

(D) 12.0

(E) 12.1

Exercises continue on the next page . . .

20. SUPPLEMENTARY QUESTIONS: SEVERITY, FREQUENCY, AND AGGREGATE LOSS

20.9.

341

You are given:

(i) The number of claims follows a binomial distribution with m  3, q  0.2. (ii) Claim sizes follow the following distribution:

(iii)

Claim size

Claim probability

0 1 2 3

0.2 0.5 0.2 0.1

A reinsurance policy has an aggregate deductible of 6.

Determine the expected aggregate amount paid by the reinsurer. (A) 0.000300

(B) 0.000312

(C) 0.000324

(D) 0.000336

(E) 0.000348

20.10. You are given: (i) Claim counts follow a negative binomial distribution with r  0.5, β  1 per year. (ii) Claim sizes follow a two-parameter Pareto distribution with α  3, θ  1000. (iii) Claim counts and claim sizes are independent. Using the normal approximation, determine the probability that annual aggregate claims are less than 150. (A) 0.15

(B) 0.25

(C) 0.35

(D) 0.45

(E) 0.55

20.11. For an insurance coverage loss sizes follow a Pareto distribution and are independent of the deductible. You are given: (i) With an ordinary deductible of 100, average payment size per paid claim is 2600. (ii) With an ordinary deductible of 500, average payment size per paid claim is 2800. Calculate the average payment size per paid claim for a policy with a franchise deductible of 1000. (A) (B) (C) (D) (E)

Less than 3000 At least 3000, but less than 3500 At least 3500, but less than 4000 At least 4000, but less than 4500 At least 4500

20.12. Losses follow a Pareto distribution with parameters θ  1000 and α. The loss elimination ratio at 600 is 0.4. Determine α. (A) (B) (C) (D) (E)

Less than 1.9 At least 1.9, but less than 2.0 At least 2.0, but less than 2.1 At least 2.1, but less than 2.2 At least 2.2

Exercises continue on the next page . . .

20. SUPPLEMENTARY QUESTIONS: SEVERITY, FREQUENCY, AND AGGREGATE LOSS

342

20.13. For a random variable N following zero-modified negative binomial distribution, r  2, Pr ( N  0)  0.8, and Pr ( N  1)  0.02. Determine Pr ( N  2) . (A) 0.0125

(B) 0.0150

(C) 0.0175

(D) 0.0200

(E) 0.0225

20.14. Earned premium for an insurance coverage is 10,000. An agent gets a bonus of 20% of the amount by which losses are below the level generating a loss ratio of x, but the bonus may not be less than 0. The loss ratio is the ratio of losses to earned premium. Losses follow a Pareto distribution with parameters α  2, θ  6000. The expected value of the bonus is 500. Determine x. (A) 53%

(B) 55%

(C) 58%

(D) 61%

(E) 63%

20.15. At a train station, two train lines stop there, the A and the C. You take the first train that arrives. The probability that the A comes first is 50%. The number of friends you meet on the train, given the train line, has the following distribution: Number of friends

Probability for A train C train

0 1 2 3

0.5 0.2 0.2 0.1

0.6 0.3 0.1 0

Let X be the number of friends you meet. Which of the following intervals constitutes the range of all 80th percentiles of X? (A) [1, 1]

(B) [2, 2]

(C) [1, 2)

(D) (1, 2]

(E) [1, 2]

20.16. Losses follow a lognormal distribution with µ  3, σ  2. Calculate the Tail-Value-at-Risk for losses at the 90% security level. (A) 539

(B) 766

(C) 951

(D) 1134

(E) 1301

20.17. Earned premium for an insurance coverage is 7,500. An agent gets a bonus of 50% of the amount by which losses are below the level generating a loss ratio of 60% but not less than 0, where the loss ratio is the ratio of losses to earned premium. Losses on an insurance coverage follow a Pareto distribution with parameters α  1, θ  5000. Determine the expected value of the bonus. (A) 382

(B) 645

(C) 764

(D) 1068

(E) 1605

20.18. An agent sells 10,000 of premium. He gets a bonus of 20% of the premium times the excess of 80% over the loss ratio, but not less than 0, where the loss ratio is the quotient of losses over premium. Losses follow a single-parameter Pareto distribution with θ  4000, α  3. Calculate the variance of the agent’s bonus. (A) 70,000

(B) 140,000

(C) 210,000

(D) 280,000

(E) 350,000 Exercises continue on the next page . . .

20. SUPPLEMENTARY QUESTIONS: SEVERITY, FREQUENCY, AND AGGREGATE LOSS

343

20.19. The random variable X follows a Pareto distribution with parameters α  2 and θ  4. X e is a random variable having the equilibrium distribution for X. Calculate FX e (5) . (A)

4 81

(B)

16 81

(C)

25 81

(D)

5 9

(E)

65 81

Solutions 20.1. [Section 4.3] F (50) is extraneous, since we only need the distribution function from 100 on, which is a multiple of a Pareto; let k be the multiplier. To match F (100) and F (200) , we need θ 0.3  S (100)  k θ + 100 θ 0.1  S (200)  k θ + 200

!2 !2

Dividing the second equation into the first,

!2

θ + 200 3 θ + 100 θ + 200 √  3 θ + 100 √ √ θ 3 + 100 3  θ + 200

θ θ + 200

So F (150)  1 − k 20.2.

θ θ + 150

!2

√ 200 − 100 3  36.6025 θ √ 3−1

!2

!2

36.6025   0.02393 236.6025 0.1 k  4.179 0.02393

 1 − 4.179

36.6025 186.6025

!2

 1 − 0.16077  0.83923

[Section 11.2] Back out a and b. b 2 b a+ 3 b 6 b

a+

0.18  0.3 0.6 0.072   0.4 0.18 

 −0.1  −0.6

a − 0.3  0.3

a  0.6

(E)

20. SUPPLEMENTARY QUESTIONS: SEVERITY, FREQUENCY, AND AGGREGATE LOSS

344

Then β/ (1 + β )  0.6, so β  1.5, and since b  −a, then r − 1  −1 and r  0, making N logarithmic. For a logarithmic distribution, β 0.6   0.654814 p1T  (1 + β ) ln (1 + β ) ln 2.5 and since p1M  (1 − p 0M ) p1T , it follows that 1 − p0M  0.6/0.654814 and p0M  0.0837 . (C) 20.3.

[Section 8.3] The survival function is developed as follows: x

Z H (x ) 

0

h ( u ) du 

x

Z 0

100 100 + x

S ( x )  e −H ( x ) 

100 + x α du x  α ln (100 + u )  α ln 0 100 + u 100

We recognize this as a two-parameter Pareto with θ  100. Since E[X]  50, then θ/ ( α − 1)  50, so α  3. Using the tables, VaR0.9 ( X )  100 0.1−1/3 − 1  115.44





TVaR0.9 ( X )  115.44 +

100 + 115.44  223.17 2

(A)

20.4. [Lesson 3] Let X be claim costs and Y the proportion of claim adjustment costs. We want the variance of X (1 + Y ) . We will compute the first and second moments, then the variance. E[X (1 + Y ) ]  E[X] E[1 + Y]  ( αθ )(1 + 0.1)  (4)(10)(1.1)  44

 E

X (1 + Y )

2

 E X 2 E (1 + Y ) 2

f

g

f

g

E X 2  α ( α + 1) θ 2  (4)(5)(102 )  2000

f

g

E (1 + Y ) 2  E[ (1 + Y ) ]2 + Var (1 + Y )  1.12 +

f

g

0.12 12

because the variance of a uniform distribution is the range squared divided by 12 E (1 + Y ) 2  1.210833

f

 E

X (1 + Y )

g

2

 2000 (1.210833)  2421.667

Var X (1 + Y )  2421.667 − 442  485.667





(D)

An alternative solution1 uses conditional variance, and also uses the fact that the gamma distribution is a scale distribution with scale parameter θ, so total costs follow a gamma distribution with parameters α  4 and θ  u, where u is uniformly distributed on (10.5, 11.5) since it is 10 times a random variable that is uniform on (1.05, 1.15) . Let Z be total claim costs. Var ( Z )  E[Var ( Z | u ) ] + Var (E[Z | u])  E[4u 2 ] + Var (4u )

The second moment of u is

Z

11.5 10.5

v 2 dv 

11.53 − 10.53 3

EXERCISE SOLUTIONS FOR LESSON 20 so

20.5.

345

1 11.53 − 10.53 + 42  485.667 Var ( Z )  4 3 12

!

!

[Lesson 13] For an inverse exponential, F (20)  e −θ/20  e −10/20  0.60653

The coverage modification multiplies β by this factor, so β  5 (0.60653)  3.0327. Then the coefficient of variation is the standard deviation over the mean, or

p

rβ (1 + β )  rβ

s

1+β  rβ

r

4.0327  0.8154 2 (3.0327)

(E)

20.6. [Lesson 11] For the combination of 5 tyrannosaurs, the distribution is binomial(5,0.6). Each scientist has a 0.5 probability of being over 8000 calories. There are 4 ways to satisfy the conditions: 5 2 3 2 (0.6 )(0.4 )

1.

Eat 2 scientists, probability

2.

Eat 3 scientists, at least one below 8000 calories. Probability of 3 is 53 (0.63 )(0.42 )  0.3456. Probability that all 3 are above 8000 calories is 0.53 . Multiplying, (0.3456)(1 − 0.53 )  0.3024.

3.

4.

 0.2304.



Eat 4 scientists, at least 2 below 8000 calories. Probability of 4 is 54 (0.64 )(0.4)  0.2592. Probability   5 5 that 3 or 4 out of 4 are above 8000 calories is 43 (0.54 ) + 44 (0.54 )  16 . Multiplying, (0.2592)(1 − 16 ) 0.1782.



Eat 5 scientists, at least 3 below 8000 calories. Probability of 5 is 0.65  0.07776. Probability that 3, 4, or 5 scientists are at least 8000 is 21 . Multiplying, 0.07776 (1 − 21 )  0.03888

The total probability is 0.2304 + 0.3024 + 0.1782 + 0.03888  0.74988 . (E) 20.7.

[Subsection 4.1.3] The conditional survival function is S ( x | θ )  e −0.1θx

and integrating over Θ,

S ( x )  E[e −0.1Θx ]  MΘ (−0.1x )

Since Θ’s distribution is gamma with α  2 and θ  0.01,

S ( x )  (1 + 0.001x ) −2 Setting this equal to 0.5,

(1 + 0.001x ) 2  2

√ 2−1 √ x  1000 ( 2 − 1)  414.2

0.001x 

(D)

346

20.8.

20. SUPPLEMENTARY QUESTIONS: SEVERITY, FREQUENCY, AND AGGREGATE LOSS [Lesson 17] Let X be the number of inches of snow per snowstorm. Then E[X]  0.2 + 2 (0.3) + 3 (0.2) + 4 (0.1) + 5 (0.1) + 6 (0.1)  2.9

The average number of snowstorms is (8)(0.5)  4. The average amount of snow is 4 (2.9)  11.6. We need the probabilities of 0, 1, 2, and 3 inches of snow. We will calculate them directly, although the recursive formula could also be used. First we calculate the binomial probabilities for the number of snowstorms. p0  0.5

8 (0.58 )  8 (0.58 ) p1  1

!

8

8 (0.58 )  56 (0.58 ) p3  3

8 (0.58 )  28 (0.58 ) p2  2

!

!

Then for the aggregate distribution, number of inches of snow, g0  0.58  0.00390625 g1  8 (0.58 )(0.2)  0.00625 g2  0.58 (28)(0.22 ) + 8 (0.3)  0.01375





g3  0.58 (56)(0.23 ) + 28 (2)(0.3)(0.2) + 8 (0.2)  0.021125





The probability of 4 or more inches is 1 − g 0 − g1 − g2 − g3  0.95496875. The expected amount of snow that falls if we ignore years with less than 4 inches is 11.6 − 0.00625 − 0.01375 (2) − 0.021125 (3)  11.502875 The expected amount of snow conditioned on 4 inches or more is the quotient, or (D)

11.502875 0.95496875

 12.0453 .

20.9. [Lesson 18] It’s easiest to calculate this directly: probability of 7 times 1 plus probability of 8 times 2 plus probability of 9 times 3. Three claims are needed for any reinsurance payment, and the probability of 3 claims is 0.23  0.008, which we’ll multiply at the end. Aggregate claims of 7 can be obtained by two 3’s and one 1: 3 (0.1) 2 (0.5)  0.015 or by two 2’s and one 3: 3 (0.2) 2 (0.1)  0.012. Aggregate claims of 8 can be obtained by two 3’s and one 2: 3 (0.1) 2 (0.2)  0.006. Aggregate claims of 9 can be obtained by three 3’s: (0.1) 3  0.001. The answer is: 0.008 (0.015 + 0.012) + 2 (0.006) + 3 (0.001)  0.008 (0.042)  0.000336





(D)

20.10. [Lesson 15] E[N]  0.5, Var ( N )  0.5 (1)(2)  1, E[X]  500, E X 2  1,000,000, Var ( X )  750,000.

f

g

E[S]  (0.5)(500)  250 Var ( S )  (0.5)(750,000) + (1)(5002 )  625,000 150 − E[S] −100  −0.126  √ 790.569 Var ( S ) Φ (−0.126)  Φ (−0.13)  1 − 0.5517  0.4483

(D)

Note: the normal approximation would not be used in real life in this example due its high probability of being below 0.

EXERCISE SOLUTIONS FOR LESSON 20

20.11.

347

[Lesson 6] The mean excess loss at d for a Pareto is θ + 100 α−1 θ + 500 α−1 400 α−1 α

θ+d α−1 ,

so

 2600  2800  200 3

θ  5100

 3050. Add to this 1000, since under a For an ordinary deductible of 1000, mean excess loss is 5100+1000 2 franchise deductible the entire loss is paid once it is above the deductible, and we get the answer 4050 . (D) 20.12. [Lesson 7] For a Pareto, the loss elimination ratio (dividing E[X ∧ d] by E[X]) is 1 − Therefore 1000 1− 1600

! α−1

θ  α−1 . θ+x

 0.4

( α − 1) ln 85  ln 0.6 α−1

ln 0.6 ln 58

 1.0869

α  2.0869 20.13.

(C)

[Lesson 11] From the tables, p 1T 

(1 +

rβ − (1 + β )

β ) r+1

and since p0  0.8, the modified probability 0.02  p 1M  p1T (1 − 0.8)  0.2p1T , so 2β 3 (1 + β ) − (1 + β ) 0.02 2   0.1 0.2 2 + 3β + β 2 2 β 2 + 3β + 2   20 0.1 β 2 + 3β − 18  0 √ −3 + 32 + 72 β 3 2 0.02  0.2

!

The negative solution to the quadratic is rejected since β can’t be negative. Now, a  b  ( r − 1) a  0.75, so ! b M 9 M 9 p2M  a + (E) p  p1  (0.02)  0.0225 2 1 8 8

β 1+β



3 4

 0.75 and

348

20. SUPPLEMENTARY QUESTIONS: SEVERITY, FREQUENCY, AND AGGREGATE LOSS

20.14. [Lesson 10] If X is the loss random variable, the bonus is 0.2 max (0, 10,000x −X ) and its expected value is 0.2 E[max (0, 10,000x − X ) ]  0.2 (10,000x − E[X ∧ 10,000x])  500 We divide by 0.2 and by 1000 (to make the numbers easier to handle).

!

θ θ *1 − 10x − 0.001 α−1 10,000x + θ

,

10x − 6 1 −

! α−1

+  2.5 -

6,000  2.5 10,000x + 6,000 ! 10x 10x − 6  2.5 10x + 6

100x 2 + 60x − 60x  25x + 15

100x 2 − 25x − 15  0 √ 25 + 625 + 6000 x  0.5320 200

(A)

20.15. [Section 1.2] Since the trains are equally likely, the joint probabilities of the number of friends is 0.5 times the conditional probabilities, and adding these up gives the marginal probabilities of meeting a number of friends: Number of friends

Joint probability for A C

0 1 2 3

0.25 0.10 0.10 0.05

Total probability

Cumulative probability

0.55 0.25 0.15 0.05

0.55 0.80 0.95 1.00

0.30 0.15 0.05 0

Thus the cumulative distribution function F ( x ) is 0.8 for x ∈ [1, 2) . Thus for any x ∈ [1, 2], Pr ( X < x ) ≤ 0.8 and Pr ( X ≤ x ) ≥ 0.8, and any x ∈ [1, 2] is an 80th percentile. (E)

20.16. [Lesson 8] The 90th percentile of the lognormal is e 3+1.282 (2)  e 5.564 . The partial expectation for x > e 5.564 is 2

e µ+0.5σ .1 − Φ

*

ln x − µ − σ2 + 2 5.564 − 3 − 22 + /  e 3+0.5(2 ) *.1 − Φ / σ 2

!

,

!

-

,

 148.413 (1 − Φ (−0.72))

-

 148.413 (0.7642)  113.4

Dividing by the probability of being above the 90th percentile (0.1), we get 1134 as the final answer. (D) 20.17.

[Lesson 10] The bonus is 0.5 max 0, 0.6 (7500) − X  0.5 4500 − min ( X, 4500)  2250 − 0.5 ( X ∧ 4500)









and its expected value is

2250 − 0.5 E[X ∧ 4500]  2250 + 0.5θ ln

θ 4500 + θ

!

5000  2250 − 1604.63  645.37  2250 + 2500 ln 9500

!

(B)

EXERCISE SOLUTIONS FOR LESSON 20

20.18.

349

[Lesson 10] If X is losses, then the bonus is



max .0, 0.2 (10,000) 0.8 −

*

X +/  0.2 max (0, 8000 − X )  0.2 (8000 − X ∧ 8000) 10,000



,

-

The variance of the parenthesized expression, since 8000 has no variance, is the variance of X ∧ 8000. Using the tables, we calculate the moments. θ3 αθ − α − 1 2 (80002 ) 3 (4000) 40003   5500 − 2 2 (80002 ) f g 2θ 3 αθ2 − E ( X ∧ 8000) 2  α − 2 8000 3 (40002 ) 2 (40003 )  −  32,000,000 1 8000 Var ( X ∧ 8000)  32,000,000 − 55002  1,750,000 E[X ∧ 8000] 

The variance of 0.2 ( X ∧ 8000) is 0.22 (1,750,000)  70,000 . (A)

θ  20.19. [Section 8.4] E[X]  α−1 distribution has density function

4 2−1

 4, while S ( x ) 

 fe (X ) 

4/ (4 + x )

2 

4

θ α θ+x



4 2 4+x .

Then the equilibrium

4 (4 + x ) 2

which you should recognize as the density function of a two-parameter Pareto with α  1, θ  4, so Fe (5)  1 − 4/ (4 + 5)  5/9 . (D) If you didn’t recognize it, you could calculate Fe (5) by integrating f e ( x ) from 0 to 5:

R FX e ( 5 ) 

5 16dx 0 (4+x ) 2

4

1  −4 4+x

!5 0

5 1 1 −  4 4 9 9





(D)

350

20. SUPPLEMENTARY QUESTIONS: SEVERITY, FREQUENCY, AND AGGREGATE LOSS

Part II

Empirical Models

352

PART II. EMPIRICAL MODELS

In empirical models, a distribution is fitted to data without specifying an underlying model. The distribution is data-dependent; it can only be described by referring to all of the observations. This contrasts with parametric models, where the assumed distribution can be described by listing a short set of parameters. We begin with a fast review of statistics, which is used both here and when discussing parametric estimators. We then discuss the empirical fit, first with complete data, then with incomplete data. The empirical distribution function will have jumps. Kernel smoothing is a method for smoothing the empirical distribution. We finish off discussing approximations to the empirical model when there are large amounts of data, such as when constructing mortality tables.

Lesson 21

Review of Mathematical Statistics Reading: Loss Models Fourth Edition 10 This lesson is a short review of statistics. If you’ve never studied statistics, you may have difficulty with it. However, a full-fledged introduction to statistics is beyond the scope of this manual. On exams, there is typically only one question on the material of this lesson, almost always on Estimator Quality, but much of the work we do later depends on the other concepts in this lesson, hypothesis testing and confidence intervals. A statistic is a number calculated based purely on observed data. No assumptions or hypotheses are needed to calculate it. Examples of statistics are the sample mean, the sample variance, the sample median. The field of statistics deals with two tasks: testing hypotheses and estimating parameters. We shall deal with the latter first.

21.1

Estimator quality

One of the tasks of statistics is to estimate parameters. Typically, a statistical method will provide a formula which defines an estimated parameter in terms of sample statistics. For example, suppose you have a random variable that is believed to follow a normal distribution. You would like to estimate the parameters µ and σ2 of the distribution. You have a sample of n observations from the population. Here are examples of statistical methods (a hat on a variable indicates an estimate): 1. Ignore the sample data. Estimate µˆ  100 and σˆ 2  25. 2. Estimate µˆ  100, and estimate σˆ 2 as the average square difference from 100, or ¯ the sample mean, and σˆ 2  3. Estimate µˆ  x, 4. Estimate µˆ  x¯ and σˆ 2 

P

P

P

( x i − 100) 2 /n.

( x i − x¯ ) 2 /n.

( x i − x¯ ) 2 / ( n − 1) .

In the subsequent discussion, we will refer to this example as the “four-statistical-method” example. All of these methods are possible estimators for µ and σ. Which estimator is the best? What are the advantages and disadvantages of each of them? We somehow suspect that the first two estimators aren’t good, since they ignore the data, at least in estimating µ. However, they have the advantage of giving you a clear-cut answer. The estimate of µ is 100, regardless of what the data says. When using the last two estimators, you may estimate µˆ  50 after 1000 trials, yet after the 1001st trial you almost surely end up with a different estimate. There are many non-mathematical reasons that an estimator may be bad. It may be based on the wrong population. It may be based on bad assumptions. We cannot quantify these errors, and will not discuss them further. What we can quantify is the intrinsic quality of an estimator. Assuming that all of our hypotheses are correct, there is built-in error in the estimator. We will discuss three measures of estimator quality. In the following discussion, θˆ is the estimator, θˆ n is the estimator based on n observations, and θ is the parameter being estimated. C/4 Study Manual—17th edition Copyright ©2014 ASM

353

21. REVIEW OF MATHEMATICAL STATISTICS

354

21.1.1

Bias

Bias is the excess of the expected value of the estimator over its true value. biasθˆ ( θ )  E[ θˆ | θ] − θ

An estimator is unbiased if biasθˆ ( θ )  0, which means that based on our assumptions, the average value of the estimator will be the true value, obviously a desirable quality. Even if an estimator is biased, it may be asymptotically unbiased, meaning that as the sample size goes to infinity, the bias of the estimator goes to zero: θˆ asymptotically unbiased estimator for θ ⇔ lim biasθˆ ( θ )  0 n→∞

An unbiased estimator is automatically asymptotically unbiased. In the four-statistical-method example above, let’s calculate the bias of each estimator.

1. The bias of the estimator for µ is 100 − µ. If µ happens to be 100, then the estimator is unbiased. If µ is 0, then the bias is 100. This method may be very biased or it may be unbiased—it all depends on the true, presumably unknown, value of µ. Similar remarks apply to σˆ 2 . If σ2  25, the estimator is unbiased. Otherwise it is biased. 2. The remarks for µˆ of the first estimator apply equally well here. Let’s postpone discussing the bias of σˆ 2 . 3. The expected value of the sample mean is ¯  E[x]

E

fP

n i1

xi

n

g 

n E[X]  E[X] n

and E[X]  µ, so the bias of the sample mean is E[X] − µ  µ − µ  0. In general, for any distribution, the sample mean is an unbiased estimator of the true mean. A theorem of probability states that in general (not just for normal populations), for a sample of size ¯ n with sample mean x,

X  n ( x i − x¯ ) 2   ( n − 1) Var ( X )  i1 

E  Here, Var ( X )  σ2 , so

2

E[σˆ ] 

E

fP

Therefore, the bias of the estimator for σ2 is biasσˆ 2 ( σ2 ) 

( x i − x¯ ) 2 n

g 

n−1 2 σ n

n−1 2 σ2 σ − σ2  − n n

While σˆ 2 is biased, it is asymptotically unbiased. As n → ∞, the bias goes to 0.

4. Once again, the sample mean is an unbiased estimator of the true mean. The estimator for σ2 has expected value fP g E ( x i − x¯ ) 2 ( n − 1) Var ( X )   Var ( X ) n−1 n−1 and therefore σˆ 2 is an unbiased estimator for σ 2 . In general (not just for normal populations), the quotient of the square difference from the sample mean divided by n − 1 is an unbiased estimator of the variance. C/4 Study Manual—17th edition Copyright ©2014 ASM

21.1. ESTIMATOR QUALITY

355

Let’s get back to σˆ 2 of the second method. A useful trick to calculate its expected value is to break out (xi − µ)2: n X i1

( x i − 100) 2  

n  X i1 n X i1

( x i − µ ) + ( µ − 100)

(xi − µ)2 +

n X i1

2

( µ − 100) 2 + 2

n X i1

( x i − µ )( µ − 100)

The expected value of each summand of the first term is the variance, or σ2 ; in fact, E[ ( X − µ ) 2 ] is the definition of Var ( X ) . The second term has a constant summand, so it equals n ( µ − 100) 2 . The expected value of x i − µ is 0 (µ is the expected value of x i ), so the expected value of the third term is 0. So 2

E[ σˆ ]  E

" Pn

− 100) 2  σ2 + ( µ − 100) 2 n

i1 ( x i

#

If µ  100, the estimator is unbiased. Otherwise it is biased, and the bias is independent of sample size. Example 21A In an urn, there are four marbles numbered 5, 6, 7, and 8. You draw three marbles from the urn without replacement. Let θˆ be the maximum of the three marbles. Calculate the bias of θˆ as an estimator for the maximum marble in the urn, θ. Answer: There are four combinations of 3 marbles out of 4. Three of the combinations include 8, making the maximum 8. The remaining one is {5, 6, 7}, with a maximum of 7. Thus the expected value of θˆ is 3 4 (8)

+ 41 (7)  7 34 , whereas the true maximum is 8. The bias is 7 34 − 8  − 41 .



Example 21B X has a uniform distribution on [0, θ]. A sample {x i } of size n is drawn from X. Let θˆ  max x i . Determine biasθˆ ( θ ) . Answer: To calculate the expected value of the maximum, we need its density function. Let Y be the random variable for the maximum. For a uniform distribution on [0, θ], FX ( x )  Pr ( X ≤ x )  x/θ for 0 ≤ x ≤ θ. Then for 0 ≤ x ≤ θ, FY ( x )  Pr ( X1 ≤ x ) Pr ( X2 ≤ x ) · · · Pr ( X n ≤ x )  fY ( x ) 

nx n−1 θn

xn θn

The expected value of Y is θ

Z E[Y] 

0

y f ( y ) dy 

θ

Z 0

n y n dy nθ  θn n+1

We conclude that the bias of θˆ is biasθˆ ( θ )  The estimator is asymptotically unbiased. C/4 Study Manual—17th edition Copyright ©2014 ASM

θ nθ −θ − n+1 n+1



21. REVIEW OF MATHEMATICAL STATISTICS

356

21.1.2

Consistency

An unbiased estimator is good on the average, but may be quite bad. It may be too high half the time and too low half the time, and never close to the true value. Consistency may be a better measure of quality. An estimator is consistent if it is, with probability 1, arbitrarily close to the true value if the sample is large enough. In symbols, an estimator is consistent if for all δ > 0, limn→∞ Pr ( | θˆ n − θ| < δ )  1. Sometimes this is called weak consistency. A sufficient but not necessary condition for consistency is that the estimator be asymptotically unbiased and that its variance goes to zero asymptotically as the sample size goes to infinity. Let’s use this condition to analyze the consistency of the sample mean as an estimator for the true mean. We already mentioned when discussing the bias of the four-statistical-method example that the sample mean is unbiased, so it’s certainly asymptotically unbiased. The variance of the sample mean is Var ( X ) /n. If Var ( X ) is finite, then Var ( X ) /n → 0. So if Var ( X ) is finite, the sample mean is consistent. If Var ( X ) is not finite, such as for a Pareto distribution with α ≤ 2, the sample mean may not be consistent. In our four-statistical-method example, 1. The estimators of µ and σ2 are consistent if and only if µ  100 and σ2  25. 2. The estimators of µ and σ2 are consistent if and only if µ  100. 3. Since σ 2 is finite, the estimator for µ is consistent, as we just discussed. σˆ 2 is asymptotically unbiased. The variance of σˆ 2 is a function of the fourth and lower moments of a normal distribution, which are all finite, so σˆ 2 is consistent. 4. For the same reasons as the third estimator, the estimators of µ and σ 2 are consistent.

21.1.3

Variance and mean square error

Mean square error is the average square difference between the estimator and the true value of the parameter, or f g MSEθˆ ( θ )  E ( θˆ − θ ) 2 | θ

The lower the MSE, the better the estimator. In some textbooks, an estimator with low variance is called “efficient”, but the textbook on the syllabus, Loss Models, avoids the use of this vague term, so you are not responsible for knowing the meaning of the word “efficient”. An estimator is called a uniformly minimum variance unbiased estimator (UMVUE) if it is unbiased and if there is no other unbiased estimator with a smaller variance for any true value θ. It would make no sense to make a similar definition for biased estimators (i.e., a uniformly minimum MSE estimator), since estimators like the first method of the four-statistical-method example have a mean square error of 0 when the constant equals the parameter, so no estimator can have the smallest error for all values of the parameter. There is an important relationship between MSE, bias, and variance: MSEθˆ ( θ )  Var ( θˆ ) + biasθˆ ( θ )



2

(21.1)

Example 21C [4B-F96:21] (2 points) You are given the following: • The expectation of a given estimator is 0.50. • The variance of this estimator is 1.00. • The bias of this estimator is 0.50. Determine the mean square error of this estimator. (A) 0.75

(B) 1.00

(C) 1.25

(D) 1.50

(E) 1.75

21.2. HYPOTHESIS TESTING

357

Answer: MSE ( θˆ )  1.00 + 0.502  1.25 . (C)



Example 21D In an urn, there are four marbles numbered 5, 6, 7, and 8. You draw 3 marbles from the urn without replacement. Let θˆ be the maximum of the 3 marbles. Calculate the mean square error of θˆ as an estimator for the maximum marble in the urn, θ. Answer: There are four combinations of 3 marbles out of 4. Three of the combinations have 8. The remaining one is {5, 6, 7}, with a maximum of 7. The true maximum is 8. Thus, the error is 7 − 8  −1

one-fourth of the time, 0 otherwise, so the mean square error is 14 (12 ) 

1 4

.

The variance of the estimator (using the Bernoulli shortcut—see section 3.3 on page 54) is

(0.25)(0.75)(12 )  and indeed − 14



2

+

3 16

3 , 16

 14 —the bias squared plus the variance equals the mean square error.



Example 21E X has a uniform distribution on [0, θ]. A sample {x i } of size n is drawn from X. Let θˆ  max x i . Determine MSEθˆ ( θ ) . Answer: We calculated the density of Y  max x i above in Example 21B. Now let’s calculate the second moment of Y. E[Y 2 ] 

θ

Z 0

y 2 f ( y ) dy 

nθ 2 nθ Var ( Y )  − n+2 n+1

θ

Z 0

!2 

n y n+1 dy nθ 2  n θ n+2

nθ 2 ( n + 1) 2 ( n + 2)

(21.2)

Combining this with our calculation of biasθˆ ( θ ) in Example 21B, we conclude that the mean square error is !2 θ nθ 2 2θ 2 MSEθˆ ( θ )  +  n+1 ( n + 1) 2 ( n + 2) ( n + 1)( n + 2) 

?

Quiz 21-1 To estimate the parameter µ of a random variable following lognormal distribution with parameters µ and σ  4: (i) An observation x is made of the random variable. (ii) µ is estimated by ln x. Calculate the mean square error of this estimator.

21.2

Hypothesis testing

While there will probably be no exam questions directly on the material in this section, it will be used heavily in this part of the course. We often have various ideas of how the world operates. We may come up with a hypothesis quantifying natural occurrences. Examples of hypotheses are: • A certain medicine is effective in preventing Alzheimer’s. C/4 Study Manual—17th edition Copyright ©2014 ASM

21. REVIEW OF MATHEMATICAL STATISTICS

358

Table 21.1: Summary of Estimator Quality Concepts

In the following, θˆ is an estimator for θ and θn is an estimator for θ based on a sample of size n. Bias: bias ˆ ( θ )  E[ θˆ | θ] − θ. θ

If biasθˆ ( θ )  0, the estimator is unbiased. If limn→∞ biasθn ( θ )  0, the estimator is asymptotically unbiased. The sample mean is unbiased. The sample variance (with division by n − 1) is unbiased. Consistency: limn→∞ Pr ( |θn − θ| >  )  0.

A sufficient condition for consistency is that the estimator is asymptotically unbiased and the variance of the estimator goes to 0 as n → ∞. Mean square error: MSE ˆ ( θ )  E[ ( θˆ − θ ) 2 | θ]  bias2 ( θ ) + Var ( θˆ ) θ

θˆ

• Smokers have a higher probability of getting lung cancer. • Losses follow a Pareto distribution with α  4, θ  10,000. Statistics helps decide whether to accept a hypothesis. To decide on a hypothesis, we set up two hypotheses: a null hypothesis, one that we will believe unless proved otherwise, and an alternative hypothesis. Usually, the null hypothesis is a fully specified hypothesis: it has a probability distribution associated with it. The alternative hypothesis may also be fully specified, or it may allow for a range of possibilities. For example, suppose a new drug might prevent Alzheimer’s disease. It is known that for a healthy person age 75, there is a 0.1 probability of contracting Alzheimer’s disease within 5 years. To statistically test the drug, you would set up the following two hypotheses: • H0 (the null hypothesis): The probability of contracting Alzheimer’s disease if you use this drug is 0.1. • H1 (the alternative hypothesis): The probability of contracting Alzheimer’s disease if you use this drug is less than 0.1. Notice that H0 is fully specified: a Bernoulli distribution with parameter q  0.1. You can calculate probabilities assuming H0 . On the other hand H1 is not fully specified, and allows any value of q < 0.1. The next thing you would do is specify a test. For example, we’ll give 100 people age 75 this drug, and observe them for 5 years. The test statistic will be the number who get Alzheimer’s disease. Let’s call this number X. If X is low, we’ll reject H0 and accept H1 , while if it is high we won’t reject H0 . When we don’t reject H0 , we say that we accept H0 . This does not mean we really believe H0 ; it just means we don’t have enough evidence to reject it. Now we have to decide on a boundary point for the test. This boundary point is called the critical value. Let’s say the boundary point is c. Then we reject H0 if X < c or possibly if X ≤ c. The set of values for which we reject H0 is called the critical region. How do we go about setting c? The lower we make c, the more likely we’ll accept H0 when it is false. The higher we make c, the more likely we’ll reject H0 when it is true. Rejecting H0 when it is true is called a Type I error. The probability of a Type I error, assuming H0 is true, is the significance level of the test. Thus, the lower the significance, the greater the probability of accepting H0 . The letter α is often used for the significance level. The precise probability of getting the observed statistic given that the null hypothesis is true is called the p-value; the lower the p-value, the greater the tendency to reject H0 . C/4 Study Manual—17th edition Copyright ©2014 ASM

21.3. CONFIDENCE INTERVALS

359

If we reject H1 when it is true, we’ve made a Type II error. The power of a test is the probability of rejecting H0 when it’s false. We’d like as much power as possible, but increasing the power may also raise the significance level (which is no good). A uniformly most powerful test gives us the most power for a fixed significance level. When the alternative hypothesis is not fully specified, we cannot calculate the power for the entire range, but we can calculate the power for specific values of the alternative hypothesis. Example 21F In the drug example discussed above, we will reject the null hypothesis if c ≤ 6. Determine the significance level of the test. Also determine the power of the test at q  0.08. Answer: The significance level of the test is the probability of 6 or less people out of 100 getting Alzheimer’s if q  0.1. The number of people getting Alzheimer’s is binomial with m  100, q  0.1, and the probability of such a binomial being 6 or less is

! 6 X 100 i0

i

(0.1i )(0.9100−i )  0.1171556

This is the exact significance level. Since this can be difficult to compute, the normal approximation is usually used. The mean of the binomial variable is 100 (0.1)  10 and the variance is 100 (0.1)(0.9)  9. Making a continuity correction, we calculate the approximate probability that a normal random variable with µ  10, σ 2  9 is no greater than than 6.5. ! 6.5 − 10 Pr ( X ≤ 6.5 | H0 )  Φ  Φ (−1.17)  0.1210 3 This is the significance level using a normal approximation. The power of the test is the probability of X ≤ 6 if q  0.08. Using the normal approximation, the mean is 100 (0.08)  8 and the variance is 100 (0.08)(0.92)  7.36, so 6.5 − 8 Pr ( X < 6.5 | q  0.08)  Φ √  Φ (−0.55)  0.2912 7.36

!



Example 21G X is a normally distributed random variable with variance 100. You are to test the null hypothesis H0 : µ  50 against the alternative hypothesis H1 : µ  55. The test consists of a random sample of 25 observations of X. If the sample mean is less than 54, H0 is accepted; otherwise H1 is accepted. Determine the significance and power of the test. Answer: The variance of the sample mean of 25 observations is 100/25  4. The probability that x¯ ≥ 54 √ given that H0 is true is 1 − Φ (54 − 50) / 4  1 − Φ (2)  0.0228, so the significance of the test is 2.28% .

If H1 is true, the probability that x¯ > 54 is 1 − Φ (54 − 55) /2  Φ (0.5)  0.6915, so the power of the test



is 69.15% .





Table 21.2 summarizes the concepts of this section.

21.3

Confidence intervals

Rather than making a point estimate of a parameter, we may recognize that our estimates are not so precise and instead provide an interval which we believe has a high probability of containing the true value. A 100 (1 − α ) % confidence interval for a parameter θ is an interval ( L, U ) such that the probability of L being less than θ and U being greater than θ is 1 − α. Keep in mind that θ is a parameter, not a random variable; it’s L and U, the statistics, that are random. C/4 Study Manual—17th edition Copyright ©2014 ASM

21. REVIEW OF MATHEMATICAL STATISTICS

360

Table 21.2: Summary of Hypothesis Testing Concepts

Test A procedure and an accompanying rule that determine whether or not to accept the null hypothesis. Critical region Set of values for which, if the test statistic is in the set, the null hypothesis is rejected. Usually an interval of numbers going to infinity, or two intervals of numbers going to positive and negative infinity. Critical value(s) Boundary/boundaries of critical region. Type I error Rejecting the null hypothesis when it is true. Type II error Accepting the null hypothesis when it is false. Significance (level) Probability that observation is in critical region given that the null hypothesis is true. This is set before performing the test: the statistician selects a significance level and then sets the critical region accordingly. p-value Probability of observation (or more extreme observation) if the null hypothesis is true. This is calculated after the test is performed. Thus, at significance level α, the null hypothesis is rejected if the p-value of the test is less than the significance level of the test. Power Probability of rejecting null hypothesis when it is false. Uniformly most powerful test A test with maximal power for a given significance level. Typically, to construct a confidence interval for an estimate, we add to and subtract from the estimate the square root of the estimated variance times a standard normal coefficient appropriate for the level of confidence we’re interested in. If we let q z α be the 100α percentile of a standard normal distribution,

L ( θˆ ) to the estimate of the parameter θ, where Var L ( θˆ ) is this means adding and subtracting z (1−α )/2 Var the estimate of the variance of the estimator of θ. How do we estimate the variance? We often make no assumption about the distribution of the underlying data. Instead, we use the unbiased sample variance as an estimate of the variance of the underlying distribution. When using the sample mean as an estimate of the underlying mean, remember that the variance of the sample mean is the variance of the underlying distribution divided by the size of the sample. Thus when estimating the variance, the sum of square differences from the sample mean is divided by n − 1 (to obtain the unbiased sample variance) and also by n (to obtain the variance of the sample mean). Example 21H For a sample of 100 loss sizes x i , you have the following summary statistics:

X

X

x i  325,890

x 2i  1,860,942,085

Construct a 95% confidence interval for mean loss size. Answer: The sample mean is x¯  The raw second moment is µˆ 02  C/4 Study Manual—17th edition Copyright ©2014 ASM

325,890  3258.90 100

18,860,942,085  18,609,420.85 100

21.3. CONFIDENCE INTERVALS

361

The biased sample variance is 18,609,420.85 − 3258.902  7,988,991.64 To make it unbiased, we multiply by n/ ( n − 1) . 100  8,069,688.53 7,988,991.64 99

!

That is the estimate of the underlying variance of the distribution. The estimate of the variance of the sample mean is 8,069,688.53  80,696.89 100 √ The confidence interval is 3258.90 ± 1.96 80,696.89  (2975, 3543) .  The rest of this lesson is unlikely to appear on an exam and may be skipped. If the sample size is small, we should use t distribution values instead of z (1−α )/2 . However, a t distribution table is not provided at the exam, so I doubt you would be expected to use t coefficients on an exam question. If a specific distribution is assumed for the data, the variance may be a function of the mean. Then we can approximate the variance as a function of the sample mean instead of using the unbiased sample variance. For example, if the data can only have one of two possible values, then it follows a Bernoulli distribution. If q is the mean, then the variance is q (1 − q ) . We can approximate the variance of the underlying distribution as x¯ (1 − x¯ ) . This would be divided by n, the sample size, if we are approximating the variance of the sample mean. Sometimes a better confidence interval can be constructed by using the true variance of the estimator of the parameter instead of the approximated variance. In the Bernoulli example of the last paragraph, this would mean using q (1 − q ) instead of x¯ (1 − x¯ ) . When a single parameter is being estimated, the variance of the estimator of the parameter can be expressed in terms of the parameter: Var ( θˆ )  v ( θ ) for some function v. We can then use the following equation, which is equation (10.3) of the fourth edition of Loss Models: ! θˆ − θ 1 − α  Pr −z (1−α )/2 ≤ √ ≤ z (1−α )/2 (21.3) v (θ) where 1−α is the confidence level and z ( 1−α ) is the 100 ( α ) percentile of the standard normal distribution. This leads to a quadratic equation in θˆ when we assume a Poisson or a Bernoulli distribution for the data. Example 21I A sample of 200 policies has 20 claims. It is assumed claim frequency for each policy has a Poisson distribution with mean λ. Construct symmetric 95% confidence intervals for λ using the (1) the approximated variance and (2) the true variance. Answer: The estimate for λ is the sample mean, λˆ  20/200  0.10. The variance of the sample mean is the distribution variance divided by the size of the sample. The distribution variance is λ, since the variance of a Poisson equals its mean, so the variance of the sample mean is λ/200. In the approximated variance method, we substitute λˆ for λ in the formula for variance, so the variance √ of λˆ is estimated as 0.10/200  0.0005. The 95% confidence interval is 0.1±1.96 0.0005  (0.0562, 0.1438) . The true variance of λˆ is λ/200. For the confidence interval using the true variance, we want: 0.10 − λ ≤ 1.96. −1.96 ≤ √ λ/200 C/4 Study Manual—17th edition Copyright ©2014 ASM

21. REVIEW OF MATHEMATICAL STATISTICS

362

Square this inequality,

(0.10 − λ ) 2 λ/200

≤ 1.962

1.962 λ  0.019208λ 200 λ2 − 0.219208λ + 0.01 ≤ 0 √ 0.219208 ± 0.2192082 − 0.04 λ  0.0647, 0.1545 2 0.01 − 0.20λ + λ2 ≤

Therefore the confidence interval is (0.0647, 0.1545) . Although it doesn’t look symmetric around 0.10, it is! This is because higher values of λ lead to higher variances, so more room is needed on the right to cover the same probability range.  Example 21J In a mortality study on 10 lives with complete data, 2 lives die before time 5. Construct a 95% symmetric confidence interval for S (5) using the true variance. Answer: To make things simpler, let’s work with the number of deaths, 2. Let the expected number of deaths before time 5 be θ. This is a binomial variable with m  10, q  θ/10. The variance of this binomial variable is θ (10 − θ ) /10. We then need to solve:

2−θ  1.96. √ θ (10 − θ ) /10

We square both sides and solve.

(2 − θ ) 2  3.8416θ − 0.38416θ2

0  1.38416θ 2 − 7.8416θ + 4

7.8416 ± 7.84162 − 4 (1.38416)(4) θ 2 (1.38416)  0.5668, 5.0984

p

S (5) is 1 minus the number of deaths divided by the number in the study (10), so the interval for S (5) is obtained by dividing these two bounds by n  10 and subtracting them from 1: 0.5668  0.94332 10 5.0984 1−  0.49016 10 1−

The interval is (0.49016, 0.94332) .

For comparison, the approximate variance is θˆ (10 − θˆ ) /10  (2)(8) /10  1.6, so the confidence interval √ for number of deaths would be 2 ± 1.96 1.6  (−0.47923, 4.47923) . Since number of deaths can’t be less than 0, the confidence interval for S (5) would be (0.55208, 1) , where 0.55208  1 − 4.47923/10. 

In summary, we’ve mentioned three methods for obtaining the variance needed for constructing a normal confidence interval, from least to most refined: 1. Use the unbiased sample variance. C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISES FOR LESSON 21

363

2. Assume the data comes from a distribution. Express the variance as a function of the mean, then set the variance equal to that function of the sample mean. 3. Assume the data comes from a distribution. Express the variance as a function of the mean, and solve for the confidence interval. It would be hard for an exam question to specify that you should calculate a confidence interval in any manner other than the first one, adding and subtracting z1−α/2 s. They would have to say something like “First calculate the variance, then apply a normal approximation.” 1 On the other hand, the first method is standard and comes up frequently in this course.

Exercises Estimator quality The following three exercises are from old exams and use the term “efficient”, which is no longer on the syllabus. However, they are easy exercises and are included as a review of the definitions of this lesson. If you wish to do them, just keep in mind that an efficient estimator is one with minimum variance. [4B-S92:2] (1 point) Which of the following are true?

21.1. 1.

The expected value of an unbiased estimator of a parameter is equal to the true value of the parameter.

2.

If an estimator is efficient, the probability that an estimate based on n observations differs from the true parameter by more than some fixed amount converges to zero as n grows large.

3.

A consistent estimator is one with a minimal variance.

(A) 1 only (B) 3 only (C) 1 and 2 only (E) The correct answer is not given by (A) , (B) , (C) , or (D) .

(D) 1,2 and 3

21.2. [4B-S91:28] (1 point) αˆ is an estimator of α. Match each of these properties with the correct mathematical description. a. Consistent b. Unbiased c. Efficient (A) (B) (C) (D) (E)

a a a a a

 1, b  2, b  1, b  3, b  3, b

 2, c  1, c  3, c  2, c  1, c

3 3 2 1 2

ˆ α 1. E[ α] ˆ ≤ Var[ α] ˜ where α˜ is any other estimator of α 2. Var[ α] 3. For any  > 0, Pr{| αˆ − α| < } → 1 as n → ∞, where n is the sample size.

1There was an exam question long ago, in the CAS 4B days, when they made you use the refined method by specifying that you should construct the confidence interval using the method specified in textbook X. C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

21. REVIEW OF MATHEMATICAL STATISTICS

364

[4B-F92:8] (1 point) You are given the following information:

21.3.

X is a random variable whose distribution function has parameter α  2.00. Based on n random observations of X you have determined: •

E[α 1 ]  2.05, where α1 is an estimator of α having variance equal to 1.025.

E[α 2 ]  2.05, where α2 is an estimator of α having variance equal to 1.050.

As n increases to ∞, P ( |α 1 − α| >  ) approaches 0 for any  > 0.

Which of the following are true? 1.

α 1 is an unbiased estimator of α.

2.

α 2 is an efficient estimator of α.

3.

α 1 is a consistent estimator of α.

(A) 1 only

(C) 3 only

(D) 1,3 only

(E) 2,3 only

Which of the following statements are true?

21.4. I.

(B) 2 only

An estimator that is asymptotically unbiased and whose variance approaches 0 as the sample size goes to infinity is weakly consistent.

II.

For an unbiased estimator, minimizing variance is equivalent to minimizing mean square error. P III. The estimator S2  n1 nj1 ( X j − X¯ ) 2 for the variance σ2 is asymptotically unbiased. (A) I and II (B) I and III (C) II and III (E) The correct answer is not given by (A) , (B) , (C) , or (D) .

(D) I, II, and III

[4B-S96:12] (1 point) Which of the following must be true of a consistent estimator?

21.5. 1.

It is unbiased.

2.

For a small quantity , the probability that the absolute value of the deviation of the estimator from the true parameter value is less than  tends to 1 as the number of observations tends to infinity.

3.

It has minimal variance.

(A) 1 21.6. (A) (B) (C) (D) (E)

(B) 2

(C) 3

(D) 2,3

(E) 1,2,3

Which of the following statements is false? If two estimators are unbiased, a weighted average of them is unbiased. The sample mean is an unbiased estimator of the population mean. The sample mean is a consistent estimator of the population mean. For a uniform distribution on [0, θ], the sample maximum is a consistent estimator of the population maximum. The mean square error of an estimator cannot be less than the estimator’s variance.

Exercises continue on the next page . . .

EXERCISES FOR LESSON 21

21.7. (A) (B) (C) (D) (E)

365

[4-F04:40] Which of the following statements is true? A uniformly minimum variance unbiased estimator is an estimator such that no other estimator has a smaller variance. An estimator is consistent whenever the variance of the estimator approaches zero as the sample size increases to infinity. A consistent estimator is also unbiased. For an unbiased estimator, the mean squared error is always equal to the variance. One computational advantage of using mean squared error is that it is not a function of the true value of the parameter. ˆ  3 and E θˆ 2  13. θˆ is an estimator for θ. E[θ]

f

21.8.

g

ˆ If θ  4, what is the mean square error of θ? 21.9. [4B-S95:27] (2 points) Two different estimators, ψ and φ, are available for estimating the parameter, β, of a given loss distribution. To test their performance, you have conducted 75 simulated trials of each estimator, using β  2, with the following results: 75 X

ψ i  165,

i1

75 X i1

ψ 2i  375,

75 X i1

φ i  147,

75 X i1

φ 2i  312.

Calculate MSEψ ( β ) / MSEφ ( β ) . (A) (B) (C) (D) (E)

Less than 0.50 At least 0.50, but less than 0.65 At least 0.65, but less than 0.80 At least 0.80, but less than 0.95 At least 0.95, but less than 1.00

21.10. [4B-S92:17] (2 points) You are given that the underlying size of loss distribution for disability claims is a Pareto distribution with parameters α and θ  6000. You have used 10 random observations, maximum likelihood estimation, and simulation to determine ˆ the maximum likelihood estimator of α: the following for α, ˆ  2.20 E[ α] MSE ( αˆ )  1.00 Determine the variance of αˆ if α  2. (A) (B) (C) (D) (E)

Less than 0.70 At least 0.70, but less than 0.85 At least 0.85, but less than 1.00 At least 1.00, but less than 1.15 At least 1.15

21.11. A population contains the values 1, 2, 4, 9. A sample of 3 without replacement is drawn from this population. Let Y be the median of this sample. Calculate the mean square error of Y as an estimator of the population mean. C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

21. REVIEW OF MATHEMATICAL STATISTICS

366

21.12. A sample of n elements, x1 , . . . , x n , is selected from a random variable having a uniform distribu¯ tion on [0, θ]. You wish to estimate θ with an estimator of the form θˆ  k x. Determine the k that minimizes the mean square error of the estimator as a function of the sample size n. 21.13. A sample of n elements, x 1 , . . . , x n , is selected from a random variable having a uniform distribution on [0, θ]. Let Y  max ( x i ) . You wish to estimate the parameter θ with an estimator of the form kY. You may use the following facts: (i) (ii)

E[Y]  nθ/ ( n + 1) .

Var ( Y )  nθ 2

.

 ( n + 2)( n + 1) 2 .

Determine the k that minimizes the mean square error of the estimator. 21.14. [4B-F93:13] (3 points) You are given the following: •

Two instruments are available for measuring a particular (non-zero) distance.

X is the random variable representing the measurement using the first instrument and Y is the random variable representing the measurement using the second instrument.

X and Y are independent.

E[X]  0.8m; E[Y]  m; Var ( X )  m 2 ; and Var ( Y )  1.5m 2 where m is the true distance. Consider the class of estimators of m which are of the form Z  αX + βY.

Within this class of estimators of m, determine the value of α that makes Z an unbiased estimator of minimum variance. (A) (B) (C) (D) (E)

Less than 0.45 At least 0.45, but less than 0.50 At least 0.50, but less than 0.55 At least 0.55, but less than 0.60 At least 0.60

21.15. [4-S00:18] You are given two independent estimates of an unknown quantity µ: (i) Estimate A: E ( µA )  1000 and σ ( µA )  400. (ii) Estimate B: E ( µ B )  1200 and σ ( µ B )  200. Estimate C is a weighted average of the two estimates A and B, such that µ C  w · µA + (1 − w ) · µ B Determine the value of w that minimizes σ ( µ C ) . (A) 0

(B) 1/5

(C) 1/4

(D) 1/3

(E) 1/2

Exercises continue on the next page . . .

EXERCISES FOR LESSON 21

367

21.16. For two estimators X and Y of λ: (i) E[X]  λ and Var ( X )  3λ 2 . (ii) E[Y]  λ and Var ( Y )  4λ 2 . (iii) Cov ( X, Y )  −λ 2 . Let Z  aX + bY.

Determine the a and b that make Z an unbiased estimator of λ and minimize its variance. 21.17. [4-F02:31] You are given: x Pr ( X  x )

0 0.5

1 0.3

2 0.1

3 0.1

Using a sample of size n, the population mean is estimated by the sample mean X¯ and the variance is P ( X i − X¯ ) 2 2 estimated by S n  . n Calculate the bias of S2n when n  4. (A) −0.72

(B) −0.49

(C) −0.24

(D) −0.08

(E) 0.00

21.18. Losses follow a Pareto distribution with parameters α  3, θ  600. A sample of 100 is available. Determine the mean square error of the sample mean as an estimator for the mean. 21.19. A random variable follows an exponential distribution with mean θ. X1 is an observation of this random variable. Express the bias of X12 as an estimator for θ 2 as a function of θ. (A) −2θ 2

(B) −θ 2

(C) 0

(D) θ2

(E) 2θ 2

21.20. A random variable follows an exponential distribution with mean θ. X1 is an observation of this random variable. Express the mean square error of X12 as an estimator for θ 2 as a function of θ. (A) 20θ 4

(B) 21θ 4

(C) 22θ 4

(D) 23θ 4

(E) 24θ4

21.21. A random variable follows an exponential distribution with mean θ. A sample of n items, ¯ {x1 , . . . , x n }, is drawn from the random variable. The sample mean is x. 2 2 ¯ Express the bias of x as an estimator for θ in terms of n and θ. 21.22. You are given a sample of n items, x1 , . . . , x n , from a uniform distribution on [0, θ]. As an estimator for θ, you use θ˘  ( n + 1) min x i ˘ Calculate the mean square error of θ.

Exercises continue on the next page . . .

21. REVIEW OF MATHEMATICAL STATISTICS

368

21.23. [C-S05:16] For the random variable X, you are given: (i) E[X]  θ, θ>0 2 (ii) Var ( X )  θ /25 k X, k>0 (iii) θˆ  k+1 MSEθˆ ( θ )  2 biasθˆ ( θ )

(iv)



2

Determine k. (A) 0.2

(B) 0.5

(C) 2

(D) 5

(E) 25

Confidence intervals 21.24. A sample of 200 policies yields the following information for claim counts x i :

X

x¯  0.15

( x i − x¯ ) 2  46

Construct a 90% normal confidence interval for mean claim counts per policy. 21.25. [4B-S91:29] (2 points) (This exercise is on a topic unlikely to appear on the exam and may be skipped.) A sample of 1000 policies yields an estimated claim frequency of 0.210. The number of claims for each policy is assumed to have a Poisson distribution. A 95% confidence interval for λ is constructed using the true variance of the parameter. Determine the confidence interval. (A) (0.198, 0.225)

(B) (0.191, 0.232)

(C) (0.183, 0.240)

(D) (0.173, 0.251)

(E) (0.161, 0.264)

21.26. [4B-F97:1] (2 points) You are given the following: •

A portfolio consists of 10,000 identical and independent risks.

The number of claims per year for each risk follows a Poisson distribution with mean λ.

During the latest year, 1,000 claims have been observed for the entire portfolio. Determine the lower bound of a symmetric 95% confidence interval for λ.

(A) (B) (C) (D) (E)

Less than 0.0825 At least 0.0825, but less than 0.0875 At least 0.0875, but less than 0.0925 At least 0.0925, but less than 0.0975 At least 0.0975

Additional released exam questions: C-F05:28, C-F06:26

Solutions 21.1. (A) 21.2.

Only 1 is true. The other two statements have interchanged definitions of consistency and efficiency. a  3, b  1, c  2. (E)

EXERCISE SOLUTIONS FOR LESSON 21

21.3.

369

Only 3 is true. α 2 has higher variance than α 1 and the same bias, so it is less efficient. (C)

2

21.4. I is True . In II, MSEθˆ ( θ )  Var ( θˆ ) + biasθˆ ( θ ) and biasθˆ ( θ )  0, so it is True . III is True ; although this estimator is biased, asymptotically (as n → ∞), dividing by n − 1 and dividing by n doesn’t make a difference. (D)



21.5.

(B)

21.6. (A)

ˆ  θ and E[θ] ˜  θ. It follows that If θˆ and θ˜ are the two estimators, we are given E[θ] ˆ + (1 − w ) E[θ] ˜  wθ + (1 − w ) θ  θ w E[ θ]

!

(B)

This was discussed in the lesson. See item 3 on page 354. !

(C)

The sample mean is not necessarily consistent, unless the variance of the underlying distribution is finite. See Subsection 21.1.2. #

(D)

The sample maximum is asymptotically unbiased (Example 21B) and the variance approaches zero as n → ∞ (Example 21E), hence consistent. !

(E)

The mean square error is the variance plus the bias squared. !(C)

21.7. A correct version of (A) is “A uniformly minimum variance unbiased estimator is an unbiased estimator such than no other unbiased estimator has a smaller variance.” An estimator which is a constant has no variance, but if it is not equal to the true parameter it must be inconsistent, so (B) is false. Consistency is an asymptotic property, so a biased estimator which is asymptotically unbiased could be consistent, making (C) false. (D) is true, since mean square error is bias squared plus variance. Mean square error is a function of the true value of the parameter; in fact, it is the expected value of the square of the difference between the estimator and the true parameter, so (E) is false. 21.8. bias ˆ ( θ )  3 − 4  −1. Var ( θˆ )  13 − 32  4. MSE ˆ ( θ )  4 + (−1) 2  5 . θ

θ

21.9. We must estimate the variance of each estimator. The question is vague on whether to use the population variance (divide by 75) or the sample variance (divide by 74). The original exam question said to work out the answer according to a specific textbook, which used the population variance. We then get:

!2 375 165 + * + −  0.04 + 0.16  0.2 75 75 , !2  2 147 147 + 312 * MSEφ ( β )  −2 + −  0.0016 + 0.3184  0.32 75 75 75 , -

165 MSEψ ( β )  −2 75



0.2  0.625 0.32

2

(B)

If the sample variance were used, we would multiply 0.16 and 0.3184 by 75/74 to get 0.1622 and 0.3227 0.04+0.1622 respectively. The resulting quotient, 0.0016+0.3227  0.6235, which still leads to answer B. 21.10. The bias of αˆ is biasαˆ ( α )  2.20 − 2  0.2 Since the mean square error is the variance plus the bias squared, Var ( αˆ )  MSE ( αˆ ) − biasαˆ ( α )



2

 1 − 0.22  0.96

(C)

21. REVIEW OF MATHEMATICAL STATISTICS

370

21.11. Half the time (if 4 or 9 is omitted from the sample) the sample median is 2 and the other half the time (if 1 or 2 is omitted from the sample) it is 4. The mean is 1+2+4+9  4. So the MSE is 12 (2 − 4) 2  2 . 4 21.12. The mean of the uniform distribution is θ/2, so the expected value of x¯ is θ/2, and the bias of the estimator is k ( θ/2) − θ  θ ( k/2 − 1) . The variance of x¯ is the variance of the uniform distribution over n, or θ 2 / (12n ) , and multiplying x¯ by k multiplies the variance by k 2 . We minimize the mean square error of ˆ the square of the bias plus the variance, or θ, k −1 2

!2

θ2 +

k 2 θ2 12n

as a function of k. Divide this expression by θ 2 . g (k ) 

k −1 2

!2

+

k2 12n

Differentiate k k −1 + 0 2 6n

!

g0 ( k )  k



1 1 + 1 2 6n



k 21.13. The bias of kY is

nθ n ( k − 1) − 1 k −θθ . n+1 n+1

!

The variance of kY is

The MSE is then

6n 3n + 1

k 2 nθ 2 . ( n + 2)( n + 1) 2

 2 n ( k − 1) − 1 + * // . + θ 2 .. ( n + 2)( n + 1) 2 ( n + 1) 2 , k 2 nθ 2

We shall minimize this by differentiating with respect to k. To simplify matters, divide the entire expression by θ 2 and multiply it by ( n + 1) 2 ; this has no effect on the minimizing k:

 2 k2 n + n ( k − 1) − 1 n+2   2kn 0 f (k )  + 2n n ( k − 1) − 1  0 n+2 k + n ( k − 1) − 1  0 n+2   1 k +n  n+1 n+2 f (k ) 

k n ( n + 2) + 1  ( n + 1)( n + 2)





k ( n + 1) 2  ( n + 1)( n + 2) C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 21

k

371

n+2 n+1

21.14. From the unbiased condition: E[αX + βY]  m 0.8α + β  1 From the minimum variance condition: Minimize g ( α )  Var ( αX + βY )  α 2 Var ( X ) + β 2 Var ( Y )  α 2 m 2 + (1 − 0.8α ) 2 (1.5m 2 )

or

g (α)  α 2 + 1.5 − 2.4α + 0.96α 2 m2  1.96α2 − 2.4α + 1.5

A quadratic ax 2 + bx + c is minimized at −b/2a, so g ( α ) is minimized at α  2.4/3.92  0.6122 . (E)

21.15. The variance of the weighted average is

2 σC2  w 2 σA + (1 − w ) 2 σB2

 160,000w 2 + 40,000 (1 − w ) 2

Differentiating,

2 (160,000) w − 2 (40,000)(1 − w )  0 200,000w  40,000 w  1/5 21.16.

(B)

Z will be unbiased if and only if a + b  1. The variance of Z is Var ( Z )  a 2 Var ( X ) + b 2 Var ( Y ) + 2ab Cov ( X, Y )  λ 2 3a 2 + 4b 2 − 2ab





 λ 2 3a 2 + 4 (1 − a ) 2 − 2a (1 − a )





We’ll minimize the parenthetical expression.

3a 2 + 4 (1 − a ) 2 − 2a (1 − a )  3a 2 + 4 − 8a + 4a 2 − 2a + 2a 2  9a 2 − 10a + 4

The minimum of a quadratic px 2 + qx + r is −q/2p, so the minimum of this expression is a  21.17. We know that S2 

P

( X i −X¯ ) 2 n−1

n − 1 f 2g n − 1 2 E S  σ n n

n−1 σ2 − 1 σ2  − n n In this case, the true mean µ  0.5 (0) + 0.3 (1) + 0.1 (2) + 0.1 (3)  0.8 and the true variance is E[S2n ] − σ2 





σ2  0.5 (0 − 0.8) 2 + 0.3 (1 − 0.8) 2 + 0.1 (2 − 0.8) 2 + 0.1 (3 − 0.8) 2  0.96

So the bias is −0.96/4  −0.24 . (C) C/4 Study Manual—17th edition Copyright ©2014 ASM

, so b 

is an unbiased estimator; in other words, E[S2 ]  σ2 . But then E[S2n ] 

and the bias is

5 9

4 9

.

21. REVIEW OF MATHEMATICAL STATISTICS

372

21.18. The estimator is unbiased because the sample mean is an unbiased estimator of the population mean. Therefore the mean square error equals the variance. The variance of the estimator is: Var ( X )  Var ( X¯ )  100

2 (600) 2 2·1

100

600 2 2

 2700 .

21.19. From the tables, the second moment of the exponential E[X12 ]  2θ 2 . Therefore, the bias is biasX1 ( θ 2 )  E[X12 ] − θ 2  2θ 2 − θ 2  θ 2

(D)

21.20. In the previous exercise, we calculated the bias as θ 2 . The variance of X12 is Var ( X12 )  E[X14 ] − E[X12 ]2  24θ 4 − (2θ 2 ) 2

using the tables for 4th moment and 2nd moment

 20θ 4

so the mean square error is 20θ 4 + ( θ 2 ) 2  21θ 4 . (B) 21.21. Y  ni1 x i is a gamma random variable with parameters n and θ. Our estimator is ( Y/n ) 2 . The expected value of Y 2 is, using the second moment of a gamma from the table,

P

E[Y 2 ]  n ( n + 1) θ 2 So the bias is biasx¯ 2 ( θ 2 ) 

n ( n + 1) 2 θ − θ 2  θ 2 /n n2

˘ we need the distribution function of Y  min x i 21.22. In order to calculate the expected value of θ, FY ( x )  Pr ( Y ≤ x )  1 − Pr ( Y > x )  1 −

n Y i1

θ−x Pr ( x i > x )  1 − θ

!n

Notice that Y follows a beta distribution with the same parameter θ, a  1, and b  n. Therefore, its mean and variance are E[Y]  E[Y 2 ] 

θ n+1

2θ 2 ( n + 1)( n + 2)

Var ( Y )  E[Y 2 ] − E[Y]2 

nθ 2 ( n + 1) 2 ( n + 2)

˘  θ, making the estimator unbiased. The mean square error is the variance of θ, ˘ or Therefore, E[ θ] nθ 2 Var ( θ˘ )  ( n + 1) 2 Var ( Y )  n+2 Note that the variance does not approach 0 as n → ∞. C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 21 21.23. Since

373

MSEθˆ ( θ )  biasθˆ ( θ )



2

+ Var ( θˆ )

and by (iv) MSEθˆ ( θ )  2 biasθˆ ( θ )



it follows that



biasθˆ ( θ )

2

2

 Var ( θˆ )

(*)

so we calculate biasθˆ ( θ ) and Var ( θˆ ) . ˆ −θ biasθˆ ( θ )  E[θ]

k E X −θ k+1

"

#

θ kθ −θ− k+1 k+1

!



Var ( θˆ )  Var

k X k+1

k  k+1



biasθˆ ( θ )



θ − k+1

2 2

!2

θ2 25

 Var ( θˆ ) k  k+1

k2 1 25 k 5

!

!

by (*)

!2

θ2 25

!

(D)

Since k > 0, we reject k  −5.

21.24. The unbiased sample variance is 46/199. The variance of the sample mean is estimated as the estimated variance of the distribution divided by the size of the sample, or 46/199  0.0011558 200

√ The confidence interval is 0.15 ± 1.645 0.0011558  (0.094, 0.206) . 21.25. 0.21 − λ −1.96 ≤ √ ≤ 1.96 λ/1000 ( λ − 0.21) 2 ≤ 1.962  3.8416 λ/1000 1000λ 2 − 423.8416λ + 44.1 ≤ 0 √ √ 423.8416 − 3241.70 423.8416 + 3241.70 ≤λ≤ 2000 2000 0.183 ≤ λ ≤ 0.240 (C)

21. REVIEW OF MATHEMATICAL STATISTICS

374

21.26. Using the true variance. λ − 0.1 ≤ 1.96 −1.96 ≤ √ λ/10000 1.962 λ ( λ − 0.1) 2 ≤ 10000 λ 2 − 0.20038416λ + 0.01 ≤ 0 √ √ 0.20038416 − 0.0001538 0.20038416 + 0.0001538 ≤λ≤ 2 2 0.0940 ≤ λ ≤ 0.1064 The lower bound is 0.0940 . (D) The multiple choice ranges are so wide that using the cruder approxi√ mation with the approximate variance 0.1 − 1.96 0.1/10000  0.0938 results in the same answer.

Quiz Solutions 21-1. ln x is normally distributed with parameters µ and σ  4, so E[ln x]  µ, making the estimator unbiased. Also, Var (ln x )  σ2  16, so the mean square error is 16 .

Lesson 22

The Empirical Distribution for Complete Data Reading: Loss Models Fourth Edition 11 Complete data for a study means that every relevant observation is available and the exact value of every observation is known. Examples of data that are not complete are: • Observations below a certain number are not available. You only obtain a data point if it is higher than that number. • For observations above a certain number, you are only told that the observation is above that number. For example, in a mortality study, the data points may be amount of time until death. For some individuals, you may only be told that the person survived 5 years, but not told exactly how long he survived. This lesson discusses an estimator for the underlying distribution when you are provided with complete data.

22.1

Individual data

If individual data, by which we mean the exact observation points, are provided, the empirical distribution, as defined in Section 1.5, may be used as the underlying distribution. The textbook uses a subscript n on a probability function to indicate the empirical distribution based on n observations. Thus Fn ( x ) is the empirical cumulative distribution function, f n ( x ) is the empirical probability or probability density function, and so on. Since the empirical distribution for individual data is discrete, f n ( x ) would be the probability of x, and would equal k/n, where k is the number of x i in the sample equal to x. The empirical cumulative hazard function is Hn ( x )  − ln S n ( x ) . As an alternative, if for some reason you don’t want to use the empirical distribution as the underlying distribution, the cumulative hazard function can be estimated using the Nelson-Åalen estimator, which we’ll study in Section 24.2. Note that when complete individual data is available, the Nelson-Åalen estimate of H ( x ) is different from the empirical distribution estimate, whereas the product limit estimate of S ( x ) (which we’ll study in Section 24.1) is the same as the empirical distribution estimate. Example 22A In a mortality study on 10 lives, times at death are 22, 35, 78, 101, 125, 237, 350, 350, 484, 600. The empirical distribution is used as a model for the underlying distribution of time to death for the population. Calculate F10 (100) , f10 (350) , and H10 (100) .

375

22. THE EMPIRICAL DISTRIBUTION FOR COMPLETE DATA

376

Answer: #{x i ≤ 100} 3   0.3 10 10 #{x i  350} 2 f10 (350)    0.2 10  10 H10 (100)  − ln 1 − F10 (100)  − ln 0.7  0.3567 F10 (100)  Pr ( X ≤ 100) 



The empirical distribution is discrete, so it is a step function. In the above example, Fn (101)  0.4 but Fn (101 −  )  0.3 for any small  > 0. The empirical distribution may be used as an estimator for discrete distributions as well. The procedure is the same as for continuous distributions—the probability of each observed value is the proportion of the observed values in the sample, and the probability of any other value is 0.

22.2

Grouped data

Grouped data has a set of intervals and the number of losses in each interval, but does not give the exact value of each loss. This means that we know the empirical cumulative distribution function only at endpoints of intervals. Strictly speaking, grouped data is not complete data, but we will consider a modification to the empirical distribution to handle it. To generate the cumulative distribution function for all points, we “connect the dots”. We interpolate linearly between endpoints of intervals. The resulting distribution function is denoted by Fn ( x ) and is called the ogive. The derivative of the ogive is denoted by f n ( x ) . It is the density function corresponding to the ogive, and is called the histogram. It is constant between endpoints of intervals. At each point, it is equal to the number of points in the interval of that point divided by the product of the length of the interval and the total number of points (in all intervals). In other words: fn (x ) 

nj n ( c j − c j−1 )

(22.1)

where x is in the interval1 [c j−1 , c j ) , there are n j points in the interval, and n points altogether. Example 22B [4B-S95:1] (1 point) 50 observed losses have been recorded in millions and grouped by size of loss as follows: Size of Loss (X) Number of Observed Losses ( 0.5, 2.5] ( 2.5, 10.5] ( 10.5, 100.5] (100.5, 1000.5]

25 10 10 5 50

What is the height of the relative frequency histogram, f n ( x ) , at x  50? (A) Less than 0.05 (B) At least 0.05, but less than 0.10 (C) At least 0.10, but less than 0.15 (D) At least 0.15, but less than 0.20 (E) At least 0.20 1 f n ( x ) is defined to be right-continuous, so the interval is closed on the left and open on the right. This is an arbitrary decision. C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISES FOR LESSON 22

377

Answer: The length of the interval around 50 is 100.5 − 10.5  90. n  50. So fn (x ) 

10

(90)(50)

 0.0022 .

(A)

The choices appear to be designed for somebody who forgot to divide by 90.



Exercises Use the following information for questions 22.1 and 22.2: You are given the following exact times to death in a study of a population of 20: Number of years

Number surviving this long

3 4 5 7 9 10 11 12 15

4 1 3 6 2 1 1 1 1

Let T be survival time. 22.1. Using the empirical distribution as a model, calculate the probability of surviving more than 8 but no more than 12, or Pr (8 < T ≤ 12) .

22.2. Using the empirical distribution as a model, calculate the probability of surviving at least 5 but no more than 12, or Pr (5 ≤ T ≤ 12) .

22.3.

For an insurance coverage, you observe the following 15 losses: 12, 12, 15, 20, 30, 35, 43, 50, 50, 70, 85, 90, 100, 120, 150

Calculate the empirical estimate of H (50) . 22.4.

For a nursing home population, you observe the following survival times: Number of years

Number surviving this long

0–1 1–2 2–5 5–10 10+

8 15 42 29 6 100

Calculate the empirical density function at 6 years, f100 (6) .

Exercises continue on the next page . . .

22. THE EMPIRICAL DISTRIBUTION FOR COMPLETE DATA

378

[4B-S93:31] (2 points)

22.5.

The following 20 wind losses, recorded in millions of dollars, occurred in 1992: 1, 6,

1, 6,

1, 8,

1, 10,

1, 13,

2, 14,

2, 15,

3, 18,

3, 22,

4, 25

To construct an ogive Fn ( x ) , the losses were segregated into four ranges: (0.5, 2.5), (2.5, 8.5), (8.5, 15.5), (15.5, 29.5) Determine the values of the probability density function f n ( x ) corresponding to Fn ( x ) for the values x1  4 and x2  10. (A) (B) (C) (D) (E)

f n ( x1 ) f n ( x1 ) f n ( x1 ) f n ( x1 ) f n ( x1 )

 0.300,  0.050,  0.175,  0.500,  0.050,

f n ( x2 ) f n ( x2 ) f n ( x2 ) f n ( x2 ) f n ( x2 )

 0.200  0.050  0.050  0.700  0.029

22.6. [4B-F94:12] (1 point) Nine observed losses have been recorded in thousands of dollars and are grouped as follows: Interval Number of claims

[0,2) 2

[2,5) 4

[5, ∞) 3

Determine the value of the relative frequency histogram for these losses at x  3. (A) (B) (C) (D) (E) 22.7. •

Less than 0.15 At least 0.15, but less than 0.25 At least 0.25, but less than 0.35 At least 0.35, but less than 0.45 At least 0.45 [4B-F96:11] (2 points) You are given the following:

Ten losses (X) have been recorded as follows: 1000, 1000, 1000, 1000, 2000, 2000, 2000, 3000, 3000, 4000

An ogive, Fn ( x ) , has been fitted to this data using endpoints for the connecting line segments with x-values as follows: x  c 0  500,

x  c 1  1500,

x  c 2  2500,

x  c 3  4500.

Determine the height of the corresponding relative frequency histogram, f n ( x ) , at x  3000. (A) 0.00010

(B) 0.00015

(C) 0.00020

(D) 0.00025

(E) 0.00030

Exercises continue on the next page . . .

EXERCISES FOR LESSON 22

22.8.

379

You are given the following information for claim sizes: Claim Size

Number of Claims

0–1000 1000–5000 5000–10000

35 45 20

Use the ogive to estimate the probability that a randomly chosen claim is between 4000 and 5000. 22.9.

[C-S05:26] You are given the following information regarding claim sizes for 100 claims: Claim Size 0 – 1,000 1,000 – 3,000 3,000 – 5,000 5,000 – 10,000 10,000 – 25,000 25,000 – 50,000 50,000 – 100,000 over 100,000

Number of Claims 16 22 25 18 10 5 3 1

Use the ogive to estimate the probability that a randomly chosen claim is between 2,000 and 6,000. (A) 0.36

(B) 0.40

(C) 0.45

(D) 0.47

(E) 0.50

22.10. You are given the following data for amount of time for 100 hospital stays: Number of days

Number of people

(0, 1] (1, 2] (2, 4] (4, 7] (7, 10] (10, ∞)

14 19 16 12 14 25

Using the ogive, estimate h (3) , the hazard rate function at 3 days. Additional released exam questions: C-F05:1,33, C-F06:35

Solutions 22.1. F20 (8)  (4 + 1 + 3 + 6) /20  0.7 and F20 (12)  (4 + 1 + 3 + 6 + 2 + 1 + 1 + 1) /20  0.95, so Pr (8 < T ≤ 12)  F20 (12) −F20 (8)  0.25 . You could do this faster by counting the times between 8 and 12 (2 at 9, etc.). There are five times greater than 8 and not greater than 12, so Pr20 (8 < T ≤ 12)  5/20  0.25 . 22.2. We have to be careful since Fn is discrete, and Pr20 (T < 5) , Pr20 (T ≤ 5) . We must count all observations in the interval [5, 12], including the ones at 5, so Pr20 (5 ≤ T ≤ 12)  (3 + 6 + 2 + 1 + 1 + 1) /20  0.7 . 22.3. There are 9 losses less than or equal to 50, so F15 (50)  9/15  0.6 and H15 (50)  − ln (1 − 0.6)  0.9163 . C/4 Study Manual—17th edition Copyright ©2014 ASM

22. THE EMPIRICAL DISTRIBUTION FOR COMPLETE DATA

380

22.4. There are 29 observations in the interval 5–10 which contains 6, and the width of the interval 5–10 is 5. By equation (22.1): 29 f (6)   0.058 . (5)(100) 22.5. Note that the exercise gives you a list of 20 wind losses, in order from lowest to highest, as the comma after the 4 on the first line indicates. The 20 numbers are not a two-line unlabeled table in which the first line is a frequency and the second line is a number! There are 6 observations in (2.5, 8.5) and 4 observations in (8.5, 15.5). 6  0.050 (20)(6) 4 f n (10)   0.029 (20)(7) f n (4) 

(E)

22.6. 3 is in the interval [2, 5). There are 4 observed losses in this interval, a total of 9 observed losses, and the width of the interval is 3. By equation (22.1): 4

(9)(3)

 0.148 .

(A)

22.7. 3000 is in the interval [2500, 4500]. There are 3 observed losses in this interval, a total of 10 observed losses, and the width of the interval is 2000. By equation (22.1): 3  0.00015 . (10)(2000)

(B)

22.8. Since linear interpolation is used, 1/4 of the claims in the 1000–5000 range will be between 4000 and 5000, or 45/4=11.25 claims out of 100, and 11.25/100  0.1125 . 22.9.

Let X be claim size. Since there are a total of 100 claims, 22  0.22 100 25 Pr (3000 < X < 5000)   0.25 100 18 Pr (5000 < X < 10,000)   0.18 100 Pr (1000 < X < 3000) 

Using the ogive, Pr (2000 < X < 3000)  21 Pr (1000 < X < 3000)  0.11 and Pr (5000 < X < 6000)  1 5 Pr (5000 < X < 10,000)  0.036. The answer is therefore 0.11 + 0.25 + 0.036  0.396 . (B)

22.10. As we learned in Lesson 1, h ( x )  f ( x ) /S ( x ) . Here, f100 (3) 

16

(2)(100)

 0.08

14 + 19 + 0.5 (16)  0.59 100 0.08 h 100 (3)   0.1356 0.59

S100 (3)  1 −

Lesson 23

Variance of Empirical Estimators with Complete Data Reading: Loss Models Fourth Edition 12.2 Very few released exam questions relate to the material in this lesson. The textbook has several complicated looking formulas for the variance of the estimators we discussed in the last lesson. Rather than memorizing these formulas, you are better off understanding the principles behind them, so you can derive them yourself as needed. The two distributions you need to understand in order to derive any formula you need for variance of empirical estimators with complete data are: 1. The binomial distribution. If there are m items which are being placed in two categories, and the probability that it is placed in the first category is q, then the variance of the number of items, X, in the first category (or in the second category for that matter) is mq (1 − q ) . A variation of this is the random variable representing the proportion of items in the first category, or Y  X/m. The variance of Y is the variance of X divided by m 2 , or q (1 − q ) /m. This random variable is called a binomial proportion variable.

2. The multinomial distribution. If there are m items which are being placed in k categories, with probabilities q1 , . . . , q k respectively (the probabilities must sum up to 1), then the variance of the number of items in category i is mq i (1−q i ) . The covariance of the number of items in categories i and j is −mq i q j . As with the binomial distribution, you may also consider the proportion of items in the categories, in which case you would divide these expressions by m 2 ; the variance of the multinomial qi q j q (1−q ) proportion is i m i and the covariance is − m .

23.1

Individual data

If the empirical distribution is being used as the model with individual data, then S n ( x ) is the proportion of observations above x. Since the probability of an observation being above x is S ( x ) , S n ( x ) is a binomial proportion random variable with parameters m  n and q  S ( x ) ; its variance is therefore: S (x ) 1 − S (x )



Var S n ( x ) 







n

Since we don’t know S ( x ) , we estimate the variance using S n ( x ) instead of S ( x ) : Sn ( x ) 1 − Sn ( x )



L Sn ( x )  Var 



n

 (23.1)

There are a couple of pre-2000 syllabus exam questions (which are provided below in the exercises) where S ( x ) is explicitly specified. In such a case, you would use the real S ( x ) in this formula, not S n ( x ) . But I don’t expect such questions on current exams. C/4 Study Manual—17th edition Copyright ©2014 ASM

381

23. VARIANCE OF EMPIRICAL ESTIMATORS WITH COMPLETE DATA

382

If n x is the observed number of survivors past time x, then S n ( x )  n x /n. Plugging this into equation (23.1), the estimated variance becomes

( n x /n )(1 − n x /n )

L Sn ( x )  Var 



n nx (n − nx )  n3

(23.2)

We can also empirically estimate the probability of survival past time y given survival past time x, Pr ( X > y | X > x ) . We shall use the notation y−x p x to denote this probability, the probability of survival to time y, or to y − x time units past x, given survival to time x, and also the notation y−x q x to denote the complement, the probability of failure within y − x additional time units after x given survival to time x. (This notation is identical to the notation used in Exam MLC/LC.) The estimator for y−x p x is Prn ( X > y | X > x )  n y /n x , where n x and n y are the observed number of survivors past times x and y respectively. The variance of this conditional estimator, a quotient of two random variables, cannot be estimated unconditionally, since the estimator may not even exist (if everyone dies before time x). The best we can do is estimate the variance conditional on having the observed number of lives at time x, in effect making the denominator a constant. The estimator of the variance is then essentially the same as the  L unconditional estimator of Var S n ( x ) (equation (23.2)), but replacing n with n x and n x with n y . Thus, we have ( n − n y )( n y ) L ( y−x pˆ x | n x )  Var L ( y−x qˆ x | n x )  x Var (23.3) n 3x Notice that the variances of y−x pˆ x and y−x qˆ x are identical, since one is the complement of the other and Var (1 − X )  Var ( X ) . The same applies to conditional versions of these variances. f n ( x ) is the proportion equal to x. The right hand sides of formulas (23.2) and (23.3)  of observations  are used to estimate Var f n ( x ) , with n x redefined to be the number of observations equal to x. The empirical estimators of S ( x ) and f ( x ) with individual data are unbiased. Example 23A (Same data as Example 22A) In a mortality study on 10 lives, times at death are 22, 35, 78, 101, 125, 237, 350, 350, 484, 600. The empirical distribution is used as a model for the underlying distribution of time to death for the population.     Estimate Var F10 (100) and Var f10 (350) . Answer: From formula (23.2), with n x  3 since there are 3 observations below 100: Var F10 (100) 





3 (7)  0.021 103

From formula (23.2), with n x  2 since there are 2 observations equal to 350: Var f10 (350) 



23.2



2 (8)  0.016 103



Grouped data

For estimating the variance of the S ( x ) estimator for grouped data, the same formulas as for individual data can be used at boundaries of intervals. C/4 Study Manual—17th edition Copyright ©2014 ASM

23.2. GROUPED DATA

383

Otherwise, S n ( x ) is linearly interpolated. The textbook derives the following formulas using the multinomial distribution. For a point x in the interval ( c j−1 , c j ) , if Y is the number of observations less than or equal to c j−1 and Z is the number of observations in the interval ( c j−1 , c j ], then

L Sn ( x )  Var 



L (Y )( c j − c j−1 ) 2 + Var L ( Z )( x − c j−1 ) 2 + 2Cov M (Y, Z )( c j − c j−1 )( x − c j−1 ) Var

L fn (x )  and Var 



n 2 ( c j − c j−1 ) 2

L (Z ) Var n 2 ( c j − c j−1 ) 2

(23.4) (23.5)

where Y (n − Y ) n Z ( n − Z) L (Z )  Var n M (Y, Z )  − YZ Cov n

L (Y )  Var

The ogive is a biased estimator (except at boundaries), since we have no reason to believe the true distribution is linear between boundaries. Rather than memorizing the above, I recommend deriving whatever you need from the multinomial formula. The following example illustrates how to do this. Example 23B You are given the following data on 50 loss sizes: Interval

Number of losses in interval

0– 500 500– 1000 1000– 5000 5000–10000 > 10000

25 10 10 3 2

An ogive is used to estimate the distribution of losses. 1. Estimate the variance of the estimator for the probability of a loss greater than 2500. 2. Estimate the variance of the estimator for the density function at 2500. Answer: Let’s first calculate the variance of the estimator for the probability of a loss greater than 2500. Step 1— What’s the estimator? What exactly, in mathematical form, are you calculating the variance of? The ogive estimator uses linear interpolation between endpoints. This means that we count up the number of observations above the next endpoint, add on a proportionate number of observations between 2500 and the next endpoint, and then divide by the total number of observations. In other words, 3 + 2 (the observations above 5000), plus 58 of 10 (the proportionate number of observations between 1000 and 5000), all divided by 50 (the total number of observations):

D ( X > 2500)  Pr

3 + 2 + (5/8) 10 50

Step 2— In this estimator, what is random and what is not? The only thing that is random is the number of observations in each interval. The total number of observations is not random—it is assumed we C/4 Study Manual—17th edition Copyright ©2014 ASM

23. VARIANCE OF EMPIRICAL ESTIMATORS WITH COMPLETE DATA

384

decided in advance how many cases to study. The endpoints aren’t random—it is assumed that we designed the study this way. 2500 is not random—we decided what question to ask.

D ( X > 2500) , the 3, 2, and 10 are random, but the So in the above expression for Pr Items which are not random have no variance.

5 8

and 50 are not.

Step 3— Write down an expression for the estimator with mathematical symbols for the random variables. Let Y be the number of observations above 5000, and Z the number of observations between 1000 and 5000. (The number of observations between 2500 and 5000 is random, but we don’t know what it is; we’re estimating it!) Then the estimator is

D ( X > 2500)  Pr

1 1 Y + (5/8) Z  Y+ Z 50 50 80

Step 4— Calculate the variance of the random variables. Here’s where the binomial and multinomial distributions come into play. Whenever we need a probability (and we don’t have it, since we’re estimating it), use the empirical, estimated probability. Y, the number of observations above 5000, is a binomial random variable; either an observation is above 5000 or it isn’t. The probability of an observation above 5000 is estimated as 3+2 50  0.1. So Y is binomial with m  50 and q  0.1. Z, the number of observations between 1000 and 5000, is a binomial random variable; either an observation is between 1000 and 5000 or it isn’t. The probability of an observation between 1000 and 10 5000 is estimated as 50  0.2. So Z is binomial with m  50 and q  0.2. Y and Z form a trinomial distribution; either an observation is less than 1000, between 1000 and 5000, or over 5000. The parameters are q y  0.1, q z  0.2, and m  50. Step 5— Calculate the variance. Use the usual formula for Var ( aY + bZ ) : Var ( aY + bZ )  a 2 Var ( Y ) + 2ab Cov ( Y, Z ) + b 2 Var ( Z ) From step 3, a 

1 50

and b 

1 80 .

From the binomial and trinomial, Var ( Y )  mq y (1 − q y )  50 (0.1)(0.9)  4.5

Var ( Z )  mq z (1 − q z )  50 (0.2)(0.8)  8

Cov ( Y, Z )  −mq y q z  −50 (0.1)(0.2)  −1

L Pr D (loss > 2500)  Var 



1 1 (4.5) + 2 2500 50

!

1 1 (−1) + (8) 80 6400

!

 0.0018 − 0.0005 + 0.00125  0.00255 Now let’s estimate the variance of the estimate of the density function at 2500 in the same way. Step 1 The estimator, by equation (22.1), is 10

.

 (50)(4000) .

Step 2 Y, the number of observations in the interval [1000, 5000], is random; 50 and 4000 are not. Step 3 The expression is Y/200,000. Step 4 Y is binomial(m  50, q  0.2). Its variance is 50 (0.2)(0.8)  8. Step 5

L Var

Y 1 (8)  2 × 10−10  200,000 200,0002

!



EXERCISES FOR LESSON 23

385

Exercises Use the following information for questions 23.1 and 23.2: You are given the following data on loss sizes for an insurance coverage: 2 23.1.

3

5

8

8

10

12

15

18

25

Estimate the variance of the empirical estimator of F (11) .

23.2. Assume that there is an ordinary deductible of 5, and losses 5 and lower are not submitted as claims. Estimate the variance of the probability of a claim resulting in a payment of more than 6. 23.3. In a mortality study on 50 lives, one death occurs at each of times 2, 4, and 8. There are no withdrawals. Estimate the variance of the empirical estimator of 4 q 2 . Use the following information for questions 23.4 through 23.6: You are given the following information for loss sizes: Loss sizes

Number of losses

0–1000 1000–2000 2000–5000 5000–10,000 Over 10,000

32 20 35 31 12 130

23.4. The probability of a loss greater than 500 is estimated empirically assuming uniform distribution of losses within each interval. Estimate the variance of this estimate. 23.5. The probability of a loss between 1000 and 2000 is estimated empirically assuming uniform distribution of losses within each interval. Estimate the variance of this estimate. 23.6. The probability of a loss greater than 1500 is estimated empirically assuming uniform distribution of losses within each interval. Estimate the variance of this estimate.

Exercises continue on the next page . . .

23. VARIANCE OF EMPIRICAL ESTIMATORS WITH COMPLETE DATA

386

23.7.

You are given the following data from a mortality study on 100 individuals: Survival time

Number of individuals

0– 5 5–10 10–15 15+

4 3 5 88

The density function at 7, f (7) is estimated empirically. Estimate the variance of this estimate. 23.8.

You are given the following loss data: Size of loss

Number of losses

0– 1,000 1,000– 5,000 5,000–10,000 10,000–20,000

10 8 4 3

S (1500) is estimated empirically using the ogive. Estimate the variance of the estimator. 23.9.

You are given the following loss data: Size of loss

Number of losses

0– 500 500–2000 Over 2000

15 12 8

The probability of a loss between 800 and 1000 is estimated empirically, assuming uniform distribution of losses within each interval. Estimate the variance of the estimator. 23.10. The following is data on claim frequency for one year: Number of claims

Number of policies

0 1 2 3 or more

502 34 12 2 550

The probability of at least 1 claim is estimated empirically using the ogive. Estimate the variance of this estimator. 23.11. [160-S87:7] Two mortality studies with complete data have produced independent estimators of S (10) . Study 1 began with 300 lives and produced S300 (10)  0.60. Study 2 began with 100 lives and produced S100 (10)  0.50. Calculate the estimated variance of the difference between these estimators. (A) 0.0008 (B) 0.0017 (C) 0.0025 (E) The value cannot be determined from the information given. C/4 Study Manual—17th edition Copyright ©2014 ASM

(D) 0.0033

Exercises continue on the next page . . .

EXERCISES FOR LESSON 23

387

23.12. [160-S88:7] You are given: (i) A cohort of 12 individuals is observed from t  0 until t  9. (ii) The observed times of death are 1, 2, 2, 3, 4, 4, 6, 6, 7, 8, 8, 9. (iii) The cohort group is assumed to be subject to the uniform survival distribution over (0,9]. Calculate the conditional variance of pˆ 2 . (A) 0.009

(B) 0.010

(C) 0.011

(D) 0.012

(E) 0.014

23.13. [160-F90:10] From a mortality study with complete data, you are given: (i) The number of deaths during the third year is 9. (ii) The number of lives beginning the fourth year is 18. (iii) S n ( y j ) is the empirical survival function at the end of year y j . (iv) S n (1)  0.65 and S n (3)  0.30. (v) (vi)

Var S n ( y j ) is the variance of S n ( y j ) if the underlying distribution is known to be exponential with mean − ln10.7 .





L S n ( y j ) is the estimated variance of S n ( y j ) if the underlying distribution is unknown. Var 



L S n (2) . Calculate the absolute difference between Var S n (2) and Var 

(A) 0.00001 23.14.

(B) 0.00002

(C) 0.00003







(D) 0.00004

(E) 0.00005

[160-S91:10] A cohort of 10 lives is observed over the interval (0, 6].

You are given: (i)

The observed times of death are 1, 2, 3, 4, 4, 5, 5, 6, 6, 6.

(ii)

VarU S10 (3) is the variance of S10 (3) when the survival distribution is assumed to be uniform on (0, 6].





L S10 (3) is the estimated variance of S10 (3) when no assumption about the underlying survival (iii) Var distribution is made. 



L S10 (3) . Calculate VarU S10 (3) − Var 

(A) −0.004



(B) −0.002





(C) 0.000

(D) 0.002

(E) 0.004

Exercises continue on the next page . . .

23. VARIANCE OF EMPIRICAL ESTIMATORS WITH COMPLETE DATA

388

23.15. [160-83-94:5] A cohort of terminally ill patients are studied beginning at time t  0 until all have died at t  5. You are given: (i) Time t

Deaths at Time t

1 2 3 4 5

6 9 5 d4 d5

(ii)

L S n (1)  Var L S n (3) based on the actual data. Var

(iii)

The average remaining lifetime for those who survive to t  3 is 76 .









Calculate the number of deaths at t  4. (A) 1

(B) 3

(C) 5

(D) 10

(E) 15

Solutions 23.1.

Using formula (23.1) (the variance of F and S is the same), we have

L F10 (11)  Var 

23.2.



10

 0.024

Using formula (23.3), we have

L (6 q 5 )  Var 23.3.

(0.6)(0.4)

(3)(4) 73



12 343

Using formula (23.3), we have

L (4 q 2 )  Var

(1)(48) 493

 0.0004080

23.4. The number of losses less than 1000 is a binomial random variable with m  130 and q  Pr ( X ≤ 32 1000) which is estimated by F130 (1000)  130 . The proportion of losses less than 1000 is a binomial proportion random variable. Its variance is estimated as

q (1−q ) m



(32)(98) 1303

. The probability of a loss greater

than 500 is estimated as 1 minus half the losses less than 1000 divided by 130. The variance is then times the variance of the binomial proportion random variable, or

L S130 (500)  Var 



1 2

!2 

(32)(98) 1303



 0.0003569

1 2 2

EXERCISE SOLUTIONS FOR LESSON 23

389

23.5. This is the easy case. The estimator is the number of losses in this range divided by 130. The number 20 and variance mq (1 − q ) . So of losses is a binomial variable with estimated parameters m  130, q  130 2 the variance of the probability of a loss is mq (1 − q ) divided by 130 , or:

  L Pr D (1000 < X < 2000)  Var

20  110 130 130

130

 0.0010014

23.6. The ogive estimator of the probability of a loss greater than 1500 is ( X + 0.5Y ) /130, where X is the observed number of losses above 2000 and Y is the observed number of losses between 1000 and 2000. The variance is therefore Var ( X ) + 0.52 Var ( Y ) + 2 (0.5) Cov ( X, Y ) 1302 X and Y are multinomial with m  130 and X having estimated probability 78/130 and Y having estimated probability 20/130. Therefore, the variance of the ogive estimator of S130 (1500) is

L S130 (1500)  Var 





Var ( X ) + 0.52 Var ( Y ) + 2 (0.5) Cov ( X, Y ) 1302 78  52  20  110 130 130 130 + (0.52 )(130) 130 130 − (2)(0.5)(130)

1302 (78)(20) + −  3 3 130 130 1303  0.001846 + 0.000250 − 0.000710  0.001386

(78)(52)

78  20  130 130

(0.25)(20)(110)

23.7. The density function estimator is the number of losses in the interval, a multinomial random variable with m  4 + 3 + 5 + 88  100 and q  3/100  0.03 (estimated), divided by the length of the interval, 5, and the total number of losses, 100. The estimated variance of the multinomial random variable is (100)(0.03)(0.97)  2.91. So the variance of the estimator is

L f100 (7)  Var 



2.91  0.00001164 (1002 )(52 )

23.8. The random variable for losses in (1000, 5000) is multinomial with m  10 + 8 + 4 + 3  25 and 8 q 2  25  0.32 (estimated). The random variable for losses greater than 5000 is multinomial with m  25, 4+3 5000−1500 q 3  25  0.28 (estimated). S25 (1500) is estimated as 5000−1000  78 times the first, divided by 25, plus the second divided by 25. So the variance is:

!2  !     7 (0.32)(0.68) (0.28)(0.72) 7 (0.32)(0.28) L Var S25 (1500)  + −2 8

25

25

8

25

 0.006664 + 0.008064 − 2 (0.003136)  0.008456 23.9. The estimate of the number of losses Ni in the subinterval (800, 1000) is the number of losses times the length of the subinterval, 1000−800  200, divided by the length of the total interval, 2000−500  1500, 2 or 15 of the number of losses in the interval. The number of losses in the interval is a multinomial random variable with m  15 + 12 + 8  35 and q  12 35 (estimated). So the variance of the number of losses in (800, 1000) is !2 ! ! 2 12 23 L Var ( Ni )  (35)  0.140190 15 35 35 C/4 Study Manual—17th edition Copyright ©2014 ASM

390

23. VARIANCE OF EMPIRICAL ESTIMATORS WITH COMPLETE DATA

The probability of a loss in the subinterval is the number of losses in the subinterval divided by the total number of losses, 35, so we divide 0.140190 by 352 to obtain the variance of the probability of a loss: 0.140190  0.00011444 352 23.10. By equation (23.2) with n x  34 + 12 + 2  48, the variance is

(48)(502) 5503

 0.0001448

23.11. Since the estimators are independent, there is no covariance between them, so the variance of the difference of the estimators is the sum of their variances. By equation (23.1), the variance of the first estimator is (0.60)(0.40) /300  0.0008 and the variance of the second estimator is (0.5)(0.5) /100  0.0025, so the variance of the difference is 0.0025 + 0.0008  0.0033 . (D) 23.12. We use equation (23.1), but with S instead of S n , as indicated in the paragraph after the equation. The conditional density of the survival function given survival to time 2 is uniform (since it is the unconditional density, which is uniform, divided by S (2) , which is constant), and since it is defined between 2 and 9, the density is 17 . Therefore the number of lives surviving the third year given that 9 lives begin the year is a binomial variable with m  9, q  67 and variance 9 (6/7)(1/7) . pˆ 2 is this binomial variable divided by 9, so its variance is 1/81 of the variance of the binomial variable or:

L ( pˆ 2 | n2  9)  Var

(1/7)(6/7) 9

 0.01361

(E)

23.13. If the underlying distribution is unknown, we use equation (23.1) with S n (2) . Since S n (3)  0.30 and 9 deaths occurred during the third year resulting in 18 survivors, the probability of death in the third year is q3  9/27  1/3, and q 3  1 − S n (3) /S n (2) , so S n (2)  0.30/ (2/3)  0.45. Then

L S n (2)  Var 



(0.45)(0.55) 60

 0.004125

If the underlying distribution is exponential, we use equation (23.1) with S (2) . Under an exponential, S (2)  e −2/θ  e 2 ln 0.7  0.49, so Var S n (2) 





(0.49)(0.51) 60

 0.004165

The difference is 0.004165 − 0.004125  0.00004 . (D) 23.14. U

Var

S10 (3) 

1 1 2 2

 0.025 10   L S10 (3)  (0.3)(0.7)  0.021 Var 10 0.025 − 0.021  0.004 (E)





EXERCISE SOLUTIONS FOR LESSON 23

391

23.15. The average remaining lifetime equation is: d4 + 2d5 7  d4 + d5 6 while the equality of the variances at times 1 and 3 equation is:

(20)( d4 + d5 ) (6)(14 + d4 + d5 )  3 (20 + d4 + d5 ) (20 + d4 + d5 ) 3 or 84 + 6 ( d4 + d5 )  20 ( d4 + d5 ) d4 + d5  6 d4 + 2d5  7 d5  1 d4  5

(C)

392

23. VARIANCE OF EMPIRICAL ESTIMATORS WITH COMPLETE DATA

Lesson 24

Kaplan-Meier and Nelson-Åalen Estimators Reading: Loss Models Fourth Edition 12.1 Exams routinely feature questions based on the material in this lesson. When conducting a study, we often do not have complete data, and therefore cannot use raw empirical estimators. Data may be incomplete in two ways: 1. No information at all is provided for certain ranges of data. Examples would be: • An insurance policy has a deductible d. If a loss is for an amount d or less, it is not submitted. Any data you have regarding losses is conditional on the loss being greater than d. • You are measuring amount of time from disablement to recovery, but the disability policy has a six-month elimination period. Your data only includes cases for which disability payments were made. If time from disablement to recovery is less than six months, there is no record in your data. When data are not provided for a range, the data is said to be truncated. In the two examples just given, the data are left truncated, or truncated from below. It is also possible for data to be truncated from above, or right truncated. An example would be a study on time from disablement to recovery conducted on June 30, 2009 that considers only disabled people who recovered by June 30, 2009. For a group of people disabled on June 30, 2006, this study would truncate the data at time 3, since people who did not recover within 3 years would be excluded from the study. 2. The exact data point is not provided; instead, a range is provided. Examples would be: • An insurance policy has a policy limit u. If a loss is for an amount greater than u, the only information you have is that the loss is greater than u, but you are not given the exact amount of the loss. • In a mortality study on life insurance policyholders, some policyholders surrender their policy. For these policyholders, you know that they died (or will die) some time after they surrender their policy, but don’t know the exact time of death. When a range of values rather than an exact value is provided, the data is said to be censored. In the two examples just given, the data are right censored, or censored from above. It is also possible for data to be censored from below, or left censored. An example would be a study of smokers to determine the age at which they started smoking in which for smokers who started below age 18 the exact age is not provided. We will discuss techniques for constructing data-dependent estimators in the presence of left truncation and right censoring. Data-dependent estimators in the presence of right truncation or left censoring are beyond the scope of the syllabus.1 1However, parametric estimators in the presence of right truncation or left censoring are not excluded from the syllabus. We will study parametric estimators in Lessons 30–33. C/4 Study Manual—17th edition Copyright ©2014 ASM

393

24. KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

394

S (t ) = a



s1 S (t ) = a 1 − r1

y1





y2

s1 S (t ) = a 1 − r1



s2 1− r2



t s1 deaths out of r1 lives

s2 deaths out of r2 lives

Figure 24.1: Illustration of the Kaplan-Meier product limit estimator. The survival function is initially a . After each event time, it is reduced in the same proportion as the proportion of deaths in the group.

24.1

Kaplan-Meier Product Limit Estimator

The first technique we will study is the Kaplan-Meier product limit estimator. We shall discuss its use for estimating survival distributions for mortality studies, but it may be used just as easily to estimate S ( x ) , and therefore F ( x ) , for loss data. To motivate it, consider a mortality study starting with n lives. Suppose that right before time y1 , we have somehow determined that the survival function S ( y1− ) is equal to a. Now suppose that there are r1 lives in the study at time y1 . Note that r1 may differ from n, since lives may have entered or left the study between inception and time y1 . Now suppose that at time y1 , s 1 lives died. See Figure 24.1 for a schematic. The proportion of deaths at time y1 is s 1 /r1 . Therefore, it is reasonable to conclude that the conditional survival rate past time y1 , given survival to time y1 , is 1 − s1 /r1 . Then the survival function at time y1 should be multiplied by this proportion, making it a (1 − s1 /r1 ) . The same logic is repeated at the second event time y2 in Figure 24.1, so that the survival function at time y2 is a (1 − s1 /r1 )(1 − s 2 /r2 ) . Suppose we have a study where the event of interest, say death, occurs at times y j , j ≥ 1. At each time y j , there are r j individuals in the study, out of which s j die. Then the Kaplan-Meier estimator of S ( t ) sets S n ( t )  1 for t < y1 . Then recursively, at the j th event time y j , S n ( y j ) is set equal to S n ( y j−1 )(1 − s j /r j ) , with y0  0. For t in between event times, S n ( t )  S n ( y j ) , where y j is the latest event time no later than t. The Kaplan Meier product limit formula is Kaplan-Meier Product Limit Estimator Sn ( t ) 

j−1  Y i1

1−

si , ri



y j−1 ≤ t < y j

(24.1)

r i is called the risk set at time y i . It is the set of all individuals subject to the risk being studied at the event time. If entries or withdrawals occur at the same time as a death—for example, if 2 lives enter at time 5, 3 lives leave, and 1 life dies—the lives that leave are in the risk set, while the lives that enter are not. Example 24A In a mortality study, 10 lives are under observation. One death apiece occurs at times 3, 4, and 7, and two deaths occur at time 11. One withdrawal apiece occurs at times 5 and 10. The study concludes at time 12. Calculate the product limit estimate of the survival function. Answer: In this example, the event of interest is death. The event times are the times of death: 3, 4, 7, and 11. We label these events y i . The number of deaths at the four event times are 1, 1, 1, and 2 respectively. We label these numbers s i . That leaves us with calculating the risk set at each event time. At time 3, there are 10 lives under observation. Therefore, the first risk set, the risk set for time 3, is r1  10. At time 4, there are 9 lives under observation. The life that died at time 3 doesn’t count. Therefore, r2  9. C/4 Study Manual—17th edition Copyright ©2014 ASM

24.1. KAPLAN-MEIER PRODUCT LIMIT ESTIMATOR

395

b

1.0

r

b r

0.8

b r

b

0.6 r

0.4 0.2 r

1

2

3

5

4

6

7

8

9

10

11

12

Figure 24.2: Graph of y  S10 ( x ) computed in Example 24A

At time 7, there are 7 lives under observation. The lives that died at times 3 and 4, and the life that withdrew at time 5, don’t count. Therefore, r3  7. At time 11, the lives that died at times 3, 4, and 7 aren’t in the risk set. Nor are the lives that withdrew at times 5 and 10. That leave 5 lives in the risk set. r4  5. We now calculate the survival function S10 ( t ) for 0 ≤ t ≤ 12 recursively in the following table, using formula (24.1). j

Time yj

Risk Set rj

Deaths sj

1 2 3 4

3 4 7 11

10 9 7 5

1 1 1 2

Survival Function S10 ( t ) for y j ≤ t < y j+1

(10 − 1) /10  0.9000 S10 (4− ) × (9 − 1) /9  0.8000 S10 (7− ) × (7 − 1) /7  0.6857 S10 (11− ) × (5 − 2) /5  0.4114

S10 ( t )  1 for t < 3. In the above table, y5 should be construed to equal 12.



We plot the survival function of Example 24A in Figure 24.2. Note that the estimated survival function is constant between event times, and for this purpose, only the event we are interested in—death—counts, not withdrawals. This means, for example, that whereas S10 (7)  0.6857, S10 (6.999)  0.8000, the same as S10 (4) . The function is discontinuous. By definition, if X is the survival time random variable, S ( x )  Pr ( X > x ) . This means that if you want to calculate Pr ( X ≥ x ) , this is S ( x − ) , which may not be the same as S ( x ) . Example 24B Assume that you are given the same data as in Example 24A. Using the product limit estimator, estimate: 1. the probability of a death occurring at any time greater than 3 and less than 7. 2. the probability of a death occurring at any time greater than or equal to 3 and less than or equal to 7. Answer:

1. This is Pr (3 < X < 7)  Pr ( X > 3) − Pr ( X ≥ 7)  S (3) − S (7− )  0.9 − 0.8  0.1 .

2. This is Pr (3 ≤ X ≤ 7)  Pr ( X ≥ 3) − Pr ( X > 7)  S (3− ) − S (7)  1 − 0.6857  0.3143 . C/4 Study Manual—17th edition Copyright ©2014 ASM



24. KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

396

Example 24A had withdrawals but did not have new entries. New entries are treated as part of the risk set after they enter. The next example illustrates this, and also illustrates another notation system used in the textbook. In this notation system, each individual is listed separately. d i indicates the entry time, u i indicates the withdrawal time, and x i indicates the death time. Only one of u i and x i is listed. Example 24C You are given the following data from a mortality study: i

di

xi

ui

1 2 3 4

0 0 2 5

— 5 — 7

7 — 8 —

Estimate the survival function using the product-limit estimator. Answer: There are two event times, 5 and 7. At time 5, the risk set includes individuals 1, 2, and 3, but not individual 4. New entries tied with the event time do not count. So S4 (5)  2/3. At time 7, the risk set includes individuals 1, 3, and 4, since withdrawals tied with the event time do count. So S4 (7)  (2/3)(2/3)  4/9. The following table summarizes the results: j

yj

rj

sj

1 2

5 7

3 3

1 1

S4 ( t ) for y j ≤ t < y j+1 2/3 4/9



In any time interval with no withdrawals or new entries, if you are not interested in the survival function within the interval, you may merge all event times into one event time. The risk set for this event time is the number of individuals at the start of the interval, and the number of deaths is the total number of deaths in the interval. For example, in Example 24A, to calculate S10 (4) , rather than multiplying two factors for times 3 and 4, you could group the deaths at 3 and 4 together, treat the risk set at time 4 as 10 and the number of deaths as 2, and calculate S10 (4)  8/10. These principles apply equally well to estimating severity with incomplete data. Example 24D An insurance company sells two types of auto comprehensive coverage. Coverage A has no deductible and a maximum covered loss of 1000. Coverage B has a deductible of 500 and a maximum covered loss of 10,000. The company experiences the following loss sizes: Coverage A: 300, 500, 700, and three claims above 1000 Coverage B: 700, 900, 1200, 1300, 1400 Let X be the loss size. Calculate the Kaplan-Meier estimate of the probability that a loss will be greater than 1200 but less than 1400, Pr (1200 < X < 1400) . Answer: We treat the loss sizes as if they’re times! And the “members” of Coverage B enter at “time” 500. The inability to observe a loss below 500 for Coverage B is analogous to a mortality study in which members enter the study at time 500. The loss sizes above 1000 for Coverage A are treated as withdrawals; they are censored observations, since we know those losses are greater than 1000 but don’t know exactly what they are. The Kaplan-Meier table is shown in Table 24.1. We will explain below how we filled it in. At 300, only coverage A claims are in the risk set; coverage B claims are truncated from below. Thus, the risk set at 300 is 6. Similarly, the risk set at 500 is 5; remember, new entrants are not counted at the C/4 Study Manual—17th edition Copyright ©2014 ASM

24.1. KAPLAN-MEIER PRODUCT LIMIT ESTIMATOR

397

Table 24.1: Survival function calculation for Example 24D

j

Loss Size yj

Risk Set rj

Losses sj

1 2 3 4 5 6 7

300 500 700 900 1200 1300 1400

6 5 9 7 3 2 1

1 1 2 1 1 1 1

Survival Function S11 ( t ) for y j ≤ t < y j+1 5/6 2/3 14/27 4/9 8/27 4/27 0

time they enter, only after the time, so though the deductible is 500, coverage B losses do not count  even  at 500. So we have that S11 (500)  65 45  23 . At 700, 4 claims from coverage A (the one for 700 and the 3 censored ones) and all 5 claims from coverage B are in the risk set, making the risk set 9. Similarly, at 900, the risk set is 7. So S11 (900)  2 7 6 4  3 9 7 9.   8 At 1200, only the 3 claims 1200 and above on coverage B are in the risk set. So S11 (1200)  94 23  27 .   4 8 1 Similarly, S11 (1300)  27 2  27 . 8 The answer to the question is Pr11 ( X > 1200) −Pr11 ( X ≥ 1400)  S11 (1200) −S11 (1400− ) . S11 (1200)  27 . 4 − − But S11 (1400 ) is not the same as S11 (1400) . In fact, S11 (1400 )  S11 (1300)  27 , while S11 (1400)  0. The final answer is then Pr11 (1200 < X < 1400) 

8 27

4 27



4 27

.



If all lives remaining in the study die at the last event time of the study, then S can be estimated as 0 past this time. It is less clear what to do if the last observation is censored. The two extreme possibilities are 1. to treat it as if it were a death, so that S ( t )  0 for t ≥ y k , where y k is the last observation time of the study. 2. to treat it as if it lives forever, so that S ( t )  S ( y k ) for t ≥ y k . A third option is to use an exponential whose value is equal to S ( y k ) at time y k . Example 24E In example 24A, you are to use the Kaplan-Meier estimator, with an exponential to extrapolate past the end of the study. Determine S10 (15) . Answer: S10 (12)  S10 (11)  0.4114, as determined above. We extend exponentially from the end 12 of the study at time 12. In other words, we want e −12/θ  0.4114, or θ  − ln 0.4114 . Then S10 (15) 

exp



15 ln 0.4114 12



 0.411415/12  0.3295 .



Notice in the above example that using an exponential to go from year 12 to year 15 is equivalent to raising the year 12 value to the 15/12 power. In general, if u is the ending time of the study, then exponential extrapolation sets S n ( t )  S n ( u ) t/u for t > u. If a study has no members before a certain time—in other words, the study starts out with 0 individuals and the first new entries are at time y0 —then the estimated survival function is conditional on the estimated variable being greater than y0 . There is simply no estimate for values less than y0 . For example, if Example 24D is changed so that Coverage A has a deductible of 250, then the estimates are C/4 Study Manual—17th edition Copyright ©2014 ASM

24. KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

398

H (t ) = b

H (t ) = b +

y1

s1 r1

y2

H (t ) = b +

s1 s2 + r1 r2

t s1 deaths out of r1 lives

s2 deaths out of r2 lives

Figure 24.3: Illustration of the Nelson-Åalen estimator of cumulative hazard function. The cumulative hazard function is initially b . After each event time, it is incremented by the proportion of deaths in the group.

for S11 ( x | X > 250) , and Pr11 (1200 < X < 1400 | X > 250)  4/27. It is not possible to estimate the unconditional survival function in this case. Note that the letter k is used to indicate the number of unique event times. There is a released exam question in which they expected you to know that that is the meaning of k.

?

Quiz 24-1 You are given the following information regarding six individuals in a study: dj

uj

xj

0 0 0 1 2 3

5 4 — 3 — 5

— — 3 — 4 —

Calculate the Kaplan-Meier product-limit estimate of S (4.5) . Now we will discuss another estimator for survival time.

24.2

Nelson-Åalen Estimator

The Nelson-Åalen estimator estimates the cumulative hazard function. The idea is simple. Suppose the cumulative hazard rate before time y1 is known to be b. If at that time s1 lives out of a risk set of r1 die, that means that the hazard at that time y1 is s 1 /r1 . Therefore the cumulative hazard function is increased by that amount, s j /r j , and becomes b + s1 /r1 . See Figure 24.3. The Nelson-Åalen estimator sets Hˆ (0)  0 and then at each time y j at which an event occurs, Hˆ ( y j )  Hˆ ( y j−1 ) + s j /r j . The formula is: Nelson-Åalen Estimator Hˆ ( t ) 

j−1 X si i1

ri

,

y j−1 ≤ t < y j

Example 24F In a mortality study on 98 lives, you are given that (i) 1 death occurs at time 5 (ii) 2 lives withdraw at time 5 (iii) 3 lives enter the study at time 5 (iv) 1 death occurs at time 8 Calculate the Nelson-Åalen estimate of H (8) . C/4 Study Manual—17th edition Copyright ©2014 ASM

(24.2)

24.2. NELSON-ÅALEN ESTIMATOR

399

Table 24.2: Summary of Formulas in this Lesson

Kaplan-Meier Product Limit Estimator Sˆ ( t ) 

j−1  Y i1

1−

si , ri



y j−1 ≤ t < y j

(24.1)

Nelson-Åalen Estimator Hˆ ( t ) 

j−1 X si i1

Exponential extrapolation

ri

y j−1 ≤ t < y j

,

Sˆ ( t )  Sˆ ( t0 ) t/t0

(24.2)

t ≥ t0

Answer: The table of risk sets and deaths is j

Time yj

Risk Set rj

Deaths sj

NA estimate Hˆ ( y j )

1

5

98

1

1 98

2

8

98

1

1 1 + 98 98

At time 5, the original 98 lives count, but we don’t remove the 2 withdrawals or count the 3 new entrants. At time 8, we have the original 98 lives minus 2 withdrawals minus 1 death at time 5 plus 3 new entrants, or 98 − 2 − 1 + 3  98 in the risk set. 1 1 1 Hˆ (8)  +  98 98 49



To estimate the survival function using Nelson-Åalen, exponentiate the Nelson-Åalen estimate; Sˆ ( x )  In the above example, the estimate would be Sˆ (8)  e −1/49  0.9798. This will always be higher than the Kaplan-Meier estimate, except when Hˆ ( x )  0 (and then both estimates of S will be 1). In the 97 2  0.9797. above example, the Kaplan-Meier estimate would be 98 Everything we said about extrapolating past the last time, or conditioning when there are no observations before a certain time, applies equally well to Sˆ ( t ) estimated using Nelson-Åalen. ˆ e −H ( x ) .

?

Quiz 24-2 In a mortality study on 10 lives, 2 individuals die at time 4 and 1 individual at time 6. The others survive to time 10. Using the Nelson-Åalen estimator, estimate the probability of survival to time 10.

24. KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

400

Calculator Tip Usually it is easy enough to calculate the Kaplan-Meier product limit estimator by directly multiplying 1 − s j /r j . If you need to calculate several functions of s j and r j at once, such as both the Kaplan-Meier and the Nelson-Åalen estimator, it may be faster to enter s j /r j into a column of the TI-30XS/B Multiview’s data table, and the function ln (1 − L1) . The Kaplan-Meier estimator is a product, whereas the statistics registers only include sums, so it is necessary to log each factor, and then exponentiate the sum in the statistics register. Also, the sum is always of the entire column, so you must not have extraneous rows. If you need to calculate the estimator at two times, enter the rows needed for the earlier time, calculate the estimate, then add the additional rows for the second time. Example 24G Seven times of death were observed: 5

6

6

8

10

12

15

In addition, there was one censored observation apiece at times 6, 7, and 11. Calculate the absolute difference between the product-limit and Nelson-Åalen estimates of S (10) . Answer: Only times up to 10 are relevant; the rest should be omitted. The r i ’s and s i ’s are yi

ri

si

5 6 8 10

10 9 5 4

1 2 1 1

Here is the sequence of steps on the calculator: Clear table

data data 4

Enter s i /r i in column 1

Enter formula Kaplan-Meier in umn 2

for col-

Calculate statistics registers Clear display

1 ÷ 10

s% 2 ÷ 9 s% 1 ÷ 5 s% 1 ÷ 4 enter

t% data t% 1

ln 1- data 1 ) enter

2nd [stat]2 (Select L1 as first variable and L2 as enter

second)

s% s%

clear clear

L1

L2

L3

L1 L2 0.2222 0.2 0.25

L3

L1(1)=

L1(5)= L1 L2 L3 −0.105 0.1 0.2222 −0.251 −0.223 0.2 −0.288 0.25 L2(1)=−0.10536051.. 2-Var:L1,L2 1:n=4 ¯ 2:x=0.1930555556 3↓Sx=0.065322089

EXERCISES FOR LESSON 24

401

Calculator Tip Extract sum x (statistic 8) and sum y (statistic 10) from table Calculate difference of estimates

t%

2nd [ e x ] (−) 2nd [stat]38 − 2nd [ e x ] 2nd [stat]3 (Press 9 times to get to A)

s%

enter

2-Var:L1,L2 P 8↑ x=0.77222222 P 2 9: Px =0.1618827 A↓ y=−0.86750056 P P e− x − e y 0.041985293

The answer is 0.041985293 . Notice that the negative of the Nelson-Åalen estimator was exponentiated, but no negative sign is used for the sum of the logs of the factors of the product-limit estimator. 

Exercises 24.1. [160-F86:2] The results of using the product-limit data set are: 1.0,       49    ,   50   Sˆ ( x )   1,911   ,   2,000      36,309    ,  40,000

(Kaplan-Meier) estimator of S ( x ) for a certain 0≤x 100)  e −0.509524  0.6008 . and Pr

24.26. We now have 10 observations plus the censored observation of 100, so we calculate the cumulative hazard rate using risk sets of 11 at 74, 10 at 89, and 9 at 95. The risk sets at 102, 106, and 122 are the same as in the previous exercise, so we’ll add the sum computed there, 0.509524, to the sum of the quotients from the lowest three observations. 1 1 1 Hˆ (125)  + + + 0.509524  0.811544 11 10 9

D ( X > 125)  e −0.811544  0.4442 . and Pr 24.27. The Nelson-Åalen estimate of Hˆ (12) is 2 1 1 2 Hˆ (12)  + + +  0.65 15 12 10 6 Then Sˆ (12)  e −0.65  0.5220 . (B) 24.28. Since there is no censoring, we have yi 1 2 3 5

ri 50 49 48 46

si 1 1 2 2

Hˆ ( y i ) 1/50  0.02 0.02 + 1/49  0.04041 0.04041 + 2/48  0.08207 0.08207 + 2/46  0.12555

EXERCISE SOLUTIONS FOR LESSON 24

417

24.29. To go from time 3.75 to time 4, since only one agent resigned in between, we multiply Sˆ (3.75) by r i −s i r i , where s i  1 for the one agent who resigned and r i is the risk set at the time that agent resigned. Since 11 agents were employed longer, the risk set is r i  11 + 1  12 (counting the agent who resigned and the 11 who were employed longer). If we let y i be the time of resignation, since nothing happens between y i and 4, ! 11  0.2292 Sˆ (4)  Sˆ ( y i )  0.25 12 The fact 2 agents were employed for 6 years is extraneous. 24.30. The product-limit estimator up to time 5, taking the 2 censored observations at 4 and 6 into account, is: yi

ri

si

Sˆ ( y i )

1 3 5

9 8 5

1 2 1

8/9 6/9 (6/9)(4/5)  24/45

D (3 ≤ T ≤ 5)  Sˆ (3− ) − Sˆ (5)  8 − 24  16  0.3556 Pr 9 45 45

(C)

24.31. You can calculate all five possibilities, but let’s reason it out. If the lapse occurred at time 5, 4 claims occurred; otherwise, only 3 claims occurred, so one would expect the answer to be 5 , (E). 24.32. Since there is no censoring (in every case, r i+1  r i − s i ), the products telescope, and the productlimit estimator becomes the empirical estimator. S T (4) 

(112 − 22) + (45 − 10)

200 + 100 45 − 10 S B (4)   0.35 100 ST (4) − S B (4)  0.067 (B) 1 24.33. We have n1 + n−1  mate the equation as

23 132



125  0.417 300

which is a quadratic, but since n must be an integer, it is easier to approxi2 23 ≈ n − 1/2 132 1 264 n− ≈  11.48 2 23

so n  12. Then

23 132

+

1 10

+

1 9

 0.3854 . (C)

6 24.34. Through time 5 there is no censoring, so Sˆ (5)  10 (6 survivors out of 10 original lives). Then   6 3 Sˆ (7)  10 5 (three survivors from 5 lives past 5), so Sˆ (7)  0.36. There are no further claims between 7 and 8, so the answer is 0.36 . (D)

24.35. Hˆ ( n )  − ln 1 − Fˆ ( n )  0.78



100

 0.78



24. KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

418

n ( n + 1)  78 2 This quadratic can be solved directly, or by trial and error; approximate the equation with making n + 0.5 around 12.5, and we verify that 12 works. (E)

( n+0.5) 2 2

 78

24.36. The x i ’s are the events. d i ’s are entry times into the study, and u i ’s are withdrawal, or censoring, times. Every member of the study is counted in the risk set for times in the interval ( d i , u i ]. Before time 1.6, there are 2 event times, 0.9 and 1.5. (The other x i ’s are 1.7 and 2.1, which are past 1.6.) At time 0.9, the risk set consists of all entrants before 0.9, namely i  1 through 7, or 7 entries. There are no withdrawals or deaths before 0.9, so the risk set is 7. At time 1.5, the risk set consists of all entrants before 1.5, or i  1 through 8, minus deaths or withdrawals before time 1.5: the death at 0.9 and the withdrawal at 1.2, leaving 6 in the risk set. Note that entrants at time 1.5 are not counted in the risk set and withdrawals at time 1.5 are counted. The standard table with y j ’s, r j ’s, and s j ’s looks like this:

The Kaplan-Meier estimate is then

yj

rj

sj

Sˆ ( y j )

0.9 1.5

7 6

1 1

6/7 5/7

6 5 7 6



5 7

 0.7143 , or (E).

24.37. We must calculate n. Either you observe that the denominator 380 has divisors 19 and 20, or you estimate 2 39 ≈ n − 0.5 380

1 1 39 and you conclude that n  20, which you verify by calculating 20 + 19  380 . The Kaplan-Meier estimate is the empirical complete data estimate since no one is censored; after 9 deaths, the survival function is (20 − 9) /20  0.55 . (A)

Quiz Solutions 24-1. The risk set is 5 at time 3, since the entry at 3 doesn’t count. The risk set is 4 at time 4, after removing the third and fourth individuals, who left at time 3. The estimate of S (4.5) is (4/5)(3/4)  0.6 . 24-2.

The risk sets are 10 at time 4 and 8 at time 6. Therefore 2 1 Hˆ (10)  +  0.325 10 8 Sˆ (10)  e −0.325  0.7225

Lesson 25

Estimation of Related Quantities Reading: Loss Models Fourth Edition 11,12.1–12.2 If the empirical distribution, whether based on complete or incomplete data, is used as the model, questions for the distribution are answered based on it. It is treated as a genuine, full-fledged distribution, even though it’s discrete. You can use the methods we learned in the first part of this course to calculate limited moments and similar items.

25.1

Moments

25.1.1

Complete individual data

When the distribution is estimated from complete individual data, the distribution mean is the same as the sample mean. Percentiles are not well-defined for a discrete distribution, so smoothing is desirable; we’ll discuss smoothed empirical percentiles in Lesson 31. The variance of the empirical distribution is σ2 

Pn

i1 ( x i

n

− x¯ ) 2

.

When using the empirical distribution as the model, do not divide by n − 1 when calculating the variance! This is not an unbiased estimator of some underlying variance—it is the variance, since the empirical distribution is the model. Thus, using the empirical distribution as the model results in a biased estimate of the variance.

25.1.2

Grouped data

When the distribution is estimated from grouped data, moments are calculated using the ogive or histogram. The mean can be calculated as the average of the averages—sum up the averages of the groups, weighted by the probability of being in the group. Higher moments, however, may require integration. The following example illustrates this. Example 25A You have the following data for losses on an automobile liability coverage: Loss Size

Number of Losses

0– 1000 1000– 2000 2000– 5000 5000–10,000

23 16 6 5

A policy limit of 8000 is imposed. Estimate the average claim payment and the variance of claim payments, taking the policy limit into account. Answer: Remember that the value of the histogram ( fˆ( x ) ) is the number of observations in the group divided by the total number of observations and by the length of the group’s interval. The histogram is constant in each interval. C/4 Study Manual—17th edition Copyright ©2014 ASM

419

25. ESTIMATION OF RELATED QUANTITIES

420

In this study, there are a total of 50 observations. We therefore calculate: Loss Size

Size of Interval

Number of Losses

0– 1000

1000

23

1000– 2000

1000

16

2000– 5000

3000

6

5000–10,000

5000

5

Histogram fˆ( x ) .  23 (50)(1000)  0.00046 .  16 (50)(1000)  0.00032 .  6 (50)(3000)  0.00004 .  5 (50)(5000)  0.00002

Figure 25.1 graphs the histogram.

fˆ( x ) 0.00050 0.00040 0.00030 0.00020 0.00010 s

1000

2000

3000

4000

5000

6000

7000

8000

9000

10,000

x

Figure 25.1: Histogram for Example 25A

The mean can be calculated using the shortcut mentioned before the example, namely as an average of averages. In each interval below the interval containing the policy limit, the average claim payment is the midpoint. However, we must split the interval (5000, 10,000) into two subintervals, (5000, 8000) and (8000, 10,000) , because of the policy limit. In the interval (5000, 8000) , the average claim payment is 6500, but in the interval (8000, 10,000) , the average claim payment is 8000. The weight given to each interval is the probability of being in the interval. For the first three intervals, the probabilities are the number of claims in the interval divided by the total number of claims in the study. For example, in the interval (2000, 5000) , the probability of a claim being in the interval is 6/50  0.12. The probability of a loss between 5000 and 8000 is determined from the ogive, F50 (8000) − F50 (5000) . However, the ogive is a line, so the probability of a loss between 5000 and 8000 is the probability of a loss in the loss size interval (5000, 10,000) , which is 0.1, times the proportion of the interval consisting of (5000, 8000) , which is (8000 − 5000) / (10,000 − 5000)  0.6. Thus the probability that a loss is between 5000 and 8000 is (0.1)(0.6)  0.06. By similar logic, the probability that a loss is in the interval (8000, 10,000) is (0.1)(0.4)  0.04. We therefore calculate the average claim payment as follows: C/4 Study Manual—17th edition Copyright ©2014 ASM

25.1. MOMENTS

421

Interval

Probability of loss in interval

Average claim payment in interval

0.46 0.32 0.12 0.06 0.04

500 1500 3500 6500 8000

0– 1000 1000– 2000 2000– 5000 5000– 8000 8000–10,000 Total

Product 230 480 420 390 320 1840

The average claim payment is 1840. You could also calculate this using integrals, the way we will calculate the second moment, but that is harder. To calculate the variance, we first calculate the second moment. This means integrating x 2 f50 ( x ) from 0 to 8000, as in this interval the payment is x, and integrating 80002 f50 ( x ) from 8000 to 10,000, as in this interval the payment is 8000. Since f50 is piecewise constant, we integrate it interval by interval, using formula (22.1) to calculate f50 on each interval. E ( X ∧ 8000) 2 

f

g

1000

Z 0

+ 

1 3



Z

0.00046x 2 dx + 8000

5000

Z

2000 1000

0.00002x 2 dx +

0.00032x 2 dx + 10,000

Z

8000

Z

5000 2000

0.00004x 2 dx

0.00002 (80002 ) dx

0.00046 (10003 − 03 ) + 0.00032 (20003 − 10003 ) + 0.00004 (50003 − 20003 )

+ 0.00002 (80003 − 50003 ) + 0.00002 (80002 )(10,000 − 8000)



 31 (460,000 + 2,240,000 + 4,680,000 + 7,740,000) + 2,560,000

 7,600,000

The variance is then 7,600,000 − 18402  4,214,400 . An alternative to integrating is to use equation (2.4) to calculate the second moment of the uniform distribution in each interval, and weight the results by the probabilities of losses in the interval. That formula says that for a uniform distribution on [d, u], the second moment is 13 ( d 2 + du + u 2 ) . In this example, the calculation would be: E[ ( X ∧ 8000) 2 ] 

  1 0.46 (10002 ) + 0.32 10002 + (1000)(2000) + 20002 3   + 0.12 20002 + (2000)(5000) + 50002 + 0.06 50002 + (5000)(8000) + 80002





+ 0.04 (80002 )

An alternative method for calculating variance is to use the conditional variance formula. This avoids integration, since the conditional distribution of claims within each interval is uniform, and the variance of a uniform is the range squared divided by 12. Let I be the variable for the interval. Then Var ( X ∧ 8000)  E[Var ( X ∧ 8000 | I ) ] + Var (E[X ∧ 8000 | I])

10002 10002 30002 30002 E , , , , 0 + Var (500, 1500, 3500, 6500, 8000) 12 12 12 12

"

#

where each variance is based on the length of the interval (1000, 1000, 3000, 3000, and 0 for the five intervals 0–1000, 1000–2000, 2000–5000, 5000–8000, and 8000 respectively) squared and divided by 12, and each expected value is the midpoint of each interval. C/4 Study Manual—17th edition Copyright ©2014 ASM

25. ESTIMATION OF RELATED QUANTITIES

422

The expected value of the five variances is weighted by the probabilities of the intervals, each of which is the number of losses divided by 50, so 10002 10002 30002 30002 23 (10002 /12) + 16 (10002 /12) + 6 (30002 /12) + 3 (30002 /12) E , , ,   200,000 12 12 12 12 50

"

#

The overall expected value is 1840, as computed above. The second moment of the five interval expected values is 23 (5002 ) + 16 (15002 ) + 6 (35002 ) + 3 (65002 ) + 2 (80002 )  7,400,000 50 so the variance of the expected values is 7,400,000 − 18402  4,014,400. The variance of claim payments is then 200,000 + 4,014,400  4,214,400 . 

?

Quiz 25-1 Use the same data as in Example 25A. A policy limit of 1000 is imposed. Estimate the variance of claim payments assuming uniform distribution of loss sizes within each interval.

25.1.3

Incomplete data

When data are incomplete due to censoring or truncation, the product limit estimator or the Nelson-Åalen estimator is used. You can then estimate S ( x ) . To obtain the expected values or limited expected values, P you can use the definition, namely xp x for the expected value where p x is Pr ( X  x ) as estimated and a discrete counterpart of equation (5.3) for limited moments, or you can use formulas (5.2) or (5.6): ∞

Z E[X] 

0 d

Z E[X ∧ d] 

0

S ( x ) dx

S ( x ) dx

Sˆ ( x ) , whether it is a Kaplan-Meier or a Nelson-Åalen estimate, is a step function, so integrating it can be done by summing up the areas under the horizontal lines in the graph of the function. In other words (but please don’t memorize this formula, just understand it): ∞

Z 0

Sˆ ( x ) dx 

∞ X j0

Sˆ ( y j )( y j+1 − y j )

where y0  0 and the other y j ’s are event times. Example 25B [160-F86:6] You are given:

Age at

Individual

Entry

Withdrawal

Death

A B C D E F G H

0 0 0 0 5 10 15 20

6 27 – – – – 50 23

– – 42 42 60 24 – –

25.1. MOMENTS

423

Using the product-limit estimator of S ( x ) , determine the expected future lifetime at birth. (A) 46.0

(B) 46.5

(C) 47.0

(D) 47.5

(E) 48.0

Answer: Event times (deaths) are 24, 42, and 60. We have S ( t )  1 for t < 24 and: yj

rj

sj

24 42 60

6 4 1

1 2 1

S8 ( x ) , y j ≤ x < y j+1 5/6 5/12 0

We therefore have 5 1  6 6 5 5 5 Pr ( X  42)  −  6 12 12 5 Pr ( X  60)  12 Pr ( X  24)  1 −

and expected survival time is E[X] 

1 5 5 (24) + (42) + (60)  46.5 6 12 12

Alternatively, you could calculate the integral of the survival function. Figure 25.2 graphs S8 ( x ) . 1.0

S8 ( x )

0.8 0.6 0.4 0.2 q

x 10

20

30

50

40

60

Figure 25.2: Graph of estimated survival function for Example 25B

The expected value, the integral of S8 ( x ) , is the sum of the areas of the three rectangles under the graph of S8 ( x ) : E[X]  24 (1) + 18

5 6

+ 18

5 12

 46.5

(B)



25. ESTIMATION OF RELATED QUANTITIES

424

?

Quiz 25-2 Using the data of Example 25B, determine the estimated variance of future lifetime at birth. Example 25C In a mortality study, 10 lives are under observation. One death apiece occurs at times 3, 4, and 7, and two deaths occur at time 11. Withdrawals occur at times 5 and 10. The study concludes at time 12. (This is the same data as in Example 24A.) You estimate the survival function using the product limit estimator, and extrapolate past time 12 using an exponential. Estimate expected survival time. Answer: We calculated the survival function in the answer to Example 24A on page 394. In Example 24E on page 397 we determined that the extrapolated function past time 12 is 0.4114t/12 at time t > 12. The graph of the estimated survival function is shown in Figure 25.3. The expected value of a nonnegative 1.0 0.8 0.6

A

0.4

B C

0.2 0 0

5

D

E

F

10

20

15

25

30

Figure 25.3: Graph of y  S10 ( x ) of examples 24E and 25C

random variable is equal to the integral of the survival function, or the shaded area under the graph of the survival function. This area consists of rectangles A, B, C, D, and E, plus area F. The areas of the rectangles are (base times height): Area(A)  (3 − 0)(1)  3

Area(B)  (4 − 3)(0.9)  0.9

Area(C)  (7 − 4)(0.8)  2.4

Area(D)  (11 − 7)(0.6857)  2.7429

Area(E)  (12 − 11)(0.4114)  0.4114

The area of F is the integral of 0.4114t/12 from 12 to ∞: Area(F) 

Z

12

0.4114t/12 dt





 ∞

12 0.4114t/12 ln 0.4114

 5.5583

12

An alternative method for evaluating the area of F is to use the fact that the distribution is exponential, and the integral we wish to calculate equals E[ ( X −12)+ ]. Since the survival function of the exponential (as seen in the integrand) is 0.4114t/12  e t ln 0.4114/12 , it follows that the mean of the exponential is −12/ ln (0.4114)  C/4 Study Manual—17th edition Copyright ©2014 ASM

25.2. RANGE PROBABILITIES

425

13.51063. Then the conditional mean of survival time, given that it is greater than 12, is equal to the mean of the exponential distribution, since an exponential has no memory. Thus, the conditional mean of survival time given that it is greater than 12 is 13.51063. Multiplying this mean by the probability of the variable being greater than 12, or 0.4114, we get that the area of F is 0.4114 (13.51063)  5.5583. Then the expected survival time is E[X]  3 + 0.9 + 2.4 + 2.7429 + 0.4114 + 5.5583  15.0126

25.2



Range probabilities

The probability that x is in the range ( a, b] is F ( b ) − F ( a ) . Since the empirical distribution is discrete, however, you must be careful to include or exclude the boundaries depending upon whether you’re interested in “less than” (or greater than) or “less than or equal” (or greater than or equal). The following two simple examples demonstrate this. Example 25D On an inland marine coverage, you experience loss sizes (in millions) of 2, 2, 3, 4, 6, 8, 9, 12. You use the empirical distribution based on this data as the model for loss size. Let X be the loss size. Determine Pr (3 ≤ X ≤ 6) . Answer: F (6)  0.625 and F (3− )  0.25 (there are only 2 claims below 3), so the answer is 0.625 − 0.25  0.375 . Another way to see this is to note that exactly three of the loss sizes are in the range [3, 6], and 3/8  0.375.  Example 25E For a dental coverage, amount of time to handle 8 claims (in days) was 15, 22, 27, 35, 40, 40, 45, 50. In addition, two claims were withdrawn at times 30 and 35 without being settled. Let X be the amount of time to handle a claim. Use the Nelson-Åalen estimator to model amount of time for handling claims. Determine Pr (27 < X < 45) . Answer: Pr ( X > 27)  S (27) , so we need S (27) . Pr ( X < 45)  1 − S (45− ) , so we need S (45− ) , which here is the same as S (40) . 1 1 1 Hˆ (27)  + +  0.33611 10 9 8 1 2 Hˆ (45− )  0.33611 + +  1.00278 6 4 Pr (27 < X < 45)  e −0.33611 − e −1.00278  0.34768

?



Quiz 25-3 In a study of survival time, the study begins with 10 lives. One death apiece occurs at times 4, 7, 9, 12, 20. One withdrawal apiece occurs at times 6 and 10. All other lives survive to time 50. Estimate the probability of death at exact time 7 using the product limit estimator.

25.3

Deductibles and limits

We use empirical limited expected values to calculate the average payment per loss and per payment in the presence of limits and deductibles. For a deductible of d and a maximum covered claim of u, the average payment per loss is E[X ∧ u] − E[X ∧ d] C/4 Study Manual—17th edition Copyright ©2014 ASM

25. ESTIMATION OF RELATED QUANTITIES

426

and the average payment per payment is E[X ∧ u] − E[X ∧ d] . 1 − F (d )

If the data are grouped, these expected values are computed from the ogive. If an entire group is included in the range between the deductible and the maximum covered claim, this is equivalent to placing all claims in the group at the midpoint. Calculating variance is more cumbersome. Example 25A showed how to handle policy limits. Example 25F You have the following data for losses on an automobile collision coverage: Claim Size

Number of Claims

0– 1000 1000– 2000 2000– 5000 5000–10,000

23 16 6 5

You are pricing another coverage that has a deductible of 500. Estimate the average payment per payment on this coverage. Answer: Split the first interval (0, 1000) into two subintervals, (0, 500] and (500, 1000) . The average payment in the first interval is 0 and in the second interval the average payment is 250. The probability that a loss will be in the second interval is 0.5 (23) /50  0.23. Similarly, the probabilities of the other three intervals are 16/50  0.32, 6/50  0.12, and 5/50  0.1. The average payment per loss in those intervals is the midpoint, minus 500. So the overall average payment per loss is E[ ( X − 500)+ ]  0.23 (250) + 0.32 (1000) + 0.12 (3000) + 0.1 (7000)  1437.5 Also, Pr ( X ≤ 500)  0.5 (23) /50  0.23, so the average payment per payment is 1437.5/ (1 − 0.23)  1866.88 . 

25.4

Inflation

To handle uniform inflation, multiply each observed loss by the inflation amount. Example 25G You have the following data for losses on an automobile liability coverage: Claim Size

Number of Claims

0– 1000 1000– 2000 2000– 5000 5000–10,000

23 16 6 5

You are pricing another coverage that has a policy limit of 10,000. It is expected that loss sizes will increase by 5% due to inflation. Estimate the average payment per loss before and after inflation. 1 Answer: Before inflation, the average claim size is 50 23 (500) + 16 (1500) + 6 (3500) + 5 (7500)  1880 . With inflation, each group is multiplied by 1.05, so the first group’s claims are 0–1050, the second group’s claims are 1050–2100, the third group’s claims are 2100–5250, and the fourth group’s claims are 5250– 10,500. In the fourth group, claims are capped at 10,000. We use the ogive to get the average claim there.





EXERCISES FOR LESSON 25

427

500 of the claims in the range (5250, 10,500) will be Since a uniform distribution is implicit in the ogive, 5250 above 10,000. The remaining claims will have an average of the midpoint of the interval (5250, 10,000). So the average capped claim in this interval will be:

500 4750 5250 + 10,000 + (10,000)  7851.19. 5250 2 5250

!

The average claim after inflation is then 1 50



23 (525) + 16 (1575) + 6 (3675) + 5 (7851.19)  1971.62 .





Exercises 25.1.

A sample has the following observations: 2 observations of 400 7 observations of 800 1 observation of 1600

Calculate the coefficient of skewness for the empirical distribution.

Exercises continue on the next page . . .

25. ESTIMATION OF RELATED QUANTITIES

428

Probability Density

25.2. [4B-S91:34] The following relative frequency histogram depicts the expected distribution of policyholder claims. 0.20 0.19 0.18 0.17 0.16 0.15 0.14 0.13 0.12 0.11 0.10 0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0.00

0

2

6 8 10 12 Size of Policyholder Loss

4

14

16

You are given that: (i) The policyholder pays the first \$1 of each claim. (ii) The insurer pays the next \$9 of each claim. (iii) The reinsurer pays the remaining amount if the claim exceeds \$10. Determine the average net claim size paid by the insurer. (A) (B) (C) (D) (E) 25.3.

Less than 3.8 At least 3.8, but less than 4.0 At least 4.0, but less than 4.2 At least 4.2, but less than 4.4 At least 4.4 You have the following data for 100 losses: Loss Size

Number of Losses

0– 1000 1000– 2000 2000– 5000 5000–10000

42 21 19 18 100

Assuming that payments are uniformly distributed within each interval, calculate the empirical limited expected value at 1800.

Exercises continue on the next page . . .

EXERCISES FOR LESSON 25

25.4.

429

Claim sizes have the following distribution: Claim Size

Number of Claims

0–1000 1000–5000 5000– ∞

25 15 10

Let X be claim size. Assume a uniform distribution of claim sizes within each interval. Calculate E[X ∧ 3000], the limited expected value of payment at 3000. 25.5.

Claim sizes have the following distribution: Claim Size

Number of Claims

0– 500 500–1000 1000–2000

60 35 5

Let X be claim size. Assume a uniform distribution of claim sizes within each interval. Calculate E ( X ∧ 1000) 2 .

f

25.6.

g

Claim sizes have the following distribution: Claim Size

Number of Claims

0–1000 1000–2000 2000– ∞

10 7 3

Let X be claim size. Assume a uniform distribution of claim sizes within each interval. Calculate Var ( X ∧ 800) . 25.7. [4B-S96:22] (2 points) Forty (40) observed losses have been recorded in thousands of dollars and are grouped as follows: Interval (\$000)

Number of Losses

(1,4/3) [4/3,2) [2,4) [4, ∞)

16 10 10 4

Let X be the size of loss random variable.

Estimate the average payment for a coverage having a limit of 2 (thousand). (A) (B) (C) (D) (E)

Less than 0.50 At least 0.50, but less than 1.0 At least 1.0, but less than 1.5 At least 1.5, but less than 2.0 At least 2.0

Exercises continue on the next page . . .

25. ESTIMATION OF RELATED QUANTITIES

430

25.8.

Eleven observed losses for a coverage with no deductible or limit have been recorded as follows: 200

210

220

300

410

460

520

610

900

2000

2700

Estimate the average payment per payment for a coverage with a 250 deductible and a 1000 maximum covered claim using the empirical distribution. 25.9.

Nine losses from a coverage with no deductible or limit have been recorded as follows: 2000

3500

4000

4500

6000

9600

10,000

10,000

16,000

Estimate the average payment per payment after 5% inflation if a maximum covered claim of 10,000 is imposed, using the empirical distribution. 25.10. Six losses have been recorded as follows: 500

1000

4500

6000

8000

10,000

Using the empirical distribution, estimate the variance of claim size. 25.11. A study is conducted on 5 year term policies. For 10 such policies, number of years until lapse is as follows: Years

Number of lapses

1 2 3 4 5

3 2 1 0 1

All lapses occur at the end of the year. In addition, there is one death apiece in the middle of years 2, 3, and 4. Using the product limit estimator, estimate average number of years until lapse on 5-year term policies. 25.12. On a disability coverage, you have data on the length of the disability period, in months, for 15 lives. For 11 lives, the amounts of time are 3, 4, 6, 6, 8, 10, 12, 24, 36, 60, 72. The remaining 4 lives are still on disability, and have been on disability for 6, 12, 18, and 24 months respectively. You are to use the product limit estimator to estimate the amount of time on disability. If a 24 month limit is imposed, what will the average time on disability be? 25.13. An auto liability coverage is sold with two policy limits, 25,000 and 50,000. Your data include nine payments on this coverage for the following amounts: 25,000 limit: 10,000, 20,000, 25,000, 25,000, 25,000 50,000 limit: 20,000, 30,000, 40,000, 50,000 You use the product limit estimator to estimate the loss distribution. Determine the estimate of expected payment per loss for coverage with a policy limit of 45,000.

Exercises continue on the next page . . .

EXERCISES FOR LESSON 25

431

25.14. [1999 C4 Sample:7] Summary statistics for a sample of 100 losses are: Interval

Number of Losses

Sum

Sum of Squares

(0, 2,000] (2,000, 4,000] (4,000, 8,000] (8,000, 15,000] (15,000, ∞]

39 22 17 12 10

38,065 63,816 96,447 137,595 331,831

52,170,078 194,241,387 572,753,313 1,628,670,023 17,906,839,238

100

667,754

20,354,674,039

Total

Determine the empirical limited expected value E[Xˆ ∧ 15,000]. 25.15. [4-F01:2] You are given: Claim Size ( X )

Number of Claims

(0, 25] (25, 50] (50, 100] (100, 200]

30 32 20 8

Assume a uniform distribution of claim sizes within each interval. Estimate the second raw moment of the claim size distribution. (A) (B) (C) (D) (E)

Less than 3300 At least 3300, but less than 3500 At least 3500, but less than 3700 At least 3700, but less than 3900 At least 3900

25.16. [4-F03:37] You are given: Claim Size ( X )

Number of Claims

(0, 25] (25, 50] (50, 100] (100, 200]

25 28 15 6

Assume a uniform distribution of claim sizes within each interval. Estimate E[X 2 ] − E[ ( X ∧ 150) 2 ].

(A) (B) (C) (D) (E)

Less than 200 At least 200, but less than 300 At least 300, but less than 400 At least 400, but less than 500 At least 500

Additional released exam questions: C-F06:3, C-S07:7

25. ESTIMATION OF RELATED QUANTITIES

432

Solutions 25.1. The coefficient of skewness is scale-free, so let’s divide all the observations by 100 to make the arithmetic easier. 2 + (16−8) 2 The mean is 2 (4) +710(8) +16  8. The variance is 2 (4−8) 10  9.6. It would make no sense to divide by 9 instead of 10; you are not estimating the underlying variance (or even the underlying skewness). You are using the empirical distribution to calculate skewness! The third central moment is 25.2.

2 (4−8) 3 + (16−8) 3 10

 38.4. The coefficient of skewness is

38.4 9.61.5

 1.290994 .

The question asks for average payment per payment, although the language is somewhat unclear. Pr (claim)  1 − 0.1  0.9 3

Z

E[claim] 

1

0.1 ( x − 1) dx +

5

Z 3

0.2 ( x − 1) dx +

10

Z 5

0.03 ( x − 1) dx + 0.15 (9)

 0.2 (1) + 0.4 (3) + 0.15 (6.5) + 1.35  3.725 Average claim  3.725/0.9  4.14

(C)

Another way to calculate the expected payment per loss is to use the usual trick for uniform distributions: use double expectation and the fact that on any uniform interval the mean is the midpoint. The probability that the loss is in the interval (1,3) is 2 (0.1)  0.2. The probability that the loss is in the interval (3,5) is 2 (0.2)  0.4. The probability that the loss is in the interval (5,10) is 5 (0.03)  0.15. The probability that the loss is greater than 10 is 0.03 (5)  0.15. So the average payment per loss is E[claim]  0.2 (2 − 1) + 0.4 (4 − 1) + 0.15 (7.5 − 1) + 0.15 (10 − 1)  3.725 25.3. The average amount paid in the interval (0, 1000) is 500. Split the second interval into [1000, 1800] and (1800, 2000) . In [1000, 1800] the average payment is 1400. The payment is 1800 for losses of 1800 and above. The empirical probability of (0, 1000) is 0.42. The empirical probability of [1000, 1800] is 0.8 (0.21)  0.168. Therefore, E[X ∧ 1800] 

 1  42 (500) + 16.8 (1400) + 41.2 (1800)  1186.80 100

25.4. Split the interval [1000, 5000) into [1000, 3000] and (3000, 5000) . Average payment in the interval [1000, 3000] is 2000, and 0.5 (15)  7.5 claims out of 50 are expected in that interval. 25 claims are expected in (0, 1000) and the remaining 17.5claims are expected to be 3000 or greater. Therefore E[X ∧ 3000]  1 50 25 (500) + 7.5 (2000) + 17.5 (3000)  1600 . 25.5. See Example 25A on page 419 to see how this type of exercise is solved. This exercise is easier, because the limit (1000) is an endpoint of an interval. Each value of the histogram is calculated by dividing the number of observations in the interval by the length of the interval and the total number of observations (100). In the following equation, the common denominator 100 is pulled outside the brackets. E ( X ∧ 1000) 2 

f

g

1 60 (5003 ) 35 (10003 − 5003 ) 5 (10002 ) + +  304,166 23 (100)(3) 500 500 100

!

EXERCISE SOLUTIONS FOR LESSON 25

25.6.

433

See Example 25A on page 419 to see how this type of exercise is solved. Using the same technique, E[X ∧ 800]  E ( X ∧ 800)

f

But

Z

2

1 20





∞ 800



800

Z

g

8 (400) + 12 (800)  640

0

2

x f20 ( x ) dx +

Z

800

8002 f20 ( x ) dx

f20 ( x ) dx  1 − F20 (800) 

12 20

and by equation (22.1), f20 ( x ) 

10  0.0005 (20)(1000)

in the interval (0, 1000)

so E ( X ∧ 800) 2 

f

g

0.0005 (8003 ) 12 + (8002 )  469,333 13 3 20

Var ( X ∧ 800)  469333 31 − 6402  59,733 13 However, an easier method is available here, using conditional variance and the Bernoulli shortcut. The variance of X ∧ 800 is conditioned on I, the indicator variable for whether the loss is less than 800: Var ( X ∧ 800)  E[Var ( X ∧ 800 | I ) ] + Var (E[X ∧ 800 | I]) If the loss is greater than 800, then E[X ∧ 800]  800 and Var ( X ∧ 800)  0. If the loss is less than 800, then it is uniformly distributed on (0, 800) with mean 400 and variance 8002 /12, since in general the variance of a uniform random variable on (0, a ) is a 2 /12. The probability of X < 800 is 8/20  0.4. So E[Var ( X ∧ 800 | I ) ]  E[8002 /12, 0]  0.4 (8002 /12)  21,333 31 For the variance of the expectations, use the Bernoulli shortcut. Var (E[X ∧ 800 | I])  Var (400, 800)  (0.4)(0.6)(4002 )  38,400 Therefore, Var ( X ∧ 800)  21,333 31 + 38,400  59,733 13 . 25.7.

16

7 6

+ 10

5 3

40

+ 14 (2)



63 13 40



19 12

(D)

25.8. Sum up the eight losses higher than 250, capping the ones higher than 1000 at 1000, subtract 250 from each one (or subtract 250 at the end), and divide by 8: 50 + 160 + 210 + 270 + 360 + 650 + 2 (750)  400 8 25.9. After inflation, there will be four losses capped at 10,000 (since 1.05 (9600) > 10,000). The other losses are multiplied by 1.05: 1.05 (2000 + 3500 + 4000 + 4500 + 6000) + 4 (10,000)  6777.78 9

25. ESTIMATION OF RELATED QUANTITIES

434

25.10. The usual shortcut of using the second moment minus the square of the mean is available, and we will use it. 500 + 1000 + 4500 + 6000 + 8000 + 10,000  5000 6 1 X 2 5002 + 10002 + 45002 + 60002 + 80002 + 10,0002  36,916,666.67 xi  6 6 Var ( X )  36,916,666.67 − 50002  11,916,666.67 x¯ 

25.11. We estimate S ( x ) . Since the deaths occur in the middle of the year and the lapses occur at the end of the year, the risk set for each year must exclude the deaths occurring in that year. yj

rj

sj

1 2 3 5

10 6 3 1

3 2 1 1

S10 ( x ) , y j ≤ x < y j+1 7

10 7  4 10 6  14 2 30

3

0



7 15 14 45

5

We now calculate E[X]  0 S ( x ) dx by summing up the areas of the rectangles under the graph of the estimated survival function.

R

5

Z E[X] 

0

14 7 + (5 − 3) S ( x ) dx  (1 − 0)(1) + (2 − 1)(0.7) + (3 − 2) 15 45

!

!

 1 + 0.7 + 0.4667 + 0.6222  2.7889 25.12. The varying censoring times in this problem are by no means far-fetched. If four lives began disability on January 1, 2000, July 1, 2000, January 1, 2001, and July 1, 2001, and were still on disability as of year-end 2001, and you were using year-end 2001 data, you would have exactly this pattern of censoring. Unlike the previous exercise, we are estimating the limited expected value at 24 here, E[X ∧ 24]. First we estimate S ( x ) . However, we have no need to estimate S ( x ) for x ≥ 24. yj 3 4 6 8 10 12

rj 15 14 13 10 9 8

sj 1 1 2 1 1 1

S15 ( x ) , y j ≤ x < y j+1 14 15 13 15 11 15

11 9  15 10  0.66  0.66 98  0.5867  0.5867 87  0.5133

24

We now calculate E[X ∧ 24]  0 S15 ( x ) dx by summing up the areas of the rectangles under the graph of the estimated survival function.

R

E[X ∧ 24] 

24

Z 0

S15 ( x ) dx

 (3 − 0)(1) + (4 − 3)

14 15

+ (6 − 4)

13 15

+ (8 − 6)

+ (12 − 10)(0.5867) + (24 − 12)(0.5133)

11 15

+ (10 − 8)(0.66)

 3 + 0.9333 + 1.7333 + 1.4667 + 1.32 + 1.1733 + 6.16  15.7867 .

EXERCISE SOLUTIONS FOR LESSON 25

435

25.13. The idea is to use all the data, not just the data for the 50,000 policy limit. Therefore, we treat the three claims for 25,000 as censored, and estimate the survival function S9 ( x ) : yj 10,000 20,000 30,000 40,000

rj 9 8 3 2

sj 1 2 1 1

S9 ( x ) , y j ≤ x < y j+1 6 9

8 9

 4 9 2 9

2 3

To calculate the limited expected value at 45,000, we integrate S9 ( x ) between 0 and 45,000. We sum up the rectangles under the graph: E[X ∧ 45,000] 

45,000

Z 0

S9 ( x ) dx

 (10,000 − 0)(1) + (20,000 − 10,000)

8 9

+ (30,000 − 20,000)

+ (40,000 − 30,000) 49 + (45,000 − 40,000) 1000 (90 + 80 + 60 + 40 + 10)  31,111.11  9



2 9

2 3

25.14. We sum up the losses bounded by 15,000, then divide by the number of losses (100). For the first four intervals, the sum of losses is given. For the interval (15,000, ∞) , since the losses are bounded by 15,000, the sum of the ten bounded losses is 10 (15,000)  150,000. The answer is 38,065 + 63,816 + 96,447 + 137,595 + 150,000  4859.23 100 Since the sum of the losses is given, a faster way to perform the calculation is to start with the sum, subtract the losses above 15,000, and then add 150,000: 667,754 − 331,831 + 150,000  4859.23 100 32 20 25.15. The histogram is (2530 )(90) in (0, 25], (25)(90) in (25, 50], (50)(90) in (50, 100], and In each interval, x 2 integrates to ( c 3i − c 3i−1 ) /3 for interval ( c i−1 , c i ) . So

8

(100)(90)

in (100, 200].

1 (30)(253 ) (32)(503 − 253 ) (20)(1003 − 503 ) (8)(2003 − 1003 ) E[X ]  + + + (3)(90) 25 25 50 100 1  (18,750 + 140,000 + 350,000 + 560,000)  3958 13 (E) 270 2

!

25.16. The computation is shortened by noticing that the summands for calculating the two expected values differ only in the interval (100, 200]. We’ll only calculate the summands in this interval, not the full expected value. For E[X 2 ], the summand in this interval is 1 6 (2003 − 1003 ) 420,000   1891.89, 3 (74) 100 222

!

whereas for E[ ( X ∧ 150) 2 ] the summands in this interval are 6 (74)(100)

Z

150

100

2

x dx +

Z

200

150

6 1503 − 1003 150 dx  + (1502 )(50)  1554.05 7400 3 2

!

The difference is 1891.89 − 1554.05  337.84 . (C) C/4 Study Manual—17th edition Copyright ©2014 ASM

!

25. ESTIMATION OF RELATED QUANTITIES

436

Quiz Solutions 25-1. Conditional variance will be easy to use, since there are only two possibilities: loss size below 1000 and loss size greater than 1000. Let I be the condition of whether the loss is below or above 1000. Var ( X ∧ 1000)  E[Var ( X ∧ 1000 | I ) ] + Var (E[X ∧ 1000 | I])  E[10002 /12, 0] + Var (500, 1000)

The expected value of the variances is 0.46 (10002 /12)  38,333 31 . The variance of the expected values, by the Bernoulli shortcut, is (0.46)(0.54)(1000 − 500) 2  62,100. The overall variance is Var ( X ∧ 1000)  38,333 31 + 62,100  100,433 13 25-2. We calculated the probabilities of death at 24, 42, and 60, and the expected lifetime, which is 46.5. The second moment of lifetime is E[X 2 ] 

5 5 1 (242 ) + (422 ) + (602 )  2331 6 12 12

The variance of future lifetime is Var ( X )  2331 − 46.52  168.75 . 25-3. The risk set at 4 is 10 and the risk set at 7 is 8, so Sˆ (7− )  0.9 and Sˆ (7)  0.9 (7/8)  0.7875. It follows D ( X  7)  0.9 − 0.7875  0.1125 . that Pr

Lesson 26

Variance of Kaplan-Meier and Nelson-Åalen Estimators Reading: Loss Models Fourth Edition 12.2 Exam questions from this lesson are frequent. The Kaplan-Meier estimator is an unbiased estimator of the survival function. Greenwood’s approximation of the variance is: X   sj L Sˆ ( t )  Sˆ ( t ) 2 (26.1) Var r ( r − sj) y ≤t j j j

You can remember the summand in this formula as the number who died divided by the product of population before and after the deaths. A useful fact to remember is that if there is complete data—no censoring or truncation—the Greenwood approximation is identical to the empirical approximation of the variance, equation (23.1). The latter is easier to calculate. In the following, we will use some notation from MLC and LC: t px

is the probability that someone age x survives for another t years. In other words, it is the conditional probability of survival to x + t, given survival to x.

t qx

is the complement of t p x . It is the probability that someone age x dies in the next t years. In other words, it is the conditional probability of death in the interval ( x, x + t] given survival to age x.

To calculate variances for conditional probabilities like t p x , treat the study as if it started at time x and use the Greenwood approximation. Example 26A In a mortality study on 82 lives, you are given: yj

rj

sj

1 2 3 4

82 78 74 75

2 1 1 2

is estimated using the product limit estimator. Estimate the variance of the estimate. 2 q2

Answer: Treat this as a study of 74 lives starting at duration 2. Then 73 2 pˆ 2  74

73  0.9602. 75   1 2 L (2 qˆ2 )  (0.96022 ) Var +  0.0005075 (74)(73) (75)(73)

!

!

Notice that Var (2 q 2 )  Var (2 p2 ) . In both cases, the formula uses 2 p22 as the first factor. Never use 2 q22 . C/4 Study Manual—17th edition Copyright ©2014 ASM

437

26. VARIANCE OF KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

438

The approximate variance of the Nelson-Åalen estimator1 is:

L Hˆ ( t )  Var 



X sj y j ≤t

(26.2)

r 2j

This formula is a recursive formula:

L Hˆ ( y j )  Var L Hˆ ( y j−1 ) + Var 







sj r 2j

Example 26B In a mortality study on 82 lives, you are given: yj

rj

sj

1 2 3 4

82 78 74 75

2 1 1 2

H (2) is estimated using the Nelson-Åalen estimator. Estimate the variance of the estimate. Answer:

  L Hˆ (2)  2 + 1  0.0004618 Var 822 782

?



Quiz 26-1 You are given the following information regarding five individuals in a study: dj

uj

xj

0 0 1 2 4

2 — 5 — 5

— 3 — 4 —

Calculate the estimated variance of the Nelson-Åalen estimate of H (4) . Calculator Tip The TI-30XS/B Multiview calculator may be useful for calculating variances. It is probably not worthwhile using the data table for a small number of times or for the variance of Nelson-Åalen, but for the variance of the Kaplan-Meier estimate, for which you must compute both the Kaplan-Meier estimate itself as well as a sum of quotients, the data table is useful. Enter r i in column 1, s i in column 2, and then compute the KaplanMeier estimate, save it, and compute the Greenwood sum. Multiply the sum by the square of the product-limit estimate. Remember that for the Kaplan-Meier estimate you must log the factors and save the sum register; at the end exponentiate twice the saved sum. The solution to exercise 26.9 illustrates the use of the Multiview for the purpose of calculating the 1In the textbook used on the pre-2002 syllabus, this was called the Åalen estimator of the variance, and you will find that old exam questions from 2000 and 2001 call it that. C/4 Study Manual—17th edition Copyright ©2014 ASM

26. VARIANCE OF KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

439

Calculator Tip Greenwood approximation of the variance. The usual symmetric normal confidence intervals for S n ( t ) and Hˆ ( t ) , which we’ll call linear confidence intervals, can be constructed by adding and subtracting to the estimator the standard deviation times a coefficient from the standard normal distribution based on the confidence level, like 1.96 for 95%. If z p is the p th th quantile of a standard normal distribution, then the linear confidence interval for S ( t ) is



q

q

L S n ( t ) , S n ( t ) + z (1+p )/2 Var L Sn ( t ) S n ( t ) − z (1+p )/2 Var 







However, this confidence interval for S n ( t ) frequently includes points higher than 1, and correspondingly can include points below 0 for Hˆ ( t ) , which would imply that the other bound should be adjusted to truly include an interval with probability p. To avoid this  problem, an alternative formula may be used. Confidence intervals are constructed for ln − ln S n ( t ) and ln Hˆ ( t ) using the delta method, which will be discussed in Section 34.2, and then these confidence intervals are exponentiated back into confidence intervals for S n ( t ) and Hˆ ( t ) . The resulting interval for S n ( t ) is



S n ( t ) 1/U , S n ( t ) U



where z

(26.3)

(1+p ) /2 + U  exp * S n ( t ) ln S n ( t ) ,  L Sn ( t ) vˆ  Var

and the resulting interval for Hˆ ( t ) is

where

Hˆ ( t ) ˆ , H (t )U U

!

(26.4)

L ˆ * z (1+p )/2 Var H ( t ) +/ U  exp .. / Hˆ ( t ) q





-

,

You’re probably best off memorizing these two equations rather than rederiving them using the delta method every time you need them. These alternative confidence intervals are called log-transformed confidence intervals. The usual confidence intervals may be called “linear confidence intervals” to distinguish them from the log-transformed confidence intervals. Example 26C In a mortality study on 82 lives, you are given: yj

rj

sj

1 2 3 4

82 78 74 75

2 1 1 2

H (2) is estimated using the Nelson-Åalen estimator. Construct a 90% linear confidence interval and a 90% log-transformed confidence interval for H (2) . C/4 Study Manual—17th edition Copyright ©2014 ASM

26. VARIANCE OF KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

440

Answer: In Example 26B, we calculated the approximate variance as 0.0004618. The estimate of H (2) is 2 1 Hˆ (2)  +  0.03721 82 78 √ A 90% linear confidence interval for H (2) is 0.03721 ± 1.645 0.0004618  (0.00186, 0.07256) . A 90% log-transformed confidence interval is √ ! 1.645 0.0004618 U  exp  e 0.9500  2.5857 0.03721     H 0.03721 , HU  , 0.03721 (2.5857)  (0.01439, 0.09622) U 2.5857 Note that the log transformation tends to move the confidence interval for H to the right.



Example 26D The linear 95% confidence interval for S ( t0 ) is given by (0.44, 0.56). Construct the log-transformed 95% confidence interval for S ( t0 ) . √ Answer: The mean for the linear confidence interval, S n ( t ) , is the midpoint, 0.50. Then z 0.975 vˆ is half the length of the linear confidence interval, or 0.06. By formula (26.3), √ ! ! 0.06 z0.975 vˆ  exp  0.8410 U  exp S n ( t ) ln S n ( t ) 0.5 ln 0.5 and

the

log-transformed

confidence

interval



S n ( t ) 1/U , S n ( t ) U



is

(0.51/0.8410 , 0.50.8410 )



(0.4386, 0.5582) .

Note that the log transformation tends to move the confidence interval for S to the left.



Exercises Greenwood formula 26.1. In a mortality study, there are initially 15 lives. Deaths occur at times 1, 3, 5, and 8. Individuals withdraw from the study at times 2 and 6. The remaining 9 lives survive to time 10. Calculate the estimated variance of the product limit estimator of S (10) using Greenwood’s formula. 26.2.

[160-F86:11] In a 3 year mortality study, we have the following data: yj 1 2 3

rj 100 200 200

sj 10 20 20

S (3) is estimated with the Kaplan-Meier estimator. Using Greenwood’s formula, estimate the variance of the estimate. (A) 0.0012

(B) 0.0015

(C) 0.0017

(D) 0.0020

(E) 0.0022

Exercises continue on the next page . . .

EXERCISES FOR LESSON 26

441

Table 26.1: Summary of Variance Formulas

Greenwood formula for variance of Kaplan-Meier estimator

L Sn ( t )  Sn ( t ) 2 Var 



X

sj

y j ≤t

r j (r j − s j )

(26.1)

Formula for variance of Nelson-Åalen estimator: j   X si ˆ L Var H ( y j )  2

(26.2)

ri

i1

Log-transformed confidence interval for S ( t ) :



S n ( t ) 1/U , S n ( t ) U

where

q

(26.3)



L *. z (1+p )/2 Var S n ( t ) +/ U  exp . / S n ( t ) ln S n ( t ) 



-

, Log-transformed confidence interval for H ( t ) : Hˆ ( t ) ˆ , H (t )U U where

26.3.

!

(26.4)

  L ˆ *. z (1+p )/2 Var H ( t ) +/ U  exp . / Hˆ ( t ) , q

[160-F87:6] In a 3 year mortality study, we have the following data: yj

rj

sj

1 2 3

100 120 110

10 12 15

S (3) is estimated with the Kaplan-Meier estimator. Using Greenwood’s formula, estimate the standard deviation of the estimate. (A) 0.0412

(B) 0.0415

(C) 0.0420

(D) 0.0425

(E) 0.0432

Exercises continue on the next page . . .

26. VARIANCE OF KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

442

26.4.

[160-S88:16] A mortality study is conducted on 15 individuals of age x. You are given:

(i)

In addition to the 15 individuals of age x, one individual apiece enters the study at ages x + 0.4 and x + 0.7. (ii) One individual apiece leaves the study alive at ages x + 0.2 and x + 0.6. (iii) One death apiece occurs at ages x + 0.1, x + 0.3, x + 0.5, and x + 0.8. Using Greenwood’s formula, calculate the estimated variance of the product limit estimate of q x . (A) 0.01337 26.5. (i) (ii) (iii)

(B) 0.01344

(C) 0.01350

(D) 0.01357

(E) 0.01363

[160-S90:9] From a two-year mortality study of 1000 lives beginning at exact age 40, you are given: Observed deaths are distributed uniformly over the interval (40, 42] Greenwood’s approximation of Var S (2)  0.00016. qˆ40 < 0.2.





Calculate the observed mortality rate qˆ40 . (A) 0.096 26.6.

(B) 0.097

(C) 0.098

(D) 0.099

(E) 0.100

[160-S90:14] A two year mortality study is conducted on 10 individuals of age x. You are given:

(i)

In addition to the 10 individuals of age x, one individual apiece enters the study at ages x + 0.8 and x + 1.0. (ii) One individual leaves the study alive at age x + 1.5. (iii) One death apiece occurs at ages x + 0.2, x + 0.5, x + 1.3, and x + 1.7. Using Greenwood’s formula, calculate the estimated variance of the product limit estimate of 2 q x . 26.7.

[160-F90:13] In a 3 year mortality study, we have the following data: yj

rj

sj

1 2 3

1000 1400 2000

20 14 10

S (3) is estimated with the Kaplan-Meier estimator. Using Greenwood’s formula, estimate the variance of the estimate. (A) 0.000028 26.8.

(B) 0.000029

(C) 0.000030

(D) 0.000031

(E) 0.000032

[160-81-96:8] For a study of 1000 lives over three years, you are given:

(i) There are no new entrants or withdrawals. (ii) Deaths occur at the end of the year of death. (iii) (iv)

r j ( r j − s j )  1.11 × 10−4 for y j  1, 2. The expected value of Sˆ (3) is 0.746. sj

.



Calculate the conditional variance of Sˆ (3) using Greenwood’s approximation. (A) 1.83 × 10−4

(B) 1.85 × 10−4

(C) 1.87 × 10−4

(D) 1.89 × 10−4

(E) 1.91 × 10−4

Exercises continue on the next page . . .

EXERCISES FOR LESSON 26

26.9.

443

[160-82-96:8] In a five year mortality study, you are given: yj

sj

rj

1 2 3 4 5

3 24 5 6 3

15 80 25 60 10

Calculate Greenwood’s approximation of the conditional variance of the product limit estimator of S (4) . (A) 0.0055

(B) 0.0056

(C) 0.0058

(D) 0.0061

(E) 0.0063

26.10. [4-S00:38] A mortality study is conducted on 50 lives observed from time zero. You are given: (i)

Time t

Number of Deaths dt

Number Censored ct

15 17 25 30 32 40

2 0 4 0 8 2

0 3 0 c 30 0 0

Sˆ (35) is the Product-Limit estimate of S (35) .   L Sˆ (35) is the estimate of the variance of Sˆ (35) using Greenwood’s formula. (iii) Var (ii)

L Sˆ (35) Var 

(iv)



Sˆ (35)



 2  0.011467

Determine c 30 , the number censored at time 30. (A) 3

(B) 6

(C) 7

(D) 8

(E) 11

Variance of Nelson-Åalen estimator Use the following information for questions 26.11 and 26.12: 7+ ,

92 lives are under observation in a mortality study. The first seven observation times are 2, 2, 3, 6+ , 8, 8, where a plus sign indicates a censored observation.

26.11. Calculate the estimated variance of the product limit estimator S92 (8) using Greenwood’s formula. 26.12. Calculate the estimated variance of the Nelson-Åalen estimator of the cumulative hazard rate, Hˆ (8) . 26.13. In a mortality study, 2 deaths occur at time 3 and 3 at time 5. No other deaths occur before time 5. The estimated variance of the Nelson-Åalen estimator of H (3) is 0.0003125, and the estimated variance of the Nelson-Åalen estimator of H (5) is 0.0008912. Determine the number of withdrawals between times 3 and 5. C/4 Study Manual—17th edition Copyright ©2014 ASM

Exercises continue on the next page . . .

26. VARIANCE OF KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

444

26.14. In a mortality study, cumulative hazard rates are calculated using the Nelson-Åalen estimator. You are given: Hˆ ( y2 )  0.16000

L Hˆ ( y2 )  0.0037000 Var

Hˆ ( y3 )  0.31625

L Hˆ ( y3 )  0.0085828 Var

 

 

No withdrawals occur between times y2 and y3 . Determine the number of deaths at time y3 . 26.15. [4-F00:20] Fifteen cancer patients were observed from the time of diagnosis until the earlier of death or 36 months from diagnosis. Deaths occurred during the study as follows: Time In Months Since Diagnosis

Number Of Deaths

15 20 24 30 34 36

2 3 2 d 2 1

The Nelson-Åalen estimate Hˆ (35) is 1.5641.

Calculate the Åalen estimate of the variance of Hˆ (35) .

(A) (B) (C) (D) (E)

Less than 0.10 At least 0.10, but less than 0.15 At least 0.15, but less than 0.20 At least 0.20, but less than 0.25 At least 0.25

26.16. [4-S01:14] For a mortality study with right-censored data, you are given: yi

si

ri

1 8 17 25

15 20 13 31

100 65 40 31

Calculate the Åalen estimate of the standard deviation of the Nelson-Åalen estimator of the cumulative hazard function at time 20. (A) (B) (C) (D) (E)

Less than 0.05 At least 0.05, but less than 0.10 At least 0.10, but less than 0.15 At least 0.15, but less than 0.20 At least 0.20

Exercises continue on the next page . . .

EXERCISES FOR LESSON 26

445

Linear confidence intervals 26.17. In a mortality study on 12 lives, 2 die at time 3. The Kaplan-Meier estimator is used to estimate S (3) . Determine the upper bound for the linear 95% confidence interval for S (3) . 26.18. In a mortality study on 100 lives, 2 die at time 2 and 1 at time 5. The product-limit estimator is used to estimate S (5) . Determine the width of the linear 90% confidence interval for S (5) . 26.19. [4-S00:19] For a mortality study with right-censored data, the cumulative hazard rate is estimated using the Nelson-Åalen estimator. You are given: (i) No deaths occur between times t j and t j+1 . (ii) A 95% linear confidence interval for H ( t j ) is (0.07125, 0.22875) . (iii) A 95% linear confidence interval for H ( t j+1 ) is (0.15607, 0.38635) . Calculate the number of deaths observed at time t j+1 . (A) 4

(B) 5

(C) 6

(D) 7

(E) 8

26.20. [C-S05:15] Twelve policyholders were monitored from the starting date of the policy to the time of first claim. The observed data are as follows: Time of First Claim Number of Claims

1 2

2 1

3 2

4 2

5 1

6 2

7 2

Using the Nelson-Åalen estimator, calculate the 95% linear confidence interval for the cumulative hazard rate function H (4.5) . (A) (0.189, 1.361)

(B) (0.206, 1.545)

(C) (0.248, 1.402)

(D) (0.283, 1.266)

(E) (0.314, 1.437)

Log-transformed confidence intervals

L S n (365)  0.0019. 26.21. In a mortality study, S n (365)  0.76 and Var 



Determine the lower bound of the log-transformed 95% confidence interval for S (365) .

L S n (23)  0.022. 26.22. In a mortality study, S n (23)  0.55 and Var 



Determine the width of the log-transformed 90% confidence interval for S (23) . 26.23. The log-transformed 95% confidence interval for S ( y j ) is given by (0.400, 0.556). Construct the linear 95% confidence interval for S ( y j ) . 26.24. The log-transformed 90% confidence interval for S ( y j ) is given by (0.81, 0.88). Determine the width of the log-transformed 95% confidence interval.

Exercises continue on the next page . . .

26. VARIANCE OF KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

446

26.25. [4-F02:8] For a survival study, you are given: (i) The Product-Limit estimator Sˆ ( t0 ) is used to construct confidence intervals for S ( t0 ) . (ii) The 95% log-transformed confidence interval for S ( t0 ) is (0.695, 0.843) . Determine Sˆ ( t0 ) . (A) 0.758

(B) 0.762

(C) 0.765

(D) 0.769

(E) 0.779

26.26. In a study on 100 lives, 2 die at time 2 and 3 at time 8. Determine the lower bound of the log-transformed 99% confidence interval for H (8) . 26.27. The linear 95% confidence interval for H (100) is given by (0.8, 1.0) . Determine the width of the log-transformed 95% confidence interval for H (100) . 26.28. The log-transformed 95% confidence interval for H (80) is given by (0.4, 0.625). Determine the upper bound of the linear 90% confidence interval for H (80) . 26.29. In a mortality study, one death apiece occurred at times y1 and y2 . No other deaths occurred before time y2 . The 95% log-transformed confidence interval for the cumulative hazard rate H ( y2 ) calculated using the Nelson-Åalen estimator is (0.07837, 1.3477). There were no late entrants to the study. Determine the size of the risk set at time y2 . 26.30. [4-F01:37] A survival study gave (1.63, 2.55) as the 95% linear confidence interval for the cumulative hazard function H ( t0 ) . Calculate the 95% log-transformed confidence interval for H ( t0 ) . (A) (0.49, 0.94)

(B) (0.84, 3.34)

(C) (1.58, 2.60)

(D) (1.68, 2.50)

(E) (1.68, 2.60)

26.31. [4-F04:12] The interval (0.357, 0.700) is a 95% log-transformed confidence interval for the cumulative hazard rate function at time t, where the cumulative hazard rate function is estimated using the Nelson-Åalen estimator. Determine the value of the Nelson-Åalen estimate of S ( t ) . (A) 0.50

(B) 0.53

(C) 0.56

(D) 0.59

(E) 0.61

Use the following information for questions 26.32 and 26.33: For a survival study with censored and truncated data, you are given: Time(t) 1 2 3 4 5

Number at Risk at Time t 30 27 32 25 20

Failures at Time t 5 9 6 5 4

26.32. [4-F03:21] The probability of failing at or before Time 4, given survival past Time 1, is 3 q 1 . Calculate Greenwood’s approximation of the variance of 3 qˆ 1 . (A) 0.0067

(B) 0.0073

(C) 0.0080

(D) 0.0091

(E) 0.0105 Exercises continue on the next page . . .

EXERCISES FOR LESSON 26

447

26.32–33. (Repeated for convenience) Use the following information for questions 26.32 and 26.33: For a survival study with censored and truncated data, you are given: Time(t) 1 2 3 4 5

Number at Risk at Time t 30 27 32 25 20

Failures at Time t 5 9 6 5 4

26.33. [4-F03:22] Calculate the 95% log-transformed confidence interval for H (3) , based on the NelsonÅalen estimate. (A) (0.30, 0.89)

(B) (0.31, 1.54)

(C) (0.39, 0.99)

(D) (0.44, 1.07)

(E) (0.56, 0.79)

Additional released exam questions: C-F05:17, C-F06:7, C-S07:12,33

Solutions 26.1.

The table of risk sets and deaths is

14 S15 (10)  15

26.2.

!

11 12

rj

sj

1 3 5 8

15 13 12 10

1 1 1 1

9  0.7108 10     1 1 1 1 L Sˆ15 (10)  (0.7108) 2 Var + + +  0.0151 (15)(14) (13)(12) (12)(11) (10)(9)

!

12 13

yj

!

!

Using equation (26.1), S n (3)  (0.9)(0.9)(0.9)  0.729    10 20 20 2 L Var S n (3)  0.729 + +  0.00118098 (100)(90) (200)(180) (200)(180)



26.3.

Using equation (26.1), 95 S n (3)  (0.9)(0.9)  0.6995 110     10 12 15 L S n (3)  0.69952  0.001699 Var + + (100)(90) (120)(108) (110)(95) √ 0.1699  0.04122 (A)

!

(A)

26. VARIANCE OF KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

448

26.4.

The table of risk sets and deaths is: yj

rj

sj

0.1 0.3 0.5 0.8

15 13 13 12

1 1 1 1

Using equation (26.1),

L ( qˆ x )  *. 14 Var 15

!

12 13

!

12 13

!

11 + / 12

,

!

2



1

(14)(15)

+

1

(12)(13)

+

1

(12)(13)

+

1



(11)(12)

-

 0.01337

(A)

26.5. We suspect from condition (iii) that this is going to be a quadratic. Let d be the number of deaths in each year. (It is uniform). Since there is no truncation or censoring, Greenwood’s approximation is the empirical variance, so 2d  1000−2d  1000 1000

1000

 0.00016

Multiplying out and solving: 2d (1000 − 2d )  160,000

0  −2000d + 4d 2 + 160,000

2000 ± 20002 − 16 (160,000) d 8 2000 ± 1200   400 or 100 8

p

Only 100 satisfies qˆ40 < 0.2, so qˆ40  100/1000  0.1 . (E) 26.6.

Develop the following table of risk sets and deaths: yj

rj

sj

0.2 0.5 1.3 1.7

10 9 10 8

1 1 1 1

S n (2)  (0.8)(0.9)(0.875)  0.63    1 1 1 1 L S n (2)  0.632 Var + + +  0.02142 (10)(9) (9)(8) (10)(9) (8)(7)



26.7. S n (3)  (0.98)(0.99)(0.995)  0.965349    20 14 10 L S n (3)  0.9653492 Var + +  0.000028 (1000)(980) (1400)(1386) (2000)(1990)



(A)

EXERCISE SOLUTIONS FOR LESSON 26

26.8.

449

Apparently, the exam setters wanted to coax students into wasting their time backing out s3

.

r3 ( r3 −

s3 ) , and then using Greenwood’s formula. But you’re too smart for that—you know that when you have complete data, .  Greenwood’s  formula reduces to the empirical variance, and you don’t need any of the individual s j r j ( r j − s j ) ’s. (Believe it or not, the official solution did not use this shortcut, but actually



went to the trouble of backing out s3

.

r3 ( s3 − r3 ) !)

Var S1000 (3) 







(0.746)(1 − 0.746) 1000

 0.000189484

(D)

L S n (4) . 26.9. They tried to confuse you a little by giving you five years of data but then asking for Var (Did you get answer D? Shame on you!) 

S n (4)  (0.8)(0.7)(0.8)(0.9)  0.4032    24 5 6 3 L S n (4)  0.40322 Var + + + (15)(12) (80)(56) (25)(20) (60)(54)



 0.005507

(A) Calculator Tip

Here’s how the calculation could be done on a Multiview calculator:

Clear table

data data 4

L1

15

Enter s j in column 2

t% 3 s% 24 s% 5 s% 6 enter

Enter formula Kaplan-Meier in umn 3

for col-

Calculate statistics and save the sum

t% data t% 1 ln 1 − t

enter 5 sto

L1 80 25 60 L1(5)=

L2

L3

L1 80 25 60

L2 24 5 6

L3

L2 3 24 5 6

L3 −0.223 −0.357 −0.223 −0.105

L2(5)=

data 2 ÷ data 1 ) enter

clear 2nd [stat]1 Select L3 for data y x a zb tc

L3

L1(1)=

s% 80 s% 25 s% 60 enter

Enter r j in column 1

L2

enter

s% s%

L1 15 80 25 60

P

x→x

−0.908322562



26. VARIANCE OF KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

450

Calculator Tip

t% t%

t%

Enter formula for Greenwood sum in column 3

data data 1 data 2 ÷ ( data 1 ( data 1- data 2 ) ) enter

Get sum statistic and finish up the calculation

2nd [stat]1 enter

s% s% enter 5 ×

L1 L2 L3 15 3 0.0167 80 24 0.0054 25 5 0.01 60 6 0.0019 L3(1)=0.016666666..

y

2nd [ e x ]2 × x a zb tc

P

x ∗ e 2×x

0.005507174

26.10. The variance divided by Sˆ 2 is 0.011467 

2

(50)(48)

+

4

(45)(41)

+

 0.0008333 + 0.002168 + 8

8

(41 − c30 )(33 − c30 ) 8

(41 − c30 )(33 − c30 )

 0.008466

(41 − c30 )(33 − c30 ) (41 − c30 )(33 − c30 )  945

We can solve the quadratic, or we can note that 945 is about 312 , and the two factors differ by 8, so by making the factors 35 and 27, we can verify that they multiply out to 945. Then c 30  41 − 35  6 . (B)

26.11. Use formula (26.1).

85  0.9452 87     2 1 2 L S92 (8)  (0.94522 ) Var + +  0.0005689 . (92)(90) (90)(89) (87)(85) 89 S92 (8)  92

!

!

26.12. Use formula (26.2).

  L Hˆ (8)  2 + 1 + 2  0.0006240 . Var 922 902 872 26.13. 2 r12 3 0.0008912 − 0.0003125  0.0005787  2 r2 0.0003125 

r1  80 r2  72

There were 80 − 72 − 2  6 withdrawals.

26.14. As usual, s3 will denote the number of deaths at time y3 and r3 will denote the risk set at time y3 . 0.0085828 − 0.0037000  0.0048828  0.31625 − 0.16000  0.15625  C/4 Study Manual—17th edition Copyright ©2014 ASM

s3 r3

s3 r32

EXERCISE SOLUTIONS FOR LESSON 26

451

Dividing, 0.15625  32 0.0048828 s 3  32 (0.15625)  5 . r3 

26.15. To back out d, we write 2 3 2 d 2 + + + +  1.5641 15 13 10 8 8 − d 2 d  1.5641 0.5641 + + 8 8−d 2 d + 1 8 8−d

It’s easiest to solve this by plugging in values of d, but if you wish to solve the quadratic: d (8 − d ) + 16  8 (8 − d ) d 2 − 16d + 48  0 d  4, 12

We reject 12 as being larger than the population, so d  4. Now we calculate the variance as 3 2 4 2 2 + 2 + 2 + 2 + 2  0.2341 2 15 13 10 8 4

(D)

26.16. Straightforward; the only trick is that you must ignore time 25 and must take the square root at the end.

  L Hˆ (20)  15 + 20 + 13  0.01439 Var 1002 652 402 √ 0.01439  0.1198 (C) 26.17. 10  0.8333 12 !2   2 L S12 (3)  5 Var  0.01157 6 (12)(10) S12 (3) 

√ Upper bound of confidence interval is 0.8333 + 1.96 0.01157 > 1, so it is 1 . 26.18. S100 (5) 

97  0.97 100

L S100 (5)  0.972 Var 





2 1 +  0.000291 (100)(98) (98)(97)

√ Width of confidence interval 2 (1.645) 0.000291  0.05612 .



26. VARIANCE OF KAPLAN-MEIER AND NELSON-ÅALEN ESTIMATORS

452

 0.15 26.19. The midpoints of the intervals are the estimates for H ( t j ) and H ( t j+1 ) ; they are 0.07125+0.22875 2 and 0.15607+0.38635  0.27121 respectively. It follows that s /r  0.27121 − 0.15  0.12121. j+1 j+1 2 The difference between the top of each interval and the midpoint is 1.96D σ , so D σ2 (the estimate of the variance) is the difference divided by 1.96, squared, or

!2   0.22875 − 0.15 L Var H ( t j )   0.001614 1.96

L H ( t j+1 )  0.38635 − 0.27121 Var 



!2

1.96

 0.003451

The difference, 0.003451 − 0.001614  0.001837 is s j+1 /r 2j+1 . Then s j+1  0.121212 /0.001837  8 . (E) 26.20. We have

P Hˆ ( y i )  ij1

yi

ri

si

si ri

1 2 3 4

12 10 9 7

2 1 2 2

0.166667 0.100000 0.222222 0.285714

sj rj

si r i2

L Hˆ ( y i )  Pi Var j1 



sj r 2j

0.166667 0.013889 0.013889 0.266667 0.010000 0.023889 0.488889 0.024691 0.048580 0.774603 0.040816 0.089397 √ The confidence interval is 0.774603 ± 1.96 0.089397  (0.189, 1.361) . (A) 26.21. √ ! 1.96 0.0019 U  exp 0.76 ln 0.76  e −0.4096  0.6639 S1/U  (0.76) 1/0.6639  0.6614 26.22. √ ! 1.645 0.022 U  exp 0.55 ln 0.55  exp (−0.7420)  0.4761

SU  0.550.4761  0.7523,

S1/U  0.551/0.4761  0.2849

Width of interval is 0.7523 − 0.2849  0.4674 .

26.23. Take the logarithms of the lower and upper bounds of the given confidence interval, S1/U  0.400 and SU  0.556. U ln S  ln 0.556 1 ln S  ln 0.4 U ln 0.556 U2   0.64 ln 0.4 U  0.80 S  exp (U ln 0.4)  0.40.80  0.48 C/4 Study Manual—17th edition Copyright ©2014 ASM

EXERCISE SOLUTIONS FOR LESSON 26

453

√ ! z 0.975 vˆ exp  0.8 S ln S √ z 0.975 vˆ  (ln 0.8) S ln S  (ln 0.8)(0.48)(ln 0.48)  0.0786 Interval is 0.48 ± 0.0786  (0.4014, 0.5586) .

26.24. Take the logarithms of the lower and upper bounds of the given confidence interval, S1/U  0.81 and SU  0.88. U ln S  ln 0.88 1 ln S  ln 0.81 U ln 0.88 U2   0.6066 ln 0.81 U  0.7789 ln 0.88 ln 0.88 ln S    −0.1641 U 0.7789 S  e −0.1641  0.8486 √ z0.95 vˆ  ln 0.7789  −0.2500 S ln S √ ! z0.975 vˆ 1.96  −0.2500  −0.2978 S ln S 1.645 √ ! z 0.975 vˆ 0  e −0.2978  0.7425 U  exp S ln S SU  0.84860.7425  0.8852 and S 1/U 